From patchwork Mon Dec 5 22:16:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Reid X-Patchwork-Id: 39619 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:999a:b0:a4:2148:650a with SMTP id ve26csp3609404pzb; Mon, 5 Dec 2022 14:17:09 -0800 (PST) X-Google-Smtp-Source: AA0mqf7poydFm0M3BIrKd3JegbGxyvBe6bf73DdSdN1IER+6AQff91/A6ZxC1UcIeNVd6W1kH/Aa X-Received: by 2002:a17:906:5a8a:b0:7c1:39:5474 with SMTP id l10-20020a1709065a8a00b007c100395474mr3131031ejq.758.1670278629608; Mon, 05 Dec 2022 14:17:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670278629; cv=none; d=google.com; s=arc-20160816; b=DFEGURnL6sir5uexSGf69hrbaanN0JXklYpztakbhFhE0ba8HP6XF6ptk2FBktUD4a AH9Izz0TG4lTWnw5TPYXP3jLmkf3eHE4w0D2iHoCBN7FGIETwO/WAa8BuUtC5Abfq3Ni 1remGYlJ5hYrM1CpJhV7MjWSkakiV4ucnuuleDyqZQbrgFntJW2DoS3zpJkksi/uorem 6OPAAGHWbR0LQCDmTXpX6De3jm4abV30MZhb2/K68PM6hLhzHe2yKVgE+HsMu/AVrC0p QUcW1xg6ePcYH8pdiYYPqb4w4DqHfy77fH6GUr5tZ+t3hJIHUOXjtCqDlYpX4NsJHC3e QHpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=ZxpdLg3/Dz68+vC4yJ7jhAcNkCSCcxVDOkACZ0rYf2I=; b=eq9uzuVh4aIDFmK0V5F8AFz64DaWxRKrDWMjxnu9R1p6gQbtgS6N4uKB9Gq4heJGvc +DRayvZ/5GLGrVMefDMp4MwixFiqowAmvMv6F1/2hiletWvkH8viXFczu4TvaPJISDM8 HV6E2eh2PStaLA7taxThIa/nq957PKW0drLIsjR52xVZA3l6M86i/LRijvFR+o2S+qUF 3pf82729QhzrfQ6tSR+UJY9prizQtPrGx4tdOCin8uvzuz7uFHqGmwRbZg94vT01T0dl XGHJYJa40q2xYXUGCdUViH3zaxoCVMzrva8RxplSfWVTV/KiM4mo2O2pDGKS4APssc1+ JWZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=dgTDkYck; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qb28-20020a1709077e9c00b007c0f114b370si3581655ejc.955.2022.12.05.14.17.09; Mon, 05 Dec 2022 14:17:09 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=dgTDkYck; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BFB0868B9F2; Tue, 6 Dec 2022 00:17:05 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DB4F068BB37 for ; Tue, 6 Dec 2022 00:16:58 +0200 (EET) Received: by mail-pl1-f173.google.com with SMTP id jl24so12155684plb.8 for ; Mon, 05 Dec 2022 14:16:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=byysOls1OPOcvHksnWGgZd6E9GrdSAcJX1FZiN2305E=; b=dgTDkYckfLalzwZSIBcak8xtFI6ch7McjvqWyST56RVepVlJfnUx+r53IM+qlB7JTi nm3FShQL3Re1LPbhv0riLWhazj54LV9TTSQr1bkV0g3LtEx9UEb4hOz/qUinT/fXj2oU S2yQ9NkRjyIDXa1sftQ8pNUHUy2/lslMWO8y3hlhmb//P5pLXCNtQys6UchpXl8is/9U WtdsSx+fUTOtYs+cupvDamwNdyPuQ2bSp9Kelc0acz2sxv+E1CiCKHo1M0Nf2AdGBNCv et9eNLGkJjvtDdeWXOTPfNDjHtBnWZcBWx+TEUI3jnYGUPjdsd/SccrWeE++G/Ckiz9T 68og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=byysOls1OPOcvHksnWGgZd6E9GrdSAcJX1FZiN2305E=; b=eTw5ADhuP5jAztdjeyEvyDCcRrqP9S5A0ETAaq54DcT4JuZBDqR+Om6M14UbZ9WYzK rTg20JvO2DXgY8pggi0QL8/pXZDF5Z+nIKOwOuElN52BCGr695NmyBrxLpK79sPZhJk9 H2tZ83t1AnW2hFoVihUAttvbpWhIW3BUCDtUxtNQeZ81RckwdYW1SWBpkq3n/YR8sfvA MaIKOFVIwoWJLH6dIIw5fFPMSpurfxRWYMJ9ZI7rxMF2nh6JNJ+gCVIWMvY8FEmNeJ7T 8lgEVLyMpgIJoTBpFf69OUayd/5SpbJ7MYZTcYWY77eiCzePLMaGEPPFMd/ZABsi+LY7 EaYw== X-Gm-Message-State: ANoB5pl/p8Rzc0d6biBT9Us84aG4iRFJF9LrbI0LerA/wN42WP7lmMSW I7N2r1LCJ4RF9cmh48ylrwmXndT9t2U= X-Received: by 2002:a17:903:234c:b0:189:dfa8:b7b2 with SMTP id c12-20020a170903234c00b00189dfa8b7b2mr4032934plh.168.1670278616685; Mon, 05 Dec 2022 14:16:56 -0800 (PST) Received: from localhost.localdomain (S0106bc4dfba470f3.vc.shawcable.net. [174.7.244.175]) by smtp.gmail.com with ESMTPSA id b4-20020a170902650400b00176dc67df44sm11097913plk.132.2022.12.05.14.16.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Dec 2022 14:16:56 -0800 (PST) From: mindmark@gmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 5 Dec 2022 14:16:41 -0800 Message-Id: <20221205221641.1215-2-mindmark@gmail.com> X-Mailer: git-send-email 2.31.1.windows.1 In-Reply-To: <20221205221641.1215-1-mindmark@gmail.com> References: <20221205221641.1215-1-mindmark@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 2/2] libswscale: add AVBSwapDSPContext and use X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Mark Reid Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: cvCf2VjEHcKS From: Mark Reid There are some places in input.c that could use it too but they aren't currently being pass the SwsContext --- libswscale/output.c | 36 +++++++++++++++-------------------- libswscale/swscale_internal.h | 3 +++ libswscale/swscale_unscaled.c | 26 +++++++++---------------- libswscale/utils.c | 2 ++ 4 files changed, 29 insertions(+), 38 deletions(-) diff --git a/libswscale/output.c b/libswscale/output.c index 5c85bff971..cd44081e3d 100644 --- a/libswscale/output.c +++ b/libswscale/output.c @@ -2313,13 +2313,11 @@ yuv2gbrp_full_X_c(SwsContext *c, const int16_t *lumFilter, } } if (SH != 22 && (!isBE(c->dstFormat)) != (!HAVE_BIGENDIAN)) { - for (i = 0; i < dstW; i++) { - dest16[0][i] = av_bswap16(dest16[0][i]); - dest16[1][i] = av_bswap16(dest16[1][i]); - dest16[2][i] = av_bswap16(dest16[2][i]); - if (hasAlpha) - dest16[3][i] = av_bswap16(dest16[3][i]); - } + c->bsdsp.bswap16_buf(dest16[0], dest16[0], dstW); + c->bsdsp.bswap16_buf(dest16[1], dest16[1], dstW); + c->bsdsp.bswap16_buf(dest16[2], dest16[2], dstW); + if (hasAlpha) + c->bsdsp.bswap16_buf(dest16[3], dest16[3], dstW); } } @@ -2385,13 +2383,11 @@ yuv2gbrp16_full_X_c(SwsContext *c, const int16_t *lumFilter, dest16[3][i] = av_clip_uintp2(A, 30) >> 14; } if ((!isBE(c->dstFormat)) != (!HAVE_BIGENDIAN)) { - for (i = 0; i < dstW; i++) { - dest16[0][i] = av_bswap16(dest16[0][i]); - dest16[1][i] = av_bswap16(dest16[1][i]); - dest16[2][i] = av_bswap16(dest16[2][i]); - if (hasAlpha) - dest16[3][i] = av_bswap16(dest16[3][i]); - } + c->bsdsp.bswap16_buf(dest16[0], dest16[0], dstW); + c->bsdsp.bswap16_buf(dest16[1], dest16[1], dstW); + c->bsdsp.bswap16_buf(dest16[2], dest16[2], dstW); + if (hasAlpha) + c->bsdsp.bswap16_buf(dest16[3], dest16[3], dstW); } } @@ -2461,13 +2457,11 @@ yuv2gbrpf32_full_X_c(SwsContext *c, const int16_t *lumFilter, dest32[3][i] = av_float2int(float_mult * (float)(av_clip_uintp2(A, 30) >> 14)); } if ((!isBE(c->dstFormat)) != (!HAVE_BIGENDIAN)) { - for (i = 0; i < dstW; i++) { - dest32[0][i] = av_bswap32(dest32[0][i]); - dest32[1][i] = av_bswap32(dest32[1][i]); - dest32[2][i] = av_bswap32(dest32[2][i]); - if (hasAlpha) - dest32[3][i] = av_bswap32(dest32[3][i]); - } + c->bsdsp.bswap32_buf(dest32[0], dest32[0], dstW); + c->bsdsp.bswap32_buf(dest32[1], dest32[1], dstW); + c->bsdsp.bswap32_buf(dest32[2], dest32[2], dstW); + if (hasAlpha) + c->bsdsp.bswap32_buf(dest32[3], dest32[3], dstW); } } diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index abeebbb002..400f0bc8ed 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -26,6 +26,7 @@ #include "config.h" #include "libavutil/avassert.h" +#include "libavutil/bswapdsp.h" #include "libavutil/common.h" #include "libavutil/frame.h" #include "libavutil/intreadwrite.h" @@ -682,6 +683,8 @@ typedef struct SwsContext { atomic_int data_unaligned_warned; Half2FloatTables *h2f_tables; + + AVBSwapDSPContext bsdsp; } SwsContext; //FIXME check init (where 0) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 9af2e7ecc3..0010ab24d1 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -468,7 +468,7 @@ static int bswap_16bpc(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]) { - int i, j, p; + int i, p; for (p = 0; p < 4; p++) { int srcstr = srcStride[p] / 2; @@ -480,9 +480,7 @@ static int bswap_16bpc(SwsContext *c, const uint8_t *src[], continue; dstPtr += (srcSliceY >> c->chrDstVSubSample) * dststr; for (i = 0; i < (srcSliceH >> c->chrDstVSubSample); i++) { - for (j = 0; j < min_stride; j++) { - dstPtr[j] = av_bswap16(srcPtr[j]); - } + c->bsdsp.bswap16_buf(dstPtr, srcPtr, min_stride); srcPtr += srcstr; dstPtr += dststr; } @@ -495,7 +493,7 @@ static int bswap_32bpc(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]) { - int i, j, p; + int i, p; for (p = 0; p < 4; p++) { int srcstr = srcStride[p] / 4; @@ -507,9 +505,7 @@ static int bswap_32bpc(SwsContext *c, const uint8_t *src[], continue; dstPtr += (srcSliceY >> c->chrDstVSubSample) * dststr; for (i = 0; i < (srcSliceH >> c->chrDstVSubSample); i++) { - for (j = 0; j < min_stride; j++) { - dstPtr[j] = av_bswap32(srcPtr[j]); - } + c->bsdsp.bswap32_buf(dstPtr, srcPtr, min_stride); srcPtr += srcstr; dstPtr += dststr; } @@ -1616,19 +1612,17 @@ static int rgbToRgbWrapper(SwsContext *c, const uint8_t *src[], int srcStride[], conv(srcPtr, dstPtr + dstStride[0] * srcSliceY, (srcSliceH - 1) * srcStride[0] + c->srcW * srcBpp); else { - int i, j; + int i; dstPtr += dstStride[0] * srcSliceY; for (i = 0; i < srcSliceH; i++) { if(src_bswap) { - for(j=0; jsrcW; j++) - ((uint16_t*)c->formatConvBuffer)[j] = av_bswap16(((uint16_t*)srcPtr)[j]); + c->bsdsp.bswap16_buf((uint16_t*)c->formatConvBuffer, (uint16_t*)srcPtr, c->srcW); conv(c->formatConvBuffer, dstPtr, c->srcW * srcBpp); }else conv(srcPtr, dstPtr, c->srcW * srcBpp); if(dst_bswap) - for(j=0; jsrcW; j++) - ((uint16_t*)dstPtr)[j] = av_bswap16(((uint16_t*)dstPtr)[j]); + c->bsdsp.bswap16_buf((uint16_t*)dstPtr, (uint16_t*)dstPtr, c->srcW); srcPtr += srcStride[0]; dstPtr += dstStride[0]; } @@ -1932,16 +1926,14 @@ static int planarCopyWrapper(SwsContext *c, const uint8_t *src[], isBE(c->srcFormat) != isBE(c->dstFormat)) { for (i = 0; i < height; i++) { - for (j = 0; j < length; j++) - ((uint16_t *) dstPtr)[j] = av_bswap16(((const uint16_t *) srcPtr)[j]); + c->bsdsp.bswap16_buf((uint16_t *)dstPtr, (const uint16_t *)srcPtr, length); srcPtr += srcStride[plane]; dstPtr += dstStride[plane]; } } else if (isFloat(c->srcFormat) && isFloat(c->dstFormat) && isBE(c->srcFormat) != isBE(c->dstFormat)) { /* swap float plane */ for (i = 0; i < height; i++) { - for (j = 0; j < length; j++) - ((uint32_t *) dstPtr)[j] = av_bswap32(((const uint32_t *) srcPtr)[j]); + c->bsdsp.bswap32_buf((uint32_t *)dstPtr, (const uint32_t *)srcPtr, length); srcPtr += srcStride[plane]; dstPtr += dstStride[plane]; } diff --git a/libswscale/utils.c b/libswscale/utils.c index 90734f66ef..0514062d85 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -1921,6 +1921,8 @@ static av_cold int sws_init_single_context(SwsContext *c, SwsFilter *srcFilter, return 0; } + av_bswapdsp_init(&c->bsdsp); + /* unscaled special cases */ if (unscaled && !usesHFilter && !usesVFilter && (c->srcRange == c->dstRange || isAnyRGB(dstFormat) ||