From patchwork Sun Apr 26 02:37:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nelson Gomez X-Patchwork-Id: 19245 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 9AD7644949E for ; Sun, 26 Apr 2020 05:37:18 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8551C68C574; Sun, 26 Apr 2020 05:37:18 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EA0AA68C500 for ; Sun, 26 Apr 2020 05:37:09 +0300 (EEST) Received: from linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net (linux.microsoft.com [13.77.154.182]) by linux.microsoft.com (Postfix) with ESMTPSA id AD6CE20B4749 for ; Sat, 25 Apr 2020 19:37:08 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com AD6CE20B4749 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1587868628; bh=QGHnO0oPEMabRdTqBeROwi1IYh7cXOTBqFOILMNHYQ8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=mXPUmYY4puAgnZMCpapObq8WD0Jp5RPIm+b9/B7we2mRTE8edCFp+l/4VzGDfd6Ou IFleHZdJPXqbQ6gsG5yxvcASNfkP0QMtLJhznZihXzUrSc0SpdO0ibo13U+APbx+4q e6mPLA3YtKlb0EKKQ2GZzXnWDs56woYV1/Ak1ViQ= From: Nelson Gomez To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 Apr 2020 19:37:01 -0700 Message-Id: <1587868623-97667-2-git-send-email-negomez@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1587868623-97667-1-git-send-email-negomez@linux.microsoft.com> References: <1587868623-97667-1-git-send-email-negomez@linux.microsoft.com> Subject: [FFmpeg-devel] [PATCH v3 1/3] swscale: make yuv2interleavedX more asm-friendly X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Nelson Gomez Extracting information from SwsContext in assembly is difficult, and rearranging SwsContext just for asm access didn't look good. These functions only need a couple of fields from it anyway, so just make them parameters in their own right. Signed-off-by: Nelson Gomez --- libswscale/output.c | 12 +++++------- libswscale/swscale_internal.h | 5 +++-- libswscale/vscale.c | 2 +- 3 files changed, 9 insertions(+), 10 deletions(-) diff --git a/libswscale/output.c b/libswscale/output.c index 68f43ffba3..2e5d6076ab 100644 --- a/libswscale/output.c +++ b/libswscale/output.c @@ -180,7 +180,7 @@ yuv2planeX_16_c_template(const int16_t *filter, int filterSize, } } -static void yuv2p016cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterSize, +static void yuv2p016cX_c(enum AVPixelFormat dstFormat, const uint8_t *chrDither, const int16_t *chrFilter, int chrFilterSize, const int16_t **chrUSrc, const int16_t **chrVSrc, uint8_t *dest8, int chrDstW) { @@ -188,7 +188,7 @@ static void yuv2p016cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterS const int32_t **uSrc = (const int32_t **)chrUSrc; const int32_t **vSrc = (const int32_t **)chrVSrc; int shift = 15; - int big_endian = c->dstFormat == AV_PIX_FMT_P016BE; + int big_endian = dstFormat == AV_PIX_FMT_P016BE; int i, j; for (i = 0; i < chrDstW; i++) { @@ -402,12 +402,10 @@ static void yuv2plane1_8_c(const int16_t *src, uint8_t *dest, int dstW, } } -static void yuv2nv12cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterSize, +static void yuv2nv12cX_c(enum AVPixelFormat dstFormat, const uint8_t *chrDither, const int16_t *chrFilter, int chrFilterSize, const int16_t **chrUSrc, const int16_t **chrVSrc, uint8_t *dest, int chrDstW) { - enum AVPixelFormat dstFormat = c->dstFormat; - const uint8_t *chrDither = c->chrDither8; int i; if (dstFormat == AV_PIX_FMT_NV12 || @@ -477,13 +475,13 @@ static void yuv2p010lX_c(const int16_t *filter, int filterSize, } } -static void yuv2p010cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterSize, +static void yuv2p010cX_c(enum AVPixelFormat dstFormat, const uint8_t *chrDither, const int16_t *chrFilter, int chrFilterSize, const int16_t **chrUSrc, const int16_t **chrVSrc, uint8_t *dest8, int chrDstW) { uint16_t *dest = (uint16_t*)dest8; int shift = 17; - int big_endian = c->dstFormat == AV_PIX_FMT_P010BE; + int big_endian = dstFormat == AV_PIX_FMT_P010BE; int i, j; for (i = 0; i < chrDstW; i++) { diff --git a/libswscale/swscale_internal.h b/libswscale/swscale_internal.h index 9dda53eead..42fd87e887 100644 --- a/libswscale/swscale_internal.h +++ b/libswscale/swscale_internal.h @@ -119,7 +119,8 @@ typedef void (*yuv2planarX_fn)(const int16_t *filter, int filterSize, * Write one line of horizontally scaled chroma to interleaved output * with multi-point vertical scaling between input pixels. * - * @param c SWS scaling context + * @param dstFormat destination pixel format + * @param chrDither ordered dither array of type uint8_t and size 8 * @param chrFilter vertical chroma scaling coefficients, 12 bits [0,4096] * @param chrUSrc scaled chroma (U) source data, 15 bits for 8-10-bit * output, 19 bits for 16-bit output (in int32_t) @@ -130,7 +131,7 @@ typedef void (*yuv2planarX_fn)(const int16_t *filter, int filterSize, * output, this is in uint16_t * @param dstW width of chroma planes */ -typedef void (*yuv2interleavedX_fn)(struct SwsContext *c, +typedef void (*yuv2interleavedX_fn)(enum AVPixelFormat dstFormat, const uint8_t *chrDither, const int16_t *chrFilter, int chrFilterSize, const int16_t **chrUSrc, diff --git a/libswscale/vscale.c b/libswscale/vscale.c index 72352dedb3..cac85921da 100644 --- a/libswscale/vscale.c +++ b/libswscale/vscale.c @@ -85,7 +85,7 @@ static int chr_planar_vscale(SwsContext *c, SwsFilterDescriptor *desc, int slice uint16_t *filter = inst->filter[0] + (inst->isMMX ? 0 : chrSliceY * inst->filter_size); if (c->yuv2nv12cX) { - ((yuv2interleavedX_fn)inst->pfn)(c, filter, inst->filter_size, (const int16_t**)src1, (const int16_t**)src2, dst1[0], dstW); + ((yuv2interleavedX_fn)inst->pfn)(c->dstFormat, c->chrDither8, filter, inst->filter_size, (const int16_t**)src1, (const int16_t**)src2, dst1[0], dstW); } else if (inst->filter_size == 1) { ((yuv2planar1_fn)inst->pfn)((const int16_t*)src1[0], dst1[0], dstW, c->chrDither8, 0); ((yuv2planar1_fn)inst->pfn)((const int16_t*)src2[0], dst2[0], dstW, c->chrDither8, 3);