From patchwork Fri Jan 20 15:20:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Tomas_H=C3=A4rdin?= X-Patchwork-Id: 40078 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3ca3:b0:b9:1511:ac2c with SMTP id b35csp1257581pzj; Fri, 20 Jan 2023 07:20:58 -0800 (PST) X-Google-Smtp-Source: AMrXdXuWogndIb3I0vCSY0J1lwd/nM8Ts4RjljJh6GCLXFEu+Y/i/ACYhyFIXIBEmUByztO+n8g8 X-Received: by 2002:a50:ee17:0:b0:499:ccfc:dd0b with SMTP id g23-20020a50ee17000000b00499ccfcdd0bmr15382501eds.17.1674228057800; Fri, 20 Jan 2023 07:20:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674228057; cv=none; d=google.com; s=arc-20160816; b=XU2RQGao7R/u/qqszJvczikEkX47BcZcYB9CxHuCsriQAP80wffJRzkU9bpKdFs5nB RPEyGJw/ZGPljYCNAB97ERo0HU533fOTAkTcucvh7LYIGvXnvLfrqS+O+SM+KdVxBVeP 86PmsDZuMip44ZwiKJBqAIkpiKsORd6AyxnRmlfRGo79LeY7hxaCCDZd0Zw3jzOWyV1W 2/gA46ecymO7neFZ0xY5iNQbLOqIRlI30MJAa0djnvFifb7jFiU+x02RIV41gd8PaSrz fCu6fvTiBRTY3ZXlRt+j0xKD67cQtVgmV25nVNw447+IWgzLK47U6iT14l73lLXNV/F2 WEIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject :mime-version:user-agent:date:to:from:message-id:delivered-to; bh=acDFHldM53EmmH38g/9cNy12LD2r+FrY1hrdOf81Xd0=; b=ttzaiJ4kFFFeOW0sOZKnucg4Qq4mmmsz5gK9SDEWOcJ3F6PNe5HQa4MlsGLqXtK6JO Jay+hVPit6ozAhsAUNp6TU5lbsbv+1Oqgflna5CbWpt6iBMx5/w8iwLhnpSv+7FTtLqI NSufxOoqsPNWwGIQVOHlt2s7bQRtue9zxM+Gm3SzA+fh88e55TKvJIF8IqvXjEMbLxV7 VDdnF/Qmose63jby58NdXbAC/wC17Z2VWZ0FG+dFnRj3FhjwzCqtzuwoSK0rqZZQxuP7 xZfGVKjYU2edkzfCU+fnko+OlCsXngPzTo5D/oAs/ITBlnL/PNevV3RfSNlae6ZgzFw3 H8lg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id er10-20020a056402448a00b0049d4fe939b9si21382366edb.434.2023.01.20.07.20.57; Fri, 20 Jan 2023 07:20:57 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5278D68BCEE; Fri, 20 Jan 2023 17:20:54 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail.frobbit.se (mail.frobbit.se [85.30.129.176]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A3B4F68BCD5 for ; Fri, 20 Jan 2023 17:20:48 +0200 (EET) Received: from [10.10.150.87] (1346516434.ip.84grams.net [134.65.164.34]) by mail.frobbit.se (Postfix) with ESMTPSA id 2CA91214E2 for ; Fri, 20 Jan 2023 16:20:48 +0100 (CET) Message-ID: From: Tomas =?iso-8859-1?q?H=E4rdin?= To: FFmpeg development discussions and patches Date: Fri, 20 Jan 2023 16:20:47 +0100 User-Agent: Evolution 3.38.3-1+deb11u1 MIME-Version: 1.0 Subject: [FFmpeg-devel] sws/swscale_unscaled.c: Faster yuv422p10 -> yuv422p conversion X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: TkAqVNfPlWFs I have in mind a more general solution that handles 9, 12, 14 and 16- bit too, and 444p and maybe 420p /Tomas From 99cc73053cc9a544ae923e5c8e3f4686f3c05454 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= Date: Wed, 18 Jan 2023 17:28:53 +0100 Subject: [PATCH] sws/swscale_unscaled.c: Faster yuv422p10 -> yuv422p conversion Based on work by Paul B Mahol. --- libswscale/swscale_unscaled.c | 46 +++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 9af2e7ecc3..6c71ecb34d 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -371,6 +371,50 @@ static int yuv422pToUyvyWrapper(SwsContext *c, const uint8_t *src[], return srcSliceH; } +static int yuv422p10ToYuv422p(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, int srcSliceH, + uint8_t *dstParam[], int dstStride[]) +{ + const uint16_t *ysrc = (const uint16_t *)(src[0]); + const uint16_t *usrc = (const uint16_t *)(src[1]); + const uint16_t *vsrc = (const uint16_t *)(src[2]); + + uint8_t *ydst = dstParam[0] + dstStride[0] * srcSliceY; + uint8_t *udst = dstParam[1] + dstStride[1] * srcSliceY; + uint8_t *vdst = dstParam[2] + dstStride[2] * srcSliceY; + + for (int y = 0; y < srcSliceH; y++) { + int x = 0; + +#define BLOCK 4 + for (; x < (c->dstW / 2 / BLOCK)*BLOCK; x += BLOCK) { + for (int x2 = x; x2 < x + BLOCK; x2++) { + ydst[2*x2+0] = ysrc[2*x2+0] >> 2; + ydst[2*x2+1] = ysrc[2*x2+1] >> 2; + udst[x2] = usrc[x2] >> 2; + vdst[x2] = vsrc[x2] >> 2; + } + } + + for (; x < c->dstW / 2; x++) { + ydst[2*x+0] = ysrc[2*x+0] >> 2; + ydst[2*x+1] = ysrc[2*x+1] >> 2; + udst[x] = usrc[x] >> 2; + vdst[x] = vsrc[x] >> 2; + } + + ysrc += srcStride[0] / 2; + usrc += srcStride[1] / 2; + vsrc += srcStride[2] / 2; + + ydst += dstStride[0]; + udst += dstStride[1]; + vdst += dstStride[2]; + } + + return srcSliceH; +} + static int yuyvToYuv420Wrapper(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dstParam[], int dstStride[]) @@ -2223,6 +2267,8 @@ void ff_get_unscaled_swscale(SwsContext *c) c->convert_unscaled = planarCopyWrapper; } + if (srcFormat == AV_PIX_FMT_YUV422P10 && dstFormat == AV_PIX_FMT_YUV422P) + c->convert_unscaled = yuv422p10ToYuv422p; #if ARCH_PPC ff_get_unscaled_swscale_ppc(c); #elif ARCH_ARM -- 2.30.2