From patchwork Sun Aug 20 15:10:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cox X-Patchwork-Id: 43272 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:47ca:b0:130:ccc6:6c4b with SMTP id ey10csp936345pzb; Sun, 20 Aug 2023 08:10:55 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG7UkMGcdfwSqTKHtIgD+4M0H+KSkoaNDJ/HI11gOBgpbM8p6f9haV2DtNx96o85ggGLNPd X-Received: by 2002:a17:906:8a7a:b0:99d:f8f0:fcc0 with SMTP id hy26-20020a1709068a7a00b0099df8f0fcc0mr3209253ejc.32.1692544255119; Sun, 20 Aug 2023 08:10:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692544255; cv=none; d=google.com; s=arc-20160816; b=YCHQuAOhsuKVkuT6ehelMFPHBcp21xH615+7qTtDOhaIUXmnjglneKIxh6VXFVXtMb oNhh2KXHyd1lOz2g4s9eKEe8Fmcgj7hOqtdLOpp4ZPm8bdzBQ6KIvlOmamZXqtJzKQGx KZfFMI+eYUaDfev/mdvjJfFfcvdt/5/HyyHG5xKuBj6BjlWt0YHTEPY9jAeXIEOKe90H UX2ONEwn8CKfXxavZhwxWQuiJWZl1Xn+pU7RvXXvhZ26RatAB0cn7a6IeFnWMfcgIKty HpTGwposaikp+gwpAzgOw+58VKTZaiKZcICCsLkvdmwWJZp8HAwCaucAiKExVpzoCweg MtBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=q5D/Z8xUYF6tOIAEDNDxLc6pg1QXnGqhXzqjFc4JgiA=; fh=9QDi6dFFPFAV43XzYhuUbqo2pwrpR9p92hw/7eQiArk=; b=sd0FrgQ4qsR5FavQdwMJYDu5IlQtdpcljBAbBi1yUCWBKfmmZ4/flZESQ1SUul9uZg fF2xMdE5OpFVPTTlUjbDkbPp0gksIV7deI6kVf3N+B/rOyft0xVrwQxf8m9WshESxVGZ 7SJY3TICEZqgQUyr7aLkN0VcHSVXYchCIVl8xG9SmR0Bskcu+5Bm0pyoxvWjbC7zXjX/ n/0+7utVDY60qSNfj6UXedxymhS643/SGttBahlKdfDdj4QY2mTesdasx1hHUZvTj5Jn skLfB9bmq2y8ZiVxJoRCrCpJueugHv6M6sc0nlka9gnuThCSZT5PzOH42qN5UuVdb3AQ FQOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=y5i1gJGW; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id t9-20020a1709066bc900b00992e26642ddsi4373999ejs.251.2023.08.20.08.10.54; Sun, 20 Aug 2023 08:10:55 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=y5i1gJGW; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 641AB68C394; Sun, 20 Aug 2023 18:10:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 463C668BBC9 for ; Sun, 20 Aug 2023 18:10:34 +0300 (EEST) Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-3175f17a7baso2228928f8f.0 for ; Sun, 20 Aug 2023 08:10:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20221208.gappssmtp.com; s=20221208; t=1692544233; x=1693149033; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/bZxdHX+AutDK7SgdwiIYGC7H6WWqqRHYiDIlR4OlOg=; b=y5i1gJGWzDC215xy9viqCrHdi0EjN5PtjEsr5p97mh29VgZFIUzlFCF0GRGFPcVLP2 Fx/3MhDjW8TWZrIqdwfuHYYNqqdHNnYuicYf4crUGwQ/9yc3wqYdBSTNNRdKCYYSRZq5 ugcTEL1xcAJC0D55kR0TokJYzvE+4w4xYFBIs8nhmpcEaYizKvCcd4LQDqgEiD1q3Bso 8Bs5ZxrU1w7wA3TY5Fd07gZMt22vHsf5WLLypSez8DY8xP0+lSIelEVO5j87cWeDAuSD 4v9+pBth2LJa0tVw+88RQBMxXpBTbJUyTggrz5gq8T5X4ir9CtN5iCur9wvWSygyRp74 FkNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692544233; x=1693149033; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/bZxdHX+AutDK7SgdwiIYGC7H6WWqqRHYiDIlR4OlOg=; b=D6hiJGaHDPWOhZVFyc7BWojgGZJTiBuD2tWRCD1A7pPrMam8UPoFGr5AlI2h49SISI wDdMBDvMDOfUBKlfddCLOoJlYeMgWaxVVvu2CO7NvqhJ1pnr7YuHHS3kj8jqjYtR6XMk +9twnrqJTzu2QlskBW0FA88eWsOJlVemlLz4q078HtN2WI86qqSb4sKK3GXXhwhUyt/2 /MXViaUrb9erO3FdjFXIcsTd450miskf27LR2vVuMC6pB/LOgiysfYCCWA45xi4om+UQ 697GABnVSjAwvbDvepm8ORE0rcCgoz58hIKJxDnERCg2F29O35EqwVXqeUUHdnYi/Bo6 xw+g== X-Gm-Message-State: AOJu0YzfE4Y9BN0iLv2/KR+ORLfah4/kxrY5ibsmJiXTgkXxAfdsg99z CT1TmTjO0yHd24sCfMEAYVQALtboovsjQpqIDMY= X-Received: by 2002:adf:ea4b:0:b0:317:7068:4997 with SMTP id j11-20020adfea4b000000b0031770684997mr2978871wrn.60.1692544233390; Sun, 20 Aug 2023 08:10:33 -0700 (PDT) Received: from sucnaath.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id b4-20020adff904000000b003197c7d08ddsm9494476wrr.71.2023.08.20.08.10.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Aug 2023 08:10:33 -0700 (PDT) From: John Cox To: ffmpeg-devel@ffmpeg.org Date: Sun, 20 Aug 2023 15:10:17 +0000 Message-Id: <20230820151022.2204421-2-jc@kynesim.co.uk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230820151022.2204421-1-jc@kynesim.co.uk> References: <20230820151022.2204421-1-jc@kynesim.co.uk> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v1 1/6] fate-filter-fps: Set swscale bitexact for tests that do conversions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: John Cox Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: nOl+4igfl2vU -bitexact as a general flag doesn't affect swscale so add swscale option too to get correct CRCs in all circumstances. Signed-off-by: John Cox --- tests/fate/filter-video.mak | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/fate/filter-video.mak b/tests/fate/filter-video.mak index 789ec6414c..811a96d124 100644 --- a/tests/fate/filter-video.mak +++ b/tests/fate/filter-video.mak @@ -391,8 +391,8 @@ fate-filter-fps-start-drop: CMD = framecrc -lavfi testsrc2=r=7:d=3.5,fps=3:start fate-filter-fps-start-fill: CMD = framecrc -lavfi testsrc2=r=7:d=1.5,setpts=PTS+14,fps=3:start_time=1.5 FATE_FILTER_SAMPLES-$(call FILTERDEMDEC, FPS SCALE, MOV, QTRLE) += fate-filter-fps-cfr fate-filter-fps -fate-filter-fps-cfr: CMD = framecrc -auto_conversion_filters -i $(TARGET_SAMPLES)/qtrle/apple-animation-variable-fps-bug.mov -r 30 -vsync cfr -pix_fmt yuv420p -fate-filter-fps: CMD = framecrc -auto_conversion_filters -i $(TARGET_SAMPLES)/qtrle/apple-animation-variable-fps-bug.mov -vf fps=30 -pix_fmt yuv420p +fate-filter-fps-cfr: CMD = framecrc -auto_conversion_filters -i $(TARGET_SAMPLES)/qtrle/apple-animation-variable-fps-bug.mov -r 30 -vsync cfr -vf scale=sws_flags=bitexact -pix_fmt yuv420p +fate-filter-fps: CMD = framecrc -auto_conversion_filters -i $(TARGET_SAMPLES)/qtrle/apple-animation-variable-fps-bug.mov -vf fps=30,scale=sws_flags=bitexact -pix_fmt yuv420p FATE_FILTER_ALPHAEXTRACT_ALPHAMERGE := $(addprefix fate-filter-alphaextract_alphamerge_, rgb yuv) FATE_FILTER_VSYNTH_PGMYUV-$(call ALLYES, SCALE_FILTER FORMAT_FILTER SPLIT_FILTER ALPHAEXTRACT_FILTER ALPHAMERGE_FILTER) += $(FATE_FILTER_ALPHAEXTRACT_ALPHAMERGE) From patchwork Sun Aug 20 15:10:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cox X-Patchwork-Id: 43273 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:47ca:b0:130:ccc6:6c4b with SMTP id ey10csp936434pzb; Sun, 20 Aug 2023 08:11:04 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFKwPDg977doxDD7TH77OG6s+xEMhwgBtJQc4O/UuExefsLWbd30NzpRpDRVSOI2IfZD9sO X-Received: by 2002:a17:906:5199:b0:966:17b2:5b0b with SMTP id y25-20020a170906519900b0096617b25b0bmr3145323ejk.49.1692544263832; Sun, 20 Aug 2023 08:11:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692544263; cv=none; d=google.com; s=arc-20160816; b=FnVSBvaM4/Xv30YxnXM/+Vnn5sfKwUrQNVIs4ofFj8b67oSN4Ln4HhhHlRHY7X9G7T ZW/p6dza7z0+H3qcG6gWs1PPIB4TEFf0lj3VADs+2phdh3EJmEa0s/WgXRtJ8IOD9DA4 6te1PxxNgC3OyIwnaU86xTPTr7JmlJB2g4mkO8g10stadzOzbZY/Q0+ItIRGh5Gv7L6K vrEtmV0LBqF3WUHjQahCuMTuo6tGdKYHrO7GYS5Lncl51dJ3qXGadE+audwlXC+8Skaw OBN+CirQQ6ZSBPhn+CkKExisuE9lIt3nsqnm1kOLU6TyH6RmNgjeg3P0tJcyAs0rUqdt +wXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=ltu5RQ5oJnKqczE37Dl6qgTT5tGZLNp2EWXGHGJYUFA=; fh=9QDi6dFFPFAV43XzYhuUbqo2pwrpR9p92hw/7eQiArk=; b=gVxfYSfrimSZ+LGe+zLL3M8urg4t843sGxvOvzwFg7Hb57ztBF0+8lrxOvmRSbhfs6 6FevfwlXzkyAgnzYyUoEBbgGj0Y8/MKQp2oR9wOHKi28RueLrezRTP2F8M3O+VLlcR7n evy+ryiqFiAQpAnhNHP9uThByjtcQ53y5KLFSm0mvbgN7daCmrPbtMmRfXY09sj9x2yo MMBwGfBodEnFHcIFGgWkcGGGOyymqtW+npCqFsjIid8z5zJlRgzr5R/VTeHtkGJQCow+ snjFaTit/ds+UhedEENUIO3uv8SsH9s5CDbVxokvNXVX8hlmkbSB61v00Lj+FT71kGXx mN3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=31kRVzYw; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v24-20020a170906181800b0099cc98af80fsi4599141eje.552.2023.08.20.08.11.03; Sun, 20 Aug 2023 08:11:03 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=31kRVzYw; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5AAA768C39C; Sun, 20 Aug 2023 18:10:42 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6618268BBC9 for ; Sun, 20 Aug 2023 18:10:34 +0300 (EEST) Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-3fee7b7689dso8482085e9.0 for ; Sun, 20 Aug 2023 08:10:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20221208.gappssmtp.com; s=20221208; t=1692544233; x=1693149033; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JZmCQcz4F7XEErtUOFNRRdghrc2DcoGMAKVcn02qyG0=; b=31kRVzYw4G7NXrYpav2OIch1822wXCRbpgHfWLtzsznfl6+z1hkf2VMj92JAHmbsOc j4M+VsBK03tkRnT2Hc0BjHEAU/xSbgqozS5zJ37LYnFvHb6uS7n3VW3TXgZ//zR6Of+Q PIY4Y3yxAHneJ7XY8osk/Cda7WbV3s/d4iQUBtaznrEEVb202fwYloWgh4WjAHxRbEyW nmMz0t1qQm/k971cRBBPsXOl7i9Aa2/vFPv9Mj+FxJj5O+6jzcfWkUHx1lA+80YTHZp1 0TV7ekWUS3t3zEE22CBiNYGzYhmn2U6F8fxtTQM7MCp1lXYELz2LhaQUEezw02R/DtbK 2gLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692544233; x=1693149033; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JZmCQcz4F7XEErtUOFNRRdghrc2DcoGMAKVcn02qyG0=; b=YPoGoFTLVkYsv2K62PKw4J75+BJcgLYowX8tCZR3f2rCw6/u1E9BaIgS7WUY3A+vB3 QAQWP8dLOhAyTKrQCrntZyDiB/ZByAtuO9GHheYCn3t+xkpPNuEhkFb5xvyrnqyzSMAQ 75gXU41DzIvF98gib6wDLFU5pIk0euXN46Bglwo/mgF6a6IVZN769oqyt3RwTufjeSYB C1mgnNPmqiJ8h/o1ep5DklwZ2XJLffjlOv+ZZnLB4Mq9NpBz06cmoC2prMXGCGaDefvI j0oXN6w2yv0IsLiWXxtNM38mxGlROxPc9KozHj00idxdyc+0UClQVS5eLJtjHG9Xghnx oHuQ== X-Gm-Message-State: AOJu0Yzc72uzB8C0naf55bCG2LjrwOLUxykdJL0JrLvwSbSfPNd+ZB+6 2dzVtoCYuqq8FCXsXG63KiyjSFi+4R47prGR4no= X-Received: by 2002:adf:f6cc:0:b0:313:fbd0:9813 with SMTP id y12-20020adff6cc000000b00313fbd09813mr2813250wrp.28.1692544233787; Sun, 20 Aug 2023 08:10:33 -0700 (PDT) Received: from sucnaath.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id b4-20020adff904000000b003197c7d08ddsm9494476wrr.71.2023.08.20.08.10.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Aug 2023 08:10:33 -0700 (PDT) From: John Cox To: ffmpeg-devel@ffmpeg.org Date: Sun, 20 Aug 2023 15:10:18 +0000 Message-Id: <20230820151022.2204421-3-jc@kynesim.co.uk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230820151022.2204421-1-jc@kynesim.co.uk> References: <20230820151022.2204421-1-jc@kynesim.co.uk> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v1 2/6] swscale: Rename BGR24->YUV conversion functions as bgr... X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: John Cox Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: aQmAyyXee5OF Rename swscale conversion functions for converting BGR24 frames to YUV as bgr24toyuv12 rather than rgb24toyuv12 as that is just confusing and would be even more confusing with the addition of RGB24 converters. Signed-off-by: John Cox --- libswscale/bayer_template.c | 2 +- libswscale/rgb2rgb.c | 2 +- libswscale/rgb2rgb.h | 4 ++-- libswscale/rgb2rgb_template.c | 4 ++-- libswscale/swscale_unscaled.c | 2 +- libswscale/x86/rgb2rgb_template.c | 8 ++++---- 6 files changed, 11 insertions(+), 11 deletions(-) diff --git a/libswscale/bayer_template.c b/libswscale/bayer_template.c index 46b5a4984d..06d917c97f 100644 --- a/libswscale/bayer_template.c +++ b/libswscale/bayer_template.c @@ -188,7 +188,7 @@ * invoke ff_rgb24toyv12 for 2x2 pixels */ #define rgb24toyv12_2x2(src, dstY, dstU, dstV, luma_stride, src_stride, rgb2yuv) \ - ff_rgb24toyv12(src, dstY, dstV, dstU, 2, 2, luma_stride, 0, src_stride, rgb2yuv) + ff_bgr24toyv12(src, dstY, dstV, dstU, 2, 2, luma_stride, 0, src_stride, rgb2yuv) static void BAYER_RENAME(rgb24_copy)(const uint8_t *src, int src_stride, uint8_t *dst, int dst_stride, int width) { diff --git a/libswscale/rgb2rgb.c b/libswscale/rgb2rgb.c index e98fdac8ea..8707917800 100644 --- a/libswscale/rgb2rgb.c +++ b/libswscale/rgb2rgb.c @@ -78,7 +78,7 @@ void (*yuy2toyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, int chromStride, int srcStride); -void (*ff_rgb24toyv12)(const uint8_t *src, uint8_t *ydst, +void (*ff_bgr24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, int chromStride, int srcStride, diff --git a/libswscale/rgb2rgb.h b/libswscale/rgb2rgb.h index f3951d523e..305b830920 100644 --- a/libswscale/rgb2rgb.h +++ b/libswscale/rgb2rgb.h @@ -76,7 +76,7 @@ void rgb15tobgr15(const uint8_t *src, uint8_t *dst, int src_size); void rgb12tobgr12(const uint8_t *src, uint8_t *dst, int src_size); void rgb12to15(const uint8_t *src, uint8_t *dst, int src_size); -void ff_rgb24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, +void ff_bgr24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv); @@ -124,7 +124,7 @@ extern void (*yuv422ptouyvy)(const uint8_t *ysrc, const uint8_t *usrc, const uin * Chrominance data is only taken from every second line, others are ignored. * FIXME: Write high quality version. */ -extern void (*ff_rgb24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, +extern void (*ff_bgr24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv); diff --git a/libswscale/rgb2rgb_template.c b/libswscale/rgb2rgb_template.c index 42c69801ba..8ef4a2cf5d 100644 --- a/libswscale/rgb2rgb_template.c +++ b/libswscale/rgb2rgb_template.c @@ -646,7 +646,7 @@ static inline void uyvytoyv12_c(const uint8_t *src, uint8_t *ydst, * others are ignored in the C version. * FIXME: Write HQ version. */ -void ff_rgb24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, +void ff_bgr24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv) { @@ -979,7 +979,7 @@ static av_cold void rgb2rgb_init_c(void) yuv422ptouyvy = yuv422ptouyvy_c; yuy2toyv12 = yuy2toyv12_c; planar2x = planar2x_c; - ff_rgb24toyv12 = ff_rgb24toyv12_c; + ff_bgr24toyv12 = ff_bgr24toyv12_c; interleaveBytes = interleaveBytes_c; deinterleaveBytes = deinterleaveBytes_c; vu9_to_vu12 = vu9_to_vu12_c; diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 9af2e7ecc3..32e0d7f63c 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -1641,7 +1641,7 @@ static int bgr24ToYv12Wrapper(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]) { - ff_rgb24toyv12( + ff_bgr24toyv12( src[0], dst[0] + srcSliceY * dstStride[0], dst[1] + (srcSliceY >> 1) * dstStride[1], diff --git a/libswscale/x86/rgb2rgb_template.c b/libswscale/x86/rgb2rgb_template.c index 4aba25dd51..dc2b4e205a 100644 --- a/libswscale/x86/rgb2rgb_template.c +++ b/libswscale/x86/rgb2rgb_template.c @@ -1544,7 +1544,7 @@ static inline void RENAME(uyvytoyv12)(const uint8_t *src, uint8_t *ydst, uint8_t * FIXME: Write HQ version. */ #if HAVE_7REGS -static inline void RENAME(rgb24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, +static inline void RENAME(bgr24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv) @@ -1556,7 +1556,7 @@ static inline void RENAME(rgb24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_ const x86_reg chromWidth= width>>1; if (height > 2) { - ff_rgb24toyv12_c(src, ydst, udst, vdst, width, 2, lumStride, chromStride, srcStride, rgb2yuv); + ff_bgr24toyv12_c(src, ydst, udst, vdst, width, 2, lumStride, chromStride, srcStride, rgb2yuv); src += 2*srcStride; ydst += 2*lumStride; udst += chromStride; @@ -1737,7 +1737,7 @@ static inline void RENAME(rgb24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_ SFENCE" \n\t" :::"memory"); - ff_rgb24toyv12_c(src, ydst, udst, vdst, width, height-y, lumStride, chromStride, srcStride, rgb2yuv); + ff_bgr24toyv12_c(src, ydst, udst, vdst, width, height-y, lumStride, chromStride, srcStride, rgb2yuv); } #endif /* HAVE_7REGS */ #endif /* !COMPILE_TEMPLATE_SSE2 */ @@ -2434,7 +2434,7 @@ static av_cold void RENAME(rgb2rgb_init)(void) planar2x = RENAME(planar2x); #if HAVE_7REGS - ff_rgb24toyv12 = RENAME(rgb24toyv12); + ff_bgr24toyv12 = RENAME(bgr24toyv12); #endif /* HAVE_7REGS */ yuyvtoyuv420 = RENAME(yuyvtoyuv420); From patchwork Sun Aug 20 15:10:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cox X-Patchwork-Id: 43274 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:47ca:b0:130:ccc6:6c4b with SMTP id ey10csp936508pzb; Sun, 20 Aug 2023 08:11:12 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHoUhIIi8lgsUTquCW56FXcVMsZ2vMPkGYRWMVjaGOnlfoa103LcMy0vY/xTSbf81sh4Ebp X-Received: by 2002:a17:907:b15:b0:99b:605b:1f49 with SMTP id h21-20020a1709070b1500b0099b605b1f49mr3080559ejl.36.1692544272385; Sun, 20 Aug 2023 08:11:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692544272; cv=none; d=google.com; s=arc-20160816; b=IVpoVRjJT8taBO4INPFkq7yVWA5ggHvvXjIMqBAoTa4Z/eLHKpSGkg8PamSCNer67j O9OiDRk5IFOJUbQdDUaMTN34K353q0/Jl48V3YPUWSVPo/s7m4bLqGQGQDQIapTzgPj0 inuNG2IldCYryl8LIgC2AaeT4d5ZEwcrvw5ulg6RGzW2/4v72IxbL6caB5GsYWfEX/FL gxVt+RhvaKmbIiFiK8dSql8VwLYe6OQN3fUNuP0cT4PtYTJE6byNr/D2wEnAbA9tItW6 yzXrqpVNxTOal+touKJfzyI0ArSOCYWd4DEJlQ8GZBMeOcCuinO9FWdkWr0BKxaPlirG hZrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=Ler2g8kpTLO02GTJO6vooZJaBgkTIU+2Wtft2EfO0tA=; fh=9QDi6dFFPFAV43XzYhuUbqo2pwrpR9p92hw/7eQiArk=; b=Tf+0yhFuG535dONf1lU3iJKxD8DsY086z/FJNepEnobFLhwbSoWZFh1g+AM2vjp5pb Y4HEQ5vEtFWnO5Wy+sAmC7q+Vaf5UBxBHmxt/MTYcZo9YSOq6L8w8oXGBqGmK9xQaXnb YWc8Qt9dtTMVD2NF4EzUR+QhzFIKs1OlSLdsmc4ixjIltOnyTkifzR81tT8+D9rOp+sM 429t5ws1ekCZvmiAlGLDCMQGwPmCXB99Ya5pdTCb/R5gLmD1tfYpqTc72q3y7iAJRRGZ a2r6VGSEenn2XJMmoj8Cn8rxLPzb/Acu3wRdA1DR2lDPe8Whs0Rgn2Gt4tZd7J8K6wOS mFkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=oDwNDQcm; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id r23-20020a170906351700b0099b49483268si4350840eja.301.2023.08.20.08.11.11; Sun, 20 Aug 2023 08:11:12 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=oDwNDQcm; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 59B0368C3DA; Sun, 20 Aug 2023 18:10:43 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C2EB568B4D7 for ; Sun, 20 Aug 2023 18:10:34 +0300 (EEST) Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-31ad9155414so2231011f8f.3 for ; Sun, 20 Aug 2023 08:10:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20221208.gappssmtp.com; s=20221208; t=1692544234; x=1693149034; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RW7Z5hSNrilKF35jOoSyVGRkjm9ouDODSPZ2NWky/SE=; b=oDwNDQcmVpqQ6b62u06vT0GC10nyhBkbOUDtOMsgkKi/gAZPGSfQ15i73nToUJH41J bBuku8/r/AF69Dp6mkMTpEyGBQCR10NCr5bMLqMFzww1XmnVHiXrOWz/FZA8iI2Xq08b CQ9gpRMkYGp7INY2rBXCKeWkFFIAdh3VZZO7pUqGiBraQYf6yVwU2S6mZAoitQFiNU+s XxXZnCjx4Js5D1Z6IY0UfzBPtsb++9eQQsZ1G00ifiK0skARRZagYkX9w6kzxwTues6a /AnTbWYVJAfFpJQ0c2fyFf7KHVFuYIillotKmxkX9yqnXAWVa4Y2HCuqxRROuRCLxGK6 DA8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692544234; x=1693149034; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RW7Z5hSNrilKF35jOoSyVGRkjm9ouDODSPZ2NWky/SE=; b=epPCn7vxBiS8ct+tOlzjjnw8DcGlpDjMDCb0wm4utlnG1XFKMxsXMj8nsaIC4eSV+8 93gBRiRM7/Es8Q/CjVsb4a0GXKYtZX0TS3aSphAMJZA1o2iz3NXnoXIEy4O7Kx+usvf9 0RuTAuKtJ5GdfjvLtjathTNG0aj3zupEwnYV/xvToeQf0cjMfQaxJy10uvxzJ9SuVDGi Ny3x2DQCpT90y0PxGP9RAJpbIv95ooAFl9eX9uSEIq5a5oMpjQAL6RXo5g1ns/BuD1Yf MSggwOXzuPVCC8X5n/99DVCwDTYa88vy+xKDr7KbhB0iMSjQcQ3JTED3Mn9bZaL8fEe5 jZDQ== X-Gm-Message-State: AOJu0Yxoe3WpiIMMI1v9yi6eLZ86sEGeEVU23Us2Xrm8YTIGHApxWQji 6yU+8/xHNGDiFv/QEgJ68ndngtsu9xhHWZcIFOs= X-Received: by 2002:a5d:60c1:0:b0:317:3deb:a899 with SMTP id x1-20020a5d60c1000000b003173deba899mr3015485wrt.1.1692544234212; Sun, 20 Aug 2023 08:10:34 -0700 (PDT) Received: from sucnaath.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id b4-20020adff904000000b003197c7d08ddsm9494476wrr.71.2023.08.20.08.10.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Aug 2023 08:10:34 -0700 (PDT) From: John Cox To: ffmpeg-devel@ffmpeg.org Date: Sun, 20 Aug 2023 15:10:19 +0000 Message-Id: <20230820151022.2204421-4-jc@kynesim.co.uk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230820151022.2204421-1-jc@kynesim.co.uk> References: <20230820151022.2204421-1-jc@kynesim.co.uk> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v1 3/6] swscale: Add explicit rgb24->yv12 conversion X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: John Cox Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: It53zB/IBUrz Add a rgb24->yuv420p conversion. Uses the same code as the existing bgr24->yuv converter but permutes the conversion array to swap R & B coefficients. Signed-off-by: John Cox --- libswscale/rgb2rgb.c | 5 +++++ libswscale/rgb2rgb.h | 7 +++++++ libswscale/rgb2rgb_template.c | 38 ++++++++++++++++++++++++++++++----- libswscale/swscale_unscaled.c | 24 +++++++++++++++++++++- 4 files changed, 68 insertions(+), 6 deletions(-) diff --git a/libswscale/rgb2rgb.c b/libswscale/rgb2rgb.c index 8707917800..de90e5193f 100644 --- a/libswscale/rgb2rgb.c +++ b/libswscale/rgb2rgb.c @@ -83,6 +83,11 @@ void (*ff_bgr24toyv12)(const uint8_t *src, uint8_t *ydst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv); +void (*ff_rgb24toyv12)(const uint8_t *src, uint8_t *ydst, + uint8_t *udst, uint8_t *vdst, + int width, int height, + int lumStride, int chromStride, int srcStride, + int32_t *rgb2yuv); void (*planar2x)(const uint8_t *src, uint8_t *dst, int width, int height, int srcStride, int dstStride); void (*interleaveBytes)(const uint8_t *src1, const uint8_t *src2, uint8_t *dst, diff --git a/libswscale/rgb2rgb.h b/libswscale/rgb2rgb.h index 305b830920..f7a76a92ba 100644 --- a/libswscale/rgb2rgb.h +++ b/libswscale/rgb2rgb.h @@ -79,6 +79,9 @@ void rgb12to15(const uint8_t *src, uint8_t *dst, int src_size); void ff_bgr24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv); +void ff_rgb24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv); /** * Height should be a multiple of 2 and width should be a multiple of 16. @@ -128,6 +131,10 @@ extern void (*ff_bgr24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv); +extern void (*ff_rgb24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, + int width, int height, + int lumStride, int chromStride, int srcStride, + int32_t *rgb2yuv); extern void (*planar2x)(const uint8_t *src, uint8_t *dst, int width, int height, int srcStride, int dstStride); diff --git a/libswscale/rgb2rgb_template.c b/libswscale/rgb2rgb_template.c index 8ef4a2cf5d..e57bfa6545 100644 --- a/libswscale/rgb2rgb_template.c +++ b/libswscale/rgb2rgb_template.c @@ -646,13 +646,14 @@ static inline void uyvytoyv12_c(const uint8_t *src, uint8_t *ydst, * others are ignored in the C version. * FIXME: Write HQ version. */ -void ff_bgr24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, +static void rgb24toyv12_x(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, - int chromStride, int srcStride, int32_t *rgb2yuv) + int chromStride, int srcStride, int32_t *rgb2yuv, + const uint8_t x[9]) { - int32_t ry = rgb2yuv[RY_IDX], gy = rgb2yuv[GY_IDX], by = rgb2yuv[BY_IDX]; - int32_t ru = rgb2yuv[RU_IDX], gu = rgb2yuv[GU_IDX], bu = rgb2yuv[BU_IDX]; - int32_t rv = rgb2yuv[RV_IDX], gv = rgb2yuv[GV_IDX], bv = rgb2yuv[BV_IDX]; + int32_t ry = rgb2yuv[x[0]], gy = rgb2yuv[x[1]], by = rgb2yuv[x[2]]; + int32_t ru = rgb2yuv[x[3]], gu = rgb2yuv[x[4]], bu = rgb2yuv[x[5]]; + int32_t rv = rgb2yuv[x[6]], gv = rgb2yuv[x[7]], bv = rgb2yuv[x[8]]; int y; const int chromWidth = width >> 1; @@ -707,6 +708,32 @@ void ff_bgr24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, } } +static const uint8_t x_bgr[9] = { + RY_IDX, GY_IDX, BY_IDX, + RU_IDX, GU_IDX, BU_IDX, + RV_IDX, GV_IDX, BV_IDX, +}; + +static const uint8_t x_rgb[9] = { + BY_IDX, GY_IDX, RY_IDX, + BU_IDX, GU_IDX, RU_IDX, + BV_IDX, GV_IDX, RV_IDX, +}; + +void ff_bgr24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv) +{ + rgb24toyv12_x(src, ydst, udst, vdst, width, height, lumStride, chromStride, srcStride, rgb2yuv, x_bgr); +} + +void ff_rgb24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv) +{ + rgb24toyv12_x(src, ydst, udst, vdst, width, height, lumStride, chromStride, srcStride, rgb2yuv, x_rgb); +} + static void interleaveBytes_c(const uint8_t *src1, const uint8_t *src2, uint8_t *dest, int width, int height, int src1Stride, int src2Stride, int dstStride) @@ -979,6 +1006,7 @@ static av_cold void rgb2rgb_init_c(void) yuv422ptouyvy = yuv422ptouyvy_c; yuy2toyv12 = yuy2toyv12_c; planar2x = planar2x_c; + ff_rgb24toyv12 = ff_rgb24toyv12_c; ff_bgr24toyv12 = ff_bgr24toyv12_c; interleaveBytes = interleaveBytes_c; deinterleaveBytes = deinterleaveBytes_c; diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 32e0d7f63c..751bdcb2e4 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -1654,6 +1654,23 @@ static int bgr24ToYv12Wrapper(SwsContext *c, const uint8_t *src[], return srcSliceH; } +static int rgb24ToYv12Wrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, int srcSliceH, + uint8_t *dst[], int dstStride[]) +{ + ff_rgb24toyv12( + src[0], + dst[0] + srcSliceY * dstStride[0], + dst[1] + (srcSliceY >> 1) * dstStride[1], + dst[2] + (srcSliceY >> 1) * dstStride[2], + c->srcW, srcSliceH, + dstStride[0], dstStride[1], srcStride[0], + c->input_rgb2yuv_table); + if (dst[3]) + fillPlane(dst[3], dstStride[3], c->srcW, srcSliceH, srcSliceY, 255); + return srcSliceH; +} + static int yvu9ToYv12Wrapper(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]) @@ -2035,8 +2052,13 @@ void ff_get_unscaled_swscale(SwsContext *c) /* bgr24toYV12 */ if (srcFormat == AV_PIX_FMT_BGR24 && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) && - !(flags & SWS_ACCURATE_RND) && !(dstW&1)) + !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT)) && !(dstW&1)) c->convert_unscaled = bgr24ToYv12Wrapper; + /* rgb24toYV12 */ + if (srcFormat == AV_PIX_FMT_RGB24 && + (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) && + !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT)) && !(dstW&1)) + c->convert_unscaled = rgb24ToYv12Wrapper; /* RGB/BGR -> RGB/BGR (no dither needed forms) */ if (isAnyRGB(srcFormat) && isAnyRGB(dstFormat) && findRgbConvFn(c) From patchwork Sun Aug 20 15:10:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cox X-Patchwork-Id: 43275 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:47ca:b0:130:ccc6:6c4b with SMTP id ey10csp936582pzb; Sun, 20 Aug 2023 08:11:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGuD3GKAQxa6udNs4bOl1KhL4pQWQCM0NA3gaRP/L+sC87IJrDhFEPvN8skLmFiI5aW8xTS X-Received: by 2002:a2e:94c8:0:b0:2bb:78ad:56cb with SMTP id r8-20020a2e94c8000000b002bb78ad56cbmr2837099ljh.37.1692544281063; Sun, 20 Aug 2023 08:11:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692544281; cv=none; d=google.com; s=arc-20160816; b=017dfiPz+hpCYNab/a5O1JWYY8U6sDMGknV1v3T435tu+YpC95yuYNoNx3HL/TOjDs qGS/VoWuJbxfOEdjpgWij+dzWNaBtRpdcWxQWQyCnPwL2lLTS1y8U1LHpcadQb4LrqVy +fqWqMtmM/ndanI0HBQzLPBbIqyCxJf4dFXEJn0GLNpsvKs5+knb57thEBaSohn5+5Ew xmPMn6uCmp/fG9AEJ6FKKZwkXyT05PTGPreQfrFrkyHCgcSE/4HAJ/pS+fofjBqWlpf4 2TbPrckdUKSPt8lNfQtnMGzUiw8oUTi6R1sqh3tUiGQn/gTajvJXc9OrETVvuOJUJmff +POw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=ISmiB2oB/zIxJQO2dV8UEw3uteESweDu1xFjFDgLlSs=; fh=9QDi6dFFPFAV43XzYhuUbqo2pwrpR9p92hw/7eQiArk=; b=N6hDcTMZvYHKWMK6An1N6ClC5qgVdGYX7EE+kT9/Fy9dtEXGBLpOlEq+MV3eSZkSxA xgf1NktdAgkR/LmmvDHFJNe/jxuf6mI0T1N1Mtv60awXKVhi0gKH52JHDLOso5pF0mH3 3U6MVHuP0HF1siQHSGuB2QM4/QfFoJE5EL+9aRlhcQWp5QkuHKyrIwIzTr5/FjA9TWUL MCW0gexsFIr9MUb5RMqHeXvIkxQNPXNgzGdbEh0kGQ1TQNToVL1MSABHMte9iREePOb2 17i99l6m6L0Lrf5ZSjMfQzL8V25MLP/UO7zEAByHbj/mzeoPjNva9Yjfl8D20B6fGeA+ pFJg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=cFahH2ie; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id rv6-20020a17090710c600b00965cb784a27si4066286ejb.699.2023.08.20.08.11.20; Sun, 20 Aug 2023 08:11:21 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=cFahH2ie; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 43D8968C364; Sun, 20 Aug 2023 18:10:44 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f43.google.com (mail-wr1-f43.google.com [209.85.221.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2D53668BFEE for ; Sun, 20 Aug 2023 18:10:35 +0300 (EEST) Received: by mail-wr1-f43.google.com with SMTP id ffacd0b85a97d-3197808bb08so2334086f8f.2 for ; Sun, 20 Aug 2023 08:10:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20221208.gappssmtp.com; s=20221208; t=1692544234; x=1693149034; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=X1WbP+pSWdXklWqcofKp5THMMSn+sUNB7aEv72qzwI8=; b=cFahH2ie/ijxeFhsbnCrBk/hZ/V/wAiMjYFy1uD1wHnU9buwRMq/ecbzLedvNvad9p YKG4sPmsbRItFq7rnprifX0LqrR2rNkjyHrvX1KaZJ1toKpZnNMGQTiekLODibQWpS98 gflMT0BCYJOB5iR55RX+XNY9e5qAMqVxrE4RaDK9hWDpt8HC2itlno8OCQJrHpdDWE4+ arRlZM0lsVqcaa9wh+Aj3cyM3PNwpTS9/2DyTpp2A3+aYcq5taT5jcprgLf2eGTwRdbn nEjCEEsMq4wxl5pqG2/JH4OTUvV5YjkHJaavkMxIL+HfDMdIMRnhZOHWMUS9cUSd8TJR vXOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692544234; x=1693149034; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=X1WbP+pSWdXklWqcofKp5THMMSn+sUNB7aEv72qzwI8=; b=T++aSUZBEu8kEpSYfmM8YGT1tXokoa4T2J0VxWC7JU+Tqu3N5ylmEQdh9tYjPWtIxQ b0tZaMhnXjq8tPIT8F2Nhy+RiZDn66qhwX+pP2auRAwux0iPz6gSAzkowPGoV+lTSPWG oydC3ZzjtWKU+sHVcdsz+2wYNkdP1YyEqUBYFCjrOSERSupfn2yjcvZKls2AXnBXJlqq 30kFZEd56XkFVhVsQBedzfVFkIhHmj665n7m0z0Zk0ZuxKdbh3wb0VRCZHn329lB4T18 kQES+xRpwwwRFyJ3iw4Ij7di6ufIobRXaLSySCxUE8V4wawG7ffj7jQtrOaRD4VHUSQN Kjxg== X-Gm-Message-State: AOJu0Yw9i1JY5Roac7wT5vdOIuJ3DW5w6KXYYBT59nJ3+sCwY6IXemKO R51teupueeqDqj84MKL3H4uJVL9Y9AZIL2N2qxo= X-Received: by 2002:adf:e704:0:b0:319:6d20:49c7 with SMTP id c4-20020adfe704000000b003196d2049c7mr3457963wrm.3.1692544234609; Sun, 20 Aug 2023 08:10:34 -0700 (PDT) Received: from sucnaath.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id b4-20020adff904000000b003197c7d08ddsm9494476wrr.71.2023.08.20.08.10.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Aug 2023 08:10:34 -0700 (PDT) From: John Cox To: ffmpeg-devel@ffmpeg.org Date: Sun, 20 Aug 2023 15:10:20 +0000 Message-Id: <20230820151022.2204421-5-jc@kynesim.co.uk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230820151022.2204421-1-jc@kynesim.co.uk> References: <20230820151022.2204421-1-jc@kynesim.co.uk> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v1 4/6] swscale: RGB24->YUV allow odd widths & improve C rounding X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: John Cox Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: PRi/w2hjE5rl Allow odd widths for conversion it costs very little and simplifies setup slightly. x86 asm will fall back to the C code if width is odd. Round to nearest rather than just down. This reduces the Y error reported by tests/swscale from 3 to 1. x86 asm doesn't mirror the C so exact correspondence isn't an issue there. Signed-off-by: John Cox --- libswscale/rgb2rgb_template.c | 42 ++++++++++++++++++------------- libswscale/swscale_unscaled.c | 5 ++-- libswscale/x86/rgb2rgb_template.c | 5 ++++ 3 files changed, 32 insertions(+), 20 deletions(-) diff --git a/libswscale/rgb2rgb_template.c b/libswscale/rgb2rgb_template.c index e57bfa6545..5503e58a29 100644 --- a/libswscale/rgb2rgb_template.c +++ b/libswscale/rgb2rgb_template.c @@ -656,6 +656,8 @@ static void rgb24toyv12_x(const uint8_t *src, uint8_t *ydst, uint8_t *udst, int32_t rv = rgb2yuv[x[6]], gv = rgb2yuv[x[7]], bv = rgb2yuv[x[8]]; int y; const int chromWidth = width >> 1; + const int32_t ky = ((16 << 1) + 1) << (RGB2YUV_SHIFT - 1); + const int32_t kc = ((128 << 1) + 1) << (RGB2YUV_SHIFT - 1); for (y = 0; y < height; y += 2) { int i; @@ -664,9 +666,9 @@ static void rgb24toyv12_x(const uint8_t *src, uint8_t *ydst, uint8_t *udst, unsigned int g = src[6 * i + 1]; unsigned int r = src[6 * i + 2]; - unsigned int Y = ((ry * r + gy * g + by * b) >> RGB2YUV_SHIFT) + 16; - unsigned int V = ((rv * r + gv * g + bv * b) >> RGB2YUV_SHIFT) + 128; - unsigned int U = ((ru * r + gu * g + bu * b) >> RGB2YUV_SHIFT) + 128; + unsigned int Y = (ry * r + gy * g + by * b + ky) >> RGB2YUV_SHIFT; + unsigned int V = (rv * r + gv * g + bv * b + kc) >> RGB2YUV_SHIFT; + unsigned int U = (ru * r + gu * g + bu * b + kc) >> RGB2YUV_SHIFT; udst[i] = U; vdst[i] = V; @@ -676,30 +678,36 @@ static void rgb24toyv12_x(const uint8_t *src, uint8_t *ydst, uint8_t *udst, g = src[6 * i + 4]; r = src[6 * i + 5]; - Y = ((ry * r + gy * g + by * b) >> RGB2YUV_SHIFT) + 16; + Y = ((ry * r + gy * g + by * b + ky) >> RGB2YUV_SHIFT); ydst[2 * i + 1] = Y; } - ydst += lumStride; - src += srcStride; - - if (y+1 == height) - break; - - for (i = 0; i < chromWidth; i++) { + if ((width & 1) != 0) { unsigned int b = src[6 * i + 0]; unsigned int g = src[6 * i + 1]; unsigned int r = src[6 * i + 2]; - unsigned int Y = ((ry * r + gy * g + by * b) >> RGB2YUV_SHIFT) + 16; + unsigned int Y = (ry * r + gy * g + by * b + ky) >> RGB2YUV_SHIFT; + unsigned int V = (rv * r + gv * g + bv * b + kc) >> RGB2YUV_SHIFT; + unsigned int U = (ru * r + gu * g + bu * b + kc) >> RGB2YUV_SHIFT; + udst[i] = U; + vdst[i] = V; ydst[2 * i] = Y; + } + ydst += lumStride; + src += srcStride; - b = src[6 * i + 3]; - g = src[6 * i + 4]; - r = src[6 * i + 5]; + if (y+1 == height) + break; - Y = ((ry * r + gy * g + by * b) >> RGB2YUV_SHIFT) + 16; - ydst[2 * i + 1] = Y; + for (i = 0; i < width; i++) { + unsigned int b = src[3 * i + 0]; + unsigned int g = src[3 * i + 1]; + unsigned int r = src[3 * i + 2]; + + unsigned int Y = (ry * r + gy * g + by * b + ky) >> RGB2YUV_SHIFT; + + ydst[i] = Y; } udst += chromStride; vdst += chromStride; diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 751bdcb2e4..e10f967755 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -1994,7 +1994,6 @@ void ff_get_unscaled_swscale(SwsContext *c) const enum AVPixelFormat dstFormat = c->dstFormat; const int flags = c->flags; const int dstH = c->dstH; - const int dstW = c->dstW; int needsDither; needsDither = isAnyRGB(dstFormat) && @@ -2052,12 +2051,12 @@ void ff_get_unscaled_swscale(SwsContext *c) /* bgr24toYV12 */ if (srcFormat == AV_PIX_FMT_BGR24 && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) && - !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT)) && !(dstW&1)) + !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT))) c->convert_unscaled = bgr24ToYv12Wrapper; /* rgb24toYV12 */ if (srcFormat == AV_PIX_FMT_RGB24 && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) && - !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT)) && !(dstW&1)) + !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT))) c->convert_unscaled = rgb24ToYv12Wrapper; /* RGB/BGR -> RGB/BGR (no dither needed forms) */ diff --git a/libswscale/x86/rgb2rgb_template.c b/libswscale/x86/rgb2rgb_template.c index dc2b4e205a..f90527aa08 100644 --- a/libswscale/x86/rgb2rgb_template.c +++ b/libswscale/x86/rgb2rgb_template.c @@ -1555,6 +1555,11 @@ static inline void RENAME(bgr24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_ int y; const x86_reg chromWidth= width>>1; + if ((width & 1) != 0) { + ff_bgr24toyv12_c(src, ydst, udst, vdst, width, height, lumStride, chromStride, srcStride, rgb2yuv); + return; + } + if (height > 2) { ff_bgr24toyv12_c(src, ydst, udst, vdst, width, 2, lumStride, chromStride, srcStride, rgb2yuv); src += 2*srcStride; From patchwork Sun Aug 20 15:10:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cox X-Patchwork-Id: 43276 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:47ca:b0:130:ccc6:6c4b with SMTP id ey10csp936673pzb; Sun, 20 Aug 2023 08:11:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGatAOXztMQtiFXhI1gTMwylYszpqdQz315OWVHNklGiOP37vGTwtithIXiLsssU4YAmh/i X-Received: by 2002:a05:651c:c1:b0:2b9:ac03:360b with SMTP id 1-20020a05651c00c100b002b9ac03360bmr3151918ljr.19.1692544290241; Sun, 20 Aug 2023 08:11:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692544290; cv=none; d=google.com; s=arc-20160816; b=zQr1tj6WBcMOk4khGYytAb9P8U7XVUy0En/JqhCc5kzgFOwq+3WxTx8JZhnan/o3LU TGChfiH5QIX17Lis5RJuWBQSM4tUFa+5OFXTdn6y6DxqLo/xY8Pa+cRvVjUmM1IwQV61 T4fz6ulnS07rDo8HJN7iVa2VxW1ZEX/ApmacqNgm203id04lEWOvZ82PLFuN80Z4AG5r RyJ63RDiw4i2/Ha4BIz+0Zja+sjye5iXQqy1hOULNWS7n9+VmaJSuU4Ml/IsELJtZfCF 67r+08qjdi0ZlsqrW/Y8j9O83XXHCPwn1MM1NPvycoo8sY/koedk/0FPUc/z5WsqD/JD uPuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=V8UWNeyCi64Rzqqrk3tYgjwiroXfYT8kMvZjEViCz/o=; fh=9QDi6dFFPFAV43XzYhuUbqo2pwrpR9p92hw/7eQiArk=; b=VtEWlzAE7QzmZ+N1hibpD8+GuGB0w4kWXU0iR2uKkFufzFplZibyH9zk7DAWWndP7R Ir7r6OZPEHhdgrGX8OBmwuext1PBR49iHYlM7wIwI2ClKHsYIEHaJyKZb59/f6RSDkE/ Z0kpyzQWhwIJ/g1063APs1qpypv6nL8V3xE+XdadmFplpOVN4CzprppfInIHptggjRyG o894Dpr/9AM9wYZdyByMCgsuviPYNtRaB672lukZuOCKyQ6uRy4gj71K0ugqg0Ftu+Q/ sHEs3JOS8jfSPi2Rf4chFtkSN7kP6IddhQZQOpYHz3GRQY2is2z+fLNCiEzEXsL9e673 ++Dg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=GkK4oviT; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id k23-20020a1709063fd700b009886470de61si4161566ejj.857.2023.08.20.08.11.29; Sun, 20 Aug 2023 08:11:30 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=GkK4oviT; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 35B0C68C471; Sun, 20 Aug 2023 18:10:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A531868C301 for ; Sun, 20 Aug 2023 18:10:35 +0300 (EEST) Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-31aeee69de0so854204f8f.2 for ; Sun, 20 Aug 2023 08:10:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20221208.gappssmtp.com; s=20221208; t=1692544235; x=1693149035; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=epTDvXwhFWwkBAHs9ao/nNIb+JTypxHPdhV7Bs5V6ZM=; b=GkK4oviTxHiCO9WqC+pO7ROhdCeMZ24dqQL4aEP/5G9FmCxawLch2r6GGeNEb4KEZT eNVpL+3cfvZnKZhNXSW0CdPaWSTSq3MDPpUp0csUMnlu9UUaXXVA3nOn8f0w5ekpAzOk PVgb7Y4QokAXg7si/NuyqMFj1WWwANeFK+X3RjSynvx+PCHPLIAXTBEbkoHQsv7yJBNQ dSEh3iFjEkH8r5hlm4/Z17BJuncErTU6fR/ae8x/3ix29FZiR5IUxEstXXtxXbzGcaEg YvPq0YvDp8H3wCSCJxiN4ZNrluBSNgb7oTy7Y3BQGEDRCpY9Timw1i7Uql9IXwrYNcCi Byag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692544235; x=1693149035; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=epTDvXwhFWwkBAHs9ao/nNIb+JTypxHPdhV7Bs5V6ZM=; b=SBN3jY9+p/uhxvBwMLsVTlqlmRyltdaAmXHELh+EA7NJUROhkWZTnkB7oDpoInQNJG ejTbHfnWbcjgA/6UI7PY2i9pYpvnhtdf/Dpr6GDM2mAdoIFnh6Q9qlGMy0prdVd+9O/T A+7nEMfPcGJZPr/jbNiWKBrOankuWF5FtIkFkLCuXuqNIp30G2SoIzrwIglM21POpMYM mKhj+vXefO+rGxrlSAuHJJISmJrxfiiG0CV4oLl6WYkcB7Q3SYMYNr79cSjNSlukXTqt neIXbMWIeU5hxnXOpyiWuGr6AWjHIVWLmG4/62E+YiZ03LK3R8qJe9XC3NRdNwggrIS+ JQsg== X-Gm-Message-State: AOJu0YwlGuQPit0juIG0T0/16bQoOR/w+NzC+DOb241Mdi+Q4jJNayA4 25nDmmwBSurr+58EhdwZL/TAXQEqkq+vQvnrMW8= X-Received: by 2002:a5d:688c:0:b0:31a:dc2e:2db2 with SMTP id h12-20020a5d688c000000b0031adc2e2db2mr2692817wru.49.1692544235092; Sun, 20 Aug 2023 08:10:35 -0700 (PDT) Received: from sucnaath.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id b4-20020adff904000000b003197c7d08ddsm9494476wrr.71.2023.08.20.08.10.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Aug 2023 08:10:34 -0700 (PDT) From: John Cox To: ffmpeg-devel@ffmpeg.org Date: Sun, 20 Aug 2023 15:10:21 +0000 Message-Id: <20230820151022.2204421-6-jc@kynesim.co.uk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230820151022.2204421-1-jc@kynesim.co.uk> References: <20230820151022.2204421-1-jc@kynesim.co.uk> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v1 5/6] swscale: Add unscaled XRGB->YUV420P functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: John Cox Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 0von+eoFMShS Add simple C functions for converting XRGB to YUV420P. Same logic as the RGB24 functions but dropping the A channel. Signed-off-by: John Cox --- libswscale/rgb2rgb.c | 20 +++++++ libswscale/rgb2rgb.h | 16 +++++ libswscale/rgb2rgb_template.c | 106 ++++++++++++++++++++++++++++++++++ libswscale/swscale_unscaled.c | 89 ++++++++++++++++++++++++++++ 4 files changed, 231 insertions(+) diff --git a/libswscale/rgb2rgb.c b/libswscale/rgb2rgb.c index de90e5193f..b976341e70 100644 --- a/libswscale/rgb2rgb.c +++ b/libswscale/rgb2rgb.c @@ -88,6 +88,26 @@ void (*ff_rgb24toyv12)(const uint8_t *src, uint8_t *ydst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv); +void (*ff_rgbxtoyv12)(const uint8_t *src, uint8_t *ydst, + uint8_t *udst, uint8_t *vdst, + int width, int height, + int lumStride, int chromStride, int srcStride, + int32_t *rgb2yuv); +void (*ff_bgrxtoyv12)(const uint8_t *src, uint8_t *ydst, + uint8_t *udst, uint8_t *vdst, + int width, int height, + int lumStride, int chromStride, int srcStride, + int32_t *rgb2yuv); +void (*ff_xrgbtoyv12)(const uint8_t *src, uint8_t *ydst, + uint8_t *udst, uint8_t *vdst, + int width, int height, + int lumStride, int chromStride, int srcStride, + int32_t *rgb2yuv); +void (*ff_xbgrtoyv12)(const uint8_t *src, uint8_t *ydst, + uint8_t *udst, uint8_t *vdst, + int width, int height, + int lumStride, int chromStride, int srcStride, + int32_t *rgb2yuv); void (*planar2x)(const uint8_t *src, uint8_t *dst, int width, int height, int srcStride, int dstStride); void (*interleaveBytes)(const uint8_t *src1, const uint8_t *src2, uint8_t *dst, diff --git a/libswscale/rgb2rgb.h b/libswscale/rgb2rgb.h index f7a76a92ba..0015b1568a 100644 --- a/libswscale/rgb2rgb.h +++ b/libswscale/rgb2rgb.h @@ -135,6 +135,22 @@ extern void (*ff_rgb24toyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, int width, int height, int lumStride, int chromStride, int srcStride, int32_t *rgb2yuv); +extern void (*ff_rgbxtoyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, + int width, int height, + int lumStride, int chromStride, int srcStride, + int32_t *rgb2yuv); +extern void (*ff_bgrxtoyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, + int width, int height, + int lumStride, int chromStride, int srcStride, + int32_t *rgb2yuv); +extern void (*ff_xrgbtoyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, + int width, int height, + int lumStride, int chromStride, int srcStride, + int32_t *rgb2yuv); +extern void (*ff_xbgrtoyv12)(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, + int width, int height, + int lumStride, int chromStride, int srcStride, + int32_t *rgb2yuv); extern void (*planar2x)(const uint8_t *src, uint8_t *dst, int width, int height, int srcStride, int dstStride); diff --git a/libswscale/rgb2rgb_template.c b/libswscale/rgb2rgb_template.c index 5503e58a29..22326807c5 100644 --- a/libswscale/rgb2rgb_template.c +++ b/libswscale/rgb2rgb_template.c @@ -742,6 +742,108 @@ void ff_rgb24toyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, rgb24toyv12_x(src, ydst, udst, vdst, width, height, lumStride, chromStride, srcStride, rgb2yuv, x_rgb); } +static void rgbxtoyv12_x(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv, + const uint8_t x[9]) +{ + int32_t ry = rgb2yuv[x[0]], gy = rgb2yuv[x[1]], by = rgb2yuv[x[2]]; + int32_t ru = rgb2yuv[x[3]], gu = rgb2yuv[x[4]], bu = rgb2yuv[x[5]]; + int32_t rv = rgb2yuv[x[6]], gv = rgb2yuv[x[7]], bv = rgb2yuv[x[8]]; + int y; + const int chromWidth = width >> 1; + // Constants with both rounding and offset + const int32_t ky = ((16 << 1) + 1) << (RGB2YUV_SHIFT - 1); + const int32_t kc = ((128 << 1) + 1) << (RGB2YUV_SHIFT - 1); + + for (y = 0; y < height; y += 2) { + int i; + for (i = 0; i < chromWidth; i++) { + unsigned int b = src[8 * i + 0]; + unsigned int g = src[8 * i + 1]; + unsigned int r = src[8 * i + 2]; + + unsigned int Y = (ry * r + gy * g + by * b + ky) >> RGB2YUV_SHIFT; + unsigned int V = (rv * r + gv * g + bv * b + kc) >> RGB2YUV_SHIFT; + unsigned int U = (ru * r + gu * g + bu * b + kc) >> RGB2YUV_SHIFT; + + udst[i] = U; + vdst[i] = V; + ydst[2 * i] = Y; + + b = src[8 * i + 4]; + g = src[8 * i + 5]; + r = src[8 * i + 6]; + + Y = ((ry * r + gy * g + by * b) >> RGB2YUV_SHIFT) + 16; + ydst[2 * i + 1] = Y; + } + if ((width & 1) != 0) { + unsigned int b = src[8 * i + 0]; + unsigned int g = src[8 * i + 1]; + unsigned int r = src[8 * i + 2]; + + unsigned int Y = (ry * r + gy * g + by * b + ky) >> RGB2YUV_SHIFT; + unsigned int V = (rv * r + gv * g + bv * b + kc) >> RGB2YUV_SHIFT; + unsigned int U = (ru * r + gu * g + bu * b + kc) >> RGB2YUV_SHIFT; + + udst[i] = U; + vdst[i] = V; + ydst[2 * i] = Y; + } + ydst += lumStride; + src += srcStride; + + if (y+1 == height) + break; + + for (i = 0; i < width; i++) { + unsigned int b = src[4 * i + 0]; + unsigned int g = src[4 * i + 1]; + unsigned int r = src[4 * i + 2]; + + unsigned int Y = (ry * r + gy * g + by * b + ky) >> RGB2YUV_SHIFT; + + ydst[i] = Y; + } + udst += chromStride; + vdst += chromStride; + ydst += lumStride; + src += srcStride; + } +} + +static void ff_rgbxtoyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv) +{ + rgbxtoyv12_x(src, ydst, udst, vdst, width, height, lumStride, chromStride, srcStride, rgb2yuv, x_rgb); +} + +static void ff_bgrxtoyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv) +{ + rgbxtoyv12_x(src, ydst, udst, vdst, width, height, lumStride, chromStride, srcStride, rgb2yuv, x_bgr); +} + +// As the general code does no SIMD-like ops simply adding 1 to the src address +// will fix the ignored alpha position +static void ff_xrgbtoyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv) +{ + rgbxtoyv12_x(src + 1, ydst, udst, vdst, width, height, lumStride, chromStride, srcStride, rgb2yuv, x_rgb); +} + +static void ff_xbgrtoyv12_c(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv) +{ + rgbxtoyv12_x(src + 1, ydst, udst, vdst, width, height, lumStride, chromStride, srcStride, rgb2yuv, x_bgr); +} + + static void interleaveBytes_c(const uint8_t *src1, const uint8_t *src2, uint8_t *dest, int width, int height, int src1Stride, int src2Stride, int dstStride) @@ -1016,6 +1118,10 @@ static av_cold void rgb2rgb_init_c(void) planar2x = planar2x_c; ff_rgb24toyv12 = ff_rgb24toyv12_c; ff_bgr24toyv12 = ff_bgr24toyv12_c; + ff_rgbxtoyv12 = ff_rgbxtoyv12_c; + ff_bgrxtoyv12 = ff_bgrxtoyv12_c; + ff_xrgbtoyv12 = ff_xrgbtoyv12_c; + ff_xbgrtoyv12 = ff_xbgrtoyv12_c; interleaveBytes = interleaveBytes_c; deinterleaveBytes = deinterleaveBytes_c; vu9_to_vu12 = vu9_to_vu12_c; diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index e10f967755..ff682d367c 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -1671,6 +1671,74 @@ static int rgb24ToYv12Wrapper(SwsContext *c, const uint8_t *src[], return srcSliceH; } +static int bgrxToYv12Wrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, int srcSliceH, + uint8_t *dst[], int dstStride[]) +{ + ff_bgrxtoyv12( + src[0], + dst[0] + srcSliceY * dstStride[0], + dst[1] + (srcSliceY >> 1) * dstStride[1], + dst[2] + (srcSliceY >> 1) * dstStride[2], + c->srcW, srcSliceH, + dstStride[0], dstStride[1], srcStride[0], + c->input_rgb2yuv_table); + if (dst[3]) + fillPlane(dst[3], dstStride[3], c->srcW, srcSliceH, srcSliceY, 255); + return srcSliceH; +} + +static int rgbxToYv12Wrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, int srcSliceH, + uint8_t *dst[], int dstStride[]) +{ + ff_rgbxtoyv12( + src[0], + dst[0] + srcSliceY * dstStride[0], + dst[1] + (srcSliceY >> 1) * dstStride[1], + dst[2] + (srcSliceY >> 1) * dstStride[2], + c->srcW, srcSliceH, + dstStride[0], dstStride[1], srcStride[0], + c->input_rgb2yuv_table); + if (dst[3]) + fillPlane(dst[3], dstStride[3], c->srcW, srcSliceH, srcSliceY, 255); + return srcSliceH; +} + +static int xbgrToYv12Wrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, int srcSliceH, + uint8_t *dst[], int dstStride[]) +{ + ff_xbgrtoyv12( + src[0], + dst[0] + srcSliceY * dstStride[0], + dst[1] + (srcSliceY >> 1) * dstStride[1], + dst[2] + (srcSliceY >> 1) * dstStride[2], + c->srcW, srcSliceH, + dstStride[0], dstStride[1], srcStride[0], + c->input_rgb2yuv_table); + if (dst[3]) + fillPlane(dst[3], dstStride[3], c->srcW, srcSliceH, srcSliceY, 255); + return srcSliceH; +} + +static int xrgbToYv12Wrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, int srcSliceH, + uint8_t *dst[], int dstStride[]) +{ + ff_xrgbtoyv12( + src[0], + dst[0] + srcSliceY * dstStride[0], + dst[1] + (srcSliceY >> 1) * dstStride[1], + dst[2] + (srcSliceY >> 1) * dstStride[2], + c->srcW, srcSliceH, + dstStride[0], dstStride[1], srcStride[0], + c->input_rgb2yuv_table); + if (dst[3]) + fillPlane(dst[3], dstStride[3], c->srcW, srcSliceH, srcSliceY, 255); + return srcSliceH; +} + static int yvu9ToYv12Wrapper(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dst[], int dstStride[]) @@ -2059,6 +2127,27 @@ void ff_get_unscaled_swscale(SwsContext *c) !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT))) c->convert_unscaled = rgb24ToYv12Wrapper; + /* bgrxtoYV12 */ + if (((srcFormat == AV_PIX_FMT_BGRA && dstFormat == AV_PIX_FMT_YUV420P) || + (srcFormat == AV_PIX_FMT_BGR0 && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P))) && + !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT))) + c->convert_unscaled = bgrxToYv12Wrapper; + /* rgbx24toYV12 */ + if (((srcFormat == AV_PIX_FMT_RGBA && dstFormat == AV_PIX_FMT_YUV420P) || + (srcFormat == AV_PIX_FMT_RGB0 && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P))) && + !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT))) + c->convert_unscaled = rgbxToYv12Wrapper; + /* xbgrtoYV12 */ + if (((srcFormat == AV_PIX_FMT_ABGR && dstFormat == AV_PIX_FMT_YUV420P) || + (srcFormat == AV_PIX_FMT_0BGR && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P))) && + !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT))) + c->convert_unscaled = xbgrToYv12Wrapper; + /* xrgb24toYV12 */ + if (((srcFormat == AV_PIX_FMT_ARGB && dstFormat == AV_PIX_FMT_YUV420P) || + (srcFormat == AV_PIX_FMT_0RGB && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P))) && + !(flags & (SWS_ACCURATE_RND | SWS_BITEXACT))) + c->convert_unscaled = xrgbToYv12Wrapper; + /* RGB/BGR -> RGB/BGR (no dither needed forms) */ if (isAnyRGB(srcFormat) && isAnyRGB(dstFormat) && findRgbConvFn(c) && (!needsDither || (c->flags&(SWS_FAST_BILINEAR|SWS_POINT)))) From patchwork Sun Aug 20 15:10:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cox X-Patchwork-Id: 43277 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:47ca:b0:130:ccc6:6c4b with SMTP id ey10csp936756pzb; Sun, 20 Aug 2023 08:11:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGkwYzMU0NshECS91VuGB8ENPIafTxzlI7C2zi3XRFAfoERUJc9qf5k7kzQZum+FP0qS/gl X-Received: by 2002:a17:906:5a55:b0:99b:b867:4972 with SMTP id my21-20020a1709065a5500b0099bb8674972mr3483233ejc.35.1692544299437; Sun, 20 Aug 2023 08:11:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692544299; cv=none; d=google.com; s=arc-20160816; b=TZnLDeDwr6uGpK7n9z0sAud0V+TZArWzhH6DUUX+V2HWQja/6oxeflNPW0t7l9Yw2Y 25NkbjVunjhM7Z1YvuO5f8qlHkVf56q4mXQlvNZ89QZLvj/JMDN/MwY+UQB0o3epY3bB clLeoFm6IgYhsLy+yPipYmtHz5PiMe/cHFSrUIbc/Ysyy6DE9/SMloLfPQ126Dvf1B8n ZFYoEkCC2R90CAh2RvH0X7HbIir1JxnwaSPQzB0oJ+Vazz1SOqYMK2hrznHPI+RTrDfx OGT9IAH0EgNceTdN6u5PNurG7Vm2lbXBLh04dAluTwE4zeKp2IdzHu4POJBcVtoyqC6p XkmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=W4SYFIVPG41jTrXlletfueLef9qTxyEFp0y5ZZIi65Q=; fh=9QDi6dFFPFAV43XzYhuUbqo2pwrpR9p92hw/7eQiArk=; b=CCI0OnPDbt4av9iYRdjcw8TEZakq3syvNDLSjUjwrvCa7WXhLR5PM+ZmvFY5mgJUeS jm7mdpHQpDgX8BZeJuZ5EF4QuqC0q/nFIEbLs/RyBZVA8DTX36gQcYTPc95Qi8WA+oFd U2wdD0fnbT0ul5cSmMlwKTF2LSqUZpeV+xiUZxRvVHpiP+veJJ3o1NM7t2pw6G5oQd2u vQEk71AoV5eazrCv6sOUqDUeqtgnirOTtjwEYW1aLU+Mv20l1PtYIItAdwFK6ie0Jpgy TzLE7IQN2MtNfaOjsOcmlkgMYJhXSmhDxl2TQgA+4bZPrg3NYfdZK52ZlJp2FPhiySwj jkxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b="PO7lbGv/"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ke4-20020a17090798e400b0099dff3bc40dsi4025750ejc.726.2023.08.20.08.11.38; Sun, 20 Aug 2023 08:11:39 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b="PO7lbGv/"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3376468C49D; Sun, 20 Aug 2023 18:10:46 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2DB8668C394 for ; Sun, 20 Aug 2023 18:10:36 +0300 (EEST) Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-3fee5ddc23eso10447815e9.1 for ; Sun, 20 Aug 2023 08:10:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20221208.gappssmtp.com; s=20221208; t=1692544235; x=1693149035; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8a6RxiawK/HjGylhwUiICwlRkT0Zi8euFipjEcGomDc=; b=PO7lbGv/45diTnwO4VeDpWzwtcHfbnY6Vx8oBxZQqLZTHozf6tDQbFAwGJU/fiRol7 VVmoFibr1NIEeOwPDqHTqNQpdYSWmA0dUB09mKaTdHLN51qAvyhULZme7eQ1YpbWpEmG K50jrzkg6pO1o1NoEjbzU8CF+fTkgfmeWN/fXGmNvvvpCHvF5xJJ098Dk0zcRvVemoo3 hl9oP16mFg6nLvWnn+SEhIfYQpMS+XsLmGCpteqCXzAkQJoEjuFf4uwPdHRS5mSn0Rmr ZMxF7Hk5q/fgak0g4YfdFH9IlUxGwWeY1D0X14Yn/M5A7+6/ljjZh6jJvWOCuWw1iFkJ XzoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692544235; x=1693149035; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8a6RxiawK/HjGylhwUiICwlRkT0Zi8euFipjEcGomDc=; b=fcKmB1m+T+dG2gyqZNixDxlgWN1Noi6Vk1KLFpaq0m2WNCuOOKaMgND19H6urQxoBB 1hJVO0zG+VweJxcWp03mvEzqYNRk+CrOtHs65Y9tMQWXtD/BwmNZmuJRvtWcveG/YriY xX8YgEfRk9CnhM1EmeBtB0lBpnwTIoesQHkV+NLLikgL1r+dL4jhX9bchhYybsN4JNOJ X0TY1xw/VnlwJeI+iZuXmIv5niufRpArCk4qV9WP6TsX0kmYEL4OHxppIsuLIRj/sZLf ClcrwTGDAs/1G76eBf6lZWQzxk27mMkreQQbrac7gZHJFuMIhME+AuvKoV2jNPx0hbrJ fhCg== X-Gm-Message-State: AOJu0YwqFbR9ql7CTl2d+ljxd4ebF7H2NU3f6S7NXOPOdwOB+GcEloAX iirQH9OgSldmw4wAULHviMo5A0/qSMXs+x8gAl8= X-Received: by 2002:a5d:5963:0:b0:31a:ccc7:29ee with SMTP id e35-20020a5d5963000000b0031accc729eemr2546755wri.7.1692544235553; Sun, 20 Aug 2023 08:10:35 -0700 (PDT) Received: from sucnaath.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id b4-20020adff904000000b003197c7d08ddsm9494476wrr.71.2023.08.20.08.10.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Aug 2023 08:10:35 -0700 (PDT) From: John Cox To: ffmpeg-devel@ffmpeg.org Date: Sun, 20 Aug 2023 15:10:22 +0000 Message-Id: <20230820151022.2204421-7-jc@kynesim.co.uk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230820151022.2204421-1-jc@kynesim.co.uk> References: <20230820151022.2204421-1-jc@kynesim.co.uk> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v1 6/6] swscale: Add aarch64 functions for RGB24->YUV420P X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: John Cox Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: VqrZNVRUtHo4 Neon RGB24->YUV420P and BGR24->YUV420P functions. Works on 16 pixel blocks and can do any width or height, though for widths less than 32 or so the C is likely faster. Signed-off-by: John Cox --- libswscale/aarch64/rgb2rgb.c | 8 + libswscale/aarch64/rgb2rgb_neon.S | 356 ++++++++++++++++++++++++++++++ 2 files changed, 364 insertions(+) diff --git a/libswscale/aarch64/rgb2rgb.c b/libswscale/aarch64/rgb2rgb.c index a9bf6ff9e0..b2d68c1df3 100644 --- a/libswscale/aarch64/rgb2rgb.c +++ b/libswscale/aarch64/rgb2rgb.c @@ -30,6 +30,12 @@ void ff_interleave_bytes_neon(const uint8_t *src1, const uint8_t *src2, uint8_t *dest, int width, int height, int src1Stride, int src2Stride, int dstStride); +void ff_bgr24toyv12_neon(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv); +void ff_rgb24toyv12_neon(const uint8_t *src, uint8_t *ydst, uint8_t *udst, + uint8_t *vdst, int width, int height, int lumStride, + int chromStride, int srcStride, int32_t *rgb2yuv); av_cold void rgb2rgb_init_aarch64(void) { @@ -37,5 +43,7 @@ av_cold void rgb2rgb_init_aarch64(void) if (have_neon(cpu_flags)) { interleaveBytes = ff_interleave_bytes_neon; + ff_rgb24toyv12 = ff_rgb24toyv12_neon; + ff_bgr24toyv12 = ff_bgr24toyv12_neon; } } diff --git a/libswscale/aarch64/rgb2rgb_neon.S b/libswscale/aarch64/rgb2rgb_neon.S index d81110ec57..b15e69a3bd 100644 --- a/libswscale/aarch64/rgb2rgb_neon.S +++ b/libswscale/aarch64/rgb2rgb_neon.S @@ -77,3 +77,359 @@ function ff_interleave_bytes_neon, export=1 0: ret endfunc + +// Expand rgb2 into r0+r1/g0+g1/b0+b1 +.macro XRGB3Y r0, g0, b0, r1, g1, b1, r2, g2, b2 + uxtl \r0\().8h, \r2\().8b + uxtl \g0\().8h, \g2\().8b + uxtl \b0\().8h, \b2\().8b + + uxtl2 \r1\().8h, \r2\().16b + uxtl2 \g1\().8h, \g2\().16b + uxtl2 \b1\().8h, \b2\().16b +.endm + +// Expand rgb2 into r0+r1/g0+g1/b0+b1 +// and pick every other el to put back into rgb2 for chroma +.macro XRGB3YC r0, g0, b0, r1, g1, b1, r2, g2, b2 + XRGB3Y \r0, \g0, \b0, \r1, \g1, \b1, \r2, \g2, \b2 + + bic \r2\().8h, #0xff, LSL #8 + bic \g2\().8h, #0xff, LSL #8 + bic \b2\().8h, #0xff, LSL #8 +.endm + +.macro SMLAL3 d0, d1, s0, s1, s2, c0, c1, c2 + smull \d0\().4s, \s0\().4h, \c0 + smlal \d0\().4s, \s1\().4h, \c1 + smlal \d0\().4s, \s2\().4h, \c2 + smull2 \d1\().4s, \s0\().8h, \c0 + smlal2 \d1\().4s, \s1\().8h, \c1 + smlal2 \d1\().4s, \s2\().8h, \c2 +.endm + +// d0 may be s0 +// s0, s2 corrupted +.macro SHRN_Y d0, s0, s1, s2, s3, k128h + shrn \s0\().4h, \s0\().4s, #12 + shrn2 \s0\().8h, \s1\().4s, #12 + add \s0\().8h, \s0\().8h, \k128h\().8h // +128 (>> 3 = 16) + sqrshrun \d0\().8b, \s0\().8h, #3 + shrn \s2\().4h, \s2\().4s, #12 + shrn2 \s2\().8h, \s3\().4s, #12 + add \s2\().8h, \s2\().8h, \k128h\().8h + sqrshrun2 \d0\().16b, v28.8h, #3 +.endm + +.macro SHRN_C d0, s0, s1, k128b + shrn \s0\().4h, \s0\().4s, #14 + shrn2 \s0\().8h, \s1\().4s, #14 + sqrshrn \s0\().8b, \s0\().8h, #1 + add \d0\().8b, \s0\().8b, \k128b\().8b // +128 +.endm + +.macro STB2V s0, n, a + st1 {\s0\().b}[(\n+0)], [\a], #1 + st1 {\s0\().b}[(\n+1)], [\a], #1 +.endm + +.macro STB4V s0, n, a + STB2V \s0, (\n+0), \a + STB2V \s0, (\n+2), \a +.endm + + +// void ff_bgr24toyv12_neon( +// const uint8_t *src, // x0 +// uint8_t *ydst, // x1 +// uint8_t *udst, // x2 +// uint8_t *vdst, // x3 +// int width, // w4 +// int height, // w5 +// int lumStride, // w6 +// int chromStride, // w7 +// int srcStr, // [sp, #0] +// int32_t *rgb2yuv); // [sp, #8] + +function ff_bgr24toyv12_neon, export=1 + ldr x15, [sp, #8] + ld3 {v3.s, v4.s, v5.s}[0], [x15], #12 + ld3 {v3.s, v4.s, v5.s}[1], [x15], #12 + ld3 {v3.s, v4.s, v5.s}[2], [x15] + mov v6.16b, v3.16b + mov v3.16b, v5.16b + mov v5.16b, v6.16b + b 99f +endfunc + +// void ff_rgb24toyv12_neon( +// const uint8_t *src, // x0 +// uint8_t *ydst, // x1 +// uint8_t *udst, // x2 +// uint8_t *vdst, // x3 +// int width, // w4 +// int height, // w5 +// int lumStride, // w6 +// int chromStride, // w7 +// int srcStr, // [sp, #0] +// int32_t *rgb2yuv); // [sp, #8] (including Mac) + +// regs +// v0-2 Src bytes - reused as chroma src +// v3-5 Coeffs (packed very inefficiently - could be squashed) +// v6 128b +// v7 128h +// v8-15 Reserved +// v16-18 Lo Src expanded as H +// v19 - +// v20-22 Hi Src expanded as H +// v23 - +// v24 U out +// v25 U tmp +// v26 Y out +// v27-29 Y tmp +// v30 V out +// v31 V tmp + +function ff_rgb24toyv12_neon, export=1 + ldr x15, [sp, #8] + ld3 {v3.s, v4.s, v5.s}[0], [x15], #12 + ld3 {v3.s, v4.s, v5.s}[1], [x15], #12 + ld3 {v3.s, v4.s, v5.s}[2], [x15] + +99: + ldr w14, [sp, #0] + movi v7.8b, #128 + uxtl v6.8h, v7.8b + // Ensure if nothing to do then we do nothing + cmp w4, #0 + b.le 90f + cmp w5, #0 + b.le 90f + // If w % 16 != 0 then -16 so we do main loop 1 fewer times with + // the remainder done in the tail + tst w4, #15 + b.eq 1f + sub w4, w4, #16 +1: + +// -------------------- Even line body - YUV +11: + subs w9, w4, #0 + mov x10, x0 + mov x11, x1 + mov x12, x2 + mov x13, x3 + b.lt 12f + + ld3 {v0.16b, v1.16b, v2.16b}, [x10], #48 + subs w9, w9, #16 + b.le 13f + +10: + XRGB3YC v16, v17, v18, v20, v21, v22, v0, v1, v2 + + // Testing shows it is faster to stack the smull/smlal ops together + // rather than interleave them between channels and indeed even the + // shift/add sections seem happier not interleaved + + // Y0 + SMLAL3 v26, v27, v16, v17, v18, v3.h[0], v4.h[0], v5.h[0] + // Y1 + SMLAL3 v28, v29, v20, v21, v22, v3.h[0], v4.h[0], v5.h[0] + SHRN_Y v26, v26, v27, v28, v29, v6 + + // U + // Vector subscript *2 as we loaded into S but are only using H + SMLAL3 v24, v25, v0, v1, v2, v3.h[2], v4.h[2], v5.h[2] + + // V + SMLAL3 v30, v31, v0, v1, v2, v3.h[4], v4.h[4], v5.h[4] + + ld3 {v0.16b, v1.16b, v2.16b}, [x10], #48 + + SHRN_C v24, v24, v25, v7 + SHRN_C v30, v30, v31, v7 + + subs w9, w9, #16 + + st1 {v26.16b}, [x11], #16 + st1 {v24.8b}, [x12], #8 + st1 {v30.8b}, [x13], #8 + + b.gt 10b + +// -------------------- Even line tail - YUV +// If width % 16 == 0 then simply runs once with preloaded RGB +// If other then deals with preload & then does remaining tail + +13: + // Body is simple copy of main loop body minus preload + + XRGB3YC v16, v17, v18, v20, v21, v22, v0, v1, v2 + // Y0 + SMLAL3 v26, v27, v16, v17, v18, v3.h[0], v4.h[0], v5.h[0] + // Y1 + SMLAL3 v28, v29, v20, v21, v22, v3.h[0], v4.h[0], v5.h[0] + SHRN_Y v26, v26, v27, v28, v29, v6 + // U + SMLAL3 v24, v25, v0, v1, v2, v3.h[2], v4.h[2], v5.h[2] + // V + SMLAL3 v30, v31, v0, v1, v2, v3.h[4], v4.h[4], v5.h[4] + + cmp w9, #-16 + + SHRN_C v24, v24, v25, v7 + SHRN_C v30, v30, v31, v7 + + // Here: + // w9 == 0 width % 16 == 0, tail done + // w9 > -16 1st tail done (16 pels), remainder still to go + // w9 == -16 shouldn't happen + // w9 > -32 2nd tail done + // w9 <= -32 shouldn't happen + + b.lt 2f + st1 {v26.16b}, [x11], #16 + st1 {v24.8b}, [x12], #8 + st1 {v30.8b}, [x13], #8 + cbz w9, 3f + +12: + sub w9, w9, #16 + + tbz w9, #3, 1f + ld3 {v0.8b, v1.8b, v2.8b}, [x10], #24 +1: tbz w9, #2, 1f + ld3 {v0.b, v1.b, v2.b}[8], [x10], #3 + ld3 {v0.b, v1.b, v2.b}[9], [x10], #3 + ld3 {v0.b, v1.b, v2.b}[10], [x10], #3 + ld3 {v0.b, v1.b, v2.b}[11], [x10], #3 +1: tbz w9, #1, 1f + ld3 {v0.b, v1.b, v2.b}[12], [x10], #3 + ld3 {v0.b, v1.b, v2.b}[13], [x10], #3 +1: tbz w9, #0, 13b + ld3 {v0.b, v1.b, v2.b}[14], [x10], #3 + b 13b + +2: + tbz w9, #3, 1f + st1 {v26.8b}, [x11], #8 + STB4V v24, 0, x12 + STB4V v30, 0, x13 +1: tbz w9, #2, 1f + STB4V v26 8, x11 + STB2V v24, 4, x12 + STB2V v30, 4, x13 +1: tbz w9, #1, 1f + STB2V v26, 12, x11 + st1 {v24.b}[6], [x12], #1 + st1 {v30.b}[6], [x13], #1 +1: tbz w9, #0, 1f + st1 {v26.b}[14], [x11] + st1 {v24.b}[7], [x12] + st1 {v30.b}[7], [x13] +1: +3: + +// -------------------- Odd line body - Y only + + subs w5, w5, #1 + b.eq 90f + + subs w9, w4, #0 + add x0, x0, w14, sxtx + add x1, x1, w6, sxtx + mov x10, x0 + mov x11, x1 + b.lt 12f + + ld3 {v0.16b, v1.16b, v2.16b}, [x10], #48 + subs w9, w9, #16 + b.le 13f + +10: + XRGB3Y v16, v17, v18, v20, v21, v22, v0, v1, v2 + // Y0 + SMLAL3 v26, v27, v16, v17, v18, v3.h[0], v4.h[0], v5.h[0] + // Y1 + SMLAL3 v28, v29, v20, v21, v22, v3.h[0], v4.h[0], v5.h[0] + + ld3 {v0.16b, v1.16b, v2.16b}, [x10], #48 + + SHRN_Y v26, v26, v27, v28, v29, v6 + + subs w9, w9, #16 + + st1 {v26.16b}, [x11], #16 + + b.gt 10b + +// -------------------- Odd line tail - Y +// If width % 16 == 0 then simply runs once with preloaded RGB +// If other then deals with preload & then does remaining tail + +13: + // Body is simple copy of main loop body minus preload + + XRGB3Y v16, v17, v18, v20, v21, v22, v0, v1, v2 + // Y0 + SMLAL3 v26, v27, v16, v17, v18, v3.h[0], v4.h[0], v5.h[0] + // Y1 + SMLAL3 v28, v29, v20, v21, v22, v3.h[0], v4.h[0], v5.h[0] + + cmp w9, #-16 + + SHRN_Y v26, v26, v27, v28, v29, v6 + + // Here: + // w9 == 0 width % 16 == 0, tail done + // w9 > -16 1st tail done (16 pels), remainder still to go + // w9 == -16 shouldn't happen + // w9 > -32 2nd tail done + // w9 <= -32 shouldn't happen + + b.lt 2f + st1 {v26.16b}, [x11], #16 + cbz w9, 3f + +12: + sub w9, w9, #16 + + tbz w9, #3, 1f + ld3 {v0.8b, v1.8b, v2.8b}, [x10], #24 +1: tbz w9, #2, 1f + ld3 {v0.b, v1.b, v2.b}[8], [x10], #3 + ld3 {v0.b, v1.b, v2.b}[9], [x10], #3 + ld3 {v0.b, v1.b, v2.b}[10], [x10], #3 + ld3 {v0.b, v1.b, v2.b}[11], [x10], #3 +1: tbz w9, #1, 1f + ld3 {v0.b, v1.b, v2.b}[12], [x10], #3 + ld3 {v0.b, v1.b, v2.b}[13], [x10], #3 +1: tbz w9, #0, 13b + ld3 {v0.b, v1.b, v2.b}[14], [x10], #3 + b 13b + +2: + tbz w9, #3, 1f + st1 {v26.8b}, [x11], #8 +1: tbz w9, #2, 1f + STB4V v26, 8, x11 +1: tbz w9, #1, 1f + STB2V v26, 12, x11 +1: tbz w9, #0, 1f + st1 {v26.b}[14], [x11] +1: +3: + +// ------------------- Loop to start + + add x0, x0, w14, sxtx + add x1, x1, w6, sxtx + add x2, x2, w7, sxtx + add x3, x3, w7, sxtx + subs w5, w5, #1 + b.gt 11b +90: + ret +endfunc