From patchwork Mon Sep 23 12:40:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramiro Polla X-Patchwork-Id: 35183 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:d154:0:b0:48e:c0f8:d0de with SMTP id bt20csp2433062vqb; Mon, 23 Sep 2024 05:59:22 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWuUYopjJMWb2096rChklbpgVkRIky4nNvS+lSBq1IU+bkjLzt5oBehAXOoFT3lP/AYBB0DsEHEW47nW2A8vLVR@gmail.com X-Google-Smtp-Source: AGHT+IFnFCmWkbkPv0ZOaHJ4TajshAaHugox4TRSk9jANijEwGxFKkopdlWuybDXCdw8cldhaonM X-Received: by 2002:a17:907:f163:b0:a8d:6329:d8cc with SMTP id a640c23a62f3a-a90d4ffe173mr1150074766b.25.1727096362220; Mon, 23 Sep 2024 05:59:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1727096362; cv=none; d=google.com; s=arc-20240605; b=QfAnSzGwNuEyJUNGiPrOqkXTLZM3by3Wt71NRZoZ3MIr9+chd28Nkgcia3wvU5T24b 81dXqeO2pQtueWQjfAKOb57XMLBHekbTFeuI9LUAwhdnEfj+Oosyz8iNnkdz2EpMzQax wlIoXoRz8U/hAHpzn6Dxt/BJErpBvtLjR+kYS4uAjlIxe5PtfgswWeQuUEA59GKFF7JR D4mYQPKZnB9QN8629hBKRARxBaz0LRzLF2FER9J64sb2T00BjT/re89EVUUIFlyTrTNK RM+XP41+Cf8wbPvIO02t0yfBYdmhKkSHEnRNifp4XacNiRpZxwNYE7jKh4tNWyDztEOX GmCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=VP7OM/FMku4ejfNZKKyAIOe7TAK7FRMTi1d8hiHCbHY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=E1I0ND53gQ39Igw6YBz867LIykq3mpMMVbKG0a+5YWDNjdqIE/ehFrboowJBRLd3he wcK3abOaxOrOehr/3+BPiNuO/+VE3QViIsmGbFaHmhvnlipoJRLszPcWZ+PtZvKfP8e2 614jeC5p9OCq/tuxjC5IZGl80kwCzjhxYm82OznTzGVNwFD6XDLl/1g4Y0qXyxXbQVKz aM/dHyCk5P6kcFQECMEL7/Wrt9np9qaVK/qg74DuxTF9UTs8HRhHLj9hJU14Jt9VP7jI +nmdMOZ4M80IyZrW+r7zOqgBBZtFafPDe4pgkf8uqNWGOFkUFE4Jo+NCsTGA1mjxhcOt R49g==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="UAh/Y3CE"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a90612b92b5si1397208366b.376.2024.09.23.05.59.21; Mon, 23 Sep 2024 05:59:22 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="UAh/Y3CE"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7E35368DAAF; Mon, 23 Sep 2024 15:40:28 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 45EB368D831 for ; Mon, 23 Sep 2024 15:40:22 +0300 (EEST) Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-42cde6b5094so38897915e9.3 for ; Mon, 23 Sep 2024 05:40:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727095221; x=1727700021; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=CNxBWOc6PUez2A0vrP7KG34i30rlNgN6VUivwjz7ipM=; b=UAh/Y3CEQp9LqG+CfsLJJAjbtJVHIzJLOVIKOUWmNJSoFCTiDZbp+vzwdcd4pYzTMD XR5o3mKf4gSsi/vk12CwFL6kdENM1wevTtscvR8bul/j+ZCwsja2zqn2Hs2hPaNPgl+F LZm34UTPmHcDXGQAS80YUm864PsYeCtwBNo40TCTl5IxOl+9ZW2yBVD/HnEFjaYyCVob Qb5cUhjAYuBsye4iU+QK1UaVB5kQkCo/BnsSbAwg+7g+Zwf5kGXkZwFakKqupoNe+MUt 2TAG9pI7e+BHc5hWewMiwmWL+xBcwr6y+hH8sF+n9LRZMyPmlsueef8aiGsBmmW8m+op ewwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727095221; x=1727700021; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CNxBWOc6PUez2A0vrP7KG34i30rlNgN6VUivwjz7ipM=; b=Iow0/rW4GJz48juyTXcjKmy31rpg2Kr2TeDtvbkDOXRLusGfwo/F4aH2zKFA9IBPex +mPvgoXDaSDHdiM5W61QTV6tSNqRVUQpegeOlu0EDGrHQ5MmgSGK9I5R2Gm1N0pvNLZ/ nRyk5ci1bHinHAi6C/5ty5zZWKQgI7/H6DitN6H8icTcB0aN2Xhz7MVVHgmuUaDoMWg2 E+iKh5o52ePgX8yA4V+b/JOgtUsVEFUv3iBPB/5yS5p0g1bEbNC4555U+WVYt8zmLvCO IxRSHw6UMB8uNuMy/p+7F0Qn95zwWsT5qTX3499aXonm5grIsGgPaYcVAeN93YoeX29N Nj7A== X-Gm-Message-State: AOJu0Yy0gqsUR0b1SY+KOlelmBrZ2rMkDrhAH5z70Zv7pXUygufAjPhz PRuI53qw4EtGPsDHpvSrBgohaXqjCzSIj/fZczRqTZ/0YXP3+GC1Fi+28A== X-Received: by 2002:a5d:52d2:0:b0:368:117c:84fd with SMTP id ffacd0b85a97d-37a42252389mr6242183f8f.3.1727095220873; Mon, 23 Sep 2024 05:40:20 -0700 (PDT) Received: from localhost.localdomain (213.95-240-81.adsl-dyn.isp.belgacom.be. [81.240.95.213]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-378e71f0683sm24424345f8f.13.2024.09.23.05.40.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Sep 2024 05:40:20 -0700 (PDT) From: Ramiro Polla To: ffmpeg-devel@ffmpeg.org Date: Mon, 23 Sep 2024 14:40:03 +0200 Message-Id: <20240923124017.33659-1-ramiro.polla@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 00/14] swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: T4fFU5/Ymh/7 There is an issue with the constants used in YUV to YUV range conversion, where the upper bound is not respected when converting to mpeg range. With this patchset, the constants are calculated at runtime, depending on the bit depth. This approach also allows us to more easily understand how the constants are derived. NOTE: simd optimizations for x86 and aarch64 have been updated, but riscv and loongarch are still missing (and therefore disabled). NOTE2: the same issue still exists in rgb2yuv conversions, which is not addressed in this patchset. Ramiro Polla (14): swscale/range_convert: call arch-specific init functions from main init function swscale/range_convert: drop redundant conditionals from arch-specific init functions swscale/range_convert: indent after previous commit checkasm: use FF_ARRAY_ELEMS instead of hardcoding size of arrays checkasm/sw_range_convert: use YUV pixel formats instead of YUVJ checkasm/sw_range_convert: reduce number of input sizes tested checkasm/sw_range_convert: only run benchmarks on largest input width checkasm/sw_range_convert: test all supported bit depths checkasm/sw_range_convert: indent after previous couple of commits swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats swscale/x86/range_convert: update sse2 and avx2 range_convert functions to new API swscale/x86: add sse2, sse4, and avx2 {lum,chr}ConvertRange16 swscale/aarch64/range_convert: update neon range_convert functions to new API swscale/aarch64: add neon {lum,chr}ConvertRange16 libswscale/aarch64/range_convert_neon.S | 141 +++++++++--- libswscale/aarch64/swscale.c | 41 +++- libswscale/hscale.c | 8 +- libswscale/loongarch/swscale_init_loongarch.c | 35 ++- libswscale/riscv/swscale.c | 12 +- libswscale/swscale.c | 119 +++++++++-- libswscale/swscale_internal.h | 13 +- libswscale/utils.c | 10 +- libswscale/x86/range_convert.asm | 201 ++++++++++++++---- libswscale/x86/swscale.c | 56 +++-- tests/checkasm/sw_gbrp.c | 15 +- tests/checkasm/sw_range_convert.c | 192 ++++++++++++----- tests/checkasm/sw_scale.c | 11 +- .../fate/filter-alphaextract_alphamerge_rgb | 100 ++++----- tests/ref/fate/filter-pixdesc-gray10be | 2 +- tests/ref/fate/filter-pixdesc-gray10le | 2 +- tests/ref/fate/filter-pixdesc-gray12be | 2 +- tests/ref/fate/filter-pixdesc-gray12le | 2 +- tests/ref/fate/filter-pixdesc-gray14be | 2 +- tests/ref/fate/filter-pixdesc-gray14le | 2 +- tests/ref/fate/filter-pixdesc-gray16be | 2 +- tests/ref/fate/filter-pixdesc-gray16le | 2 +- tests/ref/fate/filter-pixdesc-gray9be | 2 +- tests/ref/fate/filter-pixdesc-gray9le | 2 +- tests/ref/fate/filter-pixdesc-ya16be | 2 +- tests/ref/fate/filter-pixdesc-ya16le | 2 +- tests/ref/fate/filter-pixdesc-yuvj411p | 2 +- tests/ref/fate/filter-pixdesc-yuvj420p | 2 +- tests/ref/fate/filter-pixdesc-yuvj422p | 2 +- tests/ref/fate/filter-pixdesc-yuvj440p | 2 +- tests/ref/fate/filter-pixdesc-yuvj444p | 2 +- tests/ref/fate/filter-pixfmts-copy | 34 +-- tests/ref/fate/filter-pixfmts-crop | 34 +-- tests/ref/fate/filter-pixfmts-field | 34 +-- tests/ref/fate/filter-pixfmts-fieldorder | 30 +-- tests/ref/fate/filter-pixfmts-hflip | 34 +-- tests/ref/fate/filter-pixfmts-il | 34 +-- tests/ref/fate/filter-pixfmts-lut | 18 +- tests/ref/fate/filter-pixfmts-null | 34 +-- tests/ref/fate/filter-pixfmts-pad | 22 +- tests/ref/fate/filter-pixfmts-pullup | 10 +- tests/ref/fate/filter-pixfmts-rotate | 4 +- tests/ref/fate/filter-pixfmts-scale | 34 +-- tests/ref/fate/filter-pixfmts-swapuv | 10 +- .../ref/fate/filter-pixfmts-tinterlace_cvlpf | 8 +- .../ref/fate/filter-pixfmts-tinterlace_merge | 8 +- tests/ref/fate/filter-pixfmts-tinterlace_pad | 8 +- tests/ref/fate/filter-pixfmts-tinterlace_vlpf | 8 +- tests/ref/fate/filter-pixfmts-transpose | 28 +-- tests/ref/fate/filter-pixfmts-vflip | 34 +-- tests/ref/fate/fitsenc-gray | 2 +- tests/ref/fate/fitsenc-gray16be | 10 +- tests/ref/fate/gifenc-gray | 186 ++++++++-------- tests/ref/fate/idroq-video-encode | 2 +- tests/ref/fate/jpg-icc | 8 +- tests/ref/fate/sws-yuv-colorspace | 2 +- tests/ref/fate/sws-yuv-range | 2 +- tests/ref/fate/vvc-conformance-SCALING_A_1 | 128 +++++------ tests/ref/lavf/gray16be.fits | 4 +- tests/ref/lavf/gray16be.pam | 4 +- tests/ref/lavf/gray16be.png | 6 +- tests/ref/lavf/jpg | 6 +- tests/ref/lavf/smjpeg | 6 +- tests/ref/pixfmt/yuvj420p | 2 +- tests/ref/pixfmt/yuvj422p | 2 +- tests/ref/pixfmt/yuvj440p | 2 +- tests/ref/pixfmt/yuvj444p | 2 +- tests/ref/seek/lavf-jpg | 8 +- tests/ref/seek/vsynth_lena-mjpeg | 40 ++-- tests/ref/seek/vsynth_lena-roqvideo | 2 +- tests/ref/vsynth/vsynth1-amv | 8 +- tests/ref/vsynth/vsynth1-mjpeg | 6 +- tests/ref/vsynth/vsynth1-mjpeg-422 | 6 +- tests/ref/vsynth/vsynth1-mjpeg-444 | 6 +- tests/ref/vsynth/vsynth1-mjpeg-huffman | 6 +- tests/ref/vsynth/vsynth1-mjpeg-trell | 8 +- tests/ref/vsynth/vsynth1-mjpeg-trell-huffman | 8 +- tests/ref/vsynth/vsynth1-roqvideo | 8 +- tests/ref/vsynth/vsynth2-amv | 6 +- tests/ref/vsynth/vsynth2-mjpeg | 6 +- tests/ref/vsynth/vsynth2-mjpeg-422 | 6 +- tests/ref/vsynth/vsynth2-mjpeg-444 | 6 +- tests/ref/vsynth/vsynth2-mjpeg-huffman | 6 +- tests/ref/vsynth/vsynth2-mjpeg-trell | 8 +- tests/ref/vsynth/vsynth2-mjpeg-trell-huffman | 8 +- tests/ref/vsynth/vsynth2-roqvideo | 8 +- tests/ref/vsynth/vsynth3-amv | 8 +- tests/ref/vsynth/vsynth3-mjpeg | 8 +- tests/ref/vsynth/vsynth3-mjpeg-422 | 8 +- tests/ref/vsynth/vsynth3-mjpeg-444 | 6 +- tests/ref/vsynth/vsynth3-mjpeg-huffman | 8 +- tests/ref/vsynth/vsynth3-mjpeg-trell | 6 +- tests/ref/vsynth/vsynth3-mjpeg-trell-huffman | 6 +- tests/ref/vsynth/vsynth_lena-amv | 6 +- tests/ref/vsynth/vsynth_lena-mjpeg | 8 +- tests/ref/vsynth/vsynth_lena-mjpeg-422 | 6 +- tests/ref/vsynth/vsynth_lena-mjpeg-444 | 6 +- tests/ref/vsynth/vsynth_lena-mjpeg-huffman | 8 +- tests/ref/vsynth/vsynth_lena-mjpeg-trell | 8 +- .../vsynth/vsynth_lena-mjpeg-trell-huffman | 8 +- tests/ref/vsynth/vsynth_lena-roqvideo | 8 +- 101 files changed, 1229 insertions(+), 827 deletions(-)