mbox series

[FFmpeg-devel,00/14] swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats

Message ID 20240923124017.33659-1-ramiro.polla@gmail.com
Headers show
Series swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats | expand

Message

Ramiro Polla Sept. 23, 2024, 12:40 p.m. UTC
There is an issue with the constants used in YUV to YUV range conversion,
where the upper bound is not respected when converting to mpeg range.

With this patchset, the constants are calculated at runtime, depending on
the bit depth. This approach also allows us to more easily understand how
the constants are derived.

NOTE: simd optimizations for x86 and aarch64 have been updated, but riscv
      and loongarch are still missing (and therefore disabled).

NOTE2: the same issue still exists in rgb2yuv conversions, which is not
       addressed in this patchset.

Ramiro Polla (14):
  swscale/range_convert: call arch-specific init functions from main
    init function
  swscale/range_convert: drop redundant conditionals from arch-specific
    init functions
  swscale/range_convert: indent after previous commit
  checkasm: use FF_ARRAY_ELEMS instead of hardcoding size of arrays
  checkasm/sw_range_convert: use YUV pixel formats instead of YUVJ
  checkasm/sw_range_convert: reduce number of input sizes tested
  checkasm/sw_range_convert: only run benchmarks on largest input width
  checkasm/sw_range_convert: test all supported bit depths
  checkasm/sw_range_convert: indent after previous couple of commits
  swscale/range_convert: fix mpeg ranges in yuv range conversion for
    non-8-bit pixel formats
  swscale/x86/range_convert: update sse2 and avx2 range_convert
    functions to new API
  swscale/x86: add sse2, sse4, and avx2 {lum,chr}ConvertRange16
  swscale/aarch64/range_convert: update neon range_convert functions to
    new API
  swscale/aarch64: add neon {lum,chr}ConvertRange16

 libswscale/aarch64/range_convert_neon.S       | 141 +++++++++---
 libswscale/aarch64/swscale.c                  |  41 +++-
 libswscale/hscale.c                           |   8 +-
 libswscale/loongarch/swscale_init_loongarch.c |  35 ++-
 libswscale/riscv/swscale.c                    |  12 +-
 libswscale/swscale.c                          | 119 +++++++++--
 libswscale/swscale_internal.h                 |  13 +-
 libswscale/utils.c                            |  10 +-
 libswscale/x86/range_convert.asm              | 201 ++++++++++++++----
 libswscale/x86/swscale.c                      |  56 +++--
 tests/checkasm/sw_gbrp.c                      |  15 +-
 tests/checkasm/sw_range_convert.c             | 192 ++++++++++++-----
 tests/checkasm/sw_scale.c                     |  11 +-
 .../fate/filter-alphaextract_alphamerge_rgb   | 100 ++++-----
 tests/ref/fate/filter-pixdesc-gray10be        |   2 +-
 tests/ref/fate/filter-pixdesc-gray10le        |   2 +-
 tests/ref/fate/filter-pixdesc-gray12be        |   2 +-
 tests/ref/fate/filter-pixdesc-gray12le        |   2 +-
 tests/ref/fate/filter-pixdesc-gray14be        |   2 +-
 tests/ref/fate/filter-pixdesc-gray14le        |   2 +-
 tests/ref/fate/filter-pixdesc-gray16be        |   2 +-
 tests/ref/fate/filter-pixdesc-gray16le        |   2 +-
 tests/ref/fate/filter-pixdesc-gray9be         |   2 +-
 tests/ref/fate/filter-pixdesc-gray9le         |   2 +-
 tests/ref/fate/filter-pixdesc-ya16be          |   2 +-
 tests/ref/fate/filter-pixdesc-ya16le          |   2 +-
 tests/ref/fate/filter-pixdesc-yuvj411p        |   2 +-
 tests/ref/fate/filter-pixdesc-yuvj420p        |   2 +-
 tests/ref/fate/filter-pixdesc-yuvj422p        |   2 +-
 tests/ref/fate/filter-pixdesc-yuvj440p        |   2 +-
 tests/ref/fate/filter-pixdesc-yuvj444p        |   2 +-
 tests/ref/fate/filter-pixfmts-copy            |  34 +--
 tests/ref/fate/filter-pixfmts-crop            |  34 +--
 tests/ref/fate/filter-pixfmts-field           |  34 +--
 tests/ref/fate/filter-pixfmts-fieldorder      |  30 +--
 tests/ref/fate/filter-pixfmts-hflip           |  34 +--
 tests/ref/fate/filter-pixfmts-il              |  34 +--
 tests/ref/fate/filter-pixfmts-lut             |  18 +-
 tests/ref/fate/filter-pixfmts-null            |  34 +--
 tests/ref/fate/filter-pixfmts-pad             |  22 +-
 tests/ref/fate/filter-pixfmts-pullup          |  10 +-
 tests/ref/fate/filter-pixfmts-rotate          |   4 +-
 tests/ref/fate/filter-pixfmts-scale           |  34 +--
 tests/ref/fate/filter-pixfmts-swapuv          |  10 +-
 .../ref/fate/filter-pixfmts-tinterlace_cvlpf  |   8 +-
 .../ref/fate/filter-pixfmts-tinterlace_merge  |   8 +-
 tests/ref/fate/filter-pixfmts-tinterlace_pad  |   8 +-
 tests/ref/fate/filter-pixfmts-tinterlace_vlpf |   8 +-
 tests/ref/fate/filter-pixfmts-transpose       |  28 +--
 tests/ref/fate/filter-pixfmts-vflip           |  34 +--
 tests/ref/fate/fitsenc-gray                   |   2 +-
 tests/ref/fate/fitsenc-gray16be               |  10 +-
 tests/ref/fate/gifenc-gray                    | 186 ++++++++--------
 tests/ref/fate/idroq-video-encode             |   2 +-
 tests/ref/fate/jpg-icc                        |   8 +-
 tests/ref/fate/sws-yuv-colorspace             |   2 +-
 tests/ref/fate/sws-yuv-range                  |   2 +-
 tests/ref/fate/vvc-conformance-SCALING_A_1    | 128 +++++------
 tests/ref/lavf/gray16be.fits                  |   4 +-
 tests/ref/lavf/gray16be.pam                   |   4 +-
 tests/ref/lavf/gray16be.png                   |   6 +-
 tests/ref/lavf/jpg                            |   6 +-
 tests/ref/lavf/smjpeg                         |   6 +-
 tests/ref/pixfmt/yuvj420p                     |   2 +-
 tests/ref/pixfmt/yuvj422p                     |   2 +-
 tests/ref/pixfmt/yuvj440p                     |   2 +-
 tests/ref/pixfmt/yuvj444p                     |   2 +-
 tests/ref/seek/lavf-jpg                       |   8 +-
 tests/ref/seek/vsynth_lena-mjpeg              |  40 ++--
 tests/ref/seek/vsynth_lena-roqvideo           |   2 +-
 tests/ref/vsynth/vsynth1-amv                  |   8 +-
 tests/ref/vsynth/vsynth1-mjpeg                |   6 +-
 tests/ref/vsynth/vsynth1-mjpeg-422            |   6 +-
 tests/ref/vsynth/vsynth1-mjpeg-444            |   6 +-
 tests/ref/vsynth/vsynth1-mjpeg-huffman        |   6 +-
 tests/ref/vsynth/vsynth1-mjpeg-trell          |   8 +-
 tests/ref/vsynth/vsynth1-mjpeg-trell-huffman  |   8 +-
 tests/ref/vsynth/vsynth1-roqvideo             |   8 +-
 tests/ref/vsynth/vsynth2-amv                  |   6 +-
 tests/ref/vsynth/vsynth2-mjpeg                |   6 +-
 tests/ref/vsynth/vsynth2-mjpeg-422            |   6 +-
 tests/ref/vsynth/vsynth2-mjpeg-444            |   6 +-
 tests/ref/vsynth/vsynth2-mjpeg-huffman        |   6 +-
 tests/ref/vsynth/vsynth2-mjpeg-trell          |   8 +-
 tests/ref/vsynth/vsynth2-mjpeg-trell-huffman  |   8 +-
 tests/ref/vsynth/vsynth2-roqvideo             |   8 +-
 tests/ref/vsynth/vsynth3-amv                  |   8 +-
 tests/ref/vsynth/vsynth3-mjpeg                |   8 +-
 tests/ref/vsynth/vsynth3-mjpeg-422            |   8 +-
 tests/ref/vsynth/vsynth3-mjpeg-444            |   6 +-
 tests/ref/vsynth/vsynth3-mjpeg-huffman        |   8 +-
 tests/ref/vsynth/vsynth3-mjpeg-trell          |   6 +-
 tests/ref/vsynth/vsynth3-mjpeg-trell-huffman  |   6 +-
 tests/ref/vsynth/vsynth_lena-amv              |   6 +-
 tests/ref/vsynth/vsynth_lena-mjpeg            |   8 +-
 tests/ref/vsynth/vsynth_lena-mjpeg-422        |   6 +-
 tests/ref/vsynth/vsynth_lena-mjpeg-444        |   6 +-
 tests/ref/vsynth/vsynth_lena-mjpeg-huffman    |   8 +-
 tests/ref/vsynth/vsynth_lena-mjpeg-trell      |   8 +-
 .../vsynth/vsynth_lena-mjpeg-trell-huffman    |   8 +-
 tests/ref/vsynth/vsynth_lena-roqvideo         |   8 +-
 101 files changed, 1229 insertions(+), 827 deletions(-)