diff mbox series

[FFmpeg-devel,1/1] libswscale: force a minimum size of the slide for bayer sources

Message ID 20220926161122.1352372-1-chemag@gmail.com
State Accepted
Commit bf64a75c5ae58ed575303f70b2ab9b2208ded339
Headers show
Series [FFmpeg-devel,1/1] libswscale: force a minimum size of the slide for bayer sources | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Chema Gonzalez Sept. 26, 2022, 4:11 p.m. UTC
Bayer sources are read in groups of 2 lines (e.g. for a
BGGR flavor, the first row contains only B and G samples,
while the second row contains only G and R samples). They
need to be read as a whole.

Tested:

``
$ echo -ne '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
> image.raw
$ xxd image.raw
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000030: ffff ffff ffff ffff ffff ffff ffff ffff  ................
```

And then:
```
$ ./ffmpeg -y -f rawvideo -pixel_format bayer_bggr8 -s 8x8 \
    -i image.raw -f rawvideo -pix_fmt rgb24 \
    -video_size 8x8 image.raw.rgb
...
Assertion srcSliceH > 1 failed at libswscale/swscale_unscaled.c:1310
Aborted (core dumped)ated 2 times
```

We can see that the issue relates to the ffmpeg parallelization.
```
$ ffmpeg -y -filter_threads 1 -f rawvideo -pixel_format bayer_bggr8 \
    -s 8x8 -i image.raw -f rawvideo -pix_fmt rgb24 \
    -video_size 8x8 image.raw.rgb
...
frame=    1 fps=0.0 q=-0.0 Lsize=       0kB time=00:00:00.00
bitrate=N/A speed=   0x    eed=N/A
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 0.000000%
$ xxd image.raw.rgb
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 7f00 3f7f 0000 7f00 3f7f 0000 0000 0000  ..?.....?.......
00000060: ffff ffff ffff 7fbf ff7f ffff 7fbf ff7f  ................
00000070: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000080: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000090: ffff ffff ffff ffff ffff ffff ffff ffff  ................
000000a0: ffff ffff ffff ffff ffff ffff ffff ffff  ................
000000b0: ffff ffff ffff ffff ffff ffff ffff ffff  ................
```

Problem seems to be that `ff_sws_slice_worker()`
[libswscale/swscale.c:1222] tries to slice the input to parallelize
the scaling task, in my case in 16 different jobs (gdb'ing the process
shows `nb_threads == nb_jobs == 16`). The 8x8 input is therefore
divided in eight 8x1 slices (1-pixel height), which eventually breaks
in `bayer_to_rgb24_wrapper()` as it asserts `srcSliceH > 1`. The
problem is the same in the 3 Bayer conversion functions
(`bayer_to_rgb24_wrapper()`, `bayer_to_rgb48_wrapper()`, and
`bayer_to_yv12_wrapper()`.

The solution was suggested by Anton Khirnov. We set the `dst_slice_align`
value to 2 for Bayer conversions.

```
$ ./ffmpeg -y -f rawvideo -pixel_format bayer_bggr8 -s 8x8 \
    -i image.raw -f rawvideo -pix_fmt rgb24 \
    -video_size 8x8 image.raw.rgb
...
frame=    1 fps=0.0 q=-0.0 Lsize=       0kB time=00:00:00.00 bitrate=N/A speed=   0x    eed=N/A
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
$ xxd image.raw.rgb
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000070: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000080: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000090: ffff ffff ffff ffff ffff ffff ffff ffff  ................
000000a0: ffff ffff ffff ffff ffff ffff ffff ffff  ................
000000b0: ffff ffff ffff ffff ffff ffff ffff ffff  ................
```

We can see the slicing at work, though: The demosaicing does not
carry through different slices (workers). Compare to forcing a
single worker:

```
$ ./ffmpeg -y -filter_threads 1 -f rawvideo -pixel_format bayer_bggr8 \
    -s 8x8 -i image.raw -f rawvideo -pix_fmt rgb24 \
    -video_size 8x8 image.raw.alt.rgb
...
frame=    1 fps=0.0 q=-0.0 Lsize=       0kB time=00:00:00.00
bitrate=N/A speed=   0x    eed=N/A
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 0.000000%
$ xxd image.raw.rgb
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
$ xxd /tmp/image.raw.alt.rgb
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 7f00 3f7f 0000 7f00 3f7f 0000 0000 0000  ..?.....?.......
00000060: ffff ffff ffff 7fbf ff7f ffff 7fbf ff7f  ................
00000070: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000080: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000090: ffff ffff ffff ffff ffff ffff ffff ffff  ................
000000a0: ffff ffff ffff ffff ffff ffff ffff ffff  ................
000000b0: ffff ffff ffff ffff ffff ffff ffff ffff  ................
```
---
 libswscale/swscale_unscaled.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Anton Khirnov Sept. 28, 2022, 3:09 p.m. UTC | #1
Quoting Chema Gonzalez (2022-09-26 18:11:22)
> diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
> index 8838cc8b53..9af2e7ecc3 100644
> --- a/libswscale/swscale_unscaled.c
> +++ b/libswscale/swscale_unscaled.c
> @@ -2095,6 +2095,7 @@ void ff_get_unscaled_swscale(SwsContext *c)
>          c->convert_unscaled = rgbToPlanarRgbWrapper;
>  
>      if (isBayer(srcFormat)) {
> +        c->dst_slice_align = 2;

IMO it's better to put this next to the line that sets dst_slice_align
for non-bayer cases, makes it clearer what the final value is.
Chema Gonzalez Sept. 28, 2022, 4:20 p.m. UTC | #2
Hi,

On Wed, Sep 28, 2022 at 8:09 AM Anton Khirnov <anton@khirnov.net> wrote:
> >      if (isBayer(srcFormat)) {
> > +        c->dst_slice_align = 2;
>
> IMO it's better to put this next to the line that sets dst_slice_align
> for non-bayer cases, makes it clearer what the final value is.
Are you suggesting setting `dst_slice_align` in a different function?

The way I read `ff_get_unscaled_swscale()` is that it goes through the
quirks of all the different conversions (per source and destination
type). In all cases, it sets the `convert_unscaled` function pointer.
In the cases where there is the need to align (yuv2bgr and
yuv410p_to_yuv[a]420p), it also adds `dst_slice_align`. In the same
fashion, the conversions that affect Bayer sources are set in line
2097.

Thanks,
-Chema
Anton Khirnov Sept. 30, 2022, 2:15 p.m. UTC | #3
Quoting Chema Gonzalez (2022-09-28 18:20:22)
> Hi,
> 
> On Wed, Sep 28, 2022 at 8:09 AM Anton Khirnov <anton@khirnov.net> wrote:
> > >      if (isBayer(srcFormat)) {
> > > +        c->dst_slice_align = 2;
> >
> > IMO it's better to put this next to the line that sets dst_slice_align
> > for non-bayer cases, makes it clearer what the final value is.
> Are you suggesting setting `dst_slice_align` in a different function?
> 
> The way I read `ff_get_unscaled_swscale()` is that it goes through the
> quirks of all the different conversions (per source and destination
> type). In all cases, it sets the `convert_unscaled` function pointer.
> In the cases where there is the need to align (yuv2bgr and
> yuv410p_to_yuv[a]420p), it also adds `dst_slice_align`. In the same
> fashion, the conversions that affect Bayer sources are set in line
> 2097.

I suppose it depends on whether you consider the required alignment a
fundamental property of the pixel format or a specific property of the
chosen conversion kernel. My first hunch would be the former, but I
guess your argument is valid as well.

Anybody else also has an opinion? If not, I can push your patch as is.
Anton Khirnov Oct. 13, 2022, 3:07 p.m. UTC | #4
Will push.
diff mbox series

Patch

diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index 8838cc8b53..9af2e7ecc3 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -2095,6 +2095,7 @@  void ff_get_unscaled_swscale(SwsContext *c)
         c->convert_unscaled = rgbToPlanarRgbWrapper;
 
     if (isBayer(srcFormat)) {
+        c->dst_slice_align = 2;
         if (dstFormat == AV_PIX_FMT_RGB24)
             c->convert_unscaled = bayer_to_rgb24_wrapper;
         else if (dstFormat == AV_PIX_FMT_RGB48)