Message ID | 20220926161122.1352372-1-chemag@gmail.com |
---|---|
State | Accepted |
Commit | bf64a75c5ae58ed575303f70b2ab9b2208ded339 |
Headers | show |
Series | [FFmpeg-devel,1/1] libswscale: force a minimum size of the slide for bayer sources | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
Quoting Chema Gonzalez (2022-09-26 18:11:22) > diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c > index 8838cc8b53..9af2e7ecc3 100644 > --- a/libswscale/swscale_unscaled.c > +++ b/libswscale/swscale_unscaled.c > @@ -2095,6 +2095,7 @@ void ff_get_unscaled_swscale(SwsContext *c) > c->convert_unscaled = rgbToPlanarRgbWrapper; > > if (isBayer(srcFormat)) { > + c->dst_slice_align = 2; IMO it's better to put this next to the line that sets dst_slice_align for non-bayer cases, makes it clearer what the final value is.
Hi, On Wed, Sep 28, 2022 at 8:09 AM Anton Khirnov <anton@khirnov.net> wrote: > > if (isBayer(srcFormat)) { > > + c->dst_slice_align = 2; > > IMO it's better to put this next to the line that sets dst_slice_align > for non-bayer cases, makes it clearer what the final value is. Are you suggesting setting `dst_slice_align` in a different function? The way I read `ff_get_unscaled_swscale()` is that it goes through the quirks of all the different conversions (per source and destination type). In all cases, it sets the `convert_unscaled` function pointer. In the cases where there is the need to align (yuv2bgr and yuv410p_to_yuv[a]420p), it also adds `dst_slice_align`. In the same fashion, the conversions that affect Bayer sources are set in line 2097. Thanks, -Chema
Quoting Chema Gonzalez (2022-09-28 18:20:22) > Hi, > > On Wed, Sep 28, 2022 at 8:09 AM Anton Khirnov <anton@khirnov.net> wrote: > > > if (isBayer(srcFormat)) { > > > + c->dst_slice_align = 2; > > > > IMO it's better to put this next to the line that sets dst_slice_align > > for non-bayer cases, makes it clearer what the final value is. > Are you suggesting setting `dst_slice_align` in a different function? > > The way I read `ff_get_unscaled_swscale()` is that it goes through the > quirks of all the different conversions (per source and destination > type). In all cases, it sets the `convert_unscaled` function pointer. > In the cases where there is the need to align (yuv2bgr and > yuv410p_to_yuv[a]420p), it also adds `dst_slice_align`. In the same > fashion, the conversions that affect Bayer sources are set in line > 2097. I suppose it depends on whether you consider the required alignment a fundamental property of the pixel format or a specific property of the chosen conversion kernel. My first hunch would be the former, but I guess your argument is valid as well. Anybody else also has an opinion? If not, I can push your patch as is.
Will push.
diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 8838cc8b53..9af2e7ecc3 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -2095,6 +2095,7 @@ void ff_get_unscaled_swscale(SwsContext *c) c->convert_unscaled = rgbToPlanarRgbWrapper; if (isBayer(srcFormat)) { + c->dst_slice_align = 2; if (dstFormat == AV_PIX_FMT_RGB24) c->convert_unscaled = bayer_to_rgb24_wrapper; else if (dstFormat == AV_PIX_FMT_RGB48)
Bayer sources are read in groups of 2 lines (e.g. for a BGGR flavor, the first row contains only B and G samples, while the second row contains only G and R samples). They need to be read as a whole. Tested: `` $ echo -ne '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' > image.raw $ xxd image.raw 00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000020: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000030: ffff ffff ffff ffff ffff ffff ffff ffff ................ ``` And then: ``` $ ./ffmpeg -y -f rawvideo -pixel_format bayer_bggr8 -s 8x8 \ -i image.raw -f rawvideo -pix_fmt rgb24 \ -video_size 8x8 image.raw.rgb ... Assertion srcSliceH > 1 failed at libswscale/swscale_unscaled.c:1310 Aborted (core dumped)ated 2 times ``` We can see that the issue relates to the ffmpeg parallelization. ``` $ ffmpeg -y -filter_threads 1 -f rawvideo -pixel_format bayer_bggr8 \ -s 8x8 -i image.raw -f rawvideo -pix_fmt rgb24 \ -video_size 8x8 image.raw.rgb ... frame= 1 fps=0.0 q=-0.0 Lsize= 0kB time=00:00:00.00 bitrate=N/A speed= 0x eed=N/A video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000% $ xxd image.raw.rgb 00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000050: 7f00 3f7f 0000 7f00 3f7f 0000 0000 0000 ..?.....?....... 00000060: ffff ffff ffff 7fbf ff7f ffff 7fbf ff7f ................ 00000070: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000080: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000090: ffff ffff ffff ffff ffff ffff ffff ffff ................ 000000a0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 000000b0: ffff ffff ffff ffff ffff ffff ffff ffff ................ ``` Problem seems to be that `ff_sws_slice_worker()` [libswscale/swscale.c:1222] tries to slice the input to parallelize the scaling task, in my case in 16 different jobs (gdb'ing the process shows `nb_threads == nb_jobs == 16`). The 8x8 input is therefore divided in eight 8x1 slices (1-pixel height), which eventually breaks in `bayer_to_rgb24_wrapper()` as it asserts `srcSliceH > 1`. The problem is the same in the 3 Bayer conversion functions (`bayer_to_rgb24_wrapper()`, `bayer_to_rgb48_wrapper()`, and `bayer_to_yv12_wrapper()`. The solution was suggested by Anton Khirnov. We set the `dst_slice_align` value to 2 for Bayer conversions. ``` $ ./ffmpeg -y -f rawvideo -pixel_format bayer_bggr8 -s 8x8 \ -i image.raw -f rawvideo -pix_fmt rgb24 \ -video_size 8x8 image.raw.rgb ... frame= 1 fps=0.0 q=-0.0 Lsize= 0kB time=00:00:00.00 bitrate=N/A speed= 0x eed=N/A video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000% $ xxd image.raw.rgb 00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000060: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000070: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000080: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000090: ffff ffff ffff ffff ffff ffff ffff ffff ................ 000000a0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 000000b0: ffff ffff ffff ffff ffff ffff ffff ffff ................ ``` We can see the slicing at work, though: The demosaicing does not carry through different slices (workers). Compare to forcing a single worker: ``` $ ./ffmpeg -y -filter_threads 1 -f rawvideo -pixel_format bayer_bggr8 \ -s 8x8 -i image.raw -f rawvideo -pix_fmt rgb24 \ -video_size 8x8 image.raw.alt.rgb ... frame= 1 fps=0.0 q=-0.0 Lsize= 0kB time=00:00:00.00 bitrate=N/A speed= 0x eed=N/A video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000% $ xxd image.raw.rgb 00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ $ xxd /tmp/image.raw.alt.rgb 00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000050: 7f00 3f7f 0000 7f00 3f7f 0000 0000 0000 ..?.....?....... 00000060: ffff ffff ffff 7fbf ff7f ffff 7fbf ff7f ................ 00000070: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000080: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000090: ffff ffff ffff ffff ffff ffff ffff ffff ................ 000000a0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 000000b0: ffff ffff ffff ffff ffff ffff ffff ffff ................ ``` --- libswscale/swscale_unscaled.c | 1 + 1 file changed, 1 insertion(+)