Message ID | 20220104052018.9541-1-jdek@itanimul.li |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel,v3,1/6] lavc/arm: dont assign hevc_qpel functions for non-multiple of 8 widths | expand |
Context | Check | Description |
---|---|---|
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
andriy/make_ppc | success | Make finished |
andriy/make_fate_ppc | fail | Make fate failed |
J. Dekker: > The assembly is written assuming that the width is a multiple of 8. > > However the real issue is the functions were errorneously assigned to > the 2, 4, 6 & 12 widths. This behaviour never broke the decoder as > samples which trigger the functions for these widths have not been found > in the wild. This relies on the mappings in ff_hevc_pel_weight[]. > > Signed-off-by: J. Dekker <jdek@itanimul.li> > --- > libavcodec/arm/hevcdsp_init_neon.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > This set has already been reviewed by Martin, sending to list for > transparency. > > diff --git a/libavcodec/arm/hevcdsp_init_neon.c b/libavcodec/arm/hevcdsp_init_neon.c > index 201a088dac..112edb5edd 100644 > --- a/libavcodec/arm/hevcdsp_init_neon.c > +++ b/libavcodec/arm/hevcdsp_init_neon.c > @@ -270,7 +270,8 @@ av_cold void ff_hevc_dsp_init_neon(HEVCDSPContext *c, const int bit_depth) > put_hevc_qpel_uw_neon[3][1] = ff_hevc_put_qpel_uw_h1v3_neon_8; > put_hevc_qpel_uw_neon[3][2] = ff_hevc_put_qpel_uw_h2v3_neon_8; > put_hevc_qpel_uw_neon[3][3] = ff_hevc_put_qpel_uw_h3v3_neon_8; > - for (x = 0; x < 10; x++) { > + for (x = 3; x < 10; x++) { > + if (x == 4) continue; > c->put_hevc_qpel[x][1][0] = ff_hevc_put_qpel_neon_wrapper; > c->put_hevc_qpel[x][0][1] = ff_hevc_put_qpel_neon_wrapper; > c->put_hevc_qpel[x][1][1] = ff_hevc_put_qpel_neon_wrapper; > This patchset led to regressions; see e.g. http://fate.ffmpeg.org/report.cgi?time=20220104162724&slot=aarch64-linux-qemu-ubuntu-gcc-4.8 - Andreas
On Wed, 5 Jan 2022, Andreas Rheinhardt wrote: > J. Dekker: >> The assembly is written assuming that the width is a multiple of 8. >> >> However the real issue is the functions were errorneously assigned to >> the 2, 4, 6 & 12 widths. This behaviour never broke the decoder as >> samples which trigger the functions for these widths have not been found >> in the wild. This relies on the mappings in ff_hevc_pel_weight[]. >> >> Signed-off-by: J. Dekker <jdek@itanimul.li> >> --- >> libavcodec/arm/hevcdsp_init_neon.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> This set has already been reviewed by Martin, sending to list for >> transparency. >> >> diff --git a/libavcodec/arm/hevcdsp_init_neon.c b/libavcodec/arm/hevcdsp_init_neon.c >> index 201a088dac..112edb5edd 100644 >> --- a/libavcodec/arm/hevcdsp_init_neon.c >> +++ b/libavcodec/arm/hevcdsp_init_neon.c >> @@ -270,7 +270,8 @@ av_cold void ff_hevc_dsp_init_neon(HEVCDSPContext *c, const int bit_depth) >> put_hevc_qpel_uw_neon[3][1] = ff_hevc_put_qpel_uw_h1v3_neon_8; >> put_hevc_qpel_uw_neon[3][2] = ff_hevc_put_qpel_uw_h2v3_neon_8; >> put_hevc_qpel_uw_neon[3][3] = ff_hevc_put_qpel_uw_h3v3_neon_8; >> - for (x = 0; x < 10; x++) { >> + for (x = 3; x < 10; x++) { >> + if (x == 4) continue; >> c->put_hevc_qpel[x][1][0] = ff_hevc_put_qpel_neon_wrapper; >> c->put_hevc_qpel[x][0][1] = ff_hevc_put_qpel_neon_wrapper; >> c->put_hevc_qpel[x][1][1] = ff_hevc_put_qpel_neon_wrapper; >> > > This patchset led to regressions; see e.g. > http://fate.ffmpeg.org/report.cgi?time=20220104162724&slot=aarch64-linux-qemu-ubuntu-gcc-4.8 Indeed. I had only ran fate-checkasm while reviewing it, assuming that it had been fully tested with fate-hevc by the patch author. Instead of reverting the full 6 patch set, it's enough to revert a couple patches out of it though (there's some cosmetic cleanup that we can keep in for now). But I'm afraid we should disable the preexisting ff_hevc_sao_band_filter_8x8_8_neon function too. It's currently only run for the [0] case, which I think corresponds to width <= 8. But if that case also must handle widths that aren't an even multiple of 8, we'd have a lingering bug that isn't exercised by fate-hevc. // Martin
diff --git a/libavcodec/arm/hevcdsp_init_neon.c b/libavcodec/arm/hevcdsp_init_neon.c index 201a088dac..112edb5edd 100644 --- a/libavcodec/arm/hevcdsp_init_neon.c +++ b/libavcodec/arm/hevcdsp_init_neon.c @@ -270,7 +270,8 @@ av_cold void ff_hevc_dsp_init_neon(HEVCDSPContext *c, const int bit_depth) put_hevc_qpel_uw_neon[3][1] = ff_hevc_put_qpel_uw_h1v3_neon_8; put_hevc_qpel_uw_neon[3][2] = ff_hevc_put_qpel_uw_h2v3_neon_8; put_hevc_qpel_uw_neon[3][3] = ff_hevc_put_qpel_uw_h3v3_neon_8; - for (x = 0; x < 10; x++) { + for (x = 3; x < 10; x++) { + if (x == 4) continue; c->put_hevc_qpel[x][1][0] = ff_hevc_put_qpel_neon_wrapper; c->put_hevc_qpel[x][0][1] = ff_hevc_put_qpel_neon_wrapper; c->put_hevc_qpel[x][1][1] = ff_hevc_put_qpel_neon_wrapper;
The assembly is written assuming that the width is a multiple of 8. However the real issue is the functions were errorneously assigned to the 2, 4, 6 & 12 widths. This behaviour never broke the decoder as samples which trigger the functions for these widths have not been found in the wild. This relies on the mappings in ff_hevc_pel_weight[]. Signed-off-by: J. Dekker <jdek@itanimul.li> --- libavcodec/arm/hevcdsp_init_neon.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) This set has already been reviewed by Martin, sending to list for transparency.