Message ID | 20240821232455.238469-2-ramiro.polla@gmail.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] avcodec/mpegvideo: remove redundant workaround to recalculate last nonzero coefficient | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
On Thu, Aug 22, 2024 at 1:24 AM Ramiro Polla <ramiro.polla@gmail.com> wrote: > The x86 optimized dct_quantize only calculates the last nonzero > coefficient correctly if the zigzag scan order is used. For the > alternate scan order, this value is incorrect. > > To work around this, the dct_unquantize functions process the entire > block if the alternate scan order is used. > > But a second workaround (bb198e198ab) was added that recalculates the > last nonzero coefficient after dct_quantize is called if the alternate > scan order is used. > > This commit removes the first workaround, which became redundant. > --- > libavcodec/mpegvideo.c | 9 +++------ > libavcodec/x86/mpegvideo.c | 6 ++---- > 2 files changed, 5 insertions(+), 10 deletions(-) Michael, could you please check if my analysis and the changes are correct? Thanks, Ramiro
On Wed, Aug 28, 2024 at 02:14:33PM +0200, Ramiro Polla wrote: > On Thu, Aug 22, 2024 at 1:24 AM Ramiro Polla <ramiro.polla@gmail.com> wrote: > > The x86 optimized dct_quantize only calculates the last nonzero > > coefficient correctly if the zigzag scan order is used. For the > > alternate scan order, this value is incorrect. > > > > To work around this, the dct_unquantize functions process the entire > > block if the alternate scan order is used. > > > > But a second workaround (bb198e198ab) was added that recalculates the > > last nonzero coefficient after dct_quantize is called if the alternate > > scan order is used. > > > > This commit removes the first workaround, which became redundant. > > --- > > libavcodec/mpegvideo.c | 9 +++------ > > libavcodec/x86/mpegvideo.c | 6 ++---- > > 2 files changed, 5 insertions(+), 10 deletions(-) > > Michael, could you please check if my analysis and the changes are correct? If you tested it and it works it has to be correct. This would result in significant errors if it was wrong thx [...]
On Thu, Aug 29, 2024 at 9:13 PM Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Aug 28, 2024 at 02:14:33PM +0200, Ramiro Polla wrote: > > On Thu, Aug 22, 2024 at 1:24 AM Ramiro Polla <ramiro.polla@gmail.com> wrote: > > > The x86 optimized dct_quantize only calculates the last nonzero > > > coefficient correctly if the zigzag scan order is used. For the > > > alternate scan order, this value is incorrect. > > > > > > To work around this, the dct_unquantize functions process the entire > > > block if the alternate scan order is used. > > > > > > But a second workaround (bb198e198ab) was added that recalculates the > > > last nonzero coefficient after dct_quantize is called if the alternate > > > scan order is used. > > > > > > This commit removes the first workaround, which became redundant. > > > --- > > > libavcodec/mpegvideo.c | 9 +++------ > > > libavcodec/x86/mpegvideo.c | 6 ++---- > > > 2 files changed, 5 insertions(+), 10 deletions(-) > > > > Michael, could you please check if my analysis and the changes are correct? > > If you tested it and it works it has to be correct. > This would result in significant errors if it was wrong Thanks. I had manually tested by removing or leaving in place the first and/or second workarounds, and checking that it failed without both workarounds and worked with either workarounds. Fate passed, but fate doesn't test alternate scan order. That seemed good enough, so I pushed it. Ramiro
diff --git a/libavcodec/mpegvideo.c b/libavcodec/mpegvideo.c index b9a0469335..01e310e483 100644 --- a/libavcodec/mpegvideo.c +++ b/libavcodec/mpegvideo.c @@ -110,8 +110,7 @@ static void dct_unquantize_mpeg2_intra_c(MpegEncContext *s, if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale]; else qscale <<= 1; - if(s->alternate_scan) nCoeffs= 63; - else nCoeffs= s->block_last_index[n]; + nCoeffs= s->block_last_index[n]; block[0] *= n < 4 ? s->y_dc_scale : s->c_dc_scale; quant_matrix = s->intra_matrix; @@ -141,8 +140,7 @@ static void dct_unquantize_mpeg2_intra_bitexact(MpegEncContext *s, if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale]; else qscale <<= 1; - if(s->alternate_scan) nCoeffs= 63; - else nCoeffs= s->block_last_index[n]; + nCoeffs= s->block_last_index[n]; block[0] *= n < 4 ? s->y_dc_scale : s->c_dc_scale; sum += block[0]; @@ -175,8 +173,7 @@ static void dct_unquantize_mpeg2_inter_c(MpegEncContext *s, if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale]; else qscale <<= 1; - if(s->alternate_scan) nCoeffs= 63; - else nCoeffs= s->block_last_index[n]; + nCoeffs= s->block_last_index[n]; quant_matrix = s->inter_matrix; for(i=0; i<=nCoeffs; i++) { diff --git a/libavcodec/x86/mpegvideo.c b/libavcodec/x86/mpegvideo.c index 73967cafda..9878607a81 100644 --- a/libavcodec/x86/mpegvideo.c +++ b/libavcodec/x86/mpegvideo.c @@ -312,8 +312,7 @@ static void dct_unquantize_mpeg2_intra_mmx(MpegEncContext *s, if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale]; else qscale <<= 1; - if(s->alternate_scan) nCoeffs= 63; //FIXME - else nCoeffs= s->intra_scantable.raster_end[ s->block_last_index[n] ]; + nCoeffs= s->intra_scantable.raster_end[ s->block_last_index[n] ]; if (n < 4) block0 = block[0] * s->y_dc_scale; @@ -380,8 +379,7 @@ static void dct_unquantize_mpeg2_inter_mmx(MpegEncContext *s, if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale]; else qscale <<= 1; - if(s->alternate_scan) nCoeffs= 63; //FIXME - else nCoeffs= s->intra_scantable.raster_end[ s->block_last_index[n] ]; + nCoeffs= s->intra_scantable.raster_end[ s->block_last_index[n] ]; quant_matrix = s->inter_matrix; __asm__ volatile(