diff mbox series

[FFmpeg-devel] avcodec/mpegvideo: remove redundant workaround to recalculate last nonzero coefficient

Message ID 20240821232455.238469-2-ramiro.polla@gmail.com
State New
Headers show
Series [FFmpeg-devel] avcodec/mpegvideo: remove redundant workaround to recalculate last nonzero coefficient | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Ramiro Polla Aug. 21, 2024, 11:24 p.m. UTC
The x86 optimized dct_quantize only calculates the last nonzero
coefficient correctly if the zigzag scan order is used. For the
alternate scan order, this value is incorrect.

To work around this, the dct_unquantize functions process the entire
block if the alternate scan order is used.

But a second workaround (bb198e198ab) was added that recalculates the
last nonzero coefficient after dct_quantize is called if the alternate
scan order is used.

This commit removes the first workaround, which became redundant.
---
 libavcodec/mpegvideo.c     | 9 +++------
 libavcodec/x86/mpegvideo.c | 6 ++----
 2 files changed, 5 insertions(+), 10 deletions(-)

Comments

Ramiro Polla Aug. 28, 2024, 12:14 p.m. UTC | #1
On Thu, Aug 22, 2024 at 1:24 AM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> The x86 optimized dct_quantize only calculates the last nonzero
> coefficient correctly if the zigzag scan order is used. For the
> alternate scan order, this value is incorrect.
>
> To work around this, the dct_unquantize functions process the entire
> block if the alternate scan order is used.
>
> But a second workaround (bb198e198ab) was added that recalculates the
> last nonzero coefficient after dct_quantize is called if the alternate
> scan order is used.
>
> This commit removes the first workaround, which became redundant.
> ---
>  libavcodec/mpegvideo.c     | 9 +++------
>  libavcodec/x86/mpegvideo.c | 6 ++----
>  2 files changed, 5 insertions(+), 10 deletions(-)

Michael, could you please check if my analysis and the changes are correct?

Thanks,
Ramiro
Michael Niedermayer Aug. 29, 2024, 7:13 p.m. UTC | #2
On Wed, Aug 28, 2024 at 02:14:33PM +0200, Ramiro Polla wrote:
> On Thu, Aug 22, 2024 at 1:24 AM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> > The x86 optimized dct_quantize only calculates the last nonzero
> > coefficient correctly if the zigzag scan order is used. For the
> > alternate scan order, this value is incorrect.
> >
> > To work around this, the dct_unquantize functions process the entire
> > block if the alternate scan order is used.
> >
> > But a second workaround (bb198e198ab) was added that recalculates the
> > last nonzero coefficient after dct_quantize is called if the alternate
> > scan order is used.
> >
> > This commit removes the first workaround, which became redundant.
> > ---
> >  libavcodec/mpegvideo.c     | 9 +++------
> >  libavcodec/x86/mpegvideo.c | 6 ++----
> >  2 files changed, 5 insertions(+), 10 deletions(-)
> 
> Michael, could you please check if my analysis and the changes are correct?

If you tested it and it works it has to be correct.
This would result in significant errors if it was wrong

thx

[...]
Ramiro Polla Sept. 3, 2024, 3:19 p.m. UTC | #3
On Thu, Aug 29, 2024 at 9:13 PM Michael Niedermayer
<michael@niedermayer.cc> wrote:
> On Wed, Aug 28, 2024 at 02:14:33PM +0200, Ramiro Polla wrote:
> > On Thu, Aug 22, 2024 at 1:24 AM Ramiro Polla <ramiro.polla@gmail.com> wrote:
> > > The x86 optimized dct_quantize only calculates the last nonzero
> > > coefficient correctly if the zigzag scan order is used. For the
> > > alternate scan order, this value is incorrect.
> > >
> > > To work around this, the dct_unquantize functions process the entire
> > > block if the alternate scan order is used.
> > >
> > > But a second workaround (bb198e198ab) was added that recalculates the
> > > last nonzero coefficient after dct_quantize is called if the alternate
> > > scan order is used.
> > >
> > > This commit removes the first workaround, which became redundant.
> > > ---
> > >  libavcodec/mpegvideo.c     | 9 +++------
> > >  libavcodec/x86/mpegvideo.c | 6 ++----
> > >  2 files changed, 5 insertions(+), 10 deletions(-)
> >
> > Michael, could you please check if my analysis and the changes are correct?
>
> If you tested it and it works it has to be correct.
> This would result in significant errors if it was wrong

Thanks. I had manually tested by removing or leaving in place the
first and/or second workarounds, and checking that it failed without
both workarounds and worked with either workarounds. Fate passed, but
fate doesn't test alternate scan order. That seemed good enough, so I
pushed it.

Ramiro
diff mbox series

Patch

diff --git a/libavcodec/mpegvideo.c b/libavcodec/mpegvideo.c
index b9a0469335..01e310e483 100644
--- a/libavcodec/mpegvideo.c
+++ b/libavcodec/mpegvideo.c
@@ -110,8 +110,7 @@  static void dct_unquantize_mpeg2_intra_c(MpegEncContext *s,
     if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale];
     else                 qscale <<= 1;
 
-    if(s->alternate_scan) nCoeffs= 63;
-    else nCoeffs= s->block_last_index[n];
+    nCoeffs= s->block_last_index[n];
 
     block[0] *= n < 4 ? s->y_dc_scale : s->c_dc_scale;
     quant_matrix = s->intra_matrix;
@@ -141,8 +140,7 @@  static void dct_unquantize_mpeg2_intra_bitexact(MpegEncContext *s,
     if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale];
     else                 qscale <<= 1;
 
-    if(s->alternate_scan) nCoeffs= 63;
-    else nCoeffs= s->block_last_index[n];
+    nCoeffs= s->block_last_index[n];
 
     block[0] *= n < 4 ? s->y_dc_scale : s->c_dc_scale;
     sum += block[0];
@@ -175,8 +173,7 @@  static void dct_unquantize_mpeg2_inter_c(MpegEncContext *s,
     if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale];
     else                 qscale <<= 1;
 
-    if(s->alternate_scan) nCoeffs= 63;
-    else nCoeffs= s->block_last_index[n];
+    nCoeffs= s->block_last_index[n];
 
     quant_matrix = s->inter_matrix;
     for(i=0; i<=nCoeffs; i++) {
diff --git a/libavcodec/x86/mpegvideo.c b/libavcodec/x86/mpegvideo.c
index 73967cafda..9878607a81 100644
--- a/libavcodec/x86/mpegvideo.c
+++ b/libavcodec/x86/mpegvideo.c
@@ -312,8 +312,7 @@  static void dct_unquantize_mpeg2_intra_mmx(MpegEncContext *s,
     if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale];
     else                 qscale <<= 1;
 
-    if(s->alternate_scan) nCoeffs= 63; //FIXME
-    else nCoeffs= s->intra_scantable.raster_end[ s->block_last_index[n] ];
+    nCoeffs= s->intra_scantable.raster_end[ s->block_last_index[n] ];
 
     if (n < 4)
         block0 = block[0] * s->y_dc_scale;
@@ -380,8 +379,7 @@  static void dct_unquantize_mpeg2_inter_mmx(MpegEncContext *s,
     if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale];
     else                 qscale <<= 1;
 
-    if(s->alternate_scan) nCoeffs= 63; //FIXME
-    else nCoeffs= s->intra_scantable.raster_end[ s->block_last_index[n] ];
+    nCoeffs= s->intra_scantable.raster_end[ s->block_last_index[n] ];
 
         quant_matrix = s->inter_matrix;
 __asm__ volatile(