Message ID | AS8P250MB07442A1CD0C5A69943DF4DA78FA1A@AS8P250MB0744.EURP250.PROD.OUTLOOK.COM |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] avcodec/mpegvideo: Remove spec-incompliant inverse quantisation | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
On Mon, 30 Oct 2023 at 13:10, Andreas Rheinhardt < andreas.rheinhardt@outlook.com> wrote: > Section 7.4.4 of the MPEG-2 specifications requires that the > last bit of the last coefficient be toggled so that the sum > of all coefficients is odd; both our decoder and encoder > did this only if the bitexact flag has been set (although > stuff like this should be behind AV_CODEC_FLAG2_FAST). > This patch changes this by removing the spec-incompliant > functions. > LGTM
On Mon, Oct 30, 2023 at 02:11:27PM +0100, Andreas Rheinhardt wrote: > Section 7.4.4 of the MPEG-2 specifications requires that the > last bit of the last coefficient be toggled so that the sum > of all coefficients is odd; both our decoder and encoder > did this only if the bitexact flag has been set (although > stuff like this should be behind AV_CODEC_FLAG2_FAST). > This patch changes this by removing the spec-incompliant > functions. This commit message should include benchamarks documenting the speed loss (of the unquantize, the IDCT and overall) It is expected that the speed of some IDCTs will be impacted negativly as the non zero terms will prevent the skiping of some significant code as well as information about how much PSNR improves (to the encoder input) Also the change is a +-1 in one spot before the IDCT, the IDCT is not bitexactly specified in MPEG-2 so one could think of this as a correct implementation followed by a IDCT that was sometimes +-1 off instead of spec non compliance Only after the benchmarks and PSNR is presented should we decide if this is a change we want thx [...]
Quoting Michael Niedermayer (2023-10-31 09:40:44) > On Mon, Oct 30, 2023 at 02:11:27PM +0100, Andreas Rheinhardt wrote: > > Section 7.4.4 of the MPEG-2 specifications requires that the > > last bit of the last coefficient be toggled so that the sum > > of all coefficients is odd; both our decoder and encoder > > did this only if the bitexact flag has been set (although > > stuff like this should be behind AV_CODEC_FLAG2_FAST). > > This patch changes this by removing the spec-incompliant > > functions. > > This commit message should include benchamarks documenting the speed loss > (of the unquantize, the IDCT and overall) > It is expected that the speed of some IDCTs will be impacted negativly > as the non zero terms will prevent the skiping of some significant code > > as well as information about how much PSNR improves (to the encoder input) > > Also the change is a +-1 in one spot before the IDCT, the IDCT is not bitexactly > specified in MPEG-2 so one could think of this as a > correct implementation followed by a IDCT that was sometimes +-1 off > instead of spec non compliance > > Only after the benchmarks and PSNR is presented should we decide if this > is a change we want I disagree that the burden of proof should be on Andreas here. It should be up to whoever wants to keep this code to show that it is useful.
On 2023-11-08 12:40 +0100, Anton Khirnov wrote: > Quoting Michael Niedermayer (2023-10-31 09:40:44) > > On Mon, Oct 30, 2023 at 02:11:27PM +0100, Andreas Rheinhardt wrote: > > > Section 7.4.4 of the MPEG-2 specifications requires that the > > > last bit of the last coefficient be toggled so that the sum > > > of all coefficients is odd; both our decoder and encoder > > > did this only if the bitexact flag has been set (although > > > stuff like this should be behind AV_CODEC_FLAG2_FAST). > > > This patch changes this by removing the spec-incompliant > > > functions. > > > > This commit message should include benchamarks documenting the speed loss > > (of the unquantize, the IDCT and overall) > > It is expected that the speed of some IDCTs will be impacted negativly > > as the non zero terms will prevent the skiping of some significant code > > > > as well as information about how much PSNR improves (to the encoder input) > > > > Also the change is a +-1 in one spot before the IDCT, the IDCT is not bitexactly > > specified in MPEG-2 so one could think of this as a > > correct implementation followed by a IDCT that was sometimes +-1 off > > instead of spec non compliance > > > > Only after the benchmarks and PSNR is presented should we decide if this > > is a change we want > > I disagree that the burden of proof should be on Andreas here. It should > be up to whoever wants to keep this code to show that it is useful. There was an argument presented. That argument could be challenged or otherwise explained why it more important to have this always behave like with bitexact. This could lead to "OK, I think removal is better" or if not benchmarks could lead to one or the other decision. Saying the burden is on whoever wants to keep the code sounds like a way for arbitrary code removal. While I agree getting rid of code can be a good thing, this would definitely take it too far. Best regards, Alexander
On Wed, Nov 8, 2023 at 3:46 PM Alexander Strasser <eclipse7@gmx.net> wrote: > On 2023-11-08 12:40 +0100, Anton Khirnov wrote: > > Quoting Michael Niedermayer (2023-10-31 09:40:44) > > > On Mon, Oct 30, 2023 at 02:11:27PM +0100, Andreas Rheinhardt wrote: > > > > Section 7.4.4 of the MPEG-2 specifications requires that the > > > > last bit of the last coefficient be toggled so that the sum > > > > of all coefficients is odd; both our decoder and encoder > > > > did this only if the bitexact flag has been set (although > > > > stuff like this should be behind AV_CODEC_FLAG2_FAST). > > > > This patch changes this by removing the spec-incompliant > > > > functions. > > > > > > This commit message should include benchamarks documenting the speed > loss > > > (of the unquantize, the IDCT and overall) > > > It is expected that the speed of some IDCTs will be impacted negativly > > > as the non zero terms will prevent the skiping of some significant code > > > > > > as well as information about how much PSNR improves (to the encoder > input) > > > > > > Also the change is a +-1 in one spot before the IDCT, the IDCT is not > bitexactly > > > specified in MPEG-2 so one could think of this as a > > > correct implementation followed by a IDCT that was sometimes +-1 off > > > instead of spec non compliance > > > > > > Only after the benchmarks and PSNR is presented should we decide if > this > > > is a change we want > > > > I disagree that the burden of proof should be on Andreas here. It should > > be up to whoever wants to keep this code to show that it is useful. > > There was an argument presented. > > That argument could be challenged or otherwise explained why it more > important to have this always behave like with bitexact. > > This could lead to "OK, I think removal is better" or if not benchmarks > could lead to one or the other decision. > > Saying the burden is on whoever wants to keep the code sounds like a way > for arbitrary code removal. While I agree getting rid of code can be a good > thing, this would definitely take it too far. > To be fair, this is noncompliant spec code, it shouldn't be present at all since it produces inconsistent results (with the spec) and there is no device particularly needing this functionality.It's not arbitrary code removal, it's removing something that is not needed any more since the speed impact (pro or against) is negligible on modern computers. I'm of the opinion that presenting an argument against such a targeted and specific code removal with no supportive use case should be noted but not acted upon, until relevant proof is brought over. Yes sadly that burden should fall on whoever is presenting the argument.
Quoting Alexander Strasser (2023-11-08 21:55:10) > On 2023-11-08 12:40 +0100, Anton Khirnov wrote: > > Quoting Michael Niedermayer (2023-10-31 09:40:44) > > > On Mon, Oct 30, 2023 at 02:11:27PM +0100, Andreas Rheinhardt wrote: > > > > Section 7.4.4 of the MPEG-2 specifications requires that the > > > > last bit of the last coefficient be toggled so that the sum > > > > of all coefficients is odd; both our decoder and encoder > > > > did this only if the bitexact flag has been set (although > > > > stuff like this should be behind AV_CODEC_FLAG2_FAST). > > > > This patch changes this by removing the spec-incompliant > > > > functions. > > > > > > This commit message should include benchamarks documenting the speed loss > > > (of the unquantize, the IDCT and overall) > > > It is expected that the speed of some IDCTs will be impacted negativly > > > as the non zero terms will prevent the skiping of some significant code > > > > > > as well as information about how much PSNR improves (to the encoder input) > > > > > > Also the change is a +-1 in one spot before the IDCT, the IDCT is not bitexactly > > > specified in MPEG-2 so one could think of this as a > > > correct implementation followed by a IDCT that was sometimes +-1 off > > > instead of spec non compliance > > > > > > Only after the benchmarks and PSNR is presented should we decide if this > > > is a change we want > > > > I disagree that the burden of proof should be on Andreas here. It should > > be up to whoever wants to keep this code to show that it is useful. > > There was an argument presented. I see no argument for why the code in question is useful, can you point to the exact text? > That argument could be challenged or otherwise explained why it more > important to have this always behave like with bitexact. > > This could lead to "OK, I think removal is better" or if not benchmarks > could lead to one or the other decision. > > Saying the burden is on whoever wants to keep the code sounds like a way > for arbitrary code removal. While I agree getting rid of code can be a good > thing, this would definitely take it too far. All code is a maintenance burden, therefore all code should have a reason for its presence in the codebase, otherwise it should be removed.
On 2023-11-09 11:13 +0100, Anton Khirnov wrote: > Quoting Alexander Strasser (2023-11-08 21:55:10) > > On 2023-11-08 12:40 +0100, Anton Khirnov wrote: > > > Quoting Michael Niedermayer (2023-10-31 09:40:44) > > > > On Mon, Oct 30, 2023 at 02:11:27PM +0100, Andreas Rheinhardt wrote: > > > > > Section 7.4.4 of the MPEG-2 specifications requires that the > > > > > last bit of the last coefficient be toggled so that the sum > > > > > of all coefficients is odd; both our decoder and encoder > > > > > did this only if the bitexact flag has been set (although > > > > > stuff like this should be behind AV_CODEC_FLAG2_FAST). > > > > > This patch changes this by removing the spec-incompliant > > > > > functions. > > > > > > > > This commit message should include benchamarks documenting the speed loss > > > > (of the unquantize, the IDCT and overall) > > > > It is expected that the speed of some IDCTs will be impacted negativly > > > > as the non zero terms will prevent the skiping of some significant code > > > > > > > > as well as information about how much PSNR improves (to the encoder input) > > > > > > > > Also the change is a +-1 in one spot before the IDCT, the IDCT is not bitexactly > > > > specified in MPEG-2 so one could think of this as a > > > > correct implementation followed by a IDCT that was sometimes +-1 off > > > > instead of spec non compliance > > > > > > > > Only after the benchmarks and PSNR is presented should we decide if this > > > > is a change we want > > > > > > I disagree that the burden of proof should be on Andreas here. It should > > > be up to whoever wants to keep this code to show that it is useful. > > > > There was an argument presented. > > I see no argument for why the code in question is useful, can you point > to the exact text? First this: > > > > It is expected that the speed of some IDCTs will be impacted negativly > > > > as the non zero terms will prevent the skiping of some significant code Second there was an argument for compliance: > > > > Also the change is a +-1 in one spot before the IDCT, the IDCT is not bitexactly > > > > specified in MPEG-2 so one could think of this as a > > > > correct implementation followed by a IDCT that was sometimes +-1 off > > > > instead of spec non compliance Third there was no rejection of the change, but a request for measurement of the effect. I would expect an approval of the patch if the measurement leads to insignificant enough results. > > That argument could be challenged or otherwise explained why it more > > important to have this always behave like with bitexact. > > > > This could lead to "OK, I think removal is better" or if not benchmarks > > could lead to one or the other decision. > > > > Saying the burden is on whoever wants to keep the code sounds like a way > > for arbitrary code removal. While I agree getting rid of code can be a good > > thing, this would definitely take it too far. > > All code is a maintenance burden, therefore all code should have a > reason for its presence in the codebase, otherwise it should be removed. I can't see how the reason for the presence of code can be ultimately defined objectively and non-arbitrary. Alexander
Le torstaina 9. marraskuuta 2023, 22.45.35 EET Alexander Strasser a écrit : > I can't see how the reason for the presence of code can be ultimately > defined objectively and non-arbitrary. Ultimately, this was discussed and decided in a meeting, which Michael attended (albeit remotely) and for which meeting notes were published. That being the case, I don't see why Andreas should have to perform extensive testing and write extensive justification. He could have done and that would have been nice, but that is all. In this situation, it is up to whoever disagrees (and presumably was not in the meeting) to provide extensive justification why the decision should be reversed. And I'm not seeing any such thing. Also anything to shrink the amount of MMX code Looks Good To Me.
On 2023-11-08 17:58 -0500, Vittorio Giovara wrote: > On Wed, Nov 8, 2023 at 3:46 PM Alexander Strasser <eclipse7@gmx.net> wrote: > > > On 2023-11-08 12:40 +0100, Anton Khirnov wrote: > > > Quoting Michael Niedermayer (2023-10-31 09:40:44) > > > > On Mon, Oct 30, 2023 at 02:11:27PM +0100, Andreas Rheinhardt wrote: > > > > > Section 7.4.4 of the MPEG-2 specifications requires that the > > > > > last bit of the last coefficient be toggled so that the sum > > > > > of all coefficients is odd; both our decoder and encoder > > > > > did this only if the bitexact flag has been set (although > > > > > stuff like this should be behind AV_CODEC_FLAG2_FAST). > > > > > This patch changes this by removing the spec-incompliant > > > > > functions. > > > > > > > > This commit message should include benchamarks documenting the speed > > loss > > > > (of the unquantize, the IDCT and overall) > > > > It is expected that the speed of some IDCTs will be impacted negativly > > > > as the non zero terms will prevent the skiping of some significant code > > > > > > > > as well as information about how much PSNR improves (to the encoder > > input) > > > > > > > > Also the change is a +-1 in one spot before the IDCT, the IDCT is not > > bitexactly > > > > specified in MPEG-2 so one could think of this as a > > > > correct implementation followed by a IDCT that was sometimes +-1 off > > > > instead of spec non compliance > > > > > > > > Only after the benchmarks and PSNR is presented should we decide if > > this > > > > is a change we want > > > > > > I disagree that the burden of proof should be on Andreas here. It should > > > be up to whoever wants to keep this code to show that it is useful. > > > > There was an argument presented. > > > > That argument could be challenged or otherwise explained why it more > > important to have this always behave like with bitexact. > > > > This could lead to "OK, I think removal is better" or if not benchmarks > > could lead to one or the other decision. > > > > Saying the burden is on whoever wants to keep the code sounds like a way > > for arbitrary code removal. While I agree getting rid of code can be a good > > thing, this would definitely take it too far. > > > > To be fair, this is noncompliant spec code, it shouldn't be present at all > since it produces inconsistent results (with the spec) and there is no > device particularly needing this functionality.It's not arbitrary code > removal, it's removing something that is not needed any more since the > speed impact (pro or against) is negligible on modern computers. Please see the reply to Anton. I hope it's clearer now what I meant. > I'm of the opinion that presenting an argument against such a targeted and > specific code removal with no supportive use case should be noted but not > acted upon, until relevant proof is brought over. Yes sadly that burden > should fall on whoever is presenting the argument. I wouid disagree in general. Maybe this case is more special (targeted and specific?) then my current understanding of the matter. IIUC the leading reason for this patch is removing the code because of spec incompliance which I think is not so clear and could be argued both ways. Alexander
On Thu, Nov 09, 2023 at 10:52:19PM +0200, Rémi Denis-Courmont wrote: > Le torstaina 9. marraskuuta 2023, 22.45.35 EET Alexander Strasser a écrit : > > I can't see how the reason for the presence of code can be ultimately > > defined objectively and non-arbitrary. > > Ultimately, this was discussed and decided in a meeting, which Michael > attended (albeit remotely) and for which meeting notes were published. maybe i misremember as i was a bit sick that day but there are 2 pieces of code there was the "fast mode code" that i thought was discussed this is disabled by default and we aggreed to remove it the change here is about the default code path so this is very different it will affect users with default options. I do not remember this was discussed but its quite possible people had a different interpretation what the words meant that where said. This definitly should be tested before its applied. Whoever the burden falls on is not my argument but ATM i have many things to do so i will not be able to test this in the next days > > That being the case, I don't see why Andreas should have to perform extensive > testing and write extensive justification. He could have done and that would > have been nice, but that is all. I never meant that there should be extensive testing but teh removal of a optimization from the default codepath should be tested. Also about spec non compliance, if this was so bad, being in the default path there would be bug reports so iam a bit sceptic. not saying it shouldnt be removed just saying we should look before removing it thx [...]
diff --git a/libavcodec/mips/mpegvideo_init_mips.c b/libavcodec/mips/mpegvideo_init_mips.c index f687ad18f1..1b383ee3f5 100644 --- a/libavcodec/mips/mpegvideo_init_mips.c +++ b/libavcodec/mips/mpegvideo_init_mips.c @@ -33,10 +33,6 @@ av_cold void ff_mpv_common_init_mips(MpegEncContext *s) s->dct_unquantize_mpeg1_intra = ff_dct_unquantize_mpeg1_intra_mmi; s->dct_unquantize_mpeg1_inter = ff_dct_unquantize_mpeg1_inter_mmi; - if (!(s->avctx->flags & AV_CODEC_FLAG_BITEXACT)) - if (!s->q_scale_type) - s->dct_unquantize_mpeg2_intra = ff_dct_unquantize_mpeg2_intra_mmi; - s->denoise_dct= ff_denoise_dct_mmi; } diff --git a/libavcodec/mips/mpegvideo_mips.h b/libavcodec/mips/mpegvideo_mips.h index 760d7b3295..0bc0e375bd 100644 --- a/libavcodec/mips/mpegvideo_mips.h +++ b/libavcodec/mips/mpegvideo_mips.h @@ -31,8 +31,6 @@ void ff_dct_unquantize_mpeg1_intra_mmi(MpegEncContext *s, int16_t *block, int n, int qscale); void ff_dct_unquantize_mpeg1_inter_mmi(MpegEncContext *s, int16_t *block, int n, int qscale); -void ff_dct_unquantize_mpeg2_intra_mmi(MpegEncContext *s, int16_t *block, - int n, int qscale); void ff_denoise_dct_mmi(MpegEncContext *s, int16_t *block); #endif /* AVCODEC_MIPS_MPEGVIDEO_MIPS_H */ diff --git a/libavcodec/mips/mpegvideo_mmi.c b/libavcodec/mips/mpegvideo_mmi.c index 3d5b5e20ab..d0bf1a3c10 100644 --- a/libavcodec/mips/mpegvideo_mmi.c +++ b/libavcodec/mips/mpegvideo_mmi.c @@ -342,99 +342,6 @@ void ff_dct_unquantize_mpeg1_inter_mmi(MpegEncContext *s, int16_t *block, ); } -void ff_dct_unquantize_mpeg2_intra_mmi(MpegEncContext *s, int16_t *block, - int n, int qscale) -{ - uint64_t nCoeffs; - const uint16_t *quant_matrix; - int block0; - double ftmp[10]; - uint64_t tmp[1]; - mips_reg addr[1]; - DECLARE_VAR_ALL64; - DECLARE_VAR_ADDRT; - - assert(s->block_last_index[n]>=0); - - if (s->alternate_scan) - nCoeffs = 63; - else - nCoeffs = s->intra_scantable.raster_end[s->block_last_index[n]]; - - if (n < 4) - block0 = block[0] * s->y_dc_scale; - else - block0 = block[0] * s->c_dc_scale; - - quant_matrix = s->intra_matrix; - - __asm__ volatile ( - "dli %[tmp0], 0x0f \n\t" - "pcmpeqh %[ftmp0], %[ftmp0], %[ftmp0] \n\t" - "mtc1 %[tmp0], %[ftmp3] \n\t" - "mtc1 %[qscale], %[ftmp9] \n\t" - "psrlh %[ftmp0], %[ftmp0], %[ftmp3] \n\t" - "packsswh %[ftmp9], %[ftmp9], %[ftmp9] \n\t" - "packsswh %[ftmp9], %[ftmp9], %[ftmp9] \n\t" - "or %[addr0], %[nCoeffs], $0 \n\t" - ".p2align 4 \n\t" - - "1: \n\t" - MMI_LDXC1(%[ftmp1], %[addr0], %[block], 0x00) - MMI_LDXC1(%[ftmp2], %[addr0], %[block], 0x08) - "mov.d %[ftmp3], %[ftmp1] \n\t" - "mov.d %[ftmp4], %[ftmp2] \n\t" - MMI_LDXC1(%[ftmp5], %[addr0], %[quant], 0x00) - MMI_LDXC1(%[ftmp6], %[addr0], %[quant], 0x08) - "pmullh %[ftmp5], %[ftmp5], %[ftmp9] \n\t" - "pmullh %[ftmp6], %[ftmp6], %[ftmp9] \n\t" - "pxor %[ftmp7], %[ftmp7], %[ftmp7] \n\t" - "pxor %[ftmp8], %[ftmp8], %[ftmp8] \n\t" - "pcmpgth %[ftmp7], %[ftmp7], %[ftmp1] \n\t" - "pcmpgth %[ftmp8], %[ftmp8], %[ftmp2] \n\t" - "pxor %[ftmp1], %[ftmp1], %[ftmp7] \n\t" - "pxor %[ftmp2], %[ftmp2], %[ftmp8] \n\t" - "psubh %[ftmp1], %[ftmp1], %[ftmp7] \n\t" - "psubh %[ftmp2], %[ftmp2], %[ftmp8] \n\t" - "pmullh %[ftmp1], %[ftmp1], %[ftmp5] \n\t" - "pmullh %[ftmp2], %[ftmp2], %[ftmp6] \n\t" - "pxor %[ftmp5], %[ftmp5], %[ftmp5] \n\t" - "pxor %[ftmp6], %[ftmp6], %[ftmp6] \n\t" - "pcmpeqh %[ftmp5], %[ftmp5], %[ftmp3] \n\t" - "dli %[tmp0], 0x03 \n\t" - "pcmpeqh %[ftmp6] , %[ftmp6], %[ftmp4] \n\t" - "mtc1 %[tmp0], %[ftmp3] \n\t" - "psrah %[ftmp1], %[ftmp1], %[ftmp3] \n\t" - "psrah %[ftmp2], %[ftmp2], %[ftmp3] \n\t" - "pxor %[ftmp1], %[ftmp1], %[ftmp7] \n\t" - "pxor %[ftmp2], %[ftmp2], %[ftmp8] \n\t" - "psubh %[ftmp1], %[ftmp1], %[ftmp7] \n\t" - "psubh %[ftmp2], %[ftmp2], %[ftmp8] \n\t" - "pandn %[ftmp5], %[ftmp5], %[ftmp1] \n\t" - "pandn %[ftmp6], %[ftmp6], %[ftmp2] \n\t" - MMI_SDXC1(%[ftmp5], %[addr0], %[block], 0x00) - MMI_SDXC1(%[ftmp6], %[addr0], %[block], 0x08) - PTR_ADDIU "%[addr0], %[addr0], 0x10 \n\t" - "blez %[addr0], 1b \n\t" - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), - [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), - [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), - [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), - [ftmp8]"=&f"(ftmp[8]), [ftmp9]"=&f"(ftmp[9]), - [tmp0]"=&r"(tmp[0]), - RESTRICT_ASM_ALL64 - RESTRICT_ASM_ADDRT - [addr0]"=&r"(addr[0]) - : [block]"r"((mips_reg)(block+nCoeffs)), - [quant]"r"((mips_reg)(quant_matrix+nCoeffs)), - [nCoeffs]"r"((mips_reg)(2*(-nCoeffs))), - [qscale]"r"(qscale) - : "memory" - ); - - block[0]= block0; -} - void ff_denoise_dct_mmi(MpegEncContext *s, int16_t *block) { const int intra = s->mb_intra; diff --git a/libavcodec/mpegvideo.c b/libavcodec/mpegvideo.c index 81796e42bb..dadb8462e1 100644 --- a/libavcodec/mpegvideo.c +++ b/libavcodec/mpegvideo.c @@ -104,36 +104,6 @@ static void dct_unquantize_mpeg2_intra_c(MpegEncContext *s, { int i, level, nCoeffs; const uint16_t *quant_matrix; - - if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale]; - else qscale <<= 1; - - if(s->alternate_scan) nCoeffs= 63; - else nCoeffs= s->block_last_index[n]; - - block[0] *= n < 4 ? s->y_dc_scale : s->c_dc_scale; - quant_matrix = s->intra_matrix; - for(i=1;i<=nCoeffs;i++) { - int j= s->intra_scantable.permutated[i]; - level = block[j]; - if (level) { - if (level < 0) { - level = -level; - level = (int)(level * qscale * quant_matrix[j]) >> 4; - level = -level; - } else { - level = (int)(level * qscale * quant_matrix[j]) >> 4; - } - block[j] = level; - } - } -} - -static void dct_unquantize_mpeg2_intra_bitexact(MpegEncContext *s, - int16_t *block, int n, int qscale) -{ - int i, level, nCoeffs; - const uint16_t *quant_matrix; int sum=-1; if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale]; @@ -295,8 +265,6 @@ static av_cold int dct_init(MpegEncContext *s) s->dct_unquantize_mpeg1_intra = dct_unquantize_mpeg1_intra_c; s->dct_unquantize_mpeg1_inter = dct_unquantize_mpeg1_inter_c; s->dct_unquantize_mpeg2_intra = dct_unquantize_mpeg2_intra_c; - if (s->avctx->flags & AV_CODEC_FLAG_BITEXACT) - s->dct_unquantize_mpeg2_intra = dct_unquantize_mpeg2_intra_bitexact; s->dct_unquantize_mpeg2_inter = dct_unquantize_mpeg2_inter_c; #if HAVE_INTRINSICS_NEON diff --git a/libavcodec/x86/mpegvideo.c b/libavcodec/x86/mpegvideo.c index 73967cafda..f3384dfaa5 100644 --- a/libavcodec/x86/mpegvideo.c +++ b/libavcodec/x86/mpegvideo.c @@ -23,7 +23,6 @@ #include "libavutil/cpu.h" #include "libavutil/x86/asm.h" #include "libavutil/x86/cpu.h" -#include "libavcodec/avcodec.h" #include "libavcodec/mpegvideo.h" #include "libavcodec/mpegvideodata.h" @@ -300,75 +299,6 @@ __asm__ volatile( ); } -static void dct_unquantize_mpeg2_intra_mmx(MpegEncContext *s, - int16_t *block, int n, int qscale) -{ - x86_reg nCoeffs; - const uint16_t *quant_matrix; - int block0; - - av_assert2(s->block_last_index[n]>=0); - - if (s->q_scale_type) qscale = ff_mpeg2_non_linear_qscale[qscale]; - else qscale <<= 1; - - if(s->alternate_scan) nCoeffs= 63; //FIXME - else nCoeffs= s->intra_scantable.raster_end[ s->block_last_index[n] ]; - - if (n < 4) - block0 = block[0] * s->y_dc_scale; - else - block0 = block[0] * s->c_dc_scale; - quant_matrix = s->intra_matrix; -__asm__ volatile( - "pcmpeqw %%mm7, %%mm7 \n\t" - "psrlw $15, %%mm7 \n\t" - "movd %2, %%mm6 \n\t" - "packssdw %%mm6, %%mm6 \n\t" - "packssdw %%mm6, %%mm6 \n\t" - "mov %3, %%"FF_REG_a" \n\t" - ".p2align 4 \n\t" - "1: \n\t" - "movq (%0, %%"FF_REG_a"), %%mm0 \n\t" - "movq 8(%0, %%"FF_REG_a"), %%mm1\n\t" - "movq (%1, %%"FF_REG_a"), %%mm4 \n\t" - "movq 8(%1, %%"FF_REG_a"), %%mm5\n\t" - "pmullw %%mm6, %%mm4 \n\t" // q=qscale*quant_matrix[i] - "pmullw %%mm6, %%mm5 \n\t" // q=qscale*quant_matrix[i] - "pxor %%mm2, %%mm2 \n\t" - "pxor %%mm3, %%mm3 \n\t" - "pcmpgtw %%mm0, %%mm2 \n\t" // block[i] < 0 ? -1 : 0 - "pcmpgtw %%mm1, %%mm3 \n\t" // block[i] < 0 ? -1 : 0 - "pxor %%mm2, %%mm0 \n\t" - "pxor %%mm3, %%mm1 \n\t" - "psubw %%mm2, %%mm0 \n\t" // abs(block[i]) - "psubw %%mm3, %%mm1 \n\t" // abs(block[i]) - "pmullw %%mm4, %%mm0 \n\t" // abs(block[i])*q - "pmullw %%mm5, %%mm1 \n\t" // abs(block[i])*q - "pxor %%mm4, %%mm4 \n\t" - "pxor %%mm5, %%mm5 \n\t" // FIXME slow - "pcmpeqw (%0, %%"FF_REG_a"), %%mm4 \n\t" // block[i] == 0 ? -1 : 0 - "pcmpeqw 8(%0, %%"FF_REG_a"), %%mm5\n\t" // block[i] == 0 ? -1 : 0 - "psraw $4, %%mm0 \n\t" - "psraw $4, %%mm1 \n\t" - "pxor %%mm2, %%mm0 \n\t" - "pxor %%mm3, %%mm1 \n\t" - "psubw %%mm2, %%mm0 \n\t" - "psubw %%mm3, %%mm1 \n\t" - "pandn %%mm0, %%mm4 \n\t" - "pandn %%mm1, %%mm5 \n\t" - "movq %%mm4, (%0, %%"FF_REG_a") \n\t" - "movq %%mm5, 8(%0, %%"FF_REG_a")\n\t" - - "add $16, %%"FF_REG_a" \n\t" - "jng 1b \n\t" - ::"r" (block+nCoeffs), "r"(quant_matrix+nCoeffs), "rm" (qscale), "g" (-2*nCoeffs) - : "%"FF_REG_a, "memory" - ); - block[0]= block0; - //Note, we do not do mismatch control for intra as errors cannot accumulate -} - static void dct_unquantize_mpeg2_inter_mmx(MpegEncContext *s, int16_t *block, int n, int qscale) { @@ -461,8 +391,6 @@ av_cold void ff_mpv_common_init_x86(MpegEncContext *s) s->dct_unquantize_h263_inter = dct_unquantize_h263_inter_mmx; s->dct_unquantize_mpeg1_intra = dct_unquantize_mpeg1_intra_mmx; s->dct_unquantize_mpeg1_inter = dct_unquantize_mpeg1_inter_mmx; - if (!(s->avctx->flags & AV_CODEC_FLAG_BITEXACT)) - s->dct_unquantize_mpeg2_intra = dct_unquantize_mpeg2_intra_mmx; s->dct_unquantize_mpeg2_inter = dct_unquantize_mpeg2_inter_mmx; } #endif /* HAVE_MMX_INLINE */
Section 7.4.4 of the MPEG-2 specifications requires that the last bit of the last coefficient be toggled so that the sum of all coefficients is odd; both our decoder and encoder did this only if the bitexact flag has been set (although stuff like this should be behind AV_CODEC_FLAG2_FAST). This patch changes this by removing the spec-incompliant functions. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> --- libavcodec/mips/mpegvideo_init_mips.c | 4 -- libavcodec/mips/mpegvideo_mips.h | 2 - libavcodec/mips/mpegvideo_mmi.c | 93 --------------------------- libavcodec/mpegvideo.c | 32 --------- libavcodec/x86/mpegvideo.c | 72 --------------------- 5 files changed, 203 deletions(-)