Message ID | 20230715235832.64221-1-jamrial@gmail.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] avcodec/x86/mathops: use constrained immediate operands | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
Le sunnuntaina 16. heinäkuuta 2023, 2.58.32 EEST James Almer a écrit : > Should fix assembling with binutil as >= 2.41 > > Signed-off-by: James Almer <jamrial@gmail.com> > --- > This is IMO a big breakage. binutil's as has until now clipped these values > on its own, and never required the compiler to do it. TBH, silently clipping immediate constants sounds like a nasty bug that could cause really nasty suprises if somebody every passes an out-of-range constant. This has happened to me many times, typically with incidentally out-of-range immediate offsets in loads/stores. (...) > __asm__ ("shrl %1, %0\n\t" > : "+r" (a) > - : "ic" ((uint8_t)(-s)) > + : "Ic" ((uint8_t)(-s)) Note that this is not equivalent. Now, if `s` is constant but out of range, the compiler will be required to fit it. And it does that by moving it into ECX. This is probably not what you want. AFAICT, you should keep the constraint as it is, and fix the operand value instead by masking it, e.g.: if (__builtin_constant_p(s)) __asm__ ("shrl %1, %0\n\t" : "+r" (a) : "i" ((-s) & 0x1f) ); else __asm__ ("shrl %1, %0\n\t" : "+r" (a) : "c" (-s) ); (Not sure if the the 0x1f mask is correct, but you get the idea.)
James Almer (12023-07-15): > Should fix assembling with binutil as >= 2.41 > > Signed-off-by: James Almer <jamrial@gmail.com> > --- > This is IMO a big breakage. binutil's as has until now clipped these values on > its own, and never required the compiler to do it. I confirm it fixes the build failures on up-to-date Debian testing. OTOH, I ran a benchmark (decoding some x264): 474134 mod 488751 orig 494359 mod 498554 orig 508958 orig 514246 orig 518160 mod 528427 orig 530223 mod 534762 mod 536415 orig 548434 orig 550789 orig 551716 mod 553951 orig 561754 orig 572688 mod 580254 mod 581205 orig 583856 mod 583939 orig 584748 orig 594143 orig 600681 mod 607596 mod 612757 mod 621033 orig 624567 orig 626346 mod 627309 mod 628242 mod 638344 mod The numbers are the sum of the “user” column of the -benchmark_all output, on an AMD Ryzen 3 3200U and Debaian stable. The mod lines are when I disabled the two faulty functions. They are all over the place, it is hard to be sure, but it seems to indicate that, as you suspected, the benefit is not that big. Regards,
On 7/16/2023 6:23 AM, Rémi Denis-Courmont wrote: > Le sunnuntaina 16. heinäkuuta 2023, 2.58.32 EEST James Almer a écrit : >> Should fix assembling with binutil as >= 2.41 >> >> Signed-off-by: James Almer <jamrial@gmail.com> >> --- >> This is IMO a big breakage. binutil's as has until now clipped these values >> on its own, and never required the compiler to do it. > > TBH, silently clipping immediate constants sounds like a nasty bug that could > cause really nasty suprises if somebody every passes an out-of-range constant. We're passing it out or range constants alright. I tried adding an av_assert0((uint8_t)(-s) <= 31) and most fate tests started failing. > This has happened to me many times, typically with incidentally out-of-range > immediate offsets in loads/stores. > > (...) > >> __asm__ ("shrl %1, %0\n\t" >> : "+r" (a) >> - : "ic" ((uint8_t)(-s)) >> + : "Ic" ((uint8_t)(-s)) > > Note that this is not equivalent. Now, if `s` is constant but out of range, > the compiler will be required to fit it. And it does that by moving it into > ECX. This is probably not what you want. > > AFAICT, you should keep the constraint as it is, and fix the operand value > instead by masking it, e.g.: > > if (__builtin_constant_p(s)) > __asm__ ("shrl %1, %0\n\t" > : "+r" (a) > : "i" ((-s) & 0x1f) > ); > else > __asm__ ("shrl %1, %0\n\t" > : "+r" (a) > : "c" (-s) > ); > > (Not sure if the the 0x1f mask is correct, but you get the idea.) It is, just tested.
Le sunnuntaina 16. heinäkuuta 2023, 14.55.43 EEST James Almer a écrit : > On 7/16/2023 6:23 AM, Rémi Denis-Courmont wrote: > > Le sunnuntaina 16. heinäkuuta 2023, 2.58.32 EEST James Almer a écrit : > >> Should fix assembling with binutil as >= 2.41 > >> > >> Signed-off-by: James Almer <jamrial@gmail.com> > >> --- > >> This is IMO a big breakage. binutil's as has until now clipped these > >> values > >> on its own, and never required the compiler to do it. > > > > TBH, silently clipping immediate constants sounds like a nasty bug that > > could cause really nasty suprises if somebody every passes an > > out-of-range constant. > We're passing it out or range constants alright. I tried adding an > av_assert0((uint8_t)(-s) <= 31) and most fate tests started failing. Well, yes. That's why recent binutils is complaining. It wouldn't if the constant values were always in range. I'm not versed in the x86 subdomain of black magic, so I'm not sure if you imply that it was intentional that FFmpeg fed out of range values that would be cropped, or if it was unintentional. In the later case, I think that the existing assembler constraint should actually be kept as it is precisely to detect errors, and the calling code path ought to be fixed instead. Either way, changing "i" for "I" will generate suboptimal-looking code as I pointed out up-thread. If we don't even care about that, then we migth as well shift in C code, AFAICT. > > This has happened to me many times, typically with incidentally > > out-of-range immediate offsets in loads/stores. > > > > (...) > > > >> __asm__ ("shrl %1, %0\n\t" > >> > >> : "+r" (a) > >> > >> - : "ic" ((uint8_t)(-s)) > >> + : "Ic" ((uint8_t)(-s)) > > > > Note that this is not equivalent. Now, if `s` is constant but out of > > range, > > the compiler will be required to fit it. And it does that by moving it > > into > > ECX. This is probably not what you want. > > > > AFAICT, you should keep the constraint as it is, and fix the operand value > > > > instead by masking it, e.g.: > > if (__builtin_constant_p(s)) > > > > __asm__ ("shrl %1, %0\n\t" > > > > : "+r" (a) > > : "i" ((-s) & 0x1f) > > > > ); > > > > else > > > > __asm__ ("shrl %1, %0\n\t" > > > > : "+r" (a) > > : "c" (-s) > > > > ); > > > > (Not sure if the the 0x1f mask is correct, but you get the idea.) > > It is, just tested. > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
diff --git a/libavcodec/x86/mathops.h b/libavcodec/x86/mathops.h index 6298f5ed19..a08c6193bf 100644 --- a/libavcodec/x86/mathops.h +++ b/libavcodec/x86/mathops.h @@ -39,7 +39,7 @@ static av_always_inline av_const int MULL(int a, int b, unsigned shift) "imull %3 \n\t" "shrdl %4, %%edx, %%eax \n\t" :"=a"(rt), "=d"(dummy) - :"a"(a), "rm"(b), "ci"((uint8_t)shift) + :"a"(a), "rm"(b), "cI"((uint8_t)shift) ); return rt; } @@ -115,16 +115,17 @@ __asm__ volatile(\ static inline int32_t NEG_SSR32( int32_t a, int8_t s){ __asm__ ("sarl %1, %0\n\t" : "+r" (a) - : "ic" ((uint8_t)(-s)) + : "Ic" ((uint8_t)(-s)) ); return a; } #define NEG_USR32 NEG_USR32 static inline uint32_t NEG_USR32(uint32_t a, int8_t s){ + __asm__ ("shrl %1, %0\n\t" : "+r" (a) - : "ic" ((uint8_t)(-s)) + : "Ic" ((uint8_t)(-s)) ); return a; }
Should fix assembling with binutil as >= 2.41 Signed-off-by: James Almer <jamrial@gmail.com> --- This is IMO a big breakage. binutil's as has until now clipped these values on its own, and never required the compiler to do it. libavcodec/x86/mathops.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)