Message ID | 74f9b1f3-698d-7a3d-1a5-1b43f88ea9f2@softwolves.pp.se |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] configure: Remove dcbzl check for e500v1 and e500v2 architectures | expand |
Context | Check | Description |
---|---|---|
yinshiyou/configure_loongarch64 | warning | Failed to apply patch |
andriy/configure_x86 | warning | Failed to apply patch |
Le 26 septembre 2022 13:51:44 GMT+03:00, Peter Krefting <peter@softwolves.pp.se> a écrit : >The DCBZL instruction is not available for the e500v1 and e500v2 >architectures, but may still be recognized by the toolchain, so we need to >remove the test for it explicitly for these architectures. Isn't this the sort of thing that's supposed ti be guarded by run-time CPU flags rather than in the configure script? >References: PowerPC™ e500 Core Family Reference Manual (Freescale) > >Found-by: Ståle Kristoffersen <staalebk@ifi.uio.no> >Compare: Commit d5733936d857ce5c7d28c0bc9e89a2e2548f8895 >--- > configure | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > >This was originally fixed by commit d5733936d857ce5c7d28c0bc9e89a2e2548f8895 >in version 2.2, but later broke in a merge which introduced a "disable dcbzl", >forgot to check for it in the ppc branch. > >diff --git a/configure b/configure >index 7a62f0c248..5d01833f40 100755 >--- a/configure >+++ b/configure >@@ -6058,7 +6058,9 @@ elif enabled ppc; then > > enable local_aligned > >- check_inline_asm dcbzl '"dcbzl 0, %0" :: "r"(0)' >+ if enabled dcbzl; then >+ check_inline_asm dcbzl '"dcbzl 0, %0" :: "r"(0)' >+ fi > check_inline_asm ibm_asm '"add 0, 0, 0"' > check_inline_asm ppc4xx '"maclhw r10, r11, r12"' > check_inline_asm xform_asm '"lwzx %1, %y0" :: "Z"(*(int*)0), "r"(0)'
Hi! >> The DCBZL instruction is not available for the e500v1 and e500v2 >> architectures, but may still be recognized by the toolchain, so we need to >> remove the test for it explicitly for these architectures. > Isn't this the sort of thing that's supposed ti be guarded by run-time CPU flags rather than in the configure script? Our compiler (powerpc-btech-linux-gnuspe-gcc (crosstool-NG 1.24.0) 8.3.0) recognizes the instruction, so the configure test succeeds, but the CPU (e500v2) crashes if it tries to execute it. I previously had a patch (d5733936d857ce5c7d28c0bc9e89a2e2548f8895) to suppress the instruction, but it broke at some point, this patch tries to fix it in a slightly better way. Having said that, the test is there due to the fix in a4adb60858f1fa0b35b08576ea34e531f0f83459 (from 2003), and disabling the instruction does not bring back the old optimizations as it just expects it not to work at all. But for our purposes this is not as important as h aving a working library.
Le 28 septembre 2022 15:52:46 GMT+03:00, Peter Krefting <peter@softwolves.pp.se> a écrit : >Hi! > >>> The DCBZL instruction is not available for the e500v1 and e500v2 >>> architectures, but may still be recognized by the toolchain, so we need to >>> remove the test for it explicitly for these architectures. >> Isn't this the sort of thing that's supposed ti be guarded by run-time CPU >> flags rather than in the configure script? > >Our compiler (powerpc-btech-linux-gnuspe-gcc (crosstool-NG 1.24.0) 8.3.0) >recognizes the instruction, so the configure test succeeds, but the CPU >(e500v2) crashes if it tries to execute it. Yes? >I previously had a patch (d5733936d857ce5c7d28c0bc9e89a2e2548f8895) to >suppress the instruction, but it broke at some point, this patch tries to fix >it in a slightly better way. AFAICT, this old changeset had the exact same problem as this patch. If somebody just compiles FFmpeg with default flags as one does, it remains broken. Normally, you would expect that the default flags result in something that works, if perhaps not optimally, no? I mean, the patch is basically correct in that it keeps DCBZL disabled if the selected CPU is known not to support it. Altivec already works the exact same way for that matter. But Altivec is also guarded at run-time, so it won't cause a crash if the target CPU is unspecified/unknown. Br,
Rémi Denis-Courmont: >> Our compiler (powerpc-btech-linux-gnuspe-gcc (crosstool-NG 1.24.0) 8.3.0) >> recognizes the instruction, so the configure test succeeds, but the CPU >> (e500v2) crashes if it tries to execute it. > Yes? Indeed. >> I previously had a patch (d5733936d857ce5c7d28c0bc9e89a2e2548f8895) to >> suppress the instruction, but it broke at some point, this patch tries to fix >> it in a slightly better way. > AFAICT, this old changeset had the exact same problem as this patch. If > somebody just compiles FFmpeg with default flags as one does, it remains > broken. Before d5733936, ffmpeg would crash on startup on e500v2 when the binary tried to use the unsupported instruction (when built with default configuration). At d5733936 it works as the instruction is disabled. At some point between d5733936 and HEAD, a default "configure" run for this CPU re-enabled the instruction, causing it again to crash on startup. Since the configure script was changed to set "disable" in the CPU selection header: e500v2) cpuflags="-mcpu=8548 -mhard-float -mfloat-gprs=double" disable altivec disable dcbzl <--- here (which I believe came in from the avconf fork merge), my patch fixes the ppc-specific branch to check if it was disabled in the above stanza. > Normally, you would expect that the default flags result in something > that works, if perhaps not optimally, no? Exactly, this is what I am trying to (re-)fix. (And yes, I know that the CPU I am running on is end-of-life, but the joy of working with embedded hardware is that you have to support it anyway.)
On Mon, Sep 26, 2022 at 11:51:44AM +0100, Peter Krefting wrote: > The DCBZL instruction is not available for the e500v1 and e500v2 > architectures, but may still be recognized by the toolchain, so we need to > remove the test for it explicitly for these architectures. > > References: PowerPC™ e500 Core Family Reference Manual (Freescale) > > Found-by: Ståle Kristoffersen <staalebk@ifi.uio.no> > Compare: Commit d5733936d857ce5c7d28c0bc9e89a2e2548f8895 > --- > configure | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > This was originally fixed by commit d5733936d857ce5c7d28c0bc9e89a2e2548f8895 > in version 2.2, but later broke in a merge which introduced a "disable dcbzl", > forgot to check for it in the ppc branch. > > diff --git a/configure b/configure > index 7a62f0c248..5d01833f40 100755 > --- a/configure > +++ b/configure > @@ -6058,7 +6058,9 @@ elif enabled ppc; then > > enable local_aligned > > - check_inline_asm dcbzl '"dcbzl 0, %0" :: "r"(0)' > + if enabled dcbzl; then > + check_inline_asm dcbzl '"dcbzl 0, %0" :: "r"(0)' > + fi something like this disabled dcbzl || check_inline_asm dcbzl '"dcbzl 0, %0" :: "r"(0)' seems more clear what this is supposed to do is to disable the instruction when it was explicitly disabled for the target CPU thx {...]
diff --git a/configure b/configure index 7a62f0c248..5d01833f40 100755 --- a/configure +++ b/configure @@ -6058,7 +6058,9 @@ elif enabled ppc; then enable local_aligned - check_inline_asm dcbzl '"dcbzl 0, %0" :: "r"(0)' + if enabled dcbzl; then + check_inline_asm dcbzl '"dcbzl 0, %0" :: "r"(0)' + fi check_inline_asm ibm_asm '"add 0, 0, 0"' check_inline_asm ppc4xx '"maclhw r10, r11, r12"' check_inline_asm xform_asm '"lwzx %1, %y0" :: "Z"(*(int*)0), "r"(0)'