diff mbox series

[FFmpeg-devel] configure: Remove dcbzl check for e500v1 and e500v2 architectures

Message ID 74f9b1f3-698d-7a3d-1a5-1b43f88ea9f2@softwolves.pp.se
State New
Headers show
Series [FFmpeg-devel] configure: Remove dcbzl check for e500v1 and e500v2 architectures | expand

Checks

Context Check Description
yinshiyou/configure_loongarch64 warning Failed to apply patch
andriy/configure_x86 warning Failed to apply patch

Commit Message

Peter Krefting Sept. 26, 2022, 10:51 a.m. UTC
The DCBZL instruction is not available for the e500v1 and e500v2
architectures, but may still be recognized by the toolchain, so we need to
remove the test for it explicitly for these architectures.

References: PowerPC™ e500 Core Family Reference Manual (Freescale)

Found-by: Ståle Kristoffersen <staalebk@ifi.uio.no>
Compare: Commit d5733936d857ce5c7d28c0bc9e89a2e2548f8895
---
  configure | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

This was originally fixed by commit d5733936d857ce5c7d28c0bc9e89a2e2548f8895
in version 2.2, but later broke in a merge which introduced a "disable dcbzl",
forgot to check for it in the ppc branch.

Comments

Rémi Denis-Courmont Sept. 26, 2022, 11:24 a.m. UTC | #1
Le 26 septembre 2022 13:51:44 GMT+03:00, Peter Krefting <peter@softwolves.pp.se> a écrit :
>The DCBZL instruction is not available for the e500v1 and e500v2
>architectures, but may still be recognized by the toolchain, so we need to
>remove the test for it explicitly for these architectures.

Isn't this the sort of thing that's supposed ti be guarded by run-time CPU flags rather than in the configure script?

>References: PowerPC™ e500 Core Family Reference Manual (Freescale)
>
>Found-by: Ståle Kristoffersen <staalebk@ifi.uio.no>
>Compare: Commit d5733936d857ce5c7d28c0bc9e89a2e2548f8895
>---
> configure | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
>This was originally fixed by commit d5733936d857ce5c7d28c0bc9e89a2e2548f8895
>in version 2.2, but later broke in a merge which introduced a "disable dcbzl",
>forgot to check for it in the ppc branch.
>
>diff --git a/configure b/configure
>index 7a62f0c248..5d01833f40 100755
>--- a/configure
>+++ b/configure
>@@ -6058,7 +6058,9 @@ elif enabled ppc; then
>
>     enable local_aligned
>
>-    check_inline_asm dcbzl     '"dcbzl 0, %0" :: "r"(0)'
>+    if enabled dcbzl; then
>+        check_inline_asm dcbzl    '"dcbzl 0, %0" :: "r"(0)'
>+    fi
>     check_inline_asm ibm_asm   '"add 0, 0, 0"'
>     check_inline_asm ppc4xx    '"maclhw r10, r11, r12"'
>     check_inline_asm xform_asm '"lwzx %1, %y0" :: "Z"(*(int*)0), "r"(0)'
Peter Krefting Sept. 28, 2022, 12:52 p.m. UTC | #2
Hi!

>> The DCBZL instruction is not available for the e500v1 and e500v2
>> architectures, but may still be recognized by the toolchain, so we need to
>> remove the test for it explicitly for these architectures.
> Isn't this the sort of thing that's supposed ti be guarded by run-time CPU flags rather than in the configure script?

Our compiler (powerpc-btech-linux-gnuspe-gcc (crosstool-NG 1.24.0) 
8.3.0) recognizes the instruction, so the configure test succeeds, but 
the CPU (e500v2) crashes if it tries to execute it.

I previously had a patch (d5733936d857ce5c7d28c0bc9e89a2e2548f8895) to 
suppress the instruction, but it broke at some point, this patch tries 
to fix it in a slightly better way.

Having said that, the test is there due to the fix in 
a4adb60858f1fa0b35b08576ea34e531f0f83459 (from 2003), and disabling the 
instruction does not bring back the old optimizations as it just expects 
it not to work at all. But for our purposes this is not as important as h
aving a working library.
Rémi Denis-Courmont Sept. 28, 2022, 2:28 p.m. UTC | #3
Le 28 septembre 2022 15:52:46 GMT+03:00, Peter Krefting 
<peter@softwolves.pp.se> a écrit :
>Hi!
>
>>> The DCBZL instruction is not available for the e500v1 and e500v2
>>> architectures, but may still be recognized by the toolchain, so we need to
>>> remove the test for it explicitly for these architectures.
>> Isn't this the sort of thing that's supposed ti be guarded by run-time CPU
>> flags rather than in the configure script?
>
>Our compiler (powerpc-btech-linux-gnuspe-gcc (crosstool-NG 1.24.0) 8.3.0)
>recognizes the instruction, so the configure test succeeds, but the CPU
>(e500v2) crashes if it tries to execute it.

Yes?

>I previously had a patch (d5733936d857ce5c7d28c0bc9e89a2e2548f8895) to
>suppress the instruction, but it broke at some point, this patch tries to fix
>it in a slightly better way.

AFAICT, this old changeset had the exact same problem as this patch. If 
somebody just compiles FFmpeg with default flags as one does, it remains 
broken. Normally, you would expect that the default flags result in something 
that works, if perhaps not optimally, no?

I mean, the patch is basically correct in that it keeps DCBZL disabled if the 
selected CPU is known not to support it. Altivec already works the exact same 
way for that matter. But Altivec is also guarded at run-time, so it won't 
cause a crash if the target CPU is unspecified/unknown.

Br,
Peter Krefting Sept. 28, 2022, 3:15 p.m. UTC | #4
Rémi Denis-Courmont:

>> Our compiler (powerpc-btech-linux-gnuspe-gcc (crosstool-NG 1.24.0) 8.3.0)
>> recognizes the instruction, so the configure test succeeds, but the CPU
>> (e500v2) crashes if it tries to execute it.
> Yes?

Indeed.

>> I previously had a patch (d5733936d857ce5c7d28c0bc9e89a2e2548f8895) to
>> suppress the instruction, but it broke at some point, this patch tries to fix
>> it in a slightly better way.
> AFAICT, this old changeset had the exact same problem as this patch. If
> somebody just compiles FFmpeg with default flags as one does, it remains
> broken.

Before d5733936, ffmpeg would crash on startup on e500v2 when the 
binary tried to use the unsupported instruction (when built with 
default configuration). At d5733936 it works as the instruction is 
disabled.

At some point between d5733936 and HEAD, a default "configure" run for 
this CPU re-enabled the instruction, causing it again to crash on 
startup. Since the configure script was changed to set "disable" in 
the CPU selection header:

   e500v2)
     cpuflags="-mcpu=8548 -mhard-float -mfloat-gprs=double"
     disable altivec
     disable dcbzl  <--- here

(which I believe came in from the avconf fork merge), my patch fixes 
the ppc-specific branch to check if it was disabled in the above 
stanza.

> Normally, you would expect that the default flags result in something 
> that works, if perhaps not optimally, no?

Exactly, this is what I am trying to (re-)fix.

(And yes, I know that the CPU I am running on is end-of-life, but the 
joy of working with embedded hardware is that you have to support it 
anyway.)
Michael Niedermayer Jan. 5, 2024, 9:35 p.m. UTC | #5
On Mon, Sep 26, 2022 at 11:51:44AM +0100, Peter Krefting wrote:
> The DCBZL instruction is not available for the e500v1 and e500v2
> architectures, but may still be recognized by the toolchain, so we need to
> remove the test for it explicitly for these architectures.
> 
> References: PowerPC™ e500 Core Family Reference Manual (Freescale)
> 
> Found-by: Ståle Kristoffersen <staalebk@ifi.uio.no>
> Compare: Commit d5733936d857ce5c7d28c0bc9e89a2e2548f8895
> ---
>  configure | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> This was originally fixed by commit d5733936d857ce5c7d28c0bc9e89a2e2548f8895
> in version 2.2, but later broke in a merge which introduced a "disable dcbzl",
> forgot to check for it in the ppc branch.
> 
> diff --git a/configure b/configure
> index 7a62f0c248..5d01833f40 100755
> --- a/configure
> +++ b/configure
> @@ -6058,7 +6058,9 @@ elif enabled ppc; then
> 
>      enable local_aligned
> 
> -    check_inline_asm dcbzl     '"dcbzl 0, %0" :: "r"(0)'
> +    if enabled dcbzl; then
> +        check_inline_asm dcbzl    '"dcbzl 0, %0" :: "r"(0)'
> +    fi

something like this
disabled dcbzl || check_inline_asm dcbzl    '"dcbzl 0, %0" :: "r"(0)'

seems more clear
what this is supposed to do is to disable the instruction when it was
explicitly disabled for the target CPU

thx

{...]
diff mbox series

Patch

diff --git a/configure b/configure
index 7a62f0c248..5d01833f40 100755
--- a/configure
+++ b/configure
@@ -6058,7 +6058,9 @@  elif enabled ppc; then

      enable local_aligned

-    check_inline_asm dcbzl     '"dcbzl 0, %0" :: "r"(0)'
+    if enabled dcbzl; then
+        check_inline_asm dcbzl    '"dcbzl 0, %0" :: "r"(0)'
+    fi
      check_inline_asm ibm_asm   '"add 0, 0, 0"'
      check_inline_asm ppc4xx    '"maclhw r10, r11, r12"'
      check_inline_asm xform_asm '"lwzx %1, %y0" :: "Z"(*(int*)0), "r"(0)'