Message ID | 20231024122258.210941-2-martin@martin.st |
---|---|
State | Accepted |
Commit | 2c3d2a02452d3f6a6702b1d09630df7e7febe311 |
Headers | show |
Series | [FFmpeg-devel,1/2] aarch64: Simplify the linux runtime cpu detection code | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
On Tue, 24 Oct 2023, Martin Storsjö wrote: > Clang versions before 17 (Xcode versions up to and including 15.0) > had a very annoying bug in its behaviour of the ".arch" directive > in assembly. If the directive only contained a level, such as > ".arch armv8.2-a", it did validate the name of the level, but it > didn't apply the level to what instructions are allowed. The level > was applied if the directive contained an extra feature enabled, > such as ".arch armv8.2-a+crc" though. It was also applied on the > next ".arch_extension" directive. > > This bug, combined with the fact that the same versions of Clang > didn't support the dotprod/i8mm extension names in either > ".arch <level>+<feature>" or in ".arch_extension", could lead to > unexepcted build failures. > > As the dotprod/i8mm extensions couldn't be enabled dynamically > via the ".arch_extension" directive, someone building ffmpeg could > try to enable them by configuring their build with > --extra-cflags="-march=armv8.6-a". > > During configure, we test for support for the i8mm instructions > like this: > > # Built with -march=armv8.6-a > .arch armv8.2-a # Has no visible effect here > #.arch_extension i8mm # Omitted as the extension name isn't known > usdot v0.4s, v0.16b, v0.16b > # Successfully assembled as armv8.6-a is the effective level, > # and i8mm is enabled implicitly in armv8.6-a. > > Thus, we would enable assembling those instructions. However if > we later check for another extension, such as sve (which those > versions of Clang actually do support), we can later run into the > following situation when building actual code: > > # Built with -march=armv8.6-a > .arch armv8.2-a # Has no visible effect here > #.arch_extension i8mm # Omitted as the extension name isn't known > .arch_extension sve # Included as "sve" is as supported extension name > # .arch_extension effectively activates the previous .arch directive, > # so the effective level is armv8.2-a+sve now. > usdot v0.4s, v0.16b, v0.16b > # Fails to build the instructions that require i8mm. Despite the > # configure check, the unrelated ".arch_extension sve" directive > # breaks the functionality of the i8mm feature. > > This patch avoids this situation: > - By adding a dummy feature such as "+crc" on the .arch directive > (if supported), we make sure that it does get applied immediately, > avoiding it taking effect spuriously at a later unrelated > ".arch_extension" directive. > - By checking for higher arch levels such as armv8.4-a and armv8.6-a, > we can assemble the dotprod and i8mm extensions without the user > needing to pass -march=armv8.6-a. This allows using the dotprod/i8mm > codepaths via runtime detection while keeping the binary runnable > on older versions. I.e. this enables the i8mm codepaths on Apple M2 > machines while built with Xcode's Clang. > > TL;DR: Enable the I8MM extensions for Apple M2 without the user needing > to do a custom configuration; avoid potential build breakage if a user > does such a custom configuration. > > Once Xcode versions that have these issues fixed are prevalent, we > can consider reverting this change. > --- > configure | 21 ++++++++++++++++++++- > 1 file changed, 20 insertions(+), 1 deletion(-) Will push now. // Martin
diff --git a/configure b/configure index f494da204c..e00fb6b719 100755 --- a/configure +++ b/configure @@ -6045,7 +6045,26 @@ check_inline_asm inline_asm_nonlocal_labels '"Label:\n"' if enabled aarch64; then as_arch_level="armv8-a" check_as as_arch_directive ".arch $as_arch_level" - enabled as_arch_directive && check_arch_level armv8.2-a + if enabled as_arch_directive; then + # Check for higher .arch levels. We only need armv8.2-a in order to + # enable the extensions we want below - we primarily want to control + # them via .arch_extension. However: + # + # Clang before version 17 (Xcode versions up to and including 15.0) + # didn't support controlling the dotprod/i8mm extensions via + # .arch_extension; thus try to enable them via the .arch level as well. + for level in armv8.2-a armv8.4-a armv8.6-a; do + check_arch_level $level + done + # Clang before version 17 (Xcode versions up to and including 15.0) + # also had a bug (https://github.com/llvm/llvm-project/issues/32220) + # causing a plain ".arch <level>" to not have any effect unless it + # had an extra "+<feature>" included - but it was activate on the next + # ".arch_extension" directive. Check if we can include "+crc" as dummy + # feature to make the .arch directive behave as expected and take + # effect right away. + check_arch_level "${as_arch_level}+crc" + fi enabled armv8 && check_insn armv8 'prfm pldl1strm, [x0]' # internal assembler in clang 3.3 does not support this instruction