diff mbox series

[FFmpeg-devel] x86: cabac: Disable the inline asm on clang on windows on i386

Message ID 20200523182917.11853-1-martin@martin.st
State New
Headers show
Series [FFmpeg-devel] x86: cabac: Disable the inline asm on clang on windows on i386 | expand

Checks

Context Check Description
andriy/default pending
andriy/make success Make finished
andriy/make_fate success Make fate finished

Commit Message

Martin Storsjö May 23, 2020, 6:29 p.m. UTC
The cabac inline assembly is very brittle to assemble properly
on i386 windows with clang; libavcodec/hevc_cabac.c fails to build
in the default mode (which is -march=pentium4), it only works if
ffmpeg is configured with --cpu=i686 (translating to -march=i686),
and likewise, the inline assembly fails to assemble in
libavcodec/h264_cabac.c if building with optimizations disabled.

Instead of trying to step around the problem (and end up bit by it
occasionally), just disable the inline assembly for this configuration.
---
 libavcodec/x86/cabac.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Carl Eugen Hoyos May 23, 2020, 10:42 p.m. UTC | #1
Am Sa., 23. Mai 2020 um 20:35 Uhr schrieb Martin Storsjö <martin@martin.st>:
>
> The cabac inline assembly is very brittle to assemble properly
> on i386 windows with clang; libavcodec/hevc_cabac.c fails to build
> in the default mode (which is -march=pentium4), it only works if
> ffmpeg is configured with --cpu=i686 (translating to -march=i686),
> and likewise, the inline assembly fails to assemble in
> libavcodec/h264_cabac.c if building with optimizations disabled.
>
> Instead of trying to step around the problem (and end up bit by it
> occasionally), just disable the inline assembly for this configuration.

I think I cannot reproduce this issue, what may I miss?

Carl Eugen
Martin Storsjö May 24, 2020, 7:53 p.m. UTC | #2
On Sun, 24 May 2020, Carl Eugen Hoyos wrote:

> Am Sa., 23. Mai 2020 um 20:35 Uhr schrieb Martin Storsjö <martin@martin.st>:
>>
>> The cabac inline assembly is very brittle to assemble properly
>> on i386 windows with clang; libavcodec/hevc_cabac.c fails to build
>> in the default mode (which is -march=pentium4), it only works if
>> ffmpeg is configured with --cpu=i686 (translating to -march=i686),
>> and likewise, the inline assembly fails to assemble in
>> libavcodec/h264_cabac.c if building with optimizations disabled.
>>
>> Instead of trying to step around the problem (and end up bit by it
>> occasionally), just disable the inline assembly for this configuration.
>
> I think I cannot reproduce this issue, what may I miss?

That's a good question - how are you building? I can reproduce it both in 
mingw and msvc mode.

In mingw mode (with my llvm-mingw toolchains from 
https://github.com/mstorsjo/llvm-mingw, but any clang configured to target 
i386 mingw should do), I can reproduce it (in a cross compile setup) like 
this (tested with Clang 10.0 but any version should do, I remember seeing 
the issue long ago):

$ configure --cross-prefix=i686-w64-mingw32- --target-os=mingw32 
--arch=i686 --enable-gpl
$ make V=1
...
i686-w64-mingw32-gcc -I. -Isrc/ -D_ISOC99_SOURCE -D_FILE_OFFSET_BITS=64 
-D_LARGEFILE_SOURCE -U__STRICT_ANSI__ -D__USE_MINGW_ANSI_STDIO=1 
-D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 -DHAVE_AV_CONFIG_H 
-DBUILDING_avcodec -std=c11 -fomit-frame-pointer -g 
-Wdeclaration-after-statement -Wall -Wdisabled-optimization 
-Wpointer-arith -Wredundant-decls -Wwrite-strings -Wtype-limits -Wundef 
-Wmissing-prototypes -Wno-pointer-to-int-cast -Wstrict-prototypes 
-Wempty-body -Wno-parentheses -Wno-switch -Wno-format-zero-length 
-Wno-pointer-sign -Wno-unused-const-variable -Wno-bool-operation 
-Wno-char-subscripts -O3 -fno-math-errno -fno-signed-zeros 
-Qunused-arguments -Werror=implicit-function-declaration 
-Werror=missing-prototypes -Werror=return-type  -MMD -MF 
libavcodec/hevc_cabac.d -MT libavcodec/hevc_cabac.o -c -o 
libavcodec/hevc_cabac.o src/libavcodec/hevc_cabac.c
warning: unknown warning option '-Wno-bool-operation'; did you mean 
'-Wno-bool-conversion'? [-Wunknown-warning-option]
src/libavcodec/hevc_cabac.c:37:21: warning: variable 'num_bins_in_se' is 
not needed and will not be emitted [-Wunneeded-internal-declaration]
static const int8_t num_bins_in_se[] = {
                     ^
In file included from src/libavcodec/hevc_cabac.c:27:
In file included from src/libavcodec/cabac_functions.h:46:
src/libavcodec/x86/cabac.h:193:9: error: inline assembly requires more 
registers than available
         BRANCHLESS_GET_CABAC("%0", "%q0", "(%4)", "%1", "%w1",
         ^


Same thing if I build in MSVC mode, by setting INCLUDE and LIB to point to 
the right MSVC folders, and configure like this:

$ configure --enable-gpl --arch=i686 --cc='clang -target i686-win32-msvc' 
--ld=lld-link --target-os=win32 --extra-ldflags='msvcrt.lib oldnames.lib' 
--enable-cross-compile --ar=llvm-ar --nm=llvm-nm --disable-stripping
$ make V=1
clang -target i686-win32-msvc -I. -Isrc/ -D_ISOC99_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_USE_MATH_DEFINES 
-D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_WARNINGS -D_WIN32_WINNT=0x0600 
-DHAVE_AV_CONFIG_H -DBUILDING_avcodec -std=c11 -fomit-frame-pointer -g 
-Wdeclaration-after-statement -Wall -Wdisabled-optimization 
-Wpointer-arith -Wredundant-decls -Wwrite-strings -Wtype-limits -Wundef 
-Wmissing-prototypes -Wno-pointer-to-int-cast -Wstrict-prototypes 
-Wempty-body -Wno-parentheses -Wno-switch -Wno-format-zero-length 
-Wno-pointer-sign -Wno-unused-const-variable -Wno-bool-operation 
-Wno-char-subscripts -O3 -fno-math-errno -fno-signed-zeros 
-Qunused-arguments -Werror=implicit-function-declaration 
-Werror=missing-prototypes -Werror=return-type  -MMD -MF 
libavcodec/hevc_cabac.d -MT libavcodec/hevc_cabac.o -c -o 
libavcodec/hevc_cabac.o src/libavcodec/hevc_cabac.c
src/libavcodec/hevc_cabac.c:37:21: warning: variable 'num_bins_in_se' is 
not needed and will not be emitted [-Wunneeded-internal-declaration]
static const int8_t num_bins_in_se[] = {
                     ^
In file included from src/libavcodec/hevc_cabac.c:27:
In file included from src/libavcodec/cabac_functions.h:46:
src/libavcodec/x86/cabac.h:193:9: error: inline assembly requires more 
registers than available
         BRANCHLESS_GET_CABAC("%0", "%q0", "(%4)", "%1", "%w1",
         ^

The same also seems to trigger in the same way if using the clang-cl 
frontend, e.g. configured like this:

configure --enable-gpl --arch=i686 --cc=clang-cl --ld=lld-link 
--target-os=win32 --toolchain=msvc --enable-cross-compile --ar=llvm-ar 
--nm=llvm-nm --disable-stripping --extra-cflags=-m32


In the first two cases, adding --cpu=i686 to the configure line, which 
converts into -march=i686 on the compile commands, makes the issue go 
away. For clang-cl, --cpu=i686 doesn't have any effect, so the issue 
doesn't easily go away there.

Now if configuring with --cpu=i686 --disable-optimizations, the same 
issues are back again, in both libavcodec/h264_cabac.o and 
libavcodec/hevc_cabac.o.

// Martin
Carl Eugen Hoyos May 24, 2020, 8:16 p.m. UTC | #3
Am So., 24. Mai 2020 um 21:53 Uhr schrieb Martin Storsjö <martin@martin.st>:

> configure --enable-gpl --arch=i686 --cc=clang-cl --ld=lld-link
> --target-os=win32 --toolchain=msvc --enable-cross-compile --ar=llvm-ar
> --nm=llvm-nm --disable-stripping --extra-cflags=-m32

Why are you cross-compiling?
On which system are you testing this?

Carl Eugen
Martin Storsjö May 24, 2020, 9:23 p.m. UTC | #4
On Sun, 24 May 2020, Carl Eugen Hoyos wrote:

> Am So., 24. Mai 2020 um 21:53 Uhr schrieb Martin Storsjö <martin@martin.st>:
>
>> configure --enable-gpl --arch=i686 --cc=clang-cl --ld=lld-link
>> --target-os=win32 --toolchain=msvc --enable-cross-compile --ar=llvm-ar
>> --nm=llvm-nm --disable-stripping --extra-cflags=-m32
>
> Why are you cross-compiling?
> On which system are you testing this?

Because I normally only ever cross compile for windows.

There seems to be a subtle difference in this case, between using clang-cl 
and "clang -target i686-win32-msvc"; clang-cl passes "-mdisable-fp-elim" 
to the compiler internals, while "clang -target i686-win32-msvc" doesn't.

And cross compiling does seem to affect this particular case; there's a 
check for whether ebp is available, which requires running the built 
executable. When cross compiling, it's assumed the built executable is ok 
and it isn't test run.

So with clang-cl running on windows, you'll end up with "#define 
HAVE_EBP_AVAILABLE 0" in config.h, and the x86 cabac code won't end up 
used at all - i.e. this patch wouldn't make any difference.

With clang-cl in cross compilation, and other invocations of "clang" 
instead of "clang-cl", targeting i686 windows, you'll end up with 
ebp_available enabled.

For a case where it does make a difference, in the same setup, try 
configuring this way instead:

--cc='clang -target i686-win32-msvc' --ld=lld-link 
--extra-ldflags='msvcrt.lib oldnames.lib' --toolchain=msvc

In this case, the ebp_available test will succeed, it will proceed to 
trying to build the x86 inline cabac code, which fails.

// Martin
Martin Storsjö May 28, 2020, 10:37 a.m. UTC | #5
On Mon, 25 May 2020, Martin Storsjö wrote:

> On Sun, 24 May 2020, Carl Eugen Hoyos wrote:
>
>> Am So., 24. Mai 2020 um 21:53 Uhr schrieb Martin Storsjö 
> <martin@martin.st>:
>>
>>> configure --enable-gpl --arch=i686 --cc=clang-cl --ld=lld-link
>>> --target-os=win32 --toolchain=msvc --enable-cross-compile --ar=llvm-ar
>>> --nm=llvm-nm --disable-stripping --extra-cflags=-m32
>>
>> Why are you cross-compiling?
>> On which system are you testing this?
>
> Because I normally only ever cross compile for windows.
>
> There seems to be a subtle difference in this case, between using clang-cl 
> and "clang -target i686-win32-msvc"; clang-cl passes "-mdisable-fp-elim" 
> to the compiler internals, while "clang -target i686-win32-msvc" doesn't.
>
> And cross compiling does seem to affect this particular case; there's a 
> check for whether ebp is available, which requires running the built 
> executable. When cross compiling, it's assumed the built executable is ok 
> and it isn't test run.
>
> So with clang-cl running on windows, you'll end up with "#define 
> HAVE_EBP_AVAILABLE 0" in config.h, and the x86 cabac code won't end up 
> used at all - i.e. this patch wouldn't make any difference.
>
> With clang-cl in cross compilation, and other invocations of "clang" 
> instead of "clang-cl", targeting i686 windows, you'll end up with 
> ebp_available enabled.
>
> For a case where it does make a difference, in the same setup, try 
> configuring this way instead:
>
> --cc='clang -target i686-win32-msvc' --ld=lld-link 
> --extra-ldflags='msvcrt.lib oldnames.lib' --toolchain=msvc
>
> In this case, the ebp_available test will succeed, it will proceed to 
> trying to build the x86 inline cabac code, which fails.

Any further comments on this patch?

// Martin
Carl Eugen Hoyos May 29, 2020, 9:52 p.m. UTC | #6
> Am 28.05.2020 um 12:37 schrieb Martin Storsjö <martin@martin.st>:
> 
>> On Mon, 25 May 2020, Martin Storsjö wrote:
>> 
>>> On Sun, 24 May 2020, Carl Eugen Hoyos wrote:
>>> 
>>> Am So., 24. Mai 2020 um 21:53 Uhr schrieb Martin Storsjö 
>> <martin@martin.st>:
>>> 
>>>> configure --enable-gpl --arch=i686 --cc=clang-cl --ld=lld-link
>>>> --target-os=win32 --toolchain=msvc --enable-cross-compile --ar=llvm-ar
>>>> --nm=llvm-nm --disable-stripping --extra-cflags=-m32
>>> 
>>> Why are you cross-compiling?
>>> On which system are you testing this?
>> 
>> Because I normally only ever cross compile for windows.
>> 
>> There seems to be a subtle difference in this case, between using clang-cl and "clang -target i686-win32-msvc"; clang-cl passes "-mdisable-fp-elim" to the compiler internals, while "clang -target i686-win32-msvc" doesn't.
>> 
>> And cross compiling does seem to affect this particular case; there's a check for whether ebp is available, which requires running the built executable. When cross compiling, it's assumed the built executable is ok and it isn't test run.
>> 
>> So with clang-cl running on windows, you'll end up with "#define HAVE_EBP_AVAILABLE 0" in config.h, and the x86 cabac code won't end up used at all - i.e. this patch wouldn't make any difference.
>> 
>> With clang-cl in cross compilation, and other invocations of "clang" instead of "clang-cl", targeting i686 windows, you'll end up with ebp_available enabled.
>> 
>> For a case where it does make a difference, in the same setup, try configuring this way instead:
>> 
>> --cc='clang -target i686-win32-msvc' --ld=lld-link --extra-ldflags='msvcrt.lib oldnames.lib' --toolchain=msvc
>> 
>> In this case, the ebp_available test will succeed, it will proceed to trying to build the x86 inline cabac code, which fails.
> 
> Any further comments on this patch?

Please wait a few days more, I am not happy about the patch and at least I’d like to understand better why this is a good and why there is no alternative.

Carl Eugen
Martin Storsjö May 29, 2020, 10:26 p.m. UTC | #7
On Fri, 29 May 2020, Carl Eugen Hoyos wrote:

>> Am 28.05.2020 um 12:37 schrieb Martin Storsjö <martin@martin.st>:
>> 
>> Any further comments on this patch?
>
> Please wait a few days more, I am not happy about the patch and at least 
> I’d like to understand better why this is a good and why there is no 
> alternative.

Fair enough, take your time.

A different alternative would maybe be to just disable HAVE_7REGS for this 
configuration, that would leave a couple more inline functions that might 
work. (I haven't tested that configuration to see if it fails as easily.)

// Martin
diff mbox series

Patch

diff --git a/libavcodec/x86/cabac.h b/libavcodec/x86/cabac.h
index cfd3b759c9..5f1b97cec0 100644
--- a/libavcodec/x86/cabac.h
+++ b/libavcodec/x86/cabac.h
@@ -27,7 +27,7 @@ 
 #include "libavutil/x86/asm.h"
 #include "config.h"
 
-#if   (defined(__i386) && defined(__clang__) && (__clang_major__<2 || (__clang_major__==2 && __clang_minor__<10)))\
+#if   (defined(__i386) && defined(__clang__) && (__clang_major__<2 || (__clang_major__==2 && __clang_minor__<10)) || defined(_WIN32))\
    || (                  !defined(__clang__) && defined(__llvm__) && __GNUC__==4 && __GNUC_MINOR__==2 && __GNUC_PATCHLEVEL__<=1)\
    || (defined(__INTEL_COMPILER) && defined(_MSC_VER))
 #       define BROKEN_COMPILER 1