diff mbox series

[FFmpeg-devel,v2] add a configure flag to enabled tree-vecorization with gcc

Message ID fd2ce78bb11d457481cdcca122c1ad64@amazon.com
State New
Headers show
Series [FFmpeg-devel,v2] add a configure flag to enabled tree-vecorization with gcc | expand

Checks

Context Check Description
yinshiyou/commit_msg_loongarch64 warning The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ".
andriy/commit_msg_x86 warning The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ".
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_fate_x86 success Make fate finished
andriy/make_x86 warning New warnings during build

Commit Message

Swinney, Jonathan Aug. 8, 2022, 3:25 p.m. UTC
Recent version of gcc improve the automatic vectorization. This flag
allows adventurous users to enable vectorization. Known problems with
this are primarily related to inline assembly for x86 and so to address
those, add a pragma to explicitly disable automatic vectorization for
those files.

Signed-off-by: Jonathan Swinney <jswinney@amazon.com>

--

Thank you considering this patch. I believe this addresses the primary
concerns that were raised by my previous submission. There may be more
files which require the pragma add `-fno-tree-vectorize`, and I welcome
suggestions. This should strike a compromise, allowing some users to
enable vectorization while not breaking mainstream builds. This should
give time to work out additional problems if they arise before enabling
vectorization more broadly.

---
 configure              | 7 ++++++-
 libavcodec/x86/cabac.h | 4 ++++
 2 files changed, 10 insertions(+), 1 deletion(-)

Comments

Lynne Aug. 8, 2022, 7:31 p.m. UTC | #1
Aug 8, 2022, 17:25 by jswinney@amazon.com:

> Recent version of gcc improve the automatic vectorization. This flag
> allows adventurous users to enable vectorization. Known problems with
> this are primarily related to inline assembly for x86 and so to address
> those, add a pragma to explicitly disable automatic vectorization for
> those files.
>
> Signed-off-by: Jonathan Swinney <jswinney@amazon.com>
>
> --
>
> Thank you considering this patch. I believe this addresses the primary
> concerns that were raised by my previous submission. There may be more
> files which require the pragma add `-fno-tree-vectorize`, and I welcome
> suggestions. This should strike a compromise, allowing some users to
> enable vectorization while not breaking mainstream builds. This should
> give time to work out additional problems if they arise before enabling
> vectorization more broadly.
>

I dislike this, pretty soon we'll end up with compiler version checks
whenever vectorization breaks.
Either gcc should fix miscompilation, or patches should be sent
to write assembly (which they should be anyway).
diff mbox series

Patch

diff --git a/configure b/configure
index cbbb4dd9c8..8e842da1b8 100755
--- a/configure
+++ b/configure
@@ -110,6 +110,7 @@  Configuration options:
   --disable-swscale-alpha  disable alpha channel support in swscale
   --disable-all            disable building components, libraries and programs
   --disable-autodetect     disable automatically detected external libraries [no]
+  --enable-auto-vectorization enable compiler auto vectorization
 
 Program options:
   --disable-programs       do not build command line programs
@@ -1945,6 +1946,7 @@  FEATURE_LIST="
     small
     static
     swscale_alpha
+    auto_vectorization
 "
 
 # this list should be kept in linking order
@@ -7176,7 +7178,9 @@  if enabled icc; then
             disable aligned_stack
     fi
 elif enabled gcc; then
-    check_optflags -fno-tree-vectorize
+    if disabled auto_vectorization; then
+        check_optflags -fno-tree-vectorize
+    fi
     check_cflags -Werror=format-security
     check_cflags -Werror=implicit-function-declaration
     check_cflags -Werror=missing-prototypes
@@ -7569,6 +7573,7 @@  echo "pod2man enabled           ${pod2man-no}"
 echo "makeinfo enabled          ${makeinfo-no}"
 echo "makeinfo supports HTML    ${makeinfo_html-no}"
 echo "xmllint enabled           ${xmllint-no}"
+echo "auto-vectorization        ${auto_vectorization-no}"
 test -n "$random_seed" &&
     echo "random seed               ${random_seed}"
 echo
diff --git a/libavcodec/x86/cabac.h b/libavcodec/x86/cabac.h
index b046a56a6b..782e4cbda4 100644
--- a/libavcodec/x86/cabac.h
+++ b/libavcodec/x86/cabac.h
@@ -39,6 +39,10 @@ 
 
 #if HAVE_INLINE_ASM
 
+#ifdef __GNUC__
+    __attribute__((optimize("-fno-tree-vectorize")))
+#endif
+
 #ifndef UNCHECKED_BITSTREAM_READER
 #define UNCHECKED_BITSTREAM_READER !CONFIG_SAFE_BITSTREAM_READER
 #endif