diff mbox series

[FFmpeg-devel] avutil/mem: limit alignment to maximum simg align

Message ID 20240113005716.16018-1-timo@rothenpieler.org
State New
Headers show
Series [FFmpeg-devel] avutil/mem: limit alignment to maximum simg align | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Timo Rothenpieler Jan. 13, 2024, 12:57 a.m. UTC
FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs,
which then end up heap-allocated.
By declaring any variable in a struct, or tree of structs, to be 32 byte
aligned, it allows the compiler to safely assume the entire struct
itself is also 32 byte aligned.

This might make the compiler emit code which straight up crashes or
misbehaves in other ways, and at least in one instances is now
documented to actually do (see ticket 10549 on trac).
The issue there is that an unrelated variable in SingleChannelElement is
declared to have an alignment of 32 bytes. So if the compiler does a copy
in decode_cpe() with avx instructions, but ffmpeg is built with
--disable-avx, this results in a crash, since the memory is only 16 byte
aligned.

Mind you, even if the compiler does not emit avx instructions, the code
is still invalid and could misbehave. It just happens not to. Declaring
any variable in a struct with a 32 byte alignment promises 32 byte
alignment of the whole struct to the compiler.

This patch limits the maximum alignment to the maximum possible simd
alignment according to configure.
While not perfect, it at the very least gets rid of a lot of UB, by
matching up the maximum DECLARE_ALIGNED value with the alignment of heap
allocations done by lavu.
---
 libavutil/mem.c          |  2 +-
 libavutil/mem_internal.h | 20 +++++++++++---------
 2 files changed, 12 insertions(+), 10 deletions(-)

Comments

Timo Rothenpieler Jan. 13, 2024, 1 a.m. UTC | #1
On 13.01.2024 01:57, Timo Rothenpieler wrote:
> FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs,
> which then end up heap-allocated.
> By declaring any variable in a struct, or tree of structs, to be 32 byte
> aligned, it allows the compiler to safely assume the entire struct
> itself is also 32 byte aligned.
> 
> This might make the compiler emit code which straight up crashes or
> misbehaves in other ways, and at least in one instances is now
> documented to actually do (see ticket 10549 on trac).
> The issue there is that an unrelated variable in SingleChannelElement is
> declared to have an alignment of 32 bytes. So if the compiler does a copy
> in decode_cpe() with avx instructions, but ffmpeg is built with
> --disable-avx, this results in a crash, since the memory is only 16 byte
> aligned.
> 
> Mind you, even if the compiler does not emit avx instructions, the code
> is still invalid and could misbehave. It just happens not to. Declaring
> any variable in a struct with a 32 byte alignment promises 32 byte
> alignment of the whole struct to the compiler.
> 
> This patch limits the maximum alignment to the maximum possible simd
> alignment according to configure.
> While not perfect, it at the very least gets rid of a lot of UB, by
> matching up the maximum DECLARE_ALIGNED value with the alignment of heap
> allocations done by lavu.
> ---
>   libavutil/mem.c          |  2 +-
>   libavutil/mem_internal.h | 20 +++++++++++---------
>   2 files changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/libavutil/mem.c b/libavutil/mem.c
> index 36b8940a0c..62163b4cb3 100644
> --- a/libavutil/mem.c
> +++ b/libavutil/mem.c
> @@ -62,7 +62,7 @@ void  free(void *ptr);
>   
>   #endif /* MALLOC_PREFIX */
>   
> -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16))
> +#define ALIGN (HAVE_SIMD_ALIGN_64 ? 64 : (HAVE_SIMD_ALIGN_32 ? 32 : 16))
>   
>   /* NOTE: if you want to override these functions with your own
>    * implementations (not recommended) you have to link libav* as
> diff --git a/libavutil/mem_internal.h b/libavutil/mem_internal.h
> index 2448c606f1..ddd3c24806 100644
> --- a/libavutil/mem_internal.h
> +++ b/libavutil/mem_internal.h
> @@ -75,22 +75,24 @@
>    * @param v Name of the variable
>    */
>   
> +#define MAX_ALIGNMENT (HAVE_SIMD_ALIGN_64 ? 64 : (HAVE_SIMD_ALIGN_32 ? 32 : 16))

The issue with this approach is that code that previously allowed the 
compiler to optimize it heavily will now only be optimized if ffmpeg was 
built without disabling avx.
Probably a fair tradeoff though, since that's a very niche case.

I also did not test this with MSVC or ICC, so I have no idea if they 
allow FFMIN in the middle of an alignment attribute.
Timo Rothenpieler Jan. 13, 2024, 3:24 p.m. UTC | #2
On 13.01.2024 01:57, Timo Rothenpieler wrote:
> FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs,
> which then end up heap-allocated.
> By declaring any variable in a struct, or tree of structs, to be 32 byte
> aligned, it allows the compiler to safely assume the entire struct
> itself is also 32 byte aligned.
> 
> This might make the compiler emit code which straight up crashes or
> misbehaves in other ways, and at least in one instances is now
> documented to actually do (see ticket 10549 on trac).
> The issue there is that an unrelated variable in SingleChannelElement is
> declared to have an alignment of 32 bytes. So if the compiler does a copy
> in decode_cpe() with avx instructions, but ffmpeg is built with
> --disable-avx, this results in a crash, since the memory is only 16 byte
> aligned.
> 
> Mind you, even if the compiler does not emit avx instructions, the code
> is still invalid and could misbehave. It just happens not to. Declaring
> any variable in a struct with a 32 byte alignment promises 32 byte
> alignment of the whole struct to the compiler.
> 
> This patch limits the maximum alignment to the maximum possible simd
> alignment according to configure.
> While not perfect, it at the very least gets rid of a lot of UB, by
> matching up the maximum DECLARE_ALIGNED value with the alignment of heap
> allocations done by lavu.
> ---
>   libavutil/mem.c          |  2 +-
>   libavutil/mem_internal.h | 20 +++++++++++---------
>   2 files changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/libavutil/mem.c b/libavutil/mem.c
> index 36b8940a0c..62163b4cb3 100644
> --- a/libavutil/mem.c
> +++ b/libavutil/mem.c
> @@ -62,7 +62,7 @@ void  free(void *ptr);
>   
>   #endif /* MALLOC_PREFIX */
>   
> -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16))
> +#define ALIGN (HAVE_SIMD_ALIGN_64 ? 64 : (HAVE_SIMD_ALIGN_32 ? 32 : 16))
>   
>   /* NOTE: if you want to override these functions with your own
>    * implementations (not recommended) you have to link libav* as
> diff --git a/libavutil/mem_internal.h b/libavutil/mem_internal.h
> index 2448c606f1..ddd3c24806 100644
> --- a/libavutil/mem_internal.h
> +++ b/libavutil/mem_internal.h
> @@ -75,22 +75,24 @@
>    * @param v Name of the variable
>    */
>   
> +#define MAX_ALIGNMENT (HAVE_SIMD_ALIGN_64 ? 64 : (HAVE_SIMD_ALIGN_32 ? 32 : 16))
> +
>   #if defined(__INTEL_COMPILER) && __INTEL_COMPILER < 1110 || defined(__SUNPRO_C)
> -    #define DECLARE_ALIGNED(n,t,v)      t __attribute__ ((aligned (n))) v
> -    #define DECLARE_ASM_ALIGNED(n,t,v)  t __attribute__ ((aligned (n))) v
> -    #define DECLARE_ASM_CONST(n,t,v)    const t __attribute__ ((aligned (n))) v
> +    #define DECLARE_ALIGNED(n,t,v)      t __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
> +    #define DECLARE_ASM_ALIGNED(n,t,v)  t __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
> +    #define DECLARE_ASM_CONST(n,t,v)    const t __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
>   #elif defined(__DJGPP__)
>       #define DECLARE_ALIGNED(n,t,v)      t __attribute__ ((aligned (FFMIN(n, 16)))) v
>       #define DECLARE_ASM_ALIGNED(n,t,v)  t av_used __attribute__ ((aligned (FFMIN(n, 16)))) v
>       #define DECLARE_ASM_CONST(n,t,v)    static const t av_used __attribute__ ((aligned (FFMIN(n, 16)))) v
>   #elif defined(__GNUC__) || defined(__clang__)
> -    #define DECLARE_ALIGNED(n,t,v)      t __attribute__ ((aligned (n))) v
> -    #define DECLARE_ASM_ALIGNED(n,t,v)  t av_used __attribute__ ((aligned (n))) v
> -    #define DECLARE_ASM_CONST(n,t,v)    static const t av_used __attribute__ ((aligned (n))) v
> +    #define DECLARE_ALIGNED(n,t,v)      t __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
> +    #define DECLARE_ASM_ALIGNED(n,t,v)  t av_used __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
> +    #define DECLARE_ASM_CONST(n,t,v)    static const t av_used __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
>   #elif defined(_MSC_VER)
> -    #define DECLARE_ALIGNED(n,t,v)      __declspec(align(n)) t v
> -    #define DECLARE_ASM_ALIGNED(n,t,v)  __declspec(align(n)) t v
> -    #define DECLARE_ASM_CONST(n,t,v)    __declspec(align(n)) static const t v
> +    #define DECLARE_ALIGNED(n,t,v)      __declspec(align(FFMIN(n, MAX_ALIGNMENT))) t v
> +    #define DECLARE_ASM_ALIGNED(n,t,v)  __declspec(align(FFMIN(n, MAX_ALIGNMENT))) t v
> +    #define DECLARE_ASM_CONST(n,t,v)    __declspec(align(FFMIN(n, MAX_ALIGNMENT))) static const t v

Just checked, this does in fact not work with msvc:
libavfilter/af_arnndn.c(122): error C2059: Syntaxfehler: "("

So I guess for MSVC, the alignment will always have to be the full 32 or 64.

>   #else
>       #define DECLARE_ALIGNED(n,t,v)      t v
>       #define DECLARE_ASM_ALIGNED(n,t,v)  t v
diff mbox series

Patch

diff --git a/libavutil/mem.c b/libavutil/mem.c
index 36b8940a0c..62163b4cb3 100644
--- a/libavutil/mem.c
+++ b/libavutil/mem.c
@@ -62,7 +62,7 @@  void  free(void *ptr);
 
 #endif /* MALLOC_PREFIX */
 
-#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16))
+#define ALIGN (HAVE_SIMD_ALIGN_64 ? 64 : (HAVE_SIMD_ALIGN_32 ? 32 : 16))
 
 /* NOTE: if you want to override these functions with your own
  * implementations (not recommended) you have to link libav* as
diff --git a/libavutil/mem_internal.h b/libavutil/mem_internal.h
index 2448c606f1..ddd3c24806 100644
--- a/libavutil/mem_internal.h
+++ b/libavutil/mem_internal.h
@@ -75,22 +75,24 @@ 
  * @param v Name of the variable
  */
 
+#define MAX_ALIGNMENT (HAVE_SIMD_ALIGN_64 ? 64 : (HAVE_SIMD_ALIGN_32 ? 32 : 16))
+
 #if defined(__INTEL_COMPILER) && __INTEL_COMPILER < 1110 || defined(__SUNPRO_C)
-    #define DECLARE_ALIGNED(n,t,v)      t __attribute__ ((aligned (n))) v
-    #define DECLARE_ASM_ALIGNED(n,t,v)  t __attribute__ ((aligned (n))) v
-    #define DECLARE_ASM_CONST(n,t,v)    const t __attribute__ ((aligned (n))) v
+    #define DECLARE_ALIGNED(n,t,v)      t __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
+    #define DECLARE_ASM_ALIGNED(n,t,v)  t __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
+    #define DECLARE_ASM_CONST(n,t,v)    const t __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
 #elif defined(__DJGPP__)
     #define DECLARE_ALIGNED(n,t,v)      t __attribute__ ((aligned (FFMIN(n, 16)))) v
     #define DECLARE_ASM_ALIGNED(n,t,v)  t av_used __attribute__ ((aligned (FFMIN(n, 16)))) v
     #define DECLARE_ASM_CONST(n,t,v)    static const t av_used __attribute__ ((aligned (FFMIN(n, 16)))) v
 #elif defined(__GNUC__) || defined(__clang__)
-    #define DECLARE_ALIGNED(n,t,v)      t __attribute__ ((aligned (n))) v
-    #define DECLARE_ASM_ALIGNED(n,t,v)  t av_used __attribute__ ((aligned (n))) v
-    #define DECLARE_ASM_CONST(n,t,v)    static const t av_used __attribute__ ((aligned (n))) v
+    #define DECLARE_ALIGNED(n,t,v)      t __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
+    #define DECLARE_ASM_ALIGNED(n,t,v)  t av_used __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
+    #define DECLARE_ASM_CONST(n,t,v)    static const t av_used __attribute__ ((aligned (FFMIN(n, MAX_ALIGNMENT)))) v
 #elif defined(_MSC_VER)
-    #define DECLARE_ALIGNED(n,t,v)      __declspec(align(n)) t v
-    #define DECLARE_ASM_ALIGNED(n,t,v)  __declspec(align(n)) t v
-    #define DECLARE_ASM_CONST(n,t,v)    __declspec(align(n)) static const t v
+    #define DECLARE_ALIGNED(n,t,v)      __declspec(align(FFMIN(n, MAX_ALIGNMENT))) t v
+    #define DECLARE_ASM_ALIGNED(n,t,v)  __declspec(align(FFMIN(n, MAX_ALIGNMENT))) t v
+    #define DECLARE_ASM_CONST(n,t,v)    __declspec(align(FFMIN(n, MAX_ALIGNMENT))) static const t v
 #else
     #define DECLARE_ALIGNED(n,t,v)      t v
     #define DECLARE_ASM_ALIGNED(n,t,v)  t v