Message ID | 20231203201027.2255-1-timo@rothenpieler.org |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] avutil/mem: always align by at least 32 bytes | expand |
Context | Check | Description |
---|---|---|
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
On 03/12/2023 21:10, Timo Rothenpieler wrote: > FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, > which then end up heap-allocated. > By declaring any variable in a struct, or tree of structs, to be 32 byte > aligned, it allows the compiler to safely assume the entire struct > itself is also 32 byte aligned. > > This might make the compiler emit code which straight up crashes or > misbehaves in other ways, and at least in one instances is now > documented to actually do (see ticket 10549 on trac). > The issue there is that an unrelated variable in SingleChannelElement is > declared to have an alignment of 32 bytes. So if the compiler does a copy > in decode_cpe() with avx instructions, but ffmpeg is built with > --disable-avx, this results in a crash, since the memory is only 16 byte > aligned. > Mind you, even if the compiler does not emit avx instructions, the code > is still invalid and could misbehave. It just happens not to. Declaring > any variable in a struct with a 32 byte alignment promises 32 byte > alignment of the whole struct to the compiler. > > Instead of now going through all instances of variables in structs > being declared as 32 byte aligned, this patch bumps the minimum alignment > to 32 bytes. > --- > libavutil/mem.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavutil/mem.c b/libavutil/mem.c > index 36b8940a0c..26a9b9753b 100644 > --- a/libavutil/mem.c > +++ b/libavutil/mem.c > @@ -62,7 +62,7 @@ void free(void *ptr); > > #endif /* MALLOC_PREFIX */ > > -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) > +#define ALIGN (HAVE_AVX512 ? 64 : 32) > > /* NOTE: if you want to override these functions with your own > * implementations (not recommended) you have to link libav* as ping
On 12/3/2023 5:10 PM, Timo Rothenpieler wrote: > FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, > which then end up heap-allocated. > By declaring any variable in a struct, or tree of structs, to be 32 byte > aligned, it allows the compiler to safely assume the entire struct > itself is also 32 byte aligned. > > This might make the compiler emit code which straight up crashes or > misbehaves in other ways, and at least in one instances is now > documented to actually do (see ticket 10549 on trac). > The issue there is that an unrelated variable in SingleChannelElement is > declared to have an alignment of 32 bytes. So if the compiler does a copy > in decode_cpe() with avx instructions, but ffmpeg is built with > --disable-avx, this results in a crash, since the memory is only 16 byte > aligned. Wont we run into similar issues with avx512 eventually? > Mind you, even if the compiler does not emit avx instructions, the code > is still invalid and could misbehave. It just happens not to. Declaring > any variable in a struct with a 32 byte alignment promises 32 byte > alignment of the whole struct to the compiler. > > Instead of now going through all instances of variables in structs > being declared as 32 byte aligned, this patch bumps the minimum alignment > to 32 bytes. > --- > libavutil/mem.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavutil/mem.c b/libavutil/mem.c > index 36b8940a0c..26a9b9753b 100644 > --- a/libavutil/mem.c > +++ b/libavutil/mem.c > @@ -62,7 +62,7 @@ void free(void *ptr); > > #endif /* MALLOC_PREFIX */ > > -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) > +#define ALIGN (HAVE_AVX512 ? 64 : 32) > > /* NOTE: if you want to override these functions with your own > * implementations (not recommended) you have to link libav* as
Hi, On Sun, Dec 3, 2023 at 3:10 PM Timo Rothenpieler <timo@rothenpieler.org> wrote: > So if the compiler does a copy > in decode_cpe() with avx instructions, but ffmpeg is built with > --disable-avx > Ehm... What? That seems like the core bug then? Ronald
On 12/6/2023 9:50 AM, Ronald S. Bultje wrote: > Hi, > > On Sun, Dec 3, 2023 at 3:10 PM Timo Rothenpieler <timo@rothenpieler.org> > wrote: > >> So if the compiler does a copy >> in decode_cpe() with avx instructions, but ffmpeg is built with >> --disable-avx >> > > Ehm... What? That seems like the core bug then? --disable-avx will disable any ffmpeg specific AVX dsp function that depends on HAVE_AVX_EXTERNAL and similar, but will not do anything to the -march argument you can pass the compiler with the --cpu configure option. configure --cpu=haswell --disable-avx is completely valid. > > Ronald > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
On 06/12/2023 13:31, James Almer wrote: > On 12/3/2023 5:10 PM, Timo Rothenpieler wrote: >> FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, >> which then end up heap-allocated. >> By declaring any variable in a struct, or tree of structs, to be 32 byte >> aligned, it allows the compiler to safely assume the entire struct >> itself is also 32 byte aligned. >> >> This might make the compiler emit code which straight up crashes or >> misbehaves in other ways, and at least in one instances is now >> documented to actually do (see ticket 10549 on trac). >> The issue there is that an unrelated variable in SingleChannelElement is >> declared to have an alignment of 32 bytes. So if the compiler does a copy >> in decode_cpe() with avx instructions, but ffmpeg is built with >> --disable-avx, this results in a crash, since the memory is only 16 byte >> aligned. > > Wont we run into similar issues with avx512 eventually? It's only indirectly related to AVX. The core issue is that we have structs with elements that declare an alignment of 32 bytes all over the codebase. I checked all instances, and we do not have any struct members that declare a higher alignment requirement than 32. (There's two instances of 64 byte alignment, but those are not on struct members, but on stack variables.) This promises the compiler that the memory of the whole struct is aligned accordingly. So no matter if it breaks because of AVX or something else, the compiler could generate broken code if we heap allocate those structs with too small of an alignment. It could generate other, non-AVX code, that depends on that alignment. So we will only run into this again if someone decides to add a struct member with bigger alignment to a heap allocated struct somewhere. >> Mind you, even if the compiler does not emit avx instructions, the code >> is still invalid and could misbehave. It just happens not to. Declaring >> any variable in a struct with a 32 byte alignment promises 32 byte >> alignment of the whole struct to the compiler. >> >> Instead of now going through all instances of variables in structs >> being declared as 32 byte aligned, this patch bumps the minimum alignment >> to 32 bytes. >> --- >> libavutil/mem.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/libavutil/mem.c b/libavutil/mem.c >> index 36b8940a0c..26a9b9753b 100644 >> --- a/libavutil/mem.c >> +++ b/libavutil/mem.c >> @@ -62,7 +62,7 @@ void free(void *ptr); >> #endif /* MALLOC_PREFIX */ >> -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) >> +#define ALIGN (HAVE_AVX512 ? 64 : 32) >> /* NOTE: if you want to override these functions with your own >> * implementations (not recommended) you have to link libav* as > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
On Sun, 3 Dec 2023, Timo Rothenpieler wrote: > FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, > which then end up heap-allocated. > By declaring any variable in a struct, or tree of structs, to be 32 byte > aligned, it allows the compiler to safely assume the entire struct > itself is also 32 byte aligned. > > This might make the compiler emit code which straight up crashes or > misbehaves in other ways, and at least in one instances is now > documented to actually do (see ticket 10549 on trac). > The issue there is that an unrelated variable in SingleChannelElement is > declared to have an alignment of 32 bytes. So if the compiler does a copy > in decode_cpe() with avx instructions, but ffmpeg is built with > --disable-avx, this results in a crash, since the memory is only 16 byte > aligned. > Mind you, even if the compiler does not emit avx instructions, the code > is still invalid and could misbehave. It just happens not to. Declaring > any variable in a struct with a 32 byte alignment promises 32 byte > alignment of the whole struct to the compiler. > > Instead of now going through all instances of variables in structs > being declared as 32 byte aligned, this patch bumps the minimum alignment > to 32 bytes. > --- > libavutil/mem.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavutil/mem.c b/libavutil/mem.c > index 36b8940a0c..26a9b9753b 100644 > --- a/libavutil/mem.c > +++ b/libavutil/mem.c > @@ -62,7 +62,7 @@ void free(void *ptr); > > #endif /* MALLOC_PREFIX */ > > -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) > +#define ALIGN (HAVE_AVX512 ? 64 : 32) LGTM It could be good to add a comment here, to indicate how this value relates to the alignemnts used in structs. For others who commented in this thread, it all boils down to something like this: struct MyData { uint8_t __attribute__((aligned(32))) aligned_data[1024]; }; void func(void) { struct MyData *obj = malloc(sizeof(*obj)); // Uusally only aligned to 8 bytes // operate on obj->aligned_data[] } Due to how aligned_data is declared, we promise to the compiler that it is aligned to 32 bytes, and that the compiler can assume this wherever. Depending on -march or whatever, this can be to access it with instructions that assume 32 byte alignment. // Martin
On 06/12/2023 14:25, Martin Storsjö wrote: > On Sun, 3 Dec 2023, Timo Rothenpieler wrote: > >> FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, >> which then end up heap-allocated. >> By declaring any variable in a struct, or tree of structs, to be 32 byte >> aligned, it allows the compiler to safely assume the entire struct >> itself is also 32 byte aligned. >> >> This might make the compiler emit code which straight up crashes or >> misbehaves in other ways, and at least in one instances is now >> documented to actually do (see ticket 10549 on trac). >> The issue there is that an unrelated variable in SingleChannelElement is >> declared to have an alignment of 32 bytes. So if the compiler does a copy >> in decode_cpe() with avx instructions, but ffmpeg is built with >> --disable-avx, this results in a crash, since the memory is only 16 byte >> aligned. >> Mind you, even if the compiler does not emit avx instructions, the code >> is still invalid and could misbehave. It just happens not to. Declaring >> any variable in a struct with a 32 byte alignment promises 32 byte >> alignment of the whole struct to the compiler. >> >> Instead of now going through all instances of variables in structs >> being declared as 32 byte aligned, this patch bumps the minimum alignment >> to 32 bytes. >> --- >> libavutil/mem.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/libavutil/mem.c b/libavutil/mem.c >> index 36b8940a0c..26a9b9753b 100644 >> --- a/libavutil/mem.c >> +++ b/libavutil/mem.c >> @@ -62,7 +62,7 @@ void free(void *ptr); >> >> #endif /* MALLOC_PREFIX */ >> >> -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) >> +#define ALIGN (HAVE_AVX512 ? 64 : 32) > > LGTM > > It could be good to add a comment here, to indicate how this value > relates to the alignemnts used in structs. > > For others who commented in this thread, it all boils down to something > like this: > > struct MyData { > uint8_t __attribute__((aligned(32))) aligned_data[1024]; > }; It's even a bit more complex than that. The case that's crashing right now is a member that has no alignment declared on itself at all. But another member of the same struct does, and so the compiler assumes the whole struct to be aligned. > void func(void) { > struct MyData *obj = malloc(sizeof(*obj)); // Uusally only aligned > to 8 bytes > // operate on obj->aligned_data[] > } > > Due to how aligned_data is declared, we promise to the compiler that it > is aligned to 32 bytes, and that the compiler can assume this wherever. > Depending on -march or whatever, this can be to access it with > instructions that assume 32 byte alignment. > > // Martin > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
On Wed, 6 Dec 2023, Timo Rothenpieler wrote: > > > On 06/12/2023 14:25, Martin Storsjö wrote: >> On Sun, 3 Dec 2023, Timo Rothenpieler wrote: >> >>> FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, >>> which then end up heap-allocated. >>> By declaring any variable in a struct, or tree of structs, to be 32 byte >>> aligned, it allows the compiler to safely assume the entire struct >>> itself is also 32 byte aligned. >>> >>> This might make the compiler emit code which straight up crashes or >>> misbehaves in other ways, and at least in one instances is now >>> documented to actually do (see ticket 10549 on trac). >>> The issue there is that an unrelated variable in SingleChannelElement is >>> declared to have an alignment of 32 bytes. So if the compiler does a copy >>> in decode_cpe() with avx instructions, but ffmpeg is built with >>> --disable-avx, this results in a crash, since the memory is only 16 byte >>> aligned. >>> Mind you, even if the compiler does not emit avx instructions, the code >>> is still invalid and could misbehave. It just happens not to. Declaring >>> any variable in a struct with a 32 byte alignment promises 32 byte >>> alignment of the whole struct to the compiler. >>> >>> Instead of now going through all instances of variables in structs >>> being declared as 32 byte aligned, this patch bumps the minimum alignment >>> to 32 bytes. >>> --- >>> libavutil/mem.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/libavutil/mem.c b/libavutil/mem.c >>> index 36b8940a0c..26a9b9753b 100644 >>> --- a/libavutil/mem.c >>> +++ b/libavutil/mem.c >>> @@ -62,7 +62,7 @@ void free(void *ptr); >>> >>> #endif /* MALLOC_PREFIX */ >>> >>> -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) >>> +#define ALIGN (HAVE_AVX512 ? 64 : 32) >> >> LGTM >> >> It could be good to add a comment here, to indicate how this value >> relates to the alignemnts used in structs. >> >> For others who commented in this thread, it all boils down to something >> like this: >> >> struct MyData { >> uint8_t __attribute__((aligned(32))) aligned_data[1024]; >> }; > > It's even a bit more complex than that. > The case that's crashing right now is a member that has no alignment > declared on itself at all. > But another member of the same struct does, and so the compiler assumes > the whole struct to be aligned. Ah, tricky! Yeah, that's also a valid assumption for the compiler, but also a rather non-obvious one. // Martin
On 06.12.2023 14:25, Martin Storsjö wrote: > On Sun, 3 Dec 2023, Timo Rothenpieler wrote: > >> FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, >> which then end up heap-allocated. >> By declaring any variable in a struct, or tree of structs, to be 32 byte >> aligned, it allows the compiler to safely assume the entire struct >> itself is also 32 byte aligned. >> >> This might make the compiler emit code which straight up crashes or >> misbehaves in other ways, and at least in one instances is now >> documented to actually do (see ticket 10549 on trac). >> The issue there is that an unrelated variable in SingleChannelElement is >> declared to have an alignment of 32 bytes. So if the compiler does a copy >> in decode_cpe() with avx instructions, but ffmpeg is built with >> --disable-avx, this results in a crash, since the memory is only 16 byte >> aligned. >> Mind you, even if the compiler does not emit avx instructions, the code >> is still invalid and could misbehave. It just happens not to. Declaring >> any variable in a struct with a 32 byte alignment promises 32 byte >> alignment of the whole struct to the compiler. >> >> Instead of now going through all instances of variables in structs >> being declared as 32 byte aligned, this patch bumps the minimum alignment >> to 32 bytes. >> --- >> libavutil/mem.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/libavutil/mem.c b/libavutil/mem.c >> index 36b8940a0c..26a9b9753b 100644 >> --- a/libavutil/mem.c >> +++ b/libavutil/mem.c >> @@ -62,7 +62,7 @@ void free(void *ptr); >> >> #endif /* MALLOC_PREFIX */ >> >> -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) >> +#define ALIGN (HAVE_AVX512 ? 64 : 32) > > LGTM > > It could be good to add a comment here, to indicate how this value > relates to the alignemnts used in structs. Thinking about this, I don't think a comment _here_ is a good idea. I'd be more inclined to add it to the DECLARE_ALIGNED macro. People defining a new aligned member variable are at least a little more likely to read that, compared to something next to this random define that's not even in a header. Will add that and push soon. I'll also check how far this will need backported. Likely to almost all versions ever. > For others who commented in this thread, it all boils down to something > like this: > > struct MyData { > uint8_t __attribute__((aligned(32))) aligned_data[1024]; > }; > > void func(void) { > struct MyData *obj = malloc(sizeof(*obj)); // Uusally only aligned > to 8 bytes > // operate on obj->aligned_data[] > } > > Due to how aligned_data is declared, we promise to the compiler that it > is aligned to 32 bytes, and that the compiler can assume this wherever. > Depending on -march or whatever, this can be to access it with > instructions that assume 32 byte alignment. > > // Martin > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
On Fri, 8 Dec 2023, Timo Rothenpieler wrote: > On 06.12.2023 14:25, Martin Storsjö wrote: >> On Sun, 3 Dec 2023, Timo Rothenpieler wrote: >> >>> FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, >>> which then end up heap-allocated. >>> By declaring any variable in a struct, or tree of structs, to be 32 > byte >>> aligned, it allows the compiler to safely assume the entire struct >>> itself is also 32 byte aligned. >>> >>> This might make the compiler emit code which straight up crashes or >>> misbehaves in other ways, and at least in one instances is now >>> documented to actually do (see ticket 10549 on trac). >>> The issue there is that an unrelated variable in SingleChannelElement > is >>> declared to have an alignment of 32 bytes. So if the compiler does a > copy >>> in decode_cpe() with avx instructions, but ffmpeg is built with >>> --disable-avx, this results in a crash, since the memory is only 16 > byte >>> aligned. >>> Mind you, even if the compiler does not emit avx instructions, the > code >>> is still invalid and could misbehave. It just happens not to. > Declaring >>> any variable in a struct with a 32 byte alignment promises 32 byte >>> alignment of the whole struct to the compiler. >>> >>> Instead of now going through all instances of variables in structs >>> being declared as 32 byte aligned, this patch bumps the minimum > alignment >>> to 32 bytes. >>> --- >>> libavutil/mem.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/libavutil/mem.c b/libavutil/mem.c >>> index 36b8940a0c..26a9b9753b 100644 >>> --- a/libavutil/mem.c >>> +++ b/libavutil/mem.c >>> @@ -62,7 +62,7 @@ void free(void *ptr); >>> >>> #endif /* MALLOC_PREFIX */ >>> >>> -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) >>> +#define ALIGN (HAVE_AVX512 ? 64 : 32) >> >> LGTM >> >> It could be good to add a comment here, to indicate how this value >> relates to the alignemnts used in structs. > > Thinking about this, I don't think a comment _here_ is a good idea. > I'd be more inclined to add it to the DECLARE_ALIGNED macro. > People defining a new aligned member variable are at least a little more > likely to read that, compared to something next to this random define > that's not even in a header. I still think it'd be good to have a comment here, too, to explain this code to whoever is looking at it (why 32 or 64, why not 16/32/64 etc). I agree that it can be good to mention it near DECLARE_ALIGNED as well, but these comments serve different readers. // Martin
Timo Rothenpieler: > FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, > which then end up heap-allocated. > By declaring any variable in a struct, or tree of structs, to be 32 byte > aligned, it allows the compiler to safely assume the entire struct > itself is also 32 byte aligned. > > This might make the compiler emit code which straight up crashes or > misbehaves in other ways, and at least in one instances is now > documented to actually do (see ticket 10549 on trac). > The issue there is that an unrelated variable in SingleChannelElement is > declared to have an alignment of 32 bytes. So if the compiler does a copy > in decode_cpe() with avx instructions, but ffmpeg is built with > --disable-avx, this results in a crash, since the memory is only 16 byte > aligned. > Mind you, even if the compiler does not emit avx instructions, the code > is still invalid and could misbehave. It just happens not to. Declaring > any variable in a struct with a 32 byte alignment promises 32 byte > alignment of the whole struct to the compiler. > > Instead of now going through all instances of variables in structs > being declared as 32 byte aligned, this patch bumps the minimum alignment > to 32 bytes. > --- > libavutil/mem.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavutil/mem.c b/libavutil/mem.c > index 36b8940a0c..26a9b9753b 100644 > --- a/libavutil/mem.c > +++ b/libavutil/mem.c > @@ -62,7 +62,7 @@ void free(void *ptr); > > #endif /* MALLOC_PREFIX */ > > -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) > +#define ALIGN (HAVE_AVX512 ? 64 : 32) > > /* NOTE: if you want to override these functions with your own > * implementations (not recommended) you have to link libav* as 1. There is another way in which this can be triggered: Namely if one uses a build with AVX, but combines it with a lavu built without it; it is also triggerable on non-x86 (having an insufficiently aligned pointer is always UB even if the CPU does not have instructions that would benefit from the additional alignment). You should mention this in the commit message. 2. This topic gave me headaches when creating RefStruct. I "solved" it by (ab)using STRIDE_ALIGN which mimicks the alignment of av_malloc(), thereby ensuring that RefStruct does not break lavc builds built with the avx dsp functions enabled (but it does not guard against using a lavu whose av_malloc() only provides less alignment). 3. There is a downside to your patch: It bumps alignment for non-x86 arches which wastes memory (and may make allocators slower). We could fix this by modifying the 32-byte-alignment macros to only provide 16 byte alignment if the ARCH_ (and potentially the HAVE_) defines indicate that no alignment bigger than 16 is needed. - Andreas
On 08.12.2023 11:01, Andreas Rheinhardt wrote: > Timo Rothenpieler: >> FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, >> which then end up heap-allocated. >> By declaring any variable in a struct, or tree of structs, to be 32 byte >> aligned, it allows the compiler to safely assume the entire struct >> itself is also 32 byte aligned. >> >> This might make the compiler emit code which straight up crashes or >> misbehaves in other ways, and at least in one instances is now >> documented to actually do (see ticket 10549 on trac). >> The issue there is that an unrelated variable in SingleChannelElement is >> declared to have an alignment of 32 bytes. So if the compiler does a copy >> in decode_cpe() with avx instructions, but ffmpeg is built with >> --disable-avx, this results in a crash, since the memory is only 16 byte >> aligned. >> Mind you, even if the compiler does not emit avx instructions, the code >> is still invalid and could misbehave. It just happens not to. Declaring >> any variable in a struct with a 32 byte alignment promises 32 byte >> alignment of the whole struct to the compiler. >> >> Instead of now going through all instances of variables in structs >> being declared as 32 byte aligned, this patch bumps the minimum alignment >> to 32 bytes. >> --- >> libavutil/mem.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/libavutil/mem.c b/libavutil/mem.c >> index 36b8940a0c..26a9b9753b 100644 >> --- a/libavutil/mem.c >> +++ b/libavutil/mem.c >> @@ -62,7 +62,7 @@ void free(void *ptr); >> >> #endif /* MALLOC_PREFIX */ >> >> -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) >> +#define ALIGN (HAVE_AVX512 ? 64 : 32) >> >> /* NOTE: if you want to override these functions with your own >> * implementations (not recommended) you have to link libav* as > > 1. There is another way in which this can be triggered: Namely if one > uses a build with AVX, but combines it with a lavu built without it; it > is also triggerable on non-x86 (having an insufficiently aligned pointer > is always UB even if the CPU does not have instructions that would > benefit from the additional alignment). You should mention this in the > commit message. Is mixing the libraries really a scenario we need to care about/support? And yeah, this is only marginally related to AVX, in that it's what triggers a crash in this scenario. But as stated in the commit message, it's not a valid thing to do in any case, on any arch. It just happens not to crash. > 2. This topic gave me headaches when creating RefStruct. I "solved" it > by (ab)using STRIDE_ALIGN which mimicks the alignment of av_malloc(), > thereby ensuring that RefStruct does not break lavc builds built with > the avx dsp functions enabled (but it does not guard against using a > lavu whose av_malloc() only provides less alignment). > > 3. There is a downside to your patch: It bumps alignment for non-x86 > arches which wastes memory (and may make allocators slower). We could > fix this by modifying the 32-byte-alignment macros to only provide 16 > byte alignment if the ARCH_ (and potentially the HAVE_) defines indicate > that no alignment bigger than 16 is needed. But it's invalid on any other arch as well, just hasn't bitten us yet. I'm not sure if I'd want to start maintaining a growingly complex list of CPU extensions and make the DECLARE_ALIGNED macro lie if we think it doesn't need 32 byte alignment. We don't really know why someone wants a variable aligned after all. > - Andreas > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Timo Rothenpieler (12023-12-08):
> Is mixing the libraries really a scenario we need to care about/support?
No. We should merge all the libraries into a single libffmpeg.so. Having
separate libraries brings us no end of hassle and drawbacks, starting
with all the avpriv symbols and backward compatibility layers, and the
benefits it brings could be reached in simpler and more efficient ways.
But anytime I brought it up, the same naysayers would object, but when I
ask what precise benefit they think the current situation brings, with
the intent of explaining how it can be done better differently (I do not
bring that half-backed, I have thought about it beforehand) or in some
case explaining that no, this is not a benefit because linking does not
work like that. And then the naysayers would whine that I am making too
much a fuss.
Barring merging all libraries into a single libffmpeg.so, have configure
compute AV_LIBRARY_SIGNATURE as a 64 bits hash of the version and
configuration, then have in version.h of each library:
#define avsmth_version_check_signature() \
avsmth_version_check_signature_ ## AV_LIBRARY_SIGNATURE()
then have avsmth_version_check_signature() in each library call the ones
in the library it depends on, and core functions like *register_all()
call it. Then if the libraries are mixed at run time it will produce an
error message by the linker that can be searched on the web.
Regards,
Timo Rothenpieler: > On 08.12.2023 11:01, Andreas Rheinhardt wrote: >> Timo Rothenpieler: >>> FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, >>> which then end up heap-allocated. >>> By declaring any variable in a struct, or tree of structs, to be 32 byte >>> aligned, it allows the compiler to safely assume the entire struct >>> itself is also 32 byte aligned. >>> >>> This might make the compiler emit code which straight up crashes or >>> misbehaves in other ways, and at least in one instances is now >>> documented to actually do (see ticket 10549 on trac). >>> The issue there is that an unrelated variable in SingleChannelElement is >>> declared to have an alignment of 32 bytes. So if the compiler does a >>> copy >>> in decode_cpe() with avx instructions, but ffmpeg is built with >>> --disable-avx, this results in a crash, since the memory is only 16 byte >>> aligned. >>> Mind you, even if the compiler does not emit avx instructions, the code >>> is still invalid and could misbehave. It just happens not to. Declaring >>> any variable in a struct with a 32 byte alignment promises 32 byte >>> alignment of the whole struct to the compiler. >>> >>> Instead of now going through all instances of variables in structs >>> being declared as 32 byte aligned, this patch bumps the minimum >>> alignment >>> to 32 bytes. >>> --- >>> libavutil/mem.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/libavutil/mem.c b/libavutil/mem.c >>> index 36b8940a0c..26a9b9753b 100644 >>> --- a/libavutil/mem.c >>> +++ b/libavutil/mem.c >>> @@ -62,7 +62,7 @@ void free(void *ptr); >>> #endif /* MALLOC_PREFIX */ >>> -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) >>> +#define ALIGN (HAVE_AVX512 ? 64 : 32) >>> /* NOTE: if you want to override these functions with your own >>> * implementations (not recommended) you have to link libav* as >> >> 1. There is another way in which this can be triggered: Namely if one >> uses a build with AVX, but combines it with a lavu built without it; it >> is also triggerable on non-x86 (having an insufficiently aligned pointer >> is always UB even if the CPU does not have instructions that would >> benefit from the additional alignment). You should mention this in the >> commit message. > > Is mixing the libraries really a scenario we need to care about/support? > IMO, no, but Anton cares about it a lot. > And yeah, this is only marginally related to AVX, in that it's what > triggers a crash in this scenario. > But as stated in the commit message, it's not a valid thing to do in any > case, on any arch. It just happens not to crash. > I know, but this patch also happens to fix this (at least to some degree), so this should be mentioned in the commit message. >> 2. This topic gave me headaches when creating RefStruct. I "solved" it >> by (ab)using STRIDE_ALIGN which mimicks the alignment of av_malloc(), >> thereby ensuring that RefStruct does not break lavc builds built with >> the avx dsp functions enabled (but it does not guard against using a >> lavu whose av_malloc() only provides less alignment). >> >> 3. There is a downside to your patch: It bumps alignment for non-x86 >> arches which wastes memory (and may make allocators slower). We could >> fix this by modifying the 32-byte-alignment macros to only provide 16 >> byte alignment if the ARCH_ (and potentially the HAVE_) defines indicate >> that no alignment bigger than 16 is needed. > > But it's invalid on any other arch as well, just hasn't bitten us yet. It is not invalid if we modify DECLARE_ALIGNED to never use more alignment than 16 on non-x86 arches. Then all the other arches can continue to use 16. > I'm not sure if I'd want to start maintaining a growingly complex list > of CPU extensions and make the DECLARE_ALIGNED macro lie if we think it > doesn't need 32 byte alignment. > We don't really know why someone wants a variable aligned after all. I am fine with that point. Although I don't think it would be that complicated if it is done at one point (namely in configure) and if all the other places would just use a macro for max alignment (that would be placed in config.h). - Andreas
On 09.12.2023 06:23, Andreas Rheinhardt wrote: > Timo Rothenpieler: >> On 08.12.2023 11:01, Andreas Rheinhardt wrote: >>> Timo Rothenpieler: >>>> FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, >>>> which then end up heap-allocated. >>>> By declaring any variable in a struct, or tree of structs, to be 32 byte >>>> aligned, it allows the compiler to safely assume the entire struct >>>> itself is also 32 byte aligned. >>>> >>>> This might make the compiler emit code which straight up crashes or >>>> misbehaves in other ways, and at least in one instances is now >>>> documented to actually do (see ticket 10549 on trac). >>>> The issue there is that an unrelated variable in SingleChannelElement is >>>> declared to have an alignment of 32 bytes. So if the compiler does a >>>> copy >>>> in decode_cpe() with avx instructions, but ffmpeg is built with >>>> --disable-avx, this results in a crash, since the memory is only 16 byte >>>> aligned. >>>> Mind you, even if the compiler does not emit avx instructions, the code >>>> is still invalid and could misbehave. It just happens not to. Declaring >>>> any variable in a struct with a 32 byte alignment promises 32 byte >>>> alignment of the whole struct to the compiler. >>>> >>>> Instead of now going through all instances of variables in structs >>>> being declared as 32 byte aligned, this patch bumps the minimum >>>> alignment >>>> to 32 bytes. >>>> --- >>>> libavutil/mem.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/libavutil/mem.c b/libavutil/mem.c >>>> index 36b8940a0c..26a9b9753b 100644 >>>> --- a/libavutil/mem.c >>>> +++ b/libavutil/mem.c >>>> @@ -62,7 +62,7 @@ void free(void *ptr); >>>> #endif /* MALLOC_PREFIX */ >>>> -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) >>>> +#define ALIGN (HAVE_AVX512 ? 64 : 32) >>>> /* NOTE: if you want to override these functions with your own >>>> * implementations (not recommended) you have to link libav* as >>> >>> 1. There is another way in which this can be triggered: Namely if one >>> uses a build with AVX, but combines it with a lavu built without it; it >>> is also triggerable on non-x86 (having an insufficiently aligned pointer >>> is always UB even if the CPU does not have instructions that would >>> benefit from the additional alignment). You should mention this in the >>> commit message. >> >> Is mixing the libraries really a scenario we need to care about/support? >> > > IMO, no, but Anton cares about it a lot. > >> And yeah, this is only marginally related to AVX, in that it's what >> triggers a crash in this scenario. >> But as stated in the commit message, it's not a valid thing to do in any >> case, on any arch. It just happens not to crash. >> > > I know, but this patch also happens to fix this (at least to some > degree), so this should be mentioned in the commit message. > >>> 2. This topic gave me headaches when creating RefStruct. I "solved" it >>> by (ab)using STRIDE_ALIGN which mimicks the alignment of av_malloc(), >>> thereby ensuring that RefStruct does not break lavc builds built with >>> the avx dsp functions enabled (but it does not guard against using a >>> lavu whose av_malloc() only provides less alignment). >>> >>> 3. There is a downside to your patch: It bumps alignment for non-x86 >>> arches which wastes memory (and may make allocators slower). We could >>> fix this by modifying the 32-byte-alignment macros to only provide 16 >>> byte alignment if the ARCH_ (and potentially the HAVE_) defines indicate >>> that no alignment bigger than 16 is needed. >> >> But it's invalid on any other arch as well, just hasn't bitten us yet. > > It is not invalid if we modify DECLARE_ALIGNED to never use more > alignment than 16 on non-x86 arches. Then all the other arches can > continue to use 16. > >> I'm not sure if I'd want to start maintaining a growingly complex list >> of CPU extensions and make the DECLARE_ALIGNED macro lie if we think it >> doesn't need 32 byte alignment. >> We don't really know why someone wants a variable aligned after all. > > I am fine with that point. Although I don't think it would be that > complicated if it is done at one point (namely in configure) and if all > the other places would just use a macro for max alignment (that would be > placed in config.h). ping about this. I'm still not sure about the correct way forward here. Aside from the complexity of figuring out the reasonable max align, I'm also a not sure on how to modify the macro to make use of it at all. You can't put any kind of MIN/MAX construct into the body of the alignment macro after all.
diff --git a/libavutil/mem.c b/libavutil/mem.c index 36b8940a0c..26a9b9753b 100644 --- a/libavutil/mem.c +++ b/libavutil/mem.c @@ -62,7 +62,7 @@ void free(void *ptr); #endif /* MALLOC_PREFIX */ -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) +#define ALIGN (HAVE_AVX512 ? 64 : 32) /* NOTE: if you want to override these functions with your own * implementations (not recommended) you have to link libav* as