Message ID | 20171109115837.32618-6-jdarnley@obe.tv |
---|---|
State | Accepted |
Commit | 8f86e6623811f7713d5e72c13797e20fffb3df62 |
Headers | show |
On 11/9/2017 8:58 AM, James Darnley wrote: > --- > configure | 2 ++ > libavcodec/internal.h | 4 +++- > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/configure b/configure > index 146a87324c..fce8030d91 100755 > --- a/configure > +++ b/configure > @@ -1886,6 +1886,7 @@ ARCH_FEATURES=" > local_aligned > simd_align_16 > simd_align_32 > + simd_align_64 > " > > BUILTIN_LIST=" > @@ -2385,6 +2386,7 @@ fast_clz_if_any="aarch64 alpha avr32 mips ppc x86" > fast_unaligned_if_any="aarch64 ppc x86" > simd_align_16_if_any="altivec neon sse" > simd_align_32_if_any="avx" > +simd_align_64_if_any="avx512" > > # system capabilities > symver_if_any="symver_asm_label symver_gnu_asm" > diff --git a/libavcodec/internal.h b/libavcodec/internal.h > index 7748f09f54..84070431ed 100644 > --- a/libavcodec/internal.h > +++ b/libavcodec/internal.h > @@ -87,7 +87,9 @@ > > #define FF_SIGNBIT(x) ((x) >> CHAR_BIT * sizeof(x) - 1) > > -#if HAVE_SIMD_ALIGN_32 > +#if HAVE_SIMD_ALIGN_64 > +# define STRIDE_ALIGN 64 /* AVX-512 */ > +#elif HAVE_SIMD_ALIGN_32 > # define STRIDE_ALIGN 32 > #elif HAVE_SIMD_ALIGN_16 > # define STRIDE_ALIGN 16 > LGTM, but I'd really like to find a way to start using av_get_cpu_max_align() as it was meant to be done when it was introduced, instead of keeping hardcoding alignment based on configure time options. Maybe adding a line to the av_force_cpu_flags() and av_set_cpu_flags_mask() doxy stating you should reinit all your decoder and encoder contexts after calling them or similar.
On 2017-11-10 02:38, James Almer wrote: > On 11/9/2017 8:58 AM, James Darnley wrote: >> --- >> configure | 2 ++ >> libavcodec/internal.h | 4 +++- >> 2 files changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/configure b/configure >> index 146a87324c..fce8030d91 100755 >> --- a/configure >> +++ b/configure >> @@ -1886,6 +1886,7 @@ ARCH_FEATURES=" >> local_aligned >> simd_align_16 >> simd_align_32 >> + simd_align_64 >> " >> >> BUILTIN_LIST=" >> @@ -2385,6 +2386,7 @@ fast_clz_if_any="aarch64 alpha avr32 mips ppc x86" >> fast_unaligned_if_any="aarch64 ppc x86" >> simd_align_16_if_any="altivec neon sse" >> simd_align_32_if_any="avx" >> +simd_align_64_if_any="avx512" >> >> # system capabilities >> symver_if_any="symver_asm_label symver_gnu_asm" >> diff --git a/libavcodec/internal.h b/libavcodec/internal.h >> index 7748f09f54..84070431ed 100644 >> --- a/libavcodec/internal.h >> +++ b/libavcodec/internal.h >> @@ -87,7 +87,9 @@ >> >> #define FF_SIGNBIT(x) ((x) >> CHAR_BIT * sizeof(x) - 1) >> >> -#if HAVE_SIMD_ALIGN_32 >> +#if HAVE_SIMD_ALIGN_64 >> +# define STRIDE_ALIGN 64 /* AVX-512 */ >> +#elif HAVE_SIMD_ALIGN_32 >> # define STRIDE_ALIGN 32 >> #elif HAVE_SIMD_ALIGN_16 >> # define STRIDE_ALIGN 16 >> > > LGTM, but I'd really like to find a way to start using > av_get_cpu_max_align() as it was meant to be done when it was > introduced, instead of keeping hardcoding alignment based on configure > time options. That sounds like a good idea to me. If time allows I will try to start a discussion about it. It sounds ripe for a lot of bikeshedding :) > Maybe adding a line to the av_force_cpu_flags() and > av_set_cpu_flags_mask() doxy stating you should reinit all your decoder > and encoder contexts after calling them or similar. Did you want me to add that one of my patches?
On 11/10/2017 10:58 AM, James Darnley wrote: > On 2017-11-10 02:38, James Almer wrote: >> On 11/9/2017 8:58 AM, James Darnley wrote: >>> --- >>> configure | 2 ++ >>> libavcodec/internal.h | 4 +++- >>> 2 files changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/configure b/configure >>> index 146a87324c..fce8030d91 100755 >>> --- a/configure >>> +++ b/configure >>> @@ -1886,6 +1886,7 @@ ARCH_FEATURES=" >>> local_aligned >>> simd_align_16 >>> simd_align_32 >>> + simd_align_64 >>> " >>> >>> BUILTIN_LIST=" >>> @@ -2385,6 +2386,7 @@ fast_clz_if_any="aarch64 alpha avr32 mips ppc x86" >>> fast_unaligned_if_any="aarch64 ppc x86" >>> simd_align_16_if_any="altivec neon sse" >>> simd_align_32_if_any="avx" >>> +simd_align_64_if_any="avx512" >>> >>> # system capabilities >>> symver_if_any="symver_asm_label symver_gnu_asm" >>> diff --git a/libavcodec/internal.h b/libavcodec/internal.h >>> index 7748f09f54..84070431ed 100644 >>> --- a/libavcodec/internal.h >>> +++ b/libavcodec/internal.h >>> @@ -87,7 +87,9 @@ >>> >>> #define FF_SIGNBIT(x) ((x) >> CHAR_BIT * sizeof(x) - 1) >>> >>> -#if HAVE_SIMD_ALIGN_32 >>> +#if HAVE_SIMD_ALIGN_64 >>> +# define STRIDE_ALIGN 64 /* AVX-512 */ >>> +#elif HAVE_SIMD_ALIGN_32 >>> # define STRIDE_ALIGN 32 >>> #elif HAVE_SIMD_ALIGN_16 >>> # define STRIDE_ALIGN 16 >>> >> >> LGTM, but I'd really like to find a way to start using >> av_get_cpu_max_align() as it was meant to be done when it was >> introduced, instead of keeping hardcoding alignment based on configure >> time options. > > That sounds like a good idea to me. If time allows I will try to start > a discussion about it. It sounds ripe for a lot of bikeshedding :) > >> Maybe adding a line to the av_force_cpu_flags() and >> av_set_cpu_flags_mask() doxy stating you should reinit all your decoder >> and encoder contexts after calling them or similar. > > Did you want me to add that one of my patches? No, it's unrelated to this set. It was a suggestion to solve the above issue. this patch is good as is while that's discussed.
diff --git a/configure b/configure index 146a87324c..fce8030d91 100755 --- a/configure +++ b/configure @@ -1886,6 +1886,7 @@ ARCH_FEATURES=" local_aligned simd_align_16 simd_align_32 + simd_align_64 " BUILTIN_LIST=" @@ -2385,6 +2386,7 @@ fast_clz_if_any="aarch64 alpha avr32 mips ppc x86" fast_unaligned_if_any="aarch64 ppc x86" simd_align_16_if_any="altivec neon sse" simd_align_32_if_any="avx" +simd_align_64_if_any="avx512" # system capabilities symver_if_any="symver_asm_label symver_gnu_asm" diff --git a/libavcodec/internal.h b/libavcodec/internal.h index 7748f09f54..84070431ed 100644 --- a/libavcodec/internal.h +++ b/libavcodec/internal.h @@ -87,7 +87,9 @@ #define FF_SIGNBIT(x) ((x) >> CHAR_BIT * sizeof(x) - 1) -#if HAVE_SIMD_ALIGN_32 +#if HAVE_SIMD_ALIGN_64 +# define STRIDE_ALIGN 64 /* AVX-512 */ +#elif HAVE_SIMD_ALIGN_32 # define STRIDE_ALIGN 32 #elif HAVE_SIMD_ALIGN_16 # define STRIDE_ALIGN 16