diff mbox

[FFmpeg-devel,05/11] avcodec: add stride alignment needed for AVX-512

Message ID 20171109115837.32618-6-jdarnley@obe.tv
State Accepted
Commit 8f86e6623811f7713d5e72c13797e20fffb3df62
Headers show

Commit Message

James Darnley Nov. 9, 2017, 11:58 a.m. UTC
---
 configure             | 2 ++
 libavcodec/internal.h | 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)

Comments

James Almer Nov. 10, 2017, 1:38 a.m. UTC | #1
On 11/9/2017 8:58 AM, James Darnley wrote:
> ---
>  configure             | 2 ++
>  libavcodec/internal.h | 4 +++-
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/configure b/configure
> index 146a87324c..fce8030d91 100755
> --- a/configure
> +++ b/configure
> @@ -1886,6 +1886,7 @@ ARCH_FEATURES="
>      local_aligned
>      simd_align_16
>      simd_align_32
> +    simd_align_64
>  "
>  
>  BUILTIN_LIST="
> @@ -2385,6 +2386,7 @@ fast_clz_if_any="aarch64 alpha avr32 mips ppc x86"
>  fast_unaligned_if_any="aarch64 ppc x86"
>  simd_align_16_if_any="altivec neon sse"
>  simd_align_32_if_any="avx"
> +simd_align_64_if_any="avx512"
>  
>  # system capabilities
>  symver_if_any="symver_asm_label symver_gnu_asm"
> diff --git a/libavcodec/internal.h b/libavcodec/internal.h
> index 7748f09f54..84070431ed 100644
> --- a/libavcodec/internal.h
> +++ b/libavcodec/internal.h
> @@ -87,7 +87,9 @@
>  
>  #define FF_SIGNBIT(x) ((x) >> CHAR_BIT * sizeof(x) - 1)
>  
> -#if HAVE_SIMD_ALIGN_32
> +#if HAVE_SIMD_ALIGN_64
> +#   define STRIDE_ALIGN 64 /* AVX-512 */
> +#elif HAVE_SIMD_ALIGN_32
>  #   define STRIDE_ALIGN 32
>  #elif HAVE_SIMD_ALIGN_16
>  #   define STRIDE_ALIGN 16
> 

LGTM, but I'd really like to find a way to start using
av_get_cpu_max_align() as it was meant to be done when it was
introduced, instead of keeping hardcoding alignment based on configure
time options.

Maybe adding a line to the av_force_cpu_flags() and
av_set_cpu_flags_mask() doxy stating you should reinit all your decoder
and encoder contexts after calling them or similar.
James Darnley Nov. 10, 2017, 1:58 p.m. UTC | #2
On 2017-11-10 02:38, James Almer wrote:
> On 11/9/2017 8:58 AM, James Darnley wrote:
>> ---
>>  configure             | 2 ++
>>  libavcodec/internal.h | 4 +++-
>>  2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index 146a87324c..fce8030d91 100755
>> --- a/configure
>> +++ b/configure
>> @@ -1886,6 +1886,7 @@ ARCH_FEATURES="
>>      local_aligned
>>      simd_align_16
>>      simd_align_32
>> +    simd_align_64
>>  "
>>  
>>  BUILTIN_LIST="
>> @@ -2385,6 +2386,7 @@ fast_clz_if_any="aarch64 alpha avr32 mips ppc x86"
>>  fast_unaligned_if_any="aarch64 ppc x86"
>>  simd_align_16_if_any="altivec neon sse"
>>  simd_align_32_if_any="avx"
>> +simd_align_64_if_any="avx512"
>>  
>>  # system capabilities
>>  symver_if_any="symver_asm_label symver_gnu_asm"
>> diff --git a/libavcodec/internal.h b/libavcodec/internal.h
>> index 7748f09f54..84070431ed 100644
>> --- a/libavcodec/internal.h
>> +++ b/libavcodec/internal.h
>> @@ -87,7 +87,9 @@
>>  
>>  #define FF_SIGNBIT(x) ((x) >> CHAR_BIT * sizeof(x) - 1)
>>  
>> -#if HAVE_SIMD_ALIGN_32
>> +#if HAVE_SIMD_ALIGN_64
>> +#   define STRIDE_ALIGN 64 /* AVX-512 */
>> +#elif HAVE_SIMD_ALIGN_32
>>  #   define STRIDE_ALIGN 32
>>  #elif HAVE_SIMD_ALIGN_16
>>  #   define STRIDE_ALIGN 16
>>
> 
> LGTM, but I'd really like to find a way to start using
> av_get_cpu_max_align() as it was meant to be done when it was
> introduced, instead of keeping hardcoding alignment based on configure
> time options.

That sounds like a good idea to me.  If time allows I will try to start
a discussion about it.  It sounds ripe for a lot of bikeshedding :)

> Maybe adding a line to the av_force_cpu_flags() and
> av_set_cpu_flags_mask() doxy stating you should reinit all your decoder
> and encoder contexts after calling them or similar.

Did you want me to add that one of my patches?
James Almer Nov. 10, 2017, 2:21 p.m. UTC | #3
On 11/10/2017 10:58 AM, James Darnley wrote:
> On 2017-11-10 02:38, James Almer wrote:
>> On 11/9/2017 8:58 AM, James Darnley wrote:
>>> ---
>>>  configure             | 2 ++
>>>  libavcodec/internal.h | 4 +++-
>>>  2 files changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure b/configure
>>> index 146a87324c..fce8030d91 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -1886,6 +1886,7 @@ ARCH_FEATURES="
>>>      local_aligned
>>>      simd_align_16
>>>      simd_align_32
>>> +    simd_align_64
>>>  "
>>>  
>>>  BUILTIN_LIST="
>>> @@ -2385,6 +2386,7 @@ fast_clz_if_any="aarch64 alpha avr32 mips ppc x86"
>>>  fast_unaligned_if_any="aarch64 ppc x86"
>>>  simd_align_16_if_any="altivec neon sse"
>>>  simd_align_32_if_any="avx"
>>> +simd_align_64_if_any="avx512"
>>>  
>>>  # system capabilities
>>>  symver_if_any="symver_asm_label symver_gnu_asm"
>>> diff --git a/libavcodec/internal.h b/libavcodec/internal.h
>>> index 7748f09f54..84070431ed 100644
>>> --- a/libavcodec/internal.h
>>> +++ b/libavcodec/internal.h
>>> @@ -87,7 +87,9 @@
>>>  
>>>  #define FF_SIGNBIT(x) ((x) >> CHAR_BIT * sizeof(x) - 1)
>>>  
>>> -#if HAVE_SIMD_ALIGN_32
>>> +#if HAVE_SIMD_ALIGN_64
>>> +#   define STRIDE_ALIGN 64 /* AVX-512 */
>>> +#elif HAVE_SIMD_ALIGN_32
>>>  #   define STRIDE_ALIGN 32
>>>  #elif HAVE_SIMD_ALIGN_16
>>>  #   define STRIDE_ALIGN 16
>>>
>>
>> LGTM, but I'd really like to find a way to start using
>> av_get_cpu_max_align() as it was meant to be done when it was
>> introduced, instead of keeping hardcoding alignment based on configure
>> time options.
> 
> That sounds like a good idea to me.  If time allows I will try to start
> a discussion about it.  It sounds ripe for a lot of bikeshedding :)
> 
>> Maybe adding a line to the av_force_cpu_flags() and
>> av_set_cpu_flags_mask() doxy stating you should reinit all your decoder
>> and encoder contexts after calling them or similar.
> 
> Did you want me to add that one of my patches?

No, it's unrelated to this set. It was a suggestion to solve the above
issue. this patch is good as is while that's discussed.
diff mbox

Patch

diff --git a/configure b/configure
index 146a87324c..fce8030d91 100755
--- a/configure
+++ b/configure
@@ -1886,6 +1886,7 @@  ARCH_FEATURES="
     local_aligned
     simd_align_16
     simd_align_32
+    simd_align_64
 "
 
 BUILTIN_LIST="
@@ -2385,6 +2386,7 @@  fast_clz_if_any="aarch64 alpha avr32 mips ppc x86"
 fast_unaligned_if_any="aarch64 ppc x86"
 simd_align_16_if_any="altivec neon sse"
 simd_align_32_if_any="avx"
+simd_align_64_if_any="avx512"
 
 # system capabilities
 symver_if_any="symver_asm_label symver_gnu_asm"
diff --git a/libavcodec/internal.h b/libavcodec/internal.h
index 7748f09f54..84070431ed 100644
--- a/libavcodec/internal.h
+++ b/libavcodec/internal.h
@@ -87,7 +87,9 @@ 
 
 #define FF_SIGNBIT(x) ((x) >> CHAR_BIT * sizeof(x) - 1)
 
-#if HAVE_SIMD_ALIGN_32
+#if HAVE_SIMD_ALIGN_64
+#   define STRIDE_ALIGN 64 /* AVX-512 */
+#elif HAVE_SIMD_ALIGN_32
 #   define STRIDE_ALIGN 32
 #elif HAVE_SIMD_ALIGN_16
 #   define STRIDE_ALIGN 16