diff mbox series

[FFmpeg-devel] configure: set IceLake-AVX512 as the minimum baseline

Message ID 20210817062948.638775-1-jianhua.wu@intel.com
State New
Headers show
Series [FFmpeg-devel] configure: set IceLake-AVX512 as the minimum baseline
Related show

Checks

Context Check Description
andriy/x86_make success Make finished
andriy/x86_make_fate success Make fate finished
andriy/PPC64_make success Make finished
andriy/PPC64_make_fate success Make fate finished

Commit Message

Wu Jianhua Aug. 17, 2021, 6:29 a.m. UTC
Based on IceLake-AVX512 and newer architecture, a broad
range of the subsets of AVX512 could be supported.

Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
---
 configure | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Hendrik Leppkes Aug. 17, 2021, 6:33 a.m. UTC | #1
On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com> wrote:
>
> Based on IceLake-AVX512 and newer architecture, a broad
> range of the subsets of AVX512 could be supported.
>
> Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
> ---
>  configure | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/configure b/configure
> index 94b30afe74..04caa25736 100755
> --- a/configure
> +++ b/configure
> @@ -6057,7 +6057,9 @@ EOF
>              elf*) enabled debug && append X86ASMFLAGS $x86asm_debug ;;
>          esac
>
> -        enabled avx512 && check_x86asm avx512_external "vmovdqa32 [eax]{k1}{z}, zmm0"
> +        # Only IceLake and newer architectures could enable AVX512
> +        # F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ
> +        enabled avx512 && check_x86asm avx512_external "vpdpwssds zmm31{k1}{z}, zmm29, zmm28"
>          enabled avx2   && check_x86asm avx2_external   "vextracti128 xmm0, ymm0, 0"
>          enabled xop    && check_x86asm xop_external    "vpmacsdd xmm0, xmm1, xmm2, xmm3"
>          enabled fma4   && check_x86asm fma4_external   "vfmaddps ymm0, ymm1, ymm2, ymm3"

Note that you are just checking the functionality of the assembler
here, not having a runtime impact.
What you would likely want is to update avutil/x86/cpu.c as well to
only enable the AVX512 flag on those CPUs.

- Hendrik
Ronald S. Bultje Aug. 17, 2021, 3:25 p.m. UTC | #2
Hi,

On Tue, Aug 17, 2021 at 2:33 AM Hendrik Leppkes <h.leppkes@gmail.com> wrote:

> On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com> wrote:
> > Based on IceLake-AVX512 and newer architecture, a broad
> > range of the subsets of AVX512 could be supported.
>
[..]

> > -        enabled avx512 && check_x86asm avx512_external "vmovdqa32
> [eax]{k1}{z}, zmm0"
> > +        # Only IceLake and newer architectures could enable AVX512
> > +        #
> F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ
> > +        enabled avx512 && check_x86asm avx512_external "vpdpwssds
> zmm31{k1}{z}, zmm29, zmm28"
> >          enabled avx2   && check_x86asm avx2_external   "vextracti128
> xmm0, ymm0, 0"
> >          enabled xop    && check_x86asm xop_external    "vpmacsdd xmm0,
> xmm1, xmm2, xmm3"
> >          enabled fma4   && check_x86asm fma4_external   "vfmaddps ymm0,
> ymm1, ymm2, ymm3"
>
> Note that you are just checking the functionality of the assembler
> here, not having a runtime impact.
> What you would likely want is to update avutil/x86/cpu.c as well to
> only enable the AVX512 flag on those CPUs.
>

[After IRC discussion] you want runtime checks for the
variants/combinations-of-subsets that we want to support. Right now, avx512
means skylake, so you may want to rename that flag to "avx512skl", and add
a new runtime flag + check for the icelake subset called "avx512icl". Then
in your implementations, you use the appropriate flag, and code components
can individually choose to use skylake- and/or icelake-optimized ax512
functions.

Ronald
James Almer Aug. 17, 2021, 7:24 p.m. UTC | #3
On 8/17/2021 12:25 PM, Ronald S. Bultje wrote:
> Hi,
> 
> On Tue, Aug 17, 2021 at 2:33 AM Hendrik Leppkes <h.leppkes@gmail.com> wrote:
> 
>> On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com> wrote:
>>> Based on IceLake-AVX512 and newer architecture, a broad
>>> range of the subsets of AVX512 could be supported.
>>
> [..]
> 
>>> -        enabled avx512 && check_x86asm avx512_external "vmovdqa32
>> [eax]{k1}{z}, zmm0"
>>> +        # Only IceLake and newer architectures could enable AVX512
>>> +        #
>> F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ
>>> +        enabled avx512 && check_x86asm avx512_external "vpdpwssds
>> zmm31{k1}{z}, zmm29, zmm28"
>>>           enabled avx2   && check_x86asm avx2_external   "vextracti128
>> xmm0, ymm0, 0"
>>>           enabled xop    && check_x86asm xop_external    "vpmacsdd xmm0,
>> xmm1, xmm2, xmm3"
>>>           enabled fma4   && check_x86asm fma4_external   "vfmaddps ymm0,
>> ymm1, ymm2, ymm3"
>>
>> Note that you are just checking the functionality of the assembler
>> here, not having a runtime impact.
>> What you would likely want is to update avutil/x86/cpu.c as well to
>> only enable the AVX512 flag on those CPUs.
>>
> 
> [After IRC discussion] you want runtime checks for the
> variants/combinations-of-subsets that we want to support. Right now, avx512
> means skylake, so you may want to rename that flag to "avx512skl", and add
> a new runtime flag + check for the icelake subset called "avx512icl". Then
> in your implementations, you use the appropriate flag, and code components
> can individually choose to use skylake- and/or icelake-optimized ax512
> functions.

Does it really mean Skylake-X? Afaik the flag checks in cpu.c currently 
look for AVX-512 Foundation and ZMM support, so it means Knights Landing 
or newer.

What about just making the existing AVX512 flag mean F+VL+DQ+BW, so 
Skylake-X (Anything older just lacks useful instructions for 
multimedia), and if needed for this new code add a new avx512icl flag 
that also looks for something like GFNI.

> 
> Ronald
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>
James Almer Aug. 17, 2021, 7:27 p.m. UTC | #4
On 8/17/2021 4:24 PM, James Almer wrote:
> On 8/17/2021 12:25 PM, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Tue, Aug 17, 2021 at 2:33 AM Hendrik Leppkes <h.leppkes@gmail.com> 
>> wrote:
>>
>>> On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com> wrote:
>>>> Based on IceLake-AVX512 and newer architecture, a broad
>>>> range of the subsets of AVX512 could be supported.
>>>
>> [..]
>>
>>>> -        enabled avx512 && check_x86asm avx512_external "vmovdqa32
>>> [eax]{k1}{z}, zmm0"
>>>> +        # Only IceLake and newer architectures could enable AVX512
>>>> +        #
>>> F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ
>>>> +        enabled avx512 && check_x86asm avx512_external "vpdpwssds
>>> zmm31{k1}{z}, zmm29, zmm28"
>>>>           enabled avx2   && check_x86asm avx2_external   "vextracti128
>>> xmm0, ymm0, 0"
>>>>           enabled xop    && check_x86asm xop_external    "vpmacsdd 
>>>> xmm0,
>>> xmm1, xmm2, xmm3"
>>>>           enabled fma4   && check_x86asm fma4_external   "vfmaddps 
>>>> ymm0,
>>> ymm1, ymm2, ymm3"
>>>
>>> Note that you are just checking the functionality of the assembler
>>> here, not having a runtime impact.
>>> What you would likely want is to update avutil/x86/cpu.c as well to
>>> only enable the AVX512 flag on those CPUs.
>>>
>>
>> [After IRC discussion] you want runtime checks for the
>> variants/combinations-of-subsets that we want to support. Right now, 
>> avx512
>> means skylake, so you may want to rename that flag to "avx512skl", and 
>> add
>> a new runtime flag + check for the icelake subset called "avx512icl". 
>> Then
>> in your implementations, you use the appropriate flag, and code 
>> components
>> can individually choose to use skylake- and/or icelake-optimized ax512
>> functions.
> 
> Does it really mean Skylake-X? Afaik the flag checks in cpu.c currently 
> look for AVX-512 Foundation and ZMM support, so it means Knights Landing 
> or newer.

Nevermind, just noticed the comment in cpu.c that mentions the Skylake-X 
extensions.

> 
> What about just making the existing AVX512 flag mean F+VL+DQ+BW, so 
> Skylake-X (Anything older just lacks useful instructions for 
> multimedia), and if needed for this new code add a new avx512icl flag 
> that also looks for something like GFNI.

Assuming making Ice Lake the minimum supported SKU is not acceptable, 
then your suggestion is fine (Sans the renaming, since it's a breaking 
change).

> 
>>
>> Ronald
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
>>
>
Ronald S. Bultje Aug. 17, 2021, 8:47 p.m. UTC | #5
Hi,

On Tue, Aug 17, 2021 at 3:27 PM James Almer <jamrial@gmail.com> wrote:

> On 8/17/2021 4:24 PM, James Almer wrote:
> > On 8/17/2021 12:25 PM, Ronald S. Bultje wrote:
> >> On Tue, Aug 17, 2021 at 2:33 AM Hendrik Leppkes <h.leppkes@gmail.com>
> >> wrote:
> >>> On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com>
> wrote:
> >>>> Based on IceLake-AVX512 and newer architecture, a broad
> >>>> range of the subsets of AVX512 could be supported.
> >> [..]
> >>>> -        enabled avx512 && check_x86asm avx512_external "vmovdqa32
> >>> [eax]{k1}{z}, zmm0"
> >>>> +        # Only IceLake and newer architectures could enable AVX512
> >>>> +        #
> >>>
> F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ
> >>>> +        enabled avx512 && check_x86asm avx512_external "vpdpwssds
> >>> zmm31{k1}{z}, zmm29, zmm28"
> >>>>           enabled avx2   && check_x86asm avx2_external   "vextracti128
> >>> xmm0, ymm0, 0"
> >>>>           enabled xop    && check_x86asm xop_external    "vpmacsdd
> >>>> xmm0,
> >>> xmm1, xmm2, xmm3"
> >>>>           enabled fma4   && check_x86asm fma4_external   "vfmaddps
> >>>> ymm0,
> >>> ymm1, ymm2, ymm3"
> >>>
> >>> Note that you are just checking the functionality of the assembler
> >>> here, not having a runtime impact.
> >>> What you would likely want is to update avutil/x86/cpu.c as well to
> >>> only enable the AVX512 flag on those CPUs.
> >>
> >> [After IRC discussion] you want runtime checks for the
> >> variants/combinations-of-subsets that we want to support. Right now,
> >> avx512
> >> means skylake, so you may want to rename that flag to "avx512skl", and
> >> add
> >> a new runtime flag + check for the icelake subset called "avx512icl".
> >> Then
> >> in your implementations, you use the appropriate flag, and code
> >> components
> >> can individually choose to use skylake- and/or icelake-optimized ax512
> >> functions.
> [..]
>
> What about just making the existing AVX512 flag mean F+VL+DQ+BW, so
> > Skylake-X (Anything older just lacks useful instructions for
> > multimedia), and if needed for this new code add a new avx512icl flag
> > that also looks for something like GFNI.
>
> Assuming making Ice Lake the minimum supported SKU is not acceptable,
> then your suggestion is fine (Sans the renaming, since it's a breaking
> change).
>

Yes, agreed.

Ronald
Wu Jianhua Aug. 18, 2021, 2:13 a.m. UTC | #6
James Almer W	rote: 
> On 8/17/2021 4:24 PM, James Almer wrote:
> > On 8/17/2021 12:25 PM, Ronald S. Bultje wrote:
> >> Hi,
> >>
> >> On Tue, Aug 17, 2021 at 2:33 AM Hendrik Leppkes <h.leppkes@gmail.com>
> >> wrote:
> >>
> >>> On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com>
> wrote:
> >>>> Based on IceLake-AVX512 and newer architecture, a broad range of
> >>>> the subsets of AVX512 could be supported.
> >>>
> >> [..]
> >>
> >>>> -        enabled avx512 && check_x86asm avx512_external "vmovdqa32
> >>> [eax]{k1}{z}, zmm0"
> >>>> +        # Only IceLake and newer architectures could enable AVX512
> >>>> +        #
> >>>
> F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VA
> ES/VPCLMU
> >>> LQDQ
> >>>> +        enabled avx512 && check_x86asm avx512_external "vpdpwssds
> >>> zmm31{k1}{z}, zmm29, zmm28"
> >>>>           enabled avx2   && check_x86asm avx2_external
> >>>> "vextracti128
> >>> xmm0, ymm0, 0"
> >>>>           enabled xop    && check_x86asm xop_external    "vpmacsdd
> >>>> xmm0,
> >>> xmm1, xmm2, xmm3"
> >>>>           enabled fma4   && check_x86asm fma4_external   "vfmaddps
> >>>> ymm0,
> >>> ymm1, ymm2, ymm3"
> >>>
> >>> Note that you are just checking the functionality of the assembler
> >>> here, not having a runtime impact.
> >>> What you would likely want is to update avutil/x86/cpu.c as well to
> >>> only enable the AVX512 flag on those CPUs.
> >>>
> >>
> >> [After IRC discussion] you want runtime checks for the
> >> variants/combinations-of-subsets that we want to support. Right now,
> >> avx512
> >> means skylake, so you may want to rename that flag to "avx512skl",
> >> and add a new runtime flag + check for the icelake subset called
> >> "avx512icl".
> >> Then
> >> in your implementations, you use the appropriate flag, and code
> >> components can individually choose to use skylake- and/or
> >> icelake-optimized ax512 functions.
> >
> > Does it really mean Skylake-X? Afaik the flag checks in cpu.c
> > currently look for AVX-512 Foundation and ZMM support, so it means
> > Knights Landing or newer.
> 
> Nevermind, just noticed the comment in cpu.c that mentions the Skylake-X
> extensions.
> 
> >
> > What about just making the existing AVX512 flag mean F+VL+DQ+BW, so
> > Skylake-X (Anything older just lacks useful instructions for
> > multimedia), and if needed for this new code add a new avx512icl flag
> > that also looks for something like GFNI.
> 
> Assuming making Ice Lake the minimum supported SKU is not acceptable,
> then your suggestion is fine (Sans the renaming, since it's a breaking change).
> 
Hi,

The reason why we set the icelake-avx512 as the minimum baseline is that
we don't want FFmpeg run on a processor downclocking heavily, hence, keep
the current flag and add a new AVX512ICL make no sense. 

I was supposed to add a cpuid check for F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/
VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ in cpu.c in v2 patch
and no need to change flags. 

Jianhua
Ronald S. Bultje Aug. 18, 2021, 10:40 a.m. UTC | #7
Hi,

On Tue, Aug 17, 2021 at 10:13 PM Wu, Jianhua <jianhua.wu@intel.com> wrote:

> The reason why we set the icelake-avx512 as the minimum baseline is that
> we don't want FFmpeg run on a processor downclocking heavily, hence, keep
> the current flag and add a new AVX512ICL make no sense.
>

I don't think you can make that decision single-handedly. doesn't it depend
on the situation, and shouldn't individual users and/or developers be able
to make that decision themselves?

And still, you're adjusting the assembler check, you need to adjust the
runtime check. This won't do what you think it does. It's not a runtime
check and doesn't check host runtime abilities, just host build abilities.

Ronald
Joel Linn Aug. 18, 2021, 10:59 a.m. UTC | #8
Hello,

On 2021-08-18 12:40, Ronald S. Bultje wrote:
> 
> I don't think you can make that decision single-handedly. doesn't it 
> depend
> on the situation, and shouldn't individual users and/or developers be 
> able
> to make that decision themselves?
> 

I agree, not all AVX512 instructions slow down those CPUs and heat the 
room; especially when used only occasionally.

True, the AVX512 feature matrix is a mess but there are a lot of 
instructions which simplify work with 256bit registers as well because 
the restriction on 128bit lanes was lifted and a lot of new functions 
where added.
It was shown that the occasional use of AVX512 instructions does not 
significantly throttle CPUs. For example in a larger algorithm, 
replacing sequences of AVX permute and shuffle instructions with single 
AVX512 permute(ex)s could thus aid all AVX512 systems.
diff mbox series

Patch

diff --git a/configure b/configure
index 94b30afe74..04caa25736 100755
--- a/configure
+++ b/configure
@@ -6057,7 +6057,9 @@  EOF
             elf*) enabled debug && append X86ASMFLAGS $x86asm_debug ;;
         esac
 
-        enabled avx512 && check_x86asm avx512_external "vmovdqa32 [eax]{k1}{z}, zmm0"
+        # Only IceLake and newer architectures could enable AVX512
+        # F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ
+        enabled avx512 && check_x86asm avx512_external "vpdpwssds zmm31{k1}{z}, zmm29, zmm28"
         enabled avx2   && check_x86asm avx2_external   "vextracti128 xmm0, ymm0, 0"
         enabled xop    && check_x86asm xop_external    "vpmacsdd xmm0, xmm1, xmm2, xmm3"
         enabled fma4   && check_x86asm fma4_external   "vfmaddps ymm0, ymm1, ymm2, ymm3"