Message ID | 20210817062948.638775-1-jianhua.wu@intel.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] configure: set IceLake-AVX512 as the minimum baseline | expand |
Context | Check | Description |
---|---|---|
andriy/x86_make | success | Make finished |
andriy/x86_make_fate | success | Make fate finished |
andriy/PPC64_make | success | Make finished |
andriy/PPC64_make_fate | success | Make fate finished |
On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com> wrote: > > Based on IceLake-AVX512 and newer architecture, a broad > range of the subsets of AVX512 could be supported. > > Signed-off-by: Wu Jianhua <jianhua.wu@intel.com> > --- > configure | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/configure b/configure > index 94b30afe74..04caa25736 100755 > --- a/configure > +++ b/configure > @@ -6057,7 +6057,9 @@ EOF > elf*) enabled debug && append X86ASMFLAGS $x86asm_debug ;; > esac > > - enabled avx512 && check_x86asm avx512_external "vmovdqa32 [eax]{k1}{z}, zmm0" > + # Only IceLake and newer architectures could enable AVX512 > + # F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ > + enabled avx512 && check_x86asm avx512_external "vpdpwssds zmm31{k1}{z}, zmm29, zmm28" > enabled avx2 && check_x86asm avx2_external "vextracti128 xmm0, ymm0, 0" > enabled xop && check_x86asm xop_external "vpmacsdd xmm0, xmm1, xmm2, xmm3" > enabled fma4 && check_x86asm fma4_external "vfmaddps ymm0, ymm1, ymm2, ymm3" Note that you are just checking the functionality of the assembler here, not having a runtime impact. What you would likely want is to update avutil/x86/cpu.c as well to only enable the AVX512 flag on those CPUs. - Hendrik
Hi, On Tue, Aug 17, 2021 at 2:33 AM Hendrik Leppkes <h.leppkes@gmail.com> wrote: > On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com> wrote: > > Based on IceLake-AVX512 and newer architecture, a broad > > range of the subsets of AVX512 could be supported. > [..] > > - enabled avx512 && check_x86asm avx512_external "vmovdqa32 > [eax]{k1}{z}, zmm0" > > + # Only IceLake and newer architectures could enable AVX512 > > + # > F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ > > + enabled avx512 && check_x86asm avx512_external "vpdpwssds > zmm31{k1}{z}, zmm29, zmm28" > > enabled avx2 && check_x86asm avx2_external "vextracti128 > xmm0, ymm0, 0" > > enabled xop && check_x86asm xop_external "vpmacsdd xmm0, > xmm1, xmm2, xmm3" > > enabled fma4 && check_x86asm fma4_external "vfmaddps ymm0, > ymm1, ymm2, ymm3" > > Note that you are just checking the functionality of the assembler > here, not having a runtime impact. > What you would likely want is to update avutil/x86/cpu.c as well to > only enable the AVX512 flag on those CPUs. > [After IRC discussion] you want runtime checks for the variants/combinations-of-subsets that we want to support. Right now, avx512 means skylake, so you may want to rename that flag to "avx512skl", and add a new runtime flag + check for the icelake subset called "avx512icl". Then in your implementations, you use the appropriate flag, and code components can individually choose to use skylake- and/or icelake-optimized ax512 functions. Ronald
On 8/17/2021 12:25 PM, Ronald S. Bultje wrote: > Hi, > > On Tue, Aug 17, 2021 at 2:33 AM Hendrik Leppkes <h.leppkes@gmail.com> wrote: > >> On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com> wrote: >>> Based on IceLake-AVX512 and newer architecture, a broad >>> range of the subsets of AVX512 could be supported. >> > [..] > >>> - enabled avx512 && check_x86asm avx512_external "vmovdqa32 >> [eax]{k1}{z}, zmm0" >>> + # Only IceLake and newer architectures could enable AVX512 >>> + # >> F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ >>> + enabled avx512 && check_x86asm avx512_external "vpdpwssds >> zmm31{k1}{z}, zmm29, zmm28" >>> enabled avx2 && check_x86asm avx2_external "vextracti128 >> xmm0, ymm0, 0" >>> enabled xop && check_x86asm xop_external "vpmacsdd xmm0, >> xmm1, xmm2, xmm3" >>> enabled fma4 && check_x86asm fma4_external "vfmaddps ymm0, >> ymm1, ymm2, ymm3" >> >> Note that you are just checking the functionality of the assembler >> here, not having a runtime impact. >> What you would likely want is to update avutil/x86/cpu.c as well to >> only enable the AVX512 flag on those CPUs. >> > > [After IRC discussion] you want runtime checks for the > variants/combinations-of-subsets that we want to support. Right now, avx512 > means skylake, so you may want to rename that flag to "avx512skl", and add > a new runtime flag + check for the icelake subset called "avx512icl". Then > in your implementations, you use the appropriate flag, and code components > can individually choose to use skylake- and/or icelake-optimized ax512 > functions. Does it really mean Skylake-X? Afaik the flag checks in cpu.c currently look for AVX-512 Foundation and ZMM support, so it means Knights Landing or newer. What about just making the existing AVX512 flag mean F+VL+DQ+BW, so Skylake-X (Anything older just lacks useful instructions for multimedia), and if needed for this new code add a new avx512icl flag that also looks for something like GFNI. > > Ronald > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >
On 8/17/2021 4:24 PM, James Almer wrote: > On 8/17/2021 12:25 PM, Ronald S. Bultje wrote: >> Hi, >> >> On Tue, Aug 17, 2021 at 2:33 AM Hendrik Leppkes <h.leppkes@gmail.com> >> wrote: >> >>> On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com> wrote: >>>> Based on IceLake-AVX512 and newer architecture, a broad >>>> range of the subsets of AVX512 could be supported. >>> >> [..] >> >>>> - enabled avx512 && check_x86asm avx512_external "vmovdqa32 >>> [eax]{k1}{z}, zmm0" >>>> + # Only IceLake and newer architectures could enable AVX512 >>>> + # >>> F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ >>>> + enabled avx512 && check_x86asm avx512_external "vpdpwssds >>> zmm31{k1}{z}, zmm29, zmm28" >>>> enabled avx2 && check_x86asm avx2_external "vextracti128 >>> xmm0, ymm0, 0" >>>> enabled xop && check_x86asm xop_external "vpmacsdd >>>> xmm0, >>> xmm1, xmm2, xmm3" >>>> enabled fma4 && check_x86asm fma4_external "vfmaddps >>>> ymm0, >>> ymm1, ymm2, ymm3" >>> >>> Note that you are just checking the functionality of the assembler >>> here, not having a runtime impact. >>> What you would likely want is to update avutil/x86/cpu.c as well to >>> only enable the AVX512 flag on those CPUs. >>> >> >> [After IRC discussion] you want runtime checks for the >> variants/combinations-of-subsets that we want to support. Right now, >> avx512 >> means skylake, so you may want to rename that flag to "avx512skl", and >> add >> a new runtime flag + check for the icelake subset called "avx512icl". >> Then >> in your implementations, you use the appropriate flag, and code >> components >> can individually choose to use skylake- and/or icelake-optimized ax512 >> functions. > > Does it really mean Skylake-X? Afaik the flag checks in cpu.c currently > look for AVX-512 Foundation and ZMM support, so it means Knights Landing > or newer. Nevermind, just noticed the comment in cpu.c that mentions the Skylake-X extensions. > > What about just making the existing AVX512 flag mean F+VL+DQ+BW, so > Skylake-X (Anything older just lacks useful instructions for > multimedia), and if needed for this new code add a new avx512icl flag > that also looks for something like GFNI. Assuming making Ice Lake the minimum supported SKU is not acceptable, then your suggestion is fine (Sans the renaming, since it's a breaking change). > >> >> Ronald >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". >> >
Hi, On Tue, Aug 17, 2021 at 3:27 PM James Almer <jamrial@gmail.com> wrote: > On 8/17/2021 4:24 PM, James Almer wrote: > > On 8/17/2021 12:25 PM, Ronald S. Bultje wrote: > >> On Tue, Aug 17, 2021 at 2:33 AM Hendrik Leppkes <h.leppkes@gmail.com> > >> wrote: > >>> On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com> > wrote: > >>>> Based on IceLake-AVX512 and newer architecture, a broad > >>>> range of the subsets of AVX512 could be supported. > >> [..] > >>>> - enabled avx512 && check_x86asm avx512_external "vmovdqa32 > >>> [eax]{k1}{z}, zmm0" > >>>> + # Only IceLake and newer architectures could enable AVX512 > >>>> + # > >>> > F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ > >>>> + enabled avx512 && check_x86asm avx512_external "vpdpwssds > >>> zmm31{k1}{z}, zmm29, zmm28" > >>>> enabled avx2 && check_x86asm avx2_external "vextracti128 > >>> xmm0, ymm0, 0" > >>>> enabled xop && check_x86asm xop_external "vpmacsdd > >>>> xmm0, > >>> xmm1, xmm2, xmm3" > >>>> enabled fma4 && check_x86asm fma4_external "vfmaddps > >>>> ymm0, > >>> ymm1, ymm2, ymm3" > >>> > >>> Note that you are just checking the functionality of the assembler > >>> here, not having a runtime impact. > >>> What you would likely want is to update avutil/x86/cpu.c as well to > >>> only enable the AVX512 flag on those CPUs. > >> > >> [After IRC discussion] you want runtime checks for the > >> variants/combinations-of-subsets that we want to support. Right now, > >> avx512 > >> means skylake, so you may want to rename that flag to "avx512skl", and > >> add > >> a new runtime flag + check for the icelake subset called "avx512icl". > >> Then > >> in your implementations, you use the appropriate flag, and code > >> components > >> can individually choose to use skylake- and/or icelake-optimized ax512 > >> functions. > [..] > > What about just making the existing AVX512 flag mean F+VL+DQ+BW, so > > Skylake-X (Anything older just lacks useful instructions for > > multimedia), and if needed for this new code add a new avx512icl flag > > that also looks for something like GFNI. > > Assuming making Ice Lake the minimum supported SKU is not acceptable, > then your suggestion is fine (Sans the renaming, since it's a breaking > change). > Yes, agreed. Ronald
James Almer W rote: > On 8/17/2021 4:24 PM, James Almer wrote: > > On 8/17/2021 12:25 PM, Ronald S. Bultje wrote: > >> Hi, > >> > >> On Tue, Aug 17, 2021 at 2:33 AM Hendrik Leppkes <h.leppkes@gmail.com> > >> wrote: > >> > >>> On Tue, Aug 17, 2021 at 8:30 AM Wu Jianhua <jianhua.wu@intel.com> > wrote: > >>>> Based on IceLake-AVX512 and newer architecture, a broad range of > >>>> the subsets of AVX512 could be supported. > >>> > >> [..] > >> > >>>> - enabled avx512 && check_x86asm avx512_external "vmovdqa32 > >>> [eax]{k1}{z}, zmm0" > >>>> + # Only IceLake and newer architectures could enable AVX512 > >>>> + # > >>> > F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VA > ES/VPCLMU > >>> LQDQ > >>>> + enabled avx512 && check_x86asm avx512_external "vpdpwssds > >>> zmm31{k1}{z}, zmm29, zmm28" > >>>> enabled avx2 && check_x86asm avx2_external > >>>> "vextracti128 > >>> xmm0, ymm0, 0" > >>>> enabled xop && check_x86asm xop_external "vpmacsdd > >>>> xmm0, > >>> xmm1, xmm2, xmm3" > >>>> enabled fma4 && check_x86asm fma4_external "vfmaddps > >>>> ymm0, > >>> ymm1, ymm2, ymm3" > >>> > >>> Note that you are just checking the functionality of the assembler > >>> here, not having a runtime impact. > >>> What you would likely want is to update avutil/x86/cpu.c as well to > >>> only enable the AVX512 flag on those CPUs. > >>> > >> > >> [After IRC discussion] you want runtime checks for the > >> variants/combinations-of-subsets that we want to support. Right now, > >> avx512 > >> means skylake, so you may want to rename that flag to "avx512skl", > >> and add a new runtime flag + check for the icelake subset called > >> "avx512icl". > >> Then > >> in your implementations, you use the appropriate flag, and code > >> components can individually choose to use skylake- and/or > >> icelake-optimized ax512 functions. > > > > Does it really mean Skylake-X? Afaik the flag checks in cpu.c > > currently look for AVX-512 Foundation and ZMM support, so it means > > Knights Landing or newer. > > Nevermind, just noticed the comment in cpu.c that mentions the Skylake-X > extensions. > > > > > What about just making the existing AVX512 flag mean F+VL+DQ+BW, so > > Skylake-X (Anything older just lacks useful instructions for > > multimedia), and if needed for this new code add a new avx512icl flag > > that also looks for something like GFNI. > > Assuming making Ice Lake the minimum supported SKU is not acceptable, > then your suggestion is fine (Sans the renaming, since it's a breaking change). > Hi, The reason why we set the icelake-avx512 as the minimum baseline is that we don't want FFmpeg run on a processor downclocking heavily, hence, keep the current flag and add a new AVX512ICL make no sense. I was supposed to add a cpuid check for F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/ VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ in cpu.c in v2 patch and no need to change flags. Jianhua
Hi, On Tue, Aug 17, 2021 at 10:13 PM Wu, Jianhua <jianhua.wu@intel.com> wrote: > The reason why we set the icelake-avx512 as the minimum baseline is that > we don't want FFmpeg run on a processor downclocking heavily, hence, keep > the current flag and add a new AVX512ICL make no sense. > I don't think you can make that decision single-handedly. doesn't it depend on the situation, and shouldn't individual users and/or developers be able to make that decision themselves? And still, you're adjusting the assembler check, you need to adjust the runtime check. This won't do what you think it does. It's not a runtime check and doesn't check host runtime abilities, just host build abilities. Ronald
Hello, On 2021-08-18 12:40, Ronald S. Bultje wrote: > > I don't think you can make that decision single-handedly. doesn't it > depend > on the situation, and shouldn't individual users and/or developers be > able > to make that decision themselves? > I agree, not all AVX512 instructions slow down those CPUs and heat the room; especially when used only occasionally. True, the AVX512 feature matrix is a mess but there are a lot of instructions which simplify work with 256bit registers as well because the restriction on 128bit lanes was lifted and a lot of new functions where added. It was shown that the occasional use of AVX512 instructions does not significantly throttle CPUs. For example in a larger algorithm, replacing sequences of AVX permute and shuffle instructions with single AVX512 permute(ex)s could thus aid all AVX512 systems.
diff --git a/configure b/configure index 94b30afe74..04caa25736 100755 --- a/configure +++ b/configure @@ -6057,7 +6057,9 @@ EOF elf*) enabled debug && append X86ASMFLAGS $x86asm_debug ;; esac - enabled avx512 && check_x86asm avx512_external "vmovdqa32 [eax]{k1}{z}, zmm0" + # Only IceLake and newer architectures could enable AVX512 + # F/CD/BW/DQ/VL/VNNI/IFMA/VBMI/VBMI2/VPOPCNTDQ/BITALG/GFNI/VAES/VPCLMULQDQ + enabled avx512 && check_x86asm avx512_external "vpdpwssds zmm31{k1}{z}, zmm29, zmm28" enabled avx2 && check_x86asm avx2_external "vextracti128 xmm0, ymm0, 0" enabled xop && check_x86asm xop_external "vpmacsdd xmm0, xmm1, xmm2, xmm3" enabled fma4 && check_x86asm fma4_external "vfmaddps ymm0, ymm1, ymm2, ymm3"
Based on IceLake-AVX512 and newer architecture, a broad range of the subsets of AVX512 could be supported. Signed-off-by: Wu Jianhua <jianhua.wu@intel.com> --- configure | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)