Message ID | MN0PR12MB6053315D52BDB52D9372A989D3A89@MN0PR12MB6053.namprd12.prod.outlook.com |
---|---|
State | New |
Headers | show |
Series | [FFmpeg-devel] libavfilter/x86/vf_convolution.asm- fix missing decelerator for AVX512ICL sobel | expand |
Context | Check | Description |
---|---|---|
andriy/commit_msg_x86 | warning | The first line of the commit message must start with a context terminated by a colon and a space, for example "lavu/opt: " or "doc: ". |
yinshiyou/configure_loongarch64 | warning | Failed to apply patch |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
On Fri, 24 Feb 2023 at 03:00, Felix LeClair <felix.leclair123@hotmail.com> wrote: > Fixes: Compilation of Sobel with AVX512ICL > Caused: Comment left without deleniator in AVX512ICL version of SOBEL > > Testing:Confirmed working on AVX512 Alderlake (AKA SPR without AMX) > Seems fine, bit weird that FATE didn't pick it up. Kieran
On 2/24/23 04:00, Felix LeClair wrote: > Fixes: Compilation of Sobel with AVX512ICL > Caused: Comment left without deleniator in AVX512ICL version of SOBEL > > Testing:Confirmed working on AVX512 Alderlake (AKA SPR without AMX) > diff --git a/libavfilter/x86/vf_convolution.asm b/libavfilter/x86/vf_convolution.asm > index 9ac9ef5d73..8b85897819 100644 > --- a/libavfilter/x86/vf_convolution.asm > +++ b/libavfilter/x86/vf_convolution.asm > @@ -232,8 +232,8 @@ cglobal filter_sobel, 4, 15, 7, dst, width, rdiv, bias, matrix, ptr, c0, c1, c2, > psubd m4, m5 > vpermb m3, m6, m3 > mova m5, m4 > - vpdpbusd m4, m2, [sobel_mulA] {1to16} > - vpdpbusd m5, m3, [sobel_mulB] {1to16} > + vpdpbusd m4, m2, [sobel_mulA]; {1to16} > + vpdpbusd m5, m3, [sobel_mulB]; {1to16} > > cvtdq2ps m4, m4 > mulps m4, m4 Fix compilation with what? I'm not familiar with the sobel algorith/function so I can't say whether the code is correct. However those constants are only dword sized and that is how you do a memory broadcast with avx512(icl). Furthermore testing your change on an icl system results in a failure in checkasm. So what program and what version fails to assemble that? [re-sending to list]
Without patch I hit: ``` CC libavdevice/version.o AR libavdevice/libavdevice.a CC libavfilter/version.o X86ASM libavfilter/x86/vf_convolution.o libavfilter/x86/vf_convolution.asm:302: error: operation size not specified libavfilter/x86/vf_convolution.asm:235: ... from macro `FILTER_SOBEL' defined here libavfilter/x86/vf_convolution.asm:302: error: operation size not specified libavfilter/x86/vf_convolution.asm:236: ... from macro `FILTER_SOBEL' defined here make: *** [ffbuild/common.mak:103: libavfilter/x86/vf_convolution.o] Error 1 ``` During compilation of commit ac6eec1fc258efce219e4fccb84312a1b13a7a23 With make config ./configure --samples=fate-suite --enable-gpl --enable-ladspa --enable-libass --enable-libcodec2 --enable-libdav1d --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzvbi --enable-lv2 --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libx264 --enable-opencl --enable-nonfree --enable-libsvtav1 --disable-stripping --cpu=sapphire-rapids --enable-pic Kernel 6.2.0, NASM 2.16, GCC 12.1.1 ________________________________ From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> on behalf of James Darnley <james.darnley@gmail.com> Sent: February 24, 2023 8:51 AM To: ffmpeg-devel@ffmpeg.org <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] libavfilter/x86/vf_convolution.asm- fix missing decelerator for AVX512ICL sobel On 2/24/23 04:00, Felix LeClair wrote: > Fixes: Compilation of Sobel with AVX512ICL > Caused: Comment left without deleniator in AVX512ICL version of SOBEL > > Testing:Confirmed working on AVX512 Alderlake (AKA SPR without AMX) > diff --git a/libavfilter/x86/vf_convolution.asm b/libavfilter/x86/vf_convolution.asm > index 9ac9ef5d73..8b85897819 100644 > --- a/libavfilter/x86/vf_convolution.asm > +++ b/libavfilter/x86/vf_convolution.asm > @@ -232,8 +232,8 @@ cglobal filter_sobel, 4, 15, 7, dst, width, rdiv, bias, matrix, ptr, c0, c1, c2, > psubd m4, m5 > vpermb m3, m6, m3 > mova m5, m4 > - vpdpbusd m4, m2, [sobel_mulA] {1to16} > - vpdpbusd m5, m3, [sobel_mulB] {1to16} > + vpdpbusd m4, m2, [sobel_mulA]; {1to16} > + vpdpbusd m5, m3, [sobel_mulB]; {1to16} > > cvtdq2ps m4, m4 > mulps m4, m4 Fix compilation with what? I'm not familiar with the sobel algorith/function so I can't say whether the code is correct. However those constants are only dword sized and that is how you do a memory broadcast with avx512(icl). Furthermore testing your change on an icl system results in a failure in checkasm. So what program and what version fails to assemble that? [re-sending to list] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Disregard-found to be an issue in nasm 2.16RC, fixed with upstream 2.16.01 ________________________________ From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> on behalf of Felix LeClair <felix.leclair123@hotmail.com> Sent: February 24, 2023 10:12 AM To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] libavfilter/x86/vf_convolution.asm- fix missing decelerator for AVX512ICL sobel Without patch I hit: ``` CC libavdevice/version.o AR libavdevice/libavdevice.a CC libavfilter/version.o X86ASM libavfilter/x86/vf_convolution.o libavfilter/x86/vf_convolution.asm:302: error: operation size not specified libavfilter/x86/vf_convolution.asm:235: ... from macro `FILTER_SOBEL' defined here libavfilter/x86/vf_convolution.asm:302: error: operation size not specified libavfilter/x86/vf_convolution.asm:236: ... from macro `FILTER_SOBEL' defined here make: *** [ffbuild/common.mak:103: libavfilter/x86/vf_convolution.o] Error 1 ``` During compilation of commit ac6eec1fc258efce219e4fccb84312a1b13a7a23 With make config ./configure --samples=fate-suite --enable-gpl --enable-ladspa --enable-libass --enable-libcodec2 --enable-libdav1d --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzvbi --enable-lv2 --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libx264 --enable-opencl --enable-nonfree --enable-libsvtav1 --disable-stripping --cpu=sapphire-rapids --enable-pic Kernel 6.2.0, NASM 2.16, GCC 12.1.1 ________________________________ From: ffmpeg-devel <ffmpeg-devel-bounces@ffmpeg.org> on behalf of James Darnley <james.darnley@gmail.com> Sent: February 24, 2023 8:51 AM To: ffmpeg-devel@ffmpeg.org <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] libavfilter/x86/vf_convolution.asm- fix missing decelerator for AVX512ICL sobel On 2/24/23 04:00, Felix LeClair wrote: > Fixes: Compilation of Sobel with AVX512ICL > Caused: Comment left without deleniator in AVX512ICL version of SOBEL > > Testing:Confirmed working on AVX512 Alderlake (AKA SPR without AMX) > diff --git a/libavfilter/x86/vf_convolution.asm b/libavfilter/x86/vf_convolution.asm > index 9ac9ef5d73..8b85897819 100644 > --- a/libavfilter/x86/vf_convolution.asm > +++ b/libavfilter/x86/vf_convolution.asm > @@ -232,8 +232,8 @@ cglobal filter_sobel, 4, 15, 7, dst, width, rdiv, bias, matrix, ptr, c0, c1, c2, > psubd m4, m5 > vpermb m3, m6, m3 > mova m5, m4 > - vpdpbusd m4, m2, [sobel_mulA] {1to16} > - vpdpbusd m5, m3, [sobel_mulB] {1to16} > + vpdpbusd m4, m2, [sobel_mulA]; {1to16} > + vpdpbusd m5, m3, [sobel_mulB]; {1to16} > > cvtdq2ps m4, m4 > mulps m4, m4 Fix compilation with what? I'm not familiar with the sobel algorith/function so I can't say whether the code is correct. However those constants are only dword sized and that is how you do a memory broadcast with avx512(icl). Furthermore testing your change on an icl system results in a failure in checkasm. So what program and what version fails to assemble that? [re-sending to list] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
From 2b12db24d2bbe8a1544a9a7f3a08e1a693a6c2ce Mon Sep 17 00:00:00 2001 From: "Felix LeClair (FCLC)" <felix.leclair123@hotmail.com> Date: Thu, 23 Feb 2023 21:36:32 -0500 Subject: [PATCH] libavfilter/x86/vf_convolution.asm Fixes: Compilation of Sobel with AVX512ICL Caused: Comment left without deleniator in AVX512ICL version of SOBEL Testing:Confirmed working on AVX512 Alderlake (AKA SPR without AMX) Signed-off-by: Felix LeClair (FCLC) <felix.leclair123@hotmail.com> --- libavfilter/x86/vf_convolution.asm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libavfilter/x86/vf_convolution.asm b/libavfilter/x86/vf_convolution.asm index 9ac9ef5d73..8b85897819 100644 --- a/libavfilter/x86/vf_convolution.asm +++ b/libavfilter/x86/vf_convolution.asm @@ -232,8 +232,8 @@ cglobal filter_sobel, 4, 15, 7, dst, width, rdiv, bias, matrix, ptr, c0, c1, c2, psubd m4, m5 vpermb m3, m6, m3 mova m5, m4 - vpdpbusd m4, m2, [sobel_mulA] {1to16} - vpdpbusd m5, m3, [sobel_mulB] {1to16} + vpdpbusd m4, m2, [sobel_mulA]; {1to16} + vpdpbusd m5, m3, [sobel_mulB]; {1to16} cvtdq2ps m4, m4 mulps m4, m4 -- 2.34.1