diff mbox series

[FFmpeg-devel,2/2] libavfilter/x86/vf_gblur: correct the order of loop step

Message ID 20210916073408.65561-2-jianhua.wu@intel.com
State Accepted
Commit 7bbad32d5ab69cb52bc92a5ec30c7b9838daa08a
Headers show
Series [FFmpeg-devel,1/2] libavfilter/x86/vf_gblur: fixed the fate-test failed on MacOS | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished
andriy/make_ppc success Make finished
andriy/make_fate_ppc success Make fate finished

Commit Message

Wu Jianhua Sept. 16, 2021, 7:34 a.m. UTC
The problem was caused by if the width of the processed block
minus 1 is a multiple of the aligned number the instruction
jle .bscale_scalar would skip the Optimized Loop Step, which
will lead to an incorrect sampling when specifying steps more
than 1. Move the Optimized Loop Step after .bscale_scalar to
ensure the loop step is enabled.

Signed-off-by: Wu Jianhua <jianhua.wu@intel.com>
---
 libavfilter/x86/vf_gblur.asm | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/libavfilter/x86/vf_gblur.asm b/libavfilter/x86/vf_gblur.asm
index 64c067538a..16e802e002 100644
--- a/libavfilter/x86/vf_gblur.asm
+++ b/libavfilter/x86/vf_gblur.asm
@@ -524,9 +524,8 @@  cglobal horiz_slice, 4, 9, 9, ptr, width, height, steps, nu, bscale, x, y, step,
         cmp xq,        0
         jg .loop_x_scalar
 
-    OPTIMIZED_LOOP_STEP
-
     .bscale_scalar:
+        OPTIMIZED_LOOP_STEP
         sub ptrq, 4
         sub localbufq, mmsize
         mulps m3, m1