diff mbox series

[FFmpeg-devel] swscale: fix arm NEON hscale init

Message ID 20200515183514.22051-1-martin@martin.st
State Accepted
Commit 70b14cc8d6869f67dd8ebce473a00ea5fa0ea70c
Headers show
Series [FFmpeg-devel] swscale: fix arm NEON hscale init | expand

Checks

Context Check Description
andriy/default pending
andriy/make success Make finished
andriy/make_fate success Make fate finished

Commit Message

Martin Storsjö May 15, 2020, 6:35 p.m. UTC
From: Josh de Kock <josh@itanimul.li>

The NEON hscale function only supports X8 filter sizes and should only
be selected when these are being used. At the moment filterAlign is
set to 8 but in the future when extra NEON assembly for specific sizes is
added they will need to have checks here too.

The immediate usecase for this change is making the hscale checkasm
test easier and without NEON specific edge-cases (x86 already has these
guards).

This applies the same fix from 718c8f9aa59751bb490e2688acf2b5cb68fd5ad1
on the 32 bit arm version of the function, fixing fate-checkasm-sw_scale
there.

Signed-off-by: Martin Storsjö <martin@martin.st>
---
 libswscale/arm/swscale.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Josh Dekker May 15, 2020, 7:13 p.m. UTC | #1
On 15/05/2020 19:35, Martin Storsjö wrote:
> From: Josh de Kock <josh@itanimul.li>
> 
> The NEON hscale function only supports X8 filter sizes and should only
> be selected when these are being used. At the moment filterAlign is
> set to 8 but in the future when extra NEON assembly for specific sizes is
> added they will need to have checks here too.
> 
> The immediate usecase for this change is making the hscale checkasm
> test easier and without NEON specific edge-cases (x86 already has these
> guards).
> 
> This applies the same fix from 718c8f9aa59751bb490e2688acf2b5cb68fd5ad1
> on the 32 bit arm version of the function, fixing fate-checkasm-sw_scale
> there.
> 
> Signed-off-by: Martin Storsjö <martin@martin.st>
> ---
>   libswscale/arm/swscale.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/libswscale/arm/swscale.c b/libswscale/arm/swscale.c
> index 1ec360fe24..7b8fbcbc79 100644
> --- a/libswscale/arm/swscale.c
> +++ b/libswscale/arm/swscale.c
> @@ -34,7 +34,10 @@ av_cold void ff_sws_init_swscale_arm(SwsContext *c)
>       int cpu_flags = av_get_cpu_flags();
>   
>       if (have_neon(cpu_flags)) {
> -        if (c->srcBpc == 8 && c->dstBpc <= 14) {
> +        if (c->srcBpc == 8 && c->dstBpc <= 14 &&
> +            (c->hLumFilterSize % 8) == 0 &&
> +            (c->hChrFilterSize % 8) == 0)
> +        {
>               c->hyScale = c->hcScale = ff_hscale_8_to_15_neon;
>           }
>           if (c->dstBpc == 8) {
> 

LGTM
diff mbox series

Patch

diff --git a/libswscale/arm/swscale.c b/libswscale/arm/swscale.c
index 1ec360fe24..7b8fbcbc79 100644
--- a/libswscale/arm/swscale.c
+++ b/libswscale/arm/swscale.c
@@ -34,7 +34,10 @@  av_cold void ff_sws_init_swscale_arm(SwsContext *c)
     int cpu_flags = av_get_cpu_flags();
 
     if (have_neon(cpu_flags)) {
-        if (c->srcBpc == 8 && c->dstBpc <= 14) {
+        if (c->srcBpc == 8 && c->dstBpc <= 14 &&
+            (c->hLumFilterSize % 8) == 0 &&
+            (c->hChrFilterSize % 8) == 0)
+        {
             c->hyScale = c->hcScale = ff_hscale_8_to_15_neon;
         }
         if (c->dstBpc == 8) {