diff mbox series

[FFmpeg-devel] swresample/resample: Properly empty MMX state

Message ID DB6PR0101MB2214AB1CB783BCA7DED878B88FA99@DB6PR0101MB2214.eurprd01.prod.exchangelabs.com
State Accepted
Commit 55fc2c5a892c50feb1b9a8f55b74ec6594755ddb
Headers show
Series [FFmpeg-devel] swresample/resample: Properly empty MMX state | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished
andriy/make_armv7_RPi4 success Make finished
andriy/make_fate_armv7_RPi4 success Make fate finished

Commit Message

Andreas Rheinhardt June 11, 2022, 10:25 p.m. UTC
There is a x86-32 MMXEXT implementation for resampling
planar 16bit data. multiple_resample() therefore calls
emms_c() if it thinks that this needed. And this is bad:

1. It is a maintenance nightmare because changes to the
x86 resample DSP code would necessitate changes to the check
whether to call emms_c().
2. The return value of av_get_cpu_flags() does not tell
whether the MMX DSP functions are in use, as they could
have been overridden by av_force_cpu_flags().
3. The MMX DSP functions will never be overridden in case of
an x86-32 build with --disable-sse2. In this scenario lots of
resampling tests (like swr-resample_exact_lin_async-s16p-8000-48000)
fail because the cpuflags indicate that SSE2 is available
(presuming that the test is run on a CPU with SSE2).
4. The check includes a call to av_get_cpu_flags(). This is not
optimized away for arches other than x86-32.
5. The check takes about as much time as emms_c() itself,
making it pointless.

This commit therefore removes the check and calls emms_c()
unconditionally (it is a no-op for non-x86).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
---
The reason I don't add an ARCH_X86_32 check is that I intend
to remove this emms_c() again shortly. I have just updated
my branch [1] that removes obsolete MMX(EXT) by a commit that
removes the MMXEXT resampling functions that are the cause
of this issue. A follow-up commit then removes the emms_c()
completely.

[1]: https://github.com/mkver/FFmpeg/commits/mmx2

 libswresample/resample.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

Comments

Andreas Rheinhardt June 12, 2022, 8:02 p.m. UTC | #1
Andreas Rheinhardt:
> There is a x86-32 MMXEXT implementation for resampling
> planar 16bit data. multiple_resample() therefore calls
> emms_c() if it thinks that this needed. And this is bad:
> 
> 1. It is a maintenance nightmare because changes to the
> x86 resample DSP code would necessitate changes to the check
> whether to call emms_c().
> 2. The return value of av_get_cpu_flags() does not tell
> whether the MMX DSP functions are in use, as they could
> have been overridden by av_force_cpu_flags().
> 3. The MMX DSP functions will never be overridden in case of
> an x86-32 build with --disable-sse2. In this scenario lots of
> resampling tests (like swr-resample_exact_lin_async-s16p-8000-48000)
> fail because the cpuflags indicate that SSE2 is available
> (presuming that the test is run on a CPU with SSE2).
> 4. The check includes a call to av_get_cpu_flags(). This is not
> optimized away for arches other than x86-32.
> 5. The check takes about as much time as emms_c() itself,
> making it pointless.
> 
> This commit therefore removes the check and calls emms_c()
> unconditionally (it is a no-op for non-x86).
> 
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
> ---
> The reason I don't add an ARCH_X86_32 check is that I intend
> to remove this emms_c() again shortly. I have just updated
> my branch [1] that removes obsolete MMX(EXT) by a commit that
> removes the MMXEXT resampling functions that are the cause
> of this issue. A follow-up commit then removes the emms_c()
> completely.
> 
> [1]: https://github.com/mkver/FFmpeg/commits/mmx2
> 
>  libswresample/resample.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/libswresample/resample.c b/libswresample/resample.c
> index f1ec77f54b..9c5b7fee72 100644
> --- a/libswresample/resample.c
> +++ b/libswresample/resample.c
> @@ -452,9 +452,6 @@ static int set_compensation(ResampleContext *c, int sample_delta, int compensati
>  
>  static int multiple_resample(ResampleContext *c, AudioData *dst, int dst_size, AudioData *src, int src_size, int *consumed){
>      int i;
> -    int av_unused mm_flags = av_get_cpu_flags();
> -    int need_emms = c->format == AV_SAMPLE_FMT_S16P && ARCH_X86_32 &&
> -                    (mm_flags & (AV_CPU_FLAG_MMX2 | AV_CPU_FLAG_SSE2)) == AV_CPU_FLAG_MMX2;
>      int64_t max_src_size = (INT64_MAX/2 / c->phase_count) / c->src_incr;
>  
>      if (c->compensation_distance)
> @@ -500,8 +497,7 @@ static int multiple_resample(ResampleContext *c, AudioData *dst, int dst_size, A
>          }
>      }
>  
> -    if(need_emms)
> -        emms_c();
> +    emms_c();
>  
>      if (c->compensation_distance) {
>          c->compensation_distance -= dst_size;

Will apply this patchset tomorrow unless there are objections.

- Andreas
diff mbox series

Patch

diff --git a/libswresample/resample.c b/libswresample/resample.c
index f1ec77f54b..9c5b7fee72 100644
--- a/libswresample/resample.c
+++ b/libswresample/resample.c
@@ -452,9 +452,6 @@  static int set_compensation(ResampleContext *c, int sample_delta, int compensati
 
 static int multiple_resample(ResampleContext *c, AudioData *dst, int dst_size, AudioData *src, int src_size, int *consumed){
     int i;
-    int av_unused mm_flags = av_get_cpu_flags();
-    int need_emms = c->format == AV_SAMPLE_FMT_S16P && ARCH_X86_32 &&
-                    (mm_flags & (AV_CPU_FLAG_MMX2 | AV_CPU_FLAG_SSE2)) == AV_CPU_FLAG_MMX2;
     int64_t max_src_size = (INT64_MAX/2 / c->phase_count) / c->src_incr;
 
     if (c->compensation_distance)
@@ -500,8 +497,7 @@  static int multiple_resample(ResampleContext *c, AudioData *dst, int dst_size, A
         }
     }
 
-    if(need_emms)
-        emms_c();
+    emms_c();
 
     if (c->compensation_distance) {
         c->compensation_distance -= dst_size;