Message ID | DB6PR0101MB2214AB1CB783BCA7DED878B88FA99@DB6PR0101MB2214.eurprd01.prod.exchangelabs.com |
---|---|
State | Accepted |
Commit | 55fc2c5a892c50feb1b9a8f55b74ec6594755ddb |
Headers | show |
Series | [FFmpeg-devel] swresample/resample: Properly empty MMX state | expand |
Context | Check | Description |
---|---|---|
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
andriy/make_armv7_RPi4 | success | Make finished |
andriy/make_fate_armv7_RPi4 | success | Make fate finished |
Andreas Rheinhardt: > There is a x86-32 MMXEXT implementation for resampling > planar 16bit data. multiple_resample() therefore calls > emms_c() if it thinks that this needed. And this is bad: > > 1. It is a maintenance nightmare because changes to the > x86 resample DSP code would necessitate changes to the check > whether to call emms_c(). > 2. The return value of av_get_cpu_flags() does not tell > whether the MMX DSP functions are in use, as they could > have been overridden by av_force_cpu_flags(). > 3. The MMX DSP functions will never be overridden in case of > an x86-32 build with --disable-sse2. In this scenario lots of > resampling tests (like swr-resample_exact_lin_async-s16p-8000-48000) > fail because the cpuflags indicate that SSE2 is available > (presuming that the test is run on a CPU with SSE2). > 4. The check includes a call to av_get_cpu_flags(). This is not > optimized away for arches other than x86-32. > 5. The check takes about as much time as emms_c() itself, > making it pointless. > > This commit therefore removes the check and calls emms_c() > unconditionally (it is a no-op for non-x86). > > Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> > --- > The reason I don't add an ARCH_X86_32 check is that I intend > to remove this emms_c() again shortly. I have just updated > my branch [1] that removes obsolete MMX(EXT) by a commit that > removes the MMXEXT resampling functions that are the cause > of this issue. A follow-up commit then removes the emms_c() > completely. > > [1]: https://github.com/mkver/FFmpeg/commits/mmx2 > > libswresample/resample.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/libswresample/resample.c b/libswresample/resample.c > index f1ec77f54b..9c5b7fee72 100644 > --- a/libswresample/resample.c > +++ b/libswresample/resample.c > @@ -452,9 +452,6 @@ static int set_compensation(ResampleContext *c, int sample_delta, int compensati > > static int multiple_resample(ResampleContext *c, AudioData *dst, int dst_size, AudioData *src, int src_size, int *consumed){ > int i; > - int av_unused mm_flags = av_get_cpu_flags(); > - int need_emms = c->format == AV_SAMPLE_FMT_S16P && ARCH_X86_32 && > - (mm_flags & (AV_CPU_FLAG_MMX2 | AV_CPU_FLAG_SSE2)) == AV_CPU_FLAG_MMX2; > int64_t max_src_size = (INT64_MAX/2 / c->phase_count) / c->src_incr; > > if (c->compensation_distance) > @@ -500,8 +497,7 @@ static int multiple_resample(ResampleContext *c, AudioData *dst, int dst_size, A > } > } > > - if(need_emms) > - emms_c(); > + emms_c(); > > if (c->compensation_distance) { > c->compensation_distance -= dst_size; Will apply this patchset tomorrow unless there are objections. - Andreas
diff --git a/libswresample/resample.c b/libswresample/resample.c index f1ec77f54b..9c5b7fee72 100644 --- a/libswresample/resample.c +++ b/libswresample/resample.c @@ -452,9 +452,6 @@ static int set_compensation(ResampleContext *c, int sample_delta, int compensati static int multiple_resample(ResampleContext *c, AudioData *dst, int dst_size, AudioData *src, int src_size, int *consumed){ int i; - int av_unused mm_flags = av_get_cpu_flags(); - int need_emms = c->format == AV_SAMPLE_FMT_S16P && ARCH_X86_32 && - (mm_flags & (AV_CPU_FLAG_MMX2 | AV_CPU_FLAG_SSE2)) == AV_CPU_FLAG_MMX2; int64_t max_src_size = (INT64_MAX/2 / c->phase_count) / c->src_incr; if (c->compensation_distance) @@ -500,8 +497,7 @@ static int multiple_resample(ResampleContext *c, AudioData *dst, int dst_size, A } } - if(need_emms) - emms_c(); + emms_c(); if (c->compensation_distance) { c->compensation_distance -= dst_size;
There is a x86-32 MMXEXT implementation for resampling planar 16bit data. multiple_resample() therefore calls emms_c() if it thinks that this needed. And this is bad: 1. It is a maintenance nightmare because changes to the x86 resample DSP code would necessitate changes to the check whether to call emms_c(). 2. The return value of av_get_cpu_flags() does not tell whether the MMX DSP functions are in use, as they could have been overridden by av_force_cpu_flags(). 3. The MMX DSP functions will never be overridden in case of an x86-32 build with --disable-sse2. In this scenario lots of resampling tests (like swr-resample_exact_lin_async-s16p-8000-48000) fail because the cpuflags indicate that SSE2 is available (presuming that the test is run on a CPU with SSE2). 4. The check includes a call to av_get_cpu_flags(). This is not optimized away for arches other than x86-32. 5. The check takes about as much time as emms_c() itself, making it pointless. This commit therefore removes the check and calls emms_c() unconditionally (it is a no-op for non-x86). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> --- The reason I don't add an ARCH_X86_32 check is that I intend to remove this emms_c() again shortly. I have just updated my branch [1] that removes obsolete MMX(EXT) by a commit that removes the MMXEXT resampling functions that are the cause of this issue. A follow-up commit then removes the emms_c() completely. [1]: https://github.com/mkver/FFmpeg/commits/mmx2 libswresample/resample.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)