Message ID | GV1SPRMB0033280C6F96AACB191058008FA7A@GV1SPRMB0033.EURP250.PROD.OUTLOOK.COM |
---|---|
State | Accepted |
Commit | 5b85ca5317f28a73eadf10d604d2a7421b84bed8 |
Headers | show |
Series | [FFmpeg-devel,1/4] avutil/x86/pixelutils: Empty MMX state in ff_pixelutils_sad_8x8_mmxext | expand |
Context | Check | Description |
---|---|---|
yinshiyou/make_loongarch64 | success | Make finished |
yinshiyou/make_fate_loongarch64 | success | Make fate finished |
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
On Wed, Nov 1, 2023 at 10:44 AM Andreas Rheinhardt <andreas.rheinhardt@outlook.com> wrote: > libavutil/x86/pixelutils.asm | 1 + > 1 file changed, 1 insertion(+) IIRC the emms instructions is quite slow on many systems, so if this is the only pixelutils function using mmx it's probably better to just rewrite it to use SSE2 instead (even if that means only using the lower half of xmm registers).
Henrik Gramner via ffmpeg-devel: > On Wed, Nov 1, 2023 at 10:44 AM Andreas Rheinhardt > <andreas.rheinhardt@outlook.com> wrote: >> libavutil/x86/pixelutils.asm | 1 + >> 1 file changed, 1 insertion(+) > > IIRC the emms instructions is quite slow on many systems, so if this > is the only pixelutils function using mmx it's probably better to just > rewrite it to use SSE2 instead (even if that means only using the > lower half of xmm registers). Fixing it by rewriting it in a way that avoids MMX altogether is possible, but this doesn't change the fact that all public functions absolutely need to abide by the ABI. - Andreas
Andreas Rheinhardt: > We currently mostly do not empty the MMX state in our MMX > DSP functions; instead we only do so before code that might > be using x87 code. This is a violation of the System V i386 ABI > (and maybe of other ABIs, too): > "The CPU shall be in x87 mode upon entry to a function. Therefore, > every function that uses the MMX registers is required to issue an > emms or femms instruction after using MMX registers, before returning > or calling another function." (See 2.2.1 in [1]) > This patch does not intend to change all these functions to abide > by the ABI; it only does so for ff_pixelutils_sad_8x8_mmxext, as this > function can by called by external users, because it is exported > via the pixelutils API. Without this, the following fragment will > assert (on x86/x64): > uint8_t src1[8 * 8], src2[8 * 8]; > av_pixelutils_sad_fn fn = av_pixelutils_get_sad_fn(3, 3, 0, NULL); > fn(src1, 8, src2, 8); > av_assert0_fpu(); > > [1]: https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/intel386-psABI-1.1.pdf > > Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> > --- > libavutil/x86/pixelutils.asm | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/libavutil/x86/pixelutils.asm b/libavutil/x86/pixelutils.asm > index fbe9b45971..0bcccb51f5 100644 > --- a/libavutil/x86/pixelutils.asm > +++ b/libavutil/x86/pixelutils.asm > @@ -43,6 +43,7 @@ cglobal pixelutils_sad_8x8, 4,4,0, src1, stride1, src2, stride2 > lea src2q, [src2q + 2*stride2q] > %endrep > movd eax, m2 > + emms > RET > > ;------------------------------------------------------------------------------- Will apply this patchset tomorrow unless there are objections. - Andreas
diff --git a/libavutil/x86/pixelutils.asm b/libavutil/x86/pixelutils.asm index fbe9b45971..0bcccb51f5 100644 --- a/libavutil/x86/pixelutils.asm +++ b/libavutil/x86/pixelutils.asm @@ -43,6 +43,7 @@ cglobal pixelutils_sad_8x8, 4,4,0, src1, stride1, src2, stride2 lea src2q, [src2q + 2*stride2q] %endrep movd eax, m2 + emms RET ;-------------------------------------------------------------------------------
We currently mostly do not empty the MMX state in our MMX DSP functions; instead we only do so before code that might be using x87 code. This is a violation of the System V i386 ABI (and maybe of other ABIs, too): "The CPU shall be in x87 mode upon entry to a function. Therefore, every function that uses the MMX registers is required to issue an emms or femms instruction after using MMX registers, before returning or calling another function." (See 2.2.1 in [1]) This patch does not intend to change all these functions to abide by the ABI; it only does so for ff_pixelutils_sad_8x8_mmxext, as this function can by called by external users, because it is exported via the pixelutils API. Without this, the following fragment will assert (on x86/x64): uint8_t src1[8 * 8], src2[8 * 8]; av_pixelutils_sad_fn fn = av_pixelutils_get_sad_fn(3, 3, 0, NULL); fn(src1, 8, src2, 8); av_assert0_fpu(); [1]: https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/intel386-psABI-1.1.pdf Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> --- libavutil/x86/pixelutils.asm | 1 + 1 file changed, 1 insertion(+)