diff mbox

[FFmpeg-devel] avutil/x86/emms: Document the emms_c() vs alloc/free relation.

Message ID 20161023033725.16499-1-michael@niedermayer.cc
State Accepted
Commit 051517648b00cf55509c0cf370e4067db3bf1ba2
Headers show

Commit Message

Michael Niedermayer Oct. 23, 2016, 3:37 a.m. UTC
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
---
 libavutil/x86/emms.h | 2 ++
 1 file changed, 2 insertions(+)

Comments

Andreas Cadhalpun Oct. 23, 2016, 10:15 a.m. UTC | #1
On 23.10.2016 05:37, Michael Niedermayer wrote:
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  libavutil/x86/emms.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/libavutil/x86/emms.h b/libavutil/x86/emms.h
> index 6fda6e2..42c18e2 100644
> --- a/libavutil/x86/emms.h
> +++ b/libavutil/x86/emms.h
> @@ -31,6 +31,8 @@ void avpriv_emms_yasm(void);
>   * Empty mmx state.
>   * this must be called between any dsp function and float/double code.
>   * for example sin(); dsp->idct_put(); emms_c(); cos()
> + * Note, *alloc() and *free() also use float code in some libc implementations
> + * thus this also applies to them or any function using them.
>   */
>  static av_always_inline void emms_c(void)
>  {
> 

Seems fine.

Best regards,
Andreas
Michael Niedermayer Oct. 23, 2016, 11:02 a.m. UTC | #2
On Sun, Oct 23, 2016 at 12:15:38PM +0200, Andreas Cadhalpun wrote:
> On 23.10.2016 05:37, Michael Niedermayer wrote:
> > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > ---
> >  libavutil/x86/emms.h | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/libavutil/x86/emms.h b/libavutil/x86/emms.h
> > index 6fda6e2..42c18e2 100644
> > --- a/libavutil/x86/emms.h
> > +++ b/libavutil/x86/emms.h
> > @@ -31,6 +31,8 @@ void avpriv_emms_yasm(void);
> >   * Empty mmx state.
> >   * this must be called between any dsp function and float/double code.
> >   * for example sin(); dsp->idct_put(); emms_c(); cos()
> > + * Note, *alloc() and *free() also use float code in some libc implementations
> > + * thus this also applies to them or any function using them.
> >   */
> >  static av_always_inline void emms_c(void)
> >  {
> > 
> 
> Seems fine.

applied

thx

[...]
wm4 Oct. 24, 2016, 7:38 a.m. UTC | #3
On Sun, 23 Oct 2016 05:37:25 +0200
Michael Niedermayer <michael@niedermayer.cc> wrote:

> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  libavutil/x86/emms.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/libavutil/x86/emms.h b/libavutil/x86/emms.h
> index 6fda6e2..42c18e2 100644
> --- a/libavutil/x86/emms.h
> +++ b/libavutil/x86/emms.h
> @@ -31,6 +31,8 @@ void avpriv_emms_yasm(void);
>   * Empty mmx state.
>   * this must be called between any dsp function and float/double code.
>   * for example sin(); dsp->idct_put(); emms_c(); cos()
> + * Note, *alloc() and *free() also use float code in some libc implementations
> + * thus this also applies to them or any function using them.
>   */
>  static av_always_inline void emms_c(void)
>  {

Overly specific and useless information. It's an implementation detail
of 1 specific libc. It could happen to any libc function for any libc
at any time. This just adds noise because of one specific bug.
Michael Niedermayer Oct. 24, 2016, 11:24 a.m. UTC | #4
On Mon, Oct 24, 2016 at 09:38:00AM +0200, wm4 wrote:
> On Sun, 23 Oct 2016 05:37:25 +0200
> Michael Niedermayer <michael@niedermayer.cc> wrote:
> 
> > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > ---
> >  libavutil/x86/emms.h | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/libavutil/x86/emms.h b/libavutil/x86/emms.h
> > index 6fda6e2..42c18e2 100644
> > --- a/libavutil/x86/emms.h
> > +++ b/libavutil/x86/emms.h
> > @@ -31,6 +31,8 @@ void avpriv_emms_yasm(void);
> >   * Empty mmx state.
> >   * this must be called between any dsp function and float/double code.
> >   * for example sin(); dsp->idct_put(); emms_c(); cos()
> > + * Note, *alloc() and *free() also use float code in some libc implementations
> > + * thus this also applies to them or any function using them.
> >   */
> >  static av_always_inline void emms_c(void)
> >  {
> 
> Overly specific and useless information. It's an implementation detail

If we place emms_c() so as to ensure that the FPU state is clean
before all calls to *alloc() and *free() (as is done in the posted
patchset) then we need to document this
so others working on the code are aware of it and wont mistakly break
it.

If/when a different or more complete solution is implemented then this
note needs to be adjusted accordingly.

[...]
Ronald S. Bultje Oct. 24, 2016, 11:57 a.m. UTC | #5
Hi,

On Mon, Oct 24, 2016 at 7:24 AM, Michael Niedermayer <michael@niedermayer.cc
> wrote:

> On Mon, Oct 24, 2016 at 09:38:00AM +0200, wm4 wrote:
> > On Sun, 23 Oct 2016 05:37:25 +0200
> > Michael Niedermayer <michael@niedermayer.cc> wrote:
> >
> > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > > ---
> > >  libavutil/x86/emms.h | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/libavutil/x86/emms.h b/libavutil/x86/emms.h
> > > index 6fda6e2..42c18e2 100644
> > > --- a/libavutil/x86/emms.h
> > > +++ b/libavutil/x86/emms.h
> > > @@ -31,6 +31,8 @@ void avpriv_emms_yasm(void);
> > >   * Empty mmx state.
> > >   * this must be called between any dsp function and float/double code.
> > >   * for example sin(); dsp->idct_put(); emms_c(); cos()
> > > + * Note, *alloc() and *free() also use float code in some libc
> implementations
> > > + * thus this also applies to them or any function using them.
> > >   */
> > >  static av_always_inline void emms_c(void)
> > >  {
> >
> > Overly specific and useless information. It's an implementation detail
>
> If we place emms_c() so as to ensure that the FPU state is clean
> before all calls to *alloc() and *free() (as is done in the posted
> patchset) then we need to document this
> so others working on the code are aware of it and wont mistakly break
> it.
>
> If/when a different or more complete solution is implemented then this
> note needs to be adjusted accordingly.


I am doubtful that the patches that implement the partial solution
("hack"?) should be pushed, or that we should (by documenting it) advocate
partial solutions ("hacks") in general.

Ronald
wm4 Oct. 24, 2016, 12:36 p.m. UTC | #6
On Mon, 24 Oct 2016 13:24:38 +0200
Michael Niedermayer <michael@niedermayer.cc> wrote:

> On Mon, Oct 24, 2016 at 09:38:00AM +0200, wm4 wrote:
> > On Sun, 23 Oct 2016 05:37:25 +0200
> > Michael Niedermayer <michael@niedermayer.cc> wrote:
> >   
> > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > > ---
> > >  libavutil/x86/emms.h | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/libavutil/x86/emms.h b/libavutil/x86/emms.h
> > > index 6fda6e2..42c18e2 100644
> > > --- a/libavutil/x86/emms.h
> > > +++ b/libavutil/x86/emms.h
> > > @@ -31,6 +31,8 @@ void avpriv_emms_yasm(void);
> > >   * Empty mmx state.
> > >   * this must be called between any dsp function and float/double code.
> > >   * for example sin(); dsp->idct_put(); emms_c(); cos()
> > > + * Note, *alloc() and *free() also use float code in some libc implementations
> > > + * thus this also applies to them or any function using them.
> > >   */
> > >  static av_always_inline void emms_c(void)
> > >  {  
> > 
> > Overly specific and useless information. It's an implementation detail  
> 
> If we place emms_c() so as to ensure that the FPU state is clean
> before all calls to *alloc() and *free() (as is done in the posted
> patchset) then we need to document this
> so others working on the code are aware of it and wont mistakly break
> it.
> 
> If/when a different or more complete solution is implemented then this
> note needs to be adjusted accordingly.
> 
> [...]

That fixes only the musl-specific issue. It could happen any time again
with a set of different callers. Unless you want to put special effort
into fixing operation with musl (and keeping it working) it won't help
much to fix the correctness issues.

If anything, it should probably say "all external" calls or such.
Michael Niedermayer Oct. 24, 2016, 4:08 p.m. UTC | #7
On Mon, Oct 24, 2016 at 02:36:23PM +0200, wm4 wrote:
> On Mon, 24 Oct 2016 13:24:38 +0200
> Michael Niedermayer <michael@niedermayer.cc> wrote:
> 
> > On Mon, Oct 24, 2016 at 09:38:00AM +0200, wm4 wrote:
> > > On Sun, 23 Oct 2016 05:37:25 +0200
> > > Michael Niedermayer <michael@niedermayer.cc> wrote:
> > >   
> > > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > > > ---
> > > >  libavutil/x86/emms.h | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/libavutil/x86/emms.h b/libavutil/x86/emms.h
> > > > index 6fda6e2..42c18e2 100644
> > > > --- a/libavutil/x86/emms.h
> > > > +++ b/libavutil/x86/emms.h
> > > > @@ -31,6 +31,8 @@ void avpriv_emms_yasm(void);
> > > >   * Empty mmx state.
> > > >   * this must be called between any dsp function and float/double code.
> > > >   * for example sin(); dsp->idct_put(); emms_c(); cos()
> > > > + * Note, *alloc() and *free() also use float code in some libc implementations
> > > > + * thus this also applies to them or any function using them.
> > > >   */
> > > >  static av_always_inline void emms_c(void)
> > > >  {  
> > > 
> > > Overly specific and useless information. It's an implementation detail  
> > 
> > If we place emms_c() so as to ensure that the FPU state is clean
> > before all calls to *alloc() and *free() (as is done in the posted
> > patchset) then we need to document this
> > so others working on the code are aware of it and wont mistakly break
> > it.
> > 
> > If/when a different or more complete solution is implemented then this
> > note needs to be adjusted accordingly.
> > 
> > [...]
> 
> That fixes only the musl-specific issue. It could happen any time again
> with a set of different callers.

yes


> Unless you want to put special effort
> into fixing operation with musl (and keeping it working) it won't help
> much to fix the correctness issues.

i did want to do that, but ive a headache atm and the discussion
about emms yesterday makes me not want to touch emms anymore


> 
> If anything, it should probably say "all external" calls or such.

I see this as the long term goal, yes i agree it should be
be documented if theres consensus about it

The way i imagined it was to fix the practically relevant issues and
anything else thats easy or attached (like x86-64 for what is the same
on x86-32) before 3.2 and have 3.2 work with musl on all archs
then aim at fixing the rest over the following months unless some
roadblocks like performance issues are hit.

This incremental approuch has been hit with quite some opposition so
ill leave this to whoever likes to work on this

[...]
wm4 Oct. 24, 2016, 6:09 p.m. UTC | #8
On Mon, 24 Oct 2016 18:08:11 +0200
Michael Niedermayer <michael@niedermayer.cc> wrote:

> On Mon, Oct 24, 2016 at 02:36:23PM +0200, wm4 wrote:
> > On Mon, 24 Oct 2016 13:24:38 +0200
> > Michael Niedermayer <michael@niedermayer.cc> wrote:
> >   
> > > On Mon, Oct 24, 2016 at 09:38:00AM +0200, wm4 wrote:  
> > > > On Sun, 23 Oct 2016 05:37:25 +0200
> > > > Michael Niedermayer <michael@niedermayer.cc> wrote:
> > > >     
> > > > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > > > > ---
> > > > >  libavutil/x86/emms.h | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/libavutil/x86/emms.h b/libavutil/x86/emms.h
> > > > > index 6fda6e2..42c18e2 100644
> > > > > --- a/libavutil/x86/emms.h
> > > > > +++ b/libavutil/x86/emms.h
> > > > > @@ -31,6 +31,8 @@ void avpriv_emms_yasm(void);
> > > > >   * Empty mmx state.
> > > > >   * this must be called between any dsp function and float/double code.
> > > > >   * for example sin(); dsp->idct_put(); emms_c(); cos()
> > > > > + * Note, *alloc() and *free() also use float code in some libc implementations
> > > > > + * thus this also applies to them or any function using them.
> > > > >   */
> > > > >  static av_always_inline void emms_c(void)
> > > > >  {    
> > > > 
> > > > Overly specific and useless information. It's an implementation detail    
> > > 
> > > If we place emms_c() so as to ensure that the FPU state is clean
> > > before all calls to *alloc() and *free() (as is done in the posted
> > > patchset) then we need to document this
> > > so others working on the code are aware of it and wont mistakly break
> > > it.
> > > 
> > > If/when a different or more complete solution is implemented then this
> > > note needs to be adjusted accordingly.
> > > 
> > > [...]  
> > 
> > That fixes only the musl-specific issue. It could happen any time again
> > with a set of different callers.  
> 
> yes
> 
> 
> > Unless you want to put special effort
> > into fixing operation with musl (and keeping it working) it won't help
> > much to fix the correctness issues.  
> 
> i did want to do that, but ive a headache atm and the discussion
> about emms yesterday makes me not want to touch emms anymore
> 
> 
> > 
> > If anything, it should probably say "all external" calls or such.  
> 
> I see this as the long term goal, yes i agree it should be
> be documented if theres consensus about it
> 
> The way i imagined it was to fix the practically relevant issues and
> anything else thats easy or attached (like x86-64 for what is the same
> on x86-32) before 3.2 and have 3.2 work with musl on all archs
> then aim at fixing the rest over the following months unless some
> roadblocks like performance issues are hit.
> 
> This incremental approuch has been hit with quite some opposition so
> ill leave this to whoever likes to work on this

Nobody said it's going to be simple. Though personally I wouldn't mind
spamming emms() everywhere to make it somewhat more correct - just
don't pretend it actually fixes the problem.

There are some other "brutal" solutions:
A) require musl users to build with MMX disabled or have them set an
   environment variable that disables MMX at runtime as part of the CPU
   detection
B) add a compile time option that runs emms() unconditionally at the end
   of each mmx asm block

Since musl intentionally evades detection, neither can be enabled
automatically, probably. It would be interesting to see what the speed
impact of B) actually is.
Henrik Gramner Oct. 24, 2016, 6:30 p.m. UTC | #9
On Mon, Oct 24, 2016 at 8:09 PM, wm4 <nfxjfg@googlemail.com> wrote:
> B) add a compile time option that runs emms() unconditionally at the end
>    of each mmx asm block
>
> Since musl intentionally evades detection, neither can be enabled
> automatically, probably. It would be interesting to see what the speed
> impact of B) actually is.

Depends on the function, but in small/simple DSP functions (which most
relevant MMX functions are) adding emms at the end will make the
entire function twice as slow on most Intel CPUs.

So it's certainly not negligible.
u-9iep@aetey.se Oct. 24, 2016, 6:57 p.m. UTC | #10
On Mon, Oct 24, 2016 at 08:09:09PM +0200, wm4 wrote:
> B) add a compile time option that runs emms() unconditionally at the end
>    of each mmx asm block

> It would be interesting to see what the speed
> impact of B) actually is.

+1

Regards,
Rune
diff mbox

Patch

diff --git a/libavutil/x86/emms.h b/libavutil/x86/emms.h
index 6fda6e2..42c18e2 100644
--- a/libavutil/x86/emms.h
+++ b/libavutil/x86/emms.h
@@ -31,6 +31,8 @@  void avpriv_emms_yasm(void);
  * Empty mmx state.
  * this must be called between any dsp function and float/double code.
  * for example sin(); dsp->idct_put(); emms_c(); cos()
+ * Note, *alloc() and *free() also use float code in some libc implementations
+ * thus this also applies to them or any function using them.
  */
 static av_always_inline void emms_c(void)
 {