diff mbox

[FFmpeg-devel] vp9: change order of operations in adapt_prob().

Message ID 1476464497-82840-1-git-send-email-rsbultje@gmail.com
State Accepted
Commit be885da3427c5d9a6fa68229d16318afffe67193
Headers show

Commit Message

Ronald S. Bultje Oct. 14, 2016, 5:01 p.m. UTC
This is intended to workaround bug "665 Integer Divide Instruction May
Cause Unpredictable Behavior" on some early AMD CPUs, which causes a
div-by-zero in this codepath, such as reported in Mozilla bug #1293996.

Note that this isn't guaranteed to fix the bug, since a compiler is free
to reorder instructions that don't depend on each other. However, it
appears to fix the bug in Firefox, and a similar patch was applied to
libvpx also (see Chrome bug #599899).
---
 libavcodec/vp9.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

James Zern Oct. 14, 2016, 6:09 p.m. UTC | #1
Ronald,

On Fri, Oct 14, 2016 at 10:01 AM, Ronald S. Bultje <rsbultje@gmail.com> wrote:
> This is intended to workaround bug "665 Integer Divide Instruction May
> Cause Unpredictable Behavior" on some early AMD CPUs, which causes a
> div-by-zero in this codepath, such as reported in Mozilla bug #1293996.
>
> Note that this isn't guaranteed to fix the bug, since a compiler is free
> to reorder instructions that don't depend on each other. However, it
> appears to fix the bug in Firefox, and a similar patch was applied to
> libvpx also (see Chrome bug #599899).
>

I recently made a few additional changes as this regressed in chrome
[1][2], but just like this change there's no guarantee it won't occur
again.

[1] https://chromium.googlesource.com/webm/libvpx/+/8b4210940ce4183d4cfded42c323612c0c6d1688
[2] https://chromium.googlesource.com/webm/libvpx/+/82ea74223771793628dbd812c2fd50afcfb8183a

> ---
>  libavcodec/vp9.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>

lgtm

> diff --git a/libavcodec/vp9.c b/libavcodec/vp9.c
> index cb2a4a2..3b72149 100644
> --- a/libavcodec/vp9.c
> +++ b/libavcodec/vp9.c
> @@ -3705,11 +3705,10 @@ static av_always_inline void adapt_prob(uint8_t *p, unsigned ct0, unsigned ct1,
>      if (!ct)
>          return;
>
> +    update_factor = FASTDIV(update_factor * FFMIN(ct, max_count), max_count);
>      p1 = *p;
> -    p2 = ((ct0 << 8) + (ct >> 1)) / ct;
> +    p2 = ((((int64_t) ct0) << 8) + (ct >> 1)) / ct;
>      p2 = av_clip(p2, 1, 255);
> -    ct = FFMIN(ct, max_count);
> -    update_factor = FASTDIV(update_factor * ct, max_count);
>
>      // (p1 * (256 - update_factor) + p2 * update_factor + 128) >> 8
>      *p = p1 + (((p2 - p1) * update_factor + 128) >> 8);
> --
> 2.8.1
>
Michael Niedermayer Oct. 14, 2016, 6:29 p.m. UTC | #2
On Fri, Oct 14, 2016 at 11:09:30AM -0700, James Zern wrote:
> Ronald,
> 
> On Fri, Oct 14, 2016 at 10:01 AM, Ronald S. Bultje <rsbultje@gmail.com> wrote:
> > This is intended to workaround bug "665 Integer Divide Instruction May
> > Cause Unpredictable Behavior" on some early AMD CPUs, which causes a
> > div-by-zero in this codepath, such as reported in Mozilla bug #1293996.
> >
> > Note that this isn't guaranteed to fix the bug, since a compiler is free
> > to reorder instructions that don't depend on each other. However, it
> > appears to fix the bug in Firefox, and a similar patch was applied to
> > libvpx also (see Chrome bug #599899).
> >
> 
> I recently made a few additional changes as this regressed in chrome
> [1][2], but just like this change there's no guarantee it won't occur
> again.

maybe you can use empty "asm volatile(:::"memory")" statments to
prevent unwanted instruction reordering by the compiler
never tried something like this so dunno, also it would be specific
to gcc compatible compilers but should not be architecture specific

[...]
Michael Niedermayer Oct. 14, 2016, 6:31 p.m. UTC | #3
On Fri, Oct 14, 2016 at 08:29:37PM +0200, Michael Niedermayer wrote:
> On Fri, Oct 14, 2016 at 11:09:30AM -0700, James Zern wrote:
> > Ronald,
> > 
> > On Fri, Oct 14, 2016 at 10:01 AM, Ronald S. Bultje <rsbultje@gmail.com> wrote:
> > > This is intended to workaround bug "665 Integer Divide Instruction May
> > > Cause Unpredictable Behavior" on some early AMD CPUs, which causes a
> > > div-by-zero in this codepath, such as reported in Mozilla bug #1293996.
> > >
> > > Note that this isn't guaranteed to fix the bug, since a compiler is free
> > > to reorder instructions that don't depend on each other. However, it
> > > appears to fix the bug in Firefox, and a similar patch was applied to
> > > libvpx also (see Chrome bug #599899).
> > >
> > 
> > I recently made a few additional changes as this regressed in chrome
> > [1][2], but just like this change there's no guarantee it won't occur
> > again.
> 
> maybe you can use empty "asm volatile(:::"memory")" statments to
> prevent unwanted instruction reordering by the compiler
> never tried something like this so dunno, also it would be specific
> to gcc compatible compilers but should not be architecture specific

thinking again, why dont you write the function in asm for x86 ?
this would take the compiler out of the equation

[...]
Ronald S. Bultje Oct. 14, 2016, 6:54 p.m. UTC | #4
Hi Michael,

On Fri, Oct 14, 2016 at 2:31 PM, Michael Niedermayer <michael@niedermayer.cc
> wrote:

> On Fri, Oct 14, 2016 at 08:29:37PM +0200, Michael Niedermayer wrote:
> > On Fri, Oct 14, 2016 at 11:09:30AM -0700, James Zern wrote:
> > > Ronald,
> > >
> > > On Fri, Oct 14, 2016 at 10:01 AM, Ronald S. Bultje <rsbultje@gmail.com>
> wrote:
> > > > This is intended to workaround bug "665 Integer Divide Instruction
> May
> > > > Cause Unpredictable Behavior" on some early AMD CPUs, which causes a
> > > > div-by-zero in this codepath, such as reported in Mozilla bug
> #1293996.
> > > >
> > > > Note that this isn't guaranteed to fix the bug, since a compiler is
> free
> > > > to reorder instructions that don't depend on each other. However, it
> > > > appears to fix the bug in Firefox, and a similar patch was applied to
> > > > libvpx also (see Chrome bug #599899).
> > > >
> > >
> > > I recently made a few additional changes as this regressed in chrome
> > > [1][2], but just like this change there's no guarantee it won't occur
> > > again.
> >
> > maybe you can use empty "asm volatile(:::"memory")" statments to
> > prevent unwanted instruction reordering by the compiler
> > never tried something like this so dunno, also it would be specific
> > to gcc compatible compilers but should not be architecture specific
>
> thinking again, why dont you write the function in asm for x86 ?
> this would take the compiler out of the equation


I think the primary reason is that "this seems to work". Don't forget that
the bug is in the hardware, not in the code, so I don't want to make the
code needlessly (well... maybe that's debatable) complicated for a problem
that isn't really ours...

But I guess I'm open to hearing everyone else's opinion on this - if people
want me to make the workaround more persistent I can work on that.

Ronald
James Zern Oct. 14, 2016, 6:55 p.m. UTC | #5
On Fri, Oct 14, 2016 at 11:31 AM, Michael Niedermayer
<michael@niedermayer.cc> wrote:
> On Fri, Oct 14, 2016 at 08:29:37PM +0200, Michael Niedermayer wrote:
>> On Fri, Oct 14, 2016 at 11:09:30AM -0700, James Zern wrote:
>> > Ronald,
>> >
>> > On Fri, Oct 14, 2016 at 10:01 AM, Ronald S. Bultje <rsbultje@gmail.com> wrote:
>> > > This is intended to workaround bug "665 Integer Divide Instruction May
>> > > Cause Unpredictable Behavior" on some early AMD CPUs, which causes a
>> > > div-by-zero in this codepath, such as reported in Mozilla bug #1293996.
>> > >
>> > > Note that this isn't guaranteed to fix the bug, since a compiler is free
>> > > to reorder instructions that don't depend on each other. However, it
>> > > appears to fix the bug in Firefox, and a similar patch was applied to
>> > > libvpx also (see Chrome bug #599899).
>> > >
>> >
>> > I recently made a few additional changes as this regressed in chrome
>> > [1][2], but just like this change there's no guarantee it won't occur
>> > again.
>>
>> maybe you can use empty "asm volatile(:::"memory")" statments to
>> prevent unwanted instruction reordering by the compiler
>> never tried something like this so dunno, also it would be specific
>> to gcc compatible compilers but should not be architecture specific
>
> thinking again, why dont you write the function in asm for x86 ?
> this would take the compiler out of the equation
>

That could work, I started with a portion in assembly before making
the C changes, but didn't follow through with it.

> [...]
>
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> The educated differ from the uneducated as much as the living from the
> dead. -- Aristotle
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
James Zern Oct. 14, 2016, 7:19 p.m. UTC | #6
On Fri, Oct 14, 2016 at 11:54 AM, Ronald S. Bultje <rsbultje@gmail.com> wrote:
> Hi Michael,
>
> On Fri, Oct 14, 2016 at 2:31 PM, Michael Niedermayer <michael@niedermayer.cc
>> wrote:
>
>> On Fri, Oct 14, 2016 at 08:29:37PM +0200, Michael Niedermayer wrote:
>> > On Fri, Oct 14, 2016 at 11:09:30AM -0700, James Zern wrote:
>> > > Ronald,
>> > >
>> > > On Fri, Oct 14, 2016 at 10:01 AM, Ronald S. Bultje <rsbultje@gmail.com>
>> wrote:
>> > > > This is intended to workaround bug "665 Integer Divide Instruction
>> May
>> > > > Cause Unpredictable Behavior" on some early AMD CPUs, which causes a
>> > > > div-by-zero in this codepath, such as reported in Mozilla bug
>> #1293996.
>> > > >
>> > > > Note that this isn't guaranteed to fix the bug, since a compiler is
>> free
>> > > > to reorder instructions that don't depend on each other. However, it
>> > > > appears to fix the bug in Firefox, and a similar patch was applied to
>> > > > libvpx also (see Chrome bug #599899).
>> > > >
>> > >
>> > > I recently made a few additional changes as this regressed in chrome
>> > > [1][2], but just like this change there's no guarantee it won't occur
>> > > again.
>> >
>> > maybe you can use empty "asm volatile(:::"memory")" statments to
>> > prevent unwanted instruction reordering by the compiler
>> > never tried something like this so dunno, also it would be specific
>> > to gcc compatible compilers but should not be architecture specific
>>
>> thinking again, why dont you write the function in asm for x86 ?
>> this would take the compiler out of the equation
>
>
> I think the primary reason is that "this seems to work". Don't forget that
> the bug is in the hardware, not in the code, so I don't want to make the
> code needlessly (well... maybe that's debatable) complicated for a problem
> that isn't really ours...
>
> But I guess I'm open to hearing everyone else's opinion on this - if people
> want me to make the workaround more persistent I can work on that.
>

That ended up being my view mostly due to the fact that I didn't have
access to the hardware. Even with assembly there is a
timing/scheduling element, so it may be less fragile, but I think the
bug could still occur.
Ronald S. Bultje Oct. 24, 2016, 8:10 p.m. UTC | #7
Hi,

On Fri, Oct 14, 2016 at 2:09 PM, James Zern <jzern@google.com> wrote:

> Ronald,
>
> On Fri, Oct 14, 2016 at 10:01 AM, Ronald S. Bultje <rsbultje@gmail.com>
> wrote:
> > This is intended to workaround bug "665 Integer Divide Instruction May
> > Cause Unpredictable Behavior" on some early AMD CPUs, which causes a
> > div-by-zero in this codepath, such as reported in Mozilla bug #1293996.
> >
> > Note that this isn't guaranteed to fix the bug, since a compiler is free
> > to reorder instructions that don't depend on each other. However, it
> > appears to fix the bug in Firefox, and a similar patch was applied to
> > libvpx also (see Chrome bug #599899).
> >
>
> I recently made a few additional changes as this regressed in chrome
> [1][2], but just like this change there's no guarantee it won't occur
> again.
>
> [1] https://chromium.googlesource.com/webm/libvpx/+/
> 8b4210940ce4183d4cfded42c323612c0c6d1688
> [2] https://chromium.googlesource.com/webm/libvpx/+/
> 82ea74223771793628dbd812c2fd50afcfb8183a
>
> > ---
> >  libavcodec/vp9.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> >
>
> lgtm


(Firefox beta crash reports confirmed that the bug is fixed.)

Pushed.

Ronald
diff mbox

Patch

diff --git a/libavcodec/vp9.c b/libavcodec/vp9.c
index cb2a4a2..3b72149 100644
--- a/libavcodec/vp9.c
+++ b/libavcodec/vp9.c
@@ -3705,11 +3705,10 @@  static av_always_inline void adapt_prob(uint8_t *p, unsigned ct0, unsigned ct1,
     if (!ct)
         return;
 
+    update_factor = FASTDIV(update_factor * FFMIN(ct, max_count), max_count);
     p1 = *p;
-    p2 = ((ct0 << 8) + (ct >> 1)) / ct;
+    p2 = ((((int64_t) ct0) << 8) + (ct >> 1)) / ct;
     p2 = av_clip(p2, 1, 255);
-    ct = FFMIN(ct, max_count);
-    update_factor = FASTDIV(update_factor * ct, max_count);
 
     // (p1 * (256 - update_factor) + p2 * update_factor + 128) >> 8
     *p = p1 + (((p2 - p1) * update_factor + 128) >> 8);