diff mbox series

[FFmpeg-devel,3/3] avcodec/x86/diracdsp: Fix incorrect src addressing in dequant_subband_32()

Message ID 20200129215529.17410-3-michael@niedermayer.cc
State Accepted
Headers show
Series [FFmpeg-devel,1/3] avcodec/screenpresso: Optimize sum_delta_flipped() | expand

Checks

Context Check Description
andriy/ffmpeg-patchwork success Make fate finished

Commit Message

Michael Niedermayer Jan. 29, 2020, 9:55 p.m. UTC
Fixes: Segfault (not reproducable with asm, which made this hard to debug)
Fixes: decoding errors
Fixes: 19854/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5729372837511168

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
---
 libavcodec/x86/diracdsp.asm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Paul B Mahol Jan. 30, 2020, 10:30 a.m. UTC | #1
probably ok

On 1/29/20, Michael Niedermayer <michael@niedermayer.cc> wrote:
> Fixes: Segfault (not reproducable with asm, which made this hard to debug)
> Fixes: decoding errors
> Fixes:
> 19854/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5729372837511168
>
> Found-by: continuous fuzzing process
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  libavcodec/x86/diracdsp.asm | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
> index cc8a26fca5..a18bda113e 100644
> --- a/libavcodec/x86/diracdsp.asm
> +++ b/libavcodec/x86/diracdsp.asm
> @@ -294,8 +294,9 @@ cglobal dequant_subband_32, 7, 7, 4, src, dst, stride,
> qf, qs, tot_v, tot_h
>
>      add    srcq, mmsize
>      add    dstq, mmsize
> -    sub    tot_hd, 4
> +    sub    tot_hq, 4
>      jg     .loop_h
> +    lea    srcq, [srcq + 4*tot_hq]
>
>      add    r3, strideq
>      dec    tot_vd
> --
> 2.17.1
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Michael Niedermayer Jan. 30, 2020, 6:25 p.m. UTC | #2
On Thu, Jan 30, 2020 at 11:30:03AM +0100, Paul B Mahol wrote:
> probably ok

will apply

thx

[...]
James Almer Jan. 30, 2020, 8:14 p.m. UTC | #3
On 1/29/2020 6:55 PM, Michael Niedermayer wrote:
> Fixes: Segfault (not reproducable with asm, which made this hard to debug)
> Fixes: decoding errors
> Fixes: 19854/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5729372837511168
> 
> Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  libavcodec/x86/diracdsp.asm | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
> index cc8a26fca5..a18bda113e 100644
> --- a/libavcodec/x86/diracdsp.asm
> +++ b/libavcodec/x86/diracdsp.asm
> @@ -294,8 +294,9 @@ cglobal dequant_subband_32, 7, 7, 4, src, dst, stride, qf, qs, tot_v, tot_h
>  
>      add    srcq, mmsize
>      add    dstq, mmsize
> -    sub    tot_hd, 4
> +    sub    tot_hq, 4

tot_h comes from stack, and on Windows x86_64 the higher 32 bits will be
garbage. This should remain as tot_hd, so the sub instruction will
implicitly clear said high bits (if the value could be negative, then
you'd have to sign extend it instead).

>      jg     .loop_h
> +    lea    srcq, [srcq + 4*tot_hq]
>  
>      add    r3, strideq
>      dec    tot_vd
>
Michael Niedermayer Jan. 30, 2020, 9:11 p.m. UTC | #4
On Thu, Jan 30, 2020 at 05:14:18PM -0300, James Almer wrote:
> On 1/29/2020 6:55 PM, Michael Niedermayer wrote:
> > Fixes: Segfault (not reproducable with asm, which made this hard to debug)
> > Fixes: decoding errors
> > Fixes: 19854/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5729372837511168
> > 
> > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > ---
> >  libavcodec/x86/diracdsp.asm | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
> > index cc8a26fca5..a18bda113e 100644
> > --- a/libavcodec/x86/diracdsp.asm
> > +++ b/libavcodec/x86/diracdsp.asm
> > @@ -294,8 +294,9 @@ cglobal dequant_subband_32, 7, 7, 4, src, dst, stride, qf, qs, tot_v, tot_h
> >  
> >      add    srcq, mmsize
> >      add    dstq, mmsize
> > -    sub    tot_hd, 4
> > +    sub    tot_hq, 4
> 
> tot_h comes from stack, and on Windows x86_64 the higher 32 bits will be
> garbage. This should remain as tot_hd, so the sub instruction will
> implicitly clear said high bits (if the value could be negative, then
> you'd have to sign extend it instead).

well for this to work we need the tot_h value (which is not a multiple of 4
in the case of the bug) after the subtract to correct for that
non mod 4 == 0 case

is this fix below ok ?

diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
index a18bda113e..17145baf87 100644
--- a/libavcodec/x86/diracdsp.asm
+++ b/libavcodec/x86/diracdsp.asm
@@ -274,7 +274,7 @@ cglobal dequant_subband_32, 7, 7, 4, src, dst, stride, qf, qs, tot_v, tot_h
     movd   m3, qsd
     SPLATD m2
     SPLATD m3
-    mov    r4, tot_hq
+    mov    r4d, tot_hd
     mov    r3, dstq
 
     .loop_v:
     

[...]
Carl Eugen Hoyos Jan. 30, 2020, 10:11 p.m. UTC | #5
Am Mi., 29. Jan. 2020 um 22:56 Uhr schrieb Michael Niedermayer
<michael@niedermayer.cc>:
>
> Fixes: Segfault (not reproducable with asm, which made this hard to debug)

without?


> Fixes: decoding errors
> Fixes: 19854/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5729372837511168
>
> Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  libavcodec/x86/diracdsp.asm | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
> index cc8a26fca5..a18bda113e 100644
> --- a/libavcodec/x86/diracdsp.asm
> +++ b/libavcodec/x86/diracdsp.asm
> @@ -294,8 +294,9 @@ cglobal dequant_subband_32, 7, 7, 4, src, dst, stride, qf, qs, tot_v, tot_h
>
>      add    srcq, mmsize
>      add    dstq, mmsize
> -    sub    tot_hd, 4
> +    sub    tot_hq, 4
>      jg     .loop_h
> +    lea    srcq, [srcq + 4*tot_hq]
>
>      add    r3, strideq
>      dec    tot_vd

Carl Eugen
Michael Niedermayer Jan. 30, 2020, 10:56 p.m. UTC | #6
On Thu, Jan 30, 2020 at 11:11:27PM +0100, Carl Eugen Hoyos wrote:
> Am Mi., 29. Jan. 2020 um 22:56 Uhr schrieb Michael Niedermayer
> <michael@niedermayer.cc>:
> >
> > Fixes: Segfault (not reproducable with asm, which made this hard to debug)
> 
> without?

wellllll
i had the testcase from ossfuzz but that didnt reproduce any anomaly at all
and the back traces from ossfuzz from the server where it did crash are
just useless when asm is involved.

so with a bit of guessing and guessing wrong (thought first its alignment)
and more guessing and looking at the asm i modified the C 
implementation to behave like the SSE4 that then allowed reproducing the crash
and seperately after that building a valid testfile which triggers this
codepath to allow testing the fix.
And a few more things here and there
And still theres was a bug left in the fix :(


Thanks

[...]
Michael Niedermayer Jan. 30, 2020, 11:05 p.m. UTC | #7
On Thu, Jan 30, 2020 at 10:11:05PM +0100, Michael Niedermayer wrote:
> On Thu, Jan 30, 2020 at 05:14:18PM -0300, James Almer wrote:
> > On 1/29/2020 6:55 PM, Michael Niedermayer wrote:
> > > Fixes: Segfault (not reproducable with asm, which made this hard to debug)
> > > Fixes: decoding errors
> > > Fixes: 19854/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5729372837511168
> > > 
> > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > > ---
> > >  libavcodec/x86/diracdsp.asm | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
> > > index cc8a26fca5..a18bda113e 100644
> > > --- a/libavcodec/x86/diracdsp.asm
> > > +++ b/libavcodec/x86/diracdsp.asm
> > > @@ -294,8 +294,9 @@ cglobal dequant_subband_32, 7, 7, 4, src, dst, stride, qf, qs, tot_v, tot_h
> > >  
> > >      add    srcq, mmsize
> > >      add    dstq, mmsize
> > > -    sub    tot_hd, 4
> > > +    sub    tot_hq, 4
> > 
> > tot_h comes from stack, and on Windows x86_64 the higher 32 bits will be
> > garbage. This should remain as tot_hd, so the sub instruction will
> > implicitly clear said high bits (if the value could be negative, then
> > you'd have to sign extend it instead).
> 
> well for this to work we need the tot_h value (which is not a multiple of 4
> in the case of the bug) after the subtract to correct for that
> non mod 4 == 0 case
> 
> is this fix below ok ?

i succeded to reproduce this and the bugfix below on a netbsd VM, which
unexpectedly seems to be affected by this too

on mingw32/64 no issue is reproducable unless i hack the asm to make one which 
then gets fixed with the code.

i dont want to leave the regression i caused in the tree and go to bed 
so i guess ill apply this. But this issue is cursed and my asm is a bit
rusty so i have a bit an uneasy feeling.
If this fix doesnt fix it for real, dont hesitate to revert it and the
buggy fix before

Thanks

[...]
diff mbox series

Patch

diff --git a/libavcodec/x86/diracdsp.asm b/libavcodec/x86/diracdsp.asm
index cc8a26fca5..a18bda113e 100644
--- a/libavcodec/x86/diracdsp.asm
+++ b/libavcodec/x86/diracdsp.asm
@@ -294,8 +294,9 @@  cglobal dequant_subband_32, 7, 7, 4, src, dst, stride, qf, qs, tot_v, tot_h
 
     add    srcq, mmsize
     add    dstq, mmsize
-    sub    tot_hd, 4
+    sub    tot_hq, 4
     jg     .loop_h
+    lea    srcq, [srcq + 4*tot_hq]
 
     add    r3, strideq
     dec    tot_vd