diff mbox

[FFmpeg-devel,v3] avformat/mp3dec: Fixes misdetection of zYLx.wav

Message ID 20191107113743.24476-1-lance.lmwang@gmail.com
State Superseded
Headers show

Commit Message

Lance Wang Nov. 7, 2019, 11:37 a.m. UTC
From: Limin Wang <lance.lmwang@gmail.com>

Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
---
 libavformat/mp3dec.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Hendrik Leppkes Nov. 7, 2019, 12:16 p.m. UTC | #1
On Thu, Nov 7, 2019 at 12:38 PM <lance.lmwang@gmail.com> wrote:
>
> From: Limin Wang <lance.lmwang@gmail.com>
>
> Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
> ---
>  libavformat/mp3dec.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/libavformat/mp3dec.c b/libavformat/mp3dec.c
> index 258f19174b..f15045dd6f 100644
> --- a/libavformat/mp3dec.c
> +++ b/libavformat/mp3dec.c
> @@ -73,6 +73,7 @@ static int mp3_read_probe(const AVProbeData *p)
>      int frames, ret;
>      int framesizes, max_framesizes;
>      uint32_t header;
> +    uint32_t next;
>      const uint8_t *buf, *buf0, *buf2, *end;
>
>      buf0 = p->buf;
> @@ -93,6 +94,10 @@ static int mp3_read_probe(const AVProbeData *p)
>              ret = avpriv_mpegaudio_decode_header(&h, header);
>              if (ret != 0)
>                  break;
> +            next = AV_RB32(buf2 + 4);
> +            /* detect invalid data after header */
> +            if ((header & 0xfffe0000) == (next & 0xfffe0000))
> +                break;
>              buf2 += h.frame_size;
>              framesizes += h.frame_size;
>          }

This seems like one check designed to fix exactly one file. Thats
really not the kind of checks that seem appropriate here.
I'm not sure you can conclusively proof that the data after the header
cannot have a certain bit-pattern, since audio data immediately starts
here.

- Hendrik
Lance Wang Nov. 7, 2019, 2:40 p.m. UTC | #2
On Thu, Nov 07, 2019 at 01:16:32PM +0100, Hendrik Leppkes wrote:
> On Thu, Nov 7, 2019 at 12:38 PM <lance.lmwang@gmail.com> wrote:
> >
> > From: Limin Wang <lance.lmwang@gmail.com>
> >
> > Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
> > ---
> >  libavformat/mp3dec.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/libavformat/mp3dec.c b/libavformat/mp3dec.c
> > index 258f19174b..f15045dd6f 100644
> > --- a/libavformat/mp3dec.c
> > +++ b/libavformat/mp3dec.c
> > @@ -73,6 +73,7 @@ static int mp3_read_probe(const AVProbeData *p)
> >      int frames, ret;
> >      int framesizes, max_framesizes;
> >      uint32_t header;
> > +    uint32_t next;
> >      const uint8_t *buf, *buf0, *buf2, *end;
> >
> >      buf0 = p->buf;
> > @@ -93,6 +94,10 @@ static int mp3_read_probe(const AVProbeData *p)
> >              ret = avpriv_mpegaudio_decode_header(&h, header);
> >              if (ret != 0)
> >                  break;
> > +            next = AV_RB32(buf2 + 4);
> > +            /* detect invalid data after header */
> > +            if ((header & 0xfffe0000) == (next & 0xfffe0000))
> > +                break;
> >              buf2 += h.frame_size;
> >              framesizes += h.frame_size;
> >          }
> 
> This seems like one check designed to fix exactly one file. Thats
> really not the kind of checks that seem appropriate here.
> I'm not sure you can conclusively proof that the data after the header
> cannot have a certain bit-pattern, since audio data immediately starts
> here.

0xffe is the sync word for mp3(0xfffe is sync word for MPEG Version 1 layerI)
if the next sync word is near to the first sync word, that's means no frame
data between them, so we can consider it's invalid header. For the
current sample is such pattern, so I check the kind of case only.


> 
> - Hendrik
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Hendrik Leppkes Nov. 7, 2019, 4:37 p.m. UTC | #3
On Thu, Nov 7, 2019 at 3:40 PM Limin Wang <lance.lmwang@gmail.com> wrote:
>
> On Thu, Nov 07, 2019 at 01:16:32PM +0100, Hendrik Leppkes wrote:
> > On Thu, Nov 7, 2019 at 12:38 PM <lance.lmwang@gmail.com> wrote:
> > >
> > > From: Limin Wang <lance.lmwang@gmail.com>
> > >
> > > Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
> > > ---
> > >  libavformat/mp3dec.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/libavformat/mp3dec.c b/libavformat/mp3dec.c
> > > index 258f19174b..f15045dd6f 100644
> > > --- a/libavformat/mp3dec.c
> > > +++ b/libavformat/mp3dec.c
> > > @@ -73,6 +73,7 @@ static int mp3_read_probe(const AVProbeData *p)
> > >      int frames, ret;
> > >      int framesizes, max_framesizes;
> > >      uint32_t header;
> > > +    uint32_t next;
> > >      const uint8_t *buf, *buf0, *buf2, *end;
> > >
> > >      buf0 = p->buf;
> > > @@ -93,6 +94,10 @@ static int mp3_read_probe(const AVProbeData *p)
> > >              ret = avpriv_mpegaudio_decode_header(&h, header);
> > >              if (ret != 0)
> > >                  break;
> > > +            next = AV_RB32(buf2 + 4);
> > > +            /* detect invalid data after header */
> > > +            if ((header & 0xfffe0000) == (next & 0xfffe0000))
> > > +                break;
> > >              buf2 += h.frame_size;
> > >              framesizes += h.frame_size;
> > >          }
> >
> > This seems like one check designed to fix exactly one file. Thats
> > really not the kind of checks that seem appropriate here.
> > I'm not sure you can conclusively proof that the data after the header
> > cannot have a certain bit-pattern, since audio data immediately starts
> > here.
>
> 0xffe is the sync word for mp3(0xfffe is sync word for MPEG Version 1 layerI)
> if the next sync word is near to the first sync word, that's means no frame
> data between them, so we can consider it's invalid header. For the
> current sample is such pattern, so I check the kind of case only.
>

That explanation is not sufficient. What guarantees that the audio
data that follows the header cannot actually have this particular
pattern?

You can't fix every of these cases by such bit-pattern checks. PCM is
raw data, it could have any pattern, including a full valid MP3 frame
header. A better way to detect this is to check if there is multiple
MP3 frames at the correct position in the file, which reduces the
likelyness of such a misdetection drastically.

- Hendrik
diff mbox

Patch

diff --git a/libavformat/mp3dec.c b/libavformat/mp3dec.c
index 258f19174b..f15045dd6f 100644
--- a/libavformat/mp3dec.c
+++ b/libavformat/mp3dec.c
@@ -73,6 +73,7 @@  static int mp3_read_probe(const AVProbeData *p)
     int frames, ret;
     int framesizes, max_framesizes;
     uint32_t header;
+    uint32_t next;
     const uint8_t *buf, *buf0, *buf2, *end;
 
     buf0 = p->buf;
@@ -93,6 +94,10 @@  static int mp3_read_probe(const AVProbeData *p)
             ret = avpriv_mpegaudio_decode_header(&h, header);
             if (ret != 0)
                 break;
+            next = AV_RB32(buf2 + 4);
+            /* detect invalid data after header */
+            if ((header & 0xfffe0000) == (next & 0xfffe0000))
+                break;
             buf2 += h.frame_size;
             framesizes += h.frame_size;
         }