diff mbox

[FFmpeg-devel] lavf/mov: ensure only one tkhd per trak

Message ID 20181213120536.GL3501@michaelspb
State Accepted
Headers show

Commit Message

Michael Niedermayer Dec. 13, 2018, 12:05 p.m. UTC
On Wed, Dec 12, 2018 at 08:08:53PM -0800, Baptiste Coudurier wrote:
> Hi,
> 
> > On Dec 12, 2018, at 6:53 PM, chcunningham <chcunningham@chromium.org> wrote:
> > 
> > Chromium fuzzing produced a whacky file with extra tkhds. This caused
> > an AVStream that was already in use to be corrupted by assigning it a
> > new id, which blows up later in mov_read_trun because the
> > MOVFragmentStreamInfo.index_entry now points OOB.
> > ---
> > libavformat/isom.h | 3 ++-
> > libavformat/mov.c  | 7 +++++++
> > 2 files changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/libavformat/isom.h b/libavformat/isom.h
> > index e629663949..e14d670f2f 100644
> > --- a/libavformat/isom.h
> > +++ b/libavformat/isom.h
> > @@ -230,7 +230,8 @@ typedef struct MOVStreamContext {
> > 
> >     uint32_t format;
> > 
> > -    int has_sidx;  // If there is an sidx entry for this stream.
> > +    int has_sidx;  ///< If there is a sidx entry for this stream.
> > +    int has_tkhd;  ///< If there is a tkhd entry for this stream.
> >     struct {
> >         struct AVAESCTR* aes_ctr;
> >         unsigned int per_sample_iv_size;  // Either 0, 8, or 16.
> > diff --git a/libavformat/mov.c b/libavformat/mov.c
> > index ec57a05803..47c53d7992 100644
> > --- a/libavformat/mov.c
> > +++ b/libavformat/mov.c
> > @@ -4438,6 +4438,12 @@ static int mov_read_tkhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
> >     st = c->fc->streams[c->fc->nb_streams-1];
> >     sc = st->priv_data;
> > 
> > +    // Each stream (trak) should have exactly 1 tkhd. This catches bad files and
> > +    // avoids corrupting AVStreams mapped to an earlier tkhd.
> > +    if (sc->has_tkhd)
> > +        return AVERROR_INVALIDDATA;
> > +    sc->has_tkhd = 1;
> > +
> > 
> 
> Could we just check that st->id is already set ? IIRC 0 is not a valid value

I have at least 2 files which have a id of 0
Iam not sure where they are from so iam not sure i can share them

found with:


Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/michael/videos/qtrle/Earth Spin 1-bit qtrle.mov':
  Metadata:
    creation_time   : 2015-12-20T17:51:06.000000Z
    copyright       : ©Aroborescence, San Francisco All Rights Reserved.
    copyright-eng   : ©Aroborescence, San Francisco All Rights Reserved.
    comment         : Not for reuse or resale
    comment-eng     : Not for reuse or resale
    date            : Monday, May 6, 1991
    date-eng        : Monday, May 6, 1991
    original_format : Digitized
    original_format-eng: Digitized
    director        :    
    director-eng    :    
    producer        :    
    producer-eng    :    
    composer        :    
    composer-eng    :    
    performers      :      
    performers-eng  :      
    original_source :     
    original_source-eng:     
  Duration: 00:00:03.60, start: 0.000000, bitrate: 203 kb/s
    Stream #0:0(eng): Video: qtrle (rle  / 0x20656C72), pal8, 186x186, 197 kb/s, 10 fps, 10 tbr, 600 tbn, 600 tbc (default)
    Metadata:
      rotate          : 0
      creation_time   : 2015-12-20T17:51:06.000000Z
      handler_name    : Apple Alias Data Handler
      encoder         : Animation - RLE
    Side data:
      displaymatrix: rotation of -0.00 degrees



[...]

Comments

Paul B Mahol Dec. 13, 2018, 12:12 p.m. UTC | #1
On 12/13/18, Michael Niedermayer <michael@niedermayer.cc> wrote:
> On Wed, Dec 12, 2018 at 08:08:53PM -0800, Baptiste Coudurier wrote:
>> Hi,
>>
>> > On Dec 12, 2018, at 6:53 PM, chcunningham <chcunningham@chromium.org>
>> > wrote:
>> >
>> > Chromium fuzzing produced a whacky file with extra tkhds. This caused
>> > an AVStream that was already in use to be corrupted by assigning it a
>> > new id, which blows up later in mov_read_trun because the
>> > MOVFragmentStreamInfo.index_entry now points OOB.
>> > ---
>> > libavformat/isom.h | 3 ++-
>> > libavformat/mov.c  | 7 +++++++
>> > 2 files changed, 9 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/libavformat/isom.h b/libavformat/isom.h
>> > index e629663949..e14d670f2f 100644
>> > --- a/libavformat/isom.h
>> > +++ b/libavformat/isom.h
>> > @@ -230,7 +230,8 @@ typedef struct MOVStreamContext {
>> >
>> >     uint32_t format;
>> >
>> > -    int has_sidx;  // If there is an sidx entry for this stream.
>> > +    int has_sidx;  ///< If there is a sidx entry for this stream.
>> > +    int has_tkhd;  ///< If there is a tkhd entry for this stream.
>> >     struct {
>> >         struct AVAESCTR* aes_ctr;
>> >         unsigned int per_sample_iv_size;  // Either 0, 8, or 16.
>> > diff --git a/libavformat/mov.c b/libavformat/mov.c
>> > index ec57a05803..47c53d7992 100644
>> > --- a/libavformat/mov.c
>> > +++ b/libavformat/mov.c
>> > @@ -4438,6 +4438,12 @@ static int mov_read_tkhd(MOVContext *c,
>> > AVIOContext *pb, MOVAtom atom)
>> >     st = c->fc->streams[c->fc->nb_streams-1];
>> >     sc = st->priv_data;
>> >
>> > +    // Each stream (trak) should have exactly 1 tkhd. This catches bad
>> > files and
>> > +    // avoids corrupting AVStreams mapped to an earlier tkhd.
>> > +    if (sc->has_tkhd)
>> > +        return AVERROR_INVALIDDATA;
>> > +    sc->has_tkhd = 1;
>> > +
>> >
>>
>> Could we just check that st->id is already set ? IIRC 0 is not a valid
>> value
>
> I have at least 2 files which have a id of 0
> Iam not sure where they are from so iam not sure i can share them


Haha.


> found with:
> --- a/libavformat/mov.c
> +++ b/libavformat/mov.c
> @@ -4450,6 +4450,7 @@ static int mov_read_tkhd(MOVContext *c, AVIOContext
> *pb, MOVAtom atom)
>          avio_rb32(pb); /* modification time */
>      }
>      st->id = (int)avio_rb32(pb); /* track id (NOT 0 !)*/
> +    av_assert0(st->id);
>      avio_rb32(pb); /* reserved */
>
>      /* highlevel (considering edits) duration in movie timebase */
>
>
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/michael/videos/qtrle/Earth
> Spin 1-bit qtrle.mov':
>   Metadata:
>     creation_time   : 2015-12-20T17:51:06.000000Z
>     copyright       : ©Aroborescence, San Francisco All Rights Reserved.
>     copyright-eng   : ©Aroborescence, San Francisco All Rights Reserved.
>     comment         : Not for reuse or resale
>     comment-eng     : Not for reuse or resale
>     date            : Monday, May 6, 1991
>     date-eng        : Monday, May 6, 1991
>     original_format : Digitized
>     original_format-eng: Digitized
>     director        :
>     director-eng    :
>     producer        :
>     producer-eng    :
>     composer        :
>     composer-eng    :
>     performers      :
>     performers-eng  :
>     original_source :
>     original_source-eng:
>   Duration: 00:00:03.60, start: 0.000000, bitrate: 203 kb/s
>     Stream #0:0(eng): Video: qtrle (rle  / 0x20656C72), pal8, 186x186, 197
> kb/s, 10 fps, 10 tbr, 600 tbn, 600 tbc (default)
>     Metadata:
>       rotate          : 0
>       creation_time   : 2015-12-20T17:51:06.000000Z
>       handler_name    : Apple Alias Data Handler
>       encoder         : Animation - RLE
>     Side data:
>       displaymatrix: rotation of -0.00 degrees
>
>
>
> [...]
>
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> When the tyrant has disposed of foreign enemies by conquest or treaty, and
> there is nothing more to fear from them, then he is always stirring up
> some war or other, in order that the people may require a leader. -- Plato
>

Indeed.
chcunningham@chromium.org Dec. 13, 2018, 7:40 p.m. UTC | #2
> I have at least 2 files which have a id of 0
> Iam not sure where they are from so iam not sure i can share them

This was my fear as well. Also, we currently default the ID for a new
stream to be the number of streams now in the list. I worried that some
files may lack a tkhd or could be structured in such a way that they're
dependent on this defaulting and might break if I instead defaulted to
zero.
Baptiste Coudurier Dec. 13, 2018, 8:01 p.m. UTC | #3
Hey guys,

> On Dec 13, 2018, at 11:40 AM, Chris Cunningham <chcunningham@chromium.org> wrote:
> 
>> I have at least 2 files which have a id of 0
>> Iam not sure where they are from so iam not sure i can share them

Does Quicktime play them ? If not they are broken files.

> This was my fear as well. Also, we currently default the ID for a new
> stream to be the number of streams now in the list. I worried that some
> files may lack a tkhd or could be structured in such a way that they're
> dependent on this defaulting and might break if I instead defaulted to
> zero.

Lacking a tkhd is a broken file, and I know that FFmpeg has a policy of trying to play broken files.
Now the good thing is that mp4 is supported everywhere and many other demuxer just refuse to play broken files :)

The main issue with this new field is that it really looks like a hack.
They are also many atoms that are not supposed to appear twice in a file and can break
the code down the flow, so should we add “has_<atom>” for all of them ?

"st->id" is not necessary for demuxing AFAIK, please correct me if Im wrong.
Would an init value to -1 for st->id work ?

Thanks!

— 
Baptiste
chcunningham@chromium.org Dec. 13, 2018, 8:39 p.m. UTC | #4
>
> "st->id" is not necessary for demuxing AFAIK, please correct me if Im
> wrong.
> Would an init value to -1 for st->id work ?
>

st->id does get used here and there. For ex, mov_read_trun reads the id to
decide which stream corresponds to the current fragment. But if the ID
isn't coming from a tkhd (either because none exits, or because you have
truns before the tkhd appears), perhaps we can consider invalid. Using a
value of -1 would make that easier to detect. It risks breaking some bad
files, but I'm fine with that if you'd support it. I'll send a new patch
and we can see if Michael finds new breaks.

Chris
chcunningham@chromium.org Dec. 13, 2018, 8:54 p.m. UTC | #5
> But if the ID isn't coming from a tkhd (either because none exits, or
because you have truns before the tkhd appears), perhaps we can consider
invalid.

Taking a closer look at the spec, I think it actually _is valid_ to have
truns before tkhd. They "strongly recommend" that the header boxes like
tkhd be placed first, but its not a "must". No clue how often ffmpeg
encounters such a file, but today's defaulting of id = the number of
streams probably facilitates playback in such cases because the probability
of tkhd's track id matching that default is reasonably high.

Still, I'll take a stab at the approach, if just for discussion/testing.

On Thu, Dec 13, 2018 at 12:39 PM Chris Cunningham <chcunningham@chromium.org>
wrote:

> "st->id" is not necessary for demuxing AFAIK, please correct me if Im
>> wrong.
>> Would an init value to -1 for st->id work ?
>>
>
> st->id does get used here and there. For ex, mov_read_trun reads the id to
> decide which stream corresponds to the current fragment. But if the ID
> isn't coming from a tkhd (either because none exits, or because you have
> truns before the tkhd appears), perhaps we can consider invalid. Using a
> value of -1 would make that easier to detect. It risks breaking some bad
> files, but I'm fine with that if you'd support it. I'll send a new patch
> and we can see if Michael finds new breaks.
>
> Chris
>
diff mbox

Patch

--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -4450,6 +4450,7 @@  static int mov_read_tkhd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
         avio_rb32(pb); /* modification time */
     }
     st->id = (int)avio_rb32(pb); /* track id (NOT 0 !)*/
+    av_assert0(st->id);
     avio_rb32(pb); /* reserved */
 
     /* highlevel (considering edits) duration in movie timebase */