Message ID | 20181213120536.GL3501@michaelspb |
---|---|
State | Accepted |
Headers | show |
On 12/13/18, Michael Niedermayer <michael@niedermayer.cc> wrote: > On Wed, Dec 12, 2018 at 08:08:53PM -0800, Baptiste Coudurier wrote: >> Hi, >> >> > On Dec 12, 2018, at 6:53 PM, chcunningham <chcunningham@chromium.org> >> > wrote: >> > >> > Chromium fuzzing produced a whacky file with extra tkhds. This caused >> > an AVStream that was already in use to be corrupted by assigning it a >> > new id, which blows up later in mov_read_trun because the >> > MOVFragmentStreamInfo.index_entry now points OOB. >> > --- >> > libavformat/isom.h | 3 ++- >> > libavformat/mov.c | 7 +++++++ >> > 2 files changed, 9 insertions(+), 1 deletion(-) >> > >> > diff --git a/libavformat/isom.h b/libavformat/isom.h >> > index e629663949..e14d670f2f 100644 >> > --- a/libavformat/isom.h >> > +++ b/libavformat/isom.h >> > @@ -230,7 +230,8 @@ typedef struct MOVStreamContext { >> > >> > uint32_t format; >> > >> > - int has_sidx; // If there is an sidx entry for this stream. >> > + int has_sidx; ///< If there is a sidx entry for this stream. >> > + int has_tkhd; ///< If there is a tkhd entry for this stream. >> > struct { >> > struct AVAESCTR* aes_ctr; >> > unsigned int per_sample_iv_size; // Either 0, 8, or 16. >> > diff --git a/libavformat/mov.c b/libavformat/mov.c >> > index ec57a05803..47c53d7992 100644 >> > --- a/libavformat/mov.c >> > +++ b/libavformat/mov.c >> > @@ -4438,6 +4438,12 @@ static int mov_read_tkhd(MOVContext *c, >> > AVIOContext *pb, MOVAtom atom) >> > st = c->fc->streams[c->fc->nb_streams-1]; >> > sc = st->priv_data; >> > >> > + // Each stream (trak) should have exactly 1 tkhd. This catches bad >> > files and >> > + // avoids corrupting AVStreams mapped to an earlier tkhd. >> > + if (sc->has_tkhd) >> > + return AVERROR_INVALIDDATA; >> > + sc->has_tkhd = 1; >> > + >> > >> >> Could we just check that st->id is already set ? IIRC 0 is not a valid >> value > > I have at least 2 files which have a id of 0 > Iam not sure where they are from so iam not sure i can share them Haha. > found with: > --- a/libavformat/mov.c > +++ b/libavformat/mov.c > @@ -4450,6 +4450,7 @@ static int mov_read_tkhd(MOVContext *c, AVIOContext > *pb, MOVAtom atom) > avio_rb32(pb); /* modification time */ > } > st->id = (int)avio_rb32(pb); /* track id (NOT 0 !)*/ > + av_assert0(st->id); > avio_rb32(pb); /* reserved */ > > /* highlevel (considering edits) duration in movie timebase */ > > > Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/michael/videos/qtrle/Earth > Spin 1-bit qtrle.mov': > Metadata: > creation_time : 2015-12-20T17:51:06.000000Z > copyright : ©Aroborescence, San Francisco All Rights Reserved. > copyright-eng : ©Aroborescence, San Francisco All Rights Reserved. > comment : Not for reuse or resale > comment-eng : Not for reuse or resale > date : Monday, May 6, 1991 > date-eng : Monday, May 6, 1991 > original_format : Digitized > original_format-eng: Digitized > director : > director-eng : > producer : > producer-eng : > composer : > composer-eng : > performers : > performers-eng : > original_source : > original_source-eng: > Duration: 00:00:03.60, start: 0.000000, bitrate: 203 kb/s > Stream #0:0(eng): Video: qtrle (rle / 0x20656C72), pal8, 186x186, 197 > kb/s, 10 fps, 10 tbr, 600 tbn, 600 tbc (default) > Metadata: > rotate : 0 > creation_time : 2015-12-20T17:51:06.000000Z > handler_name : Apple Alias Data Handler > encoder : Animation - RLE > Side data: > displaymatrix: rotation of -0.00 degrees > > > > [...] > > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > When the tyrant has disposed of foreign enemies by conquest or treaty, and > there is nothing more to fear from them, then he is always stirring up > some war or other, in order that the people may require a leader. -- Plato > Indeed.
> I have at least 2 files which have a id of 0 > Iam not sure where they are from so iam not sure i can share them This was my fear as well. Also, we currently default the ID for a new stream to be the number of streams now in the list. I worried that some files may lack a tkhd or could be structured in such a way that they're dependent on this defaulting and might break if I instead defaulted to zero.
Hey guys, > On Dec 13, 2018, at 11:40 AM, Chris Cunningham <chcunningham@chromium.org> wrote: > >> I have at least 2 files which have a id of 0 >> Iam not sure where they are from so iam not sure i can share them Does Quicktime play them ? If not they are broken files. > This was my fear as well. Also, we currently default the ID for a new > stream to be the number of streams now in the list. I worried that some > files may lack a tkhd or could be structured in such a way that they're > dependent on this defaulting and might break if I instead defaulted to > zero. Lacking a tkhd is a broken file, and I know that FFmpeg has a policy of trying to play broken files. Now the good thing is that mp4 is supported everywhere and many other demuxer just refuse to play broken files :) The main issue with this new field is that it really looks like a hack. They are also many atoms that are not supposed to appear twice in a file and can break the code down the flow, so should we add “has_<atom>” for all of them ? "st->id" is not necessary for demuxing AFAIK, please correct me if Im wrong. Would an init value to -1 for st->id work ? Thanks! — Baptiste
> > "st->id" is not necessary for demuxing AFAIK, please correct me if Im > wrong. > Would an init value to -1 for st->id work ? > st->id does get used here and there. For ex, mov_read_trun reads the id to decide which stream corresponds to the current fragment. But if the ID isn't coming from a tkhd (either because none exits, or because you have truns before the tkhd appears), perhaps we can consider invalid. Using a value of -1 would make that easier to detect. It risks breaking some bad files, but I'm fine with that if you'd support it. I'll send a new patch and we can see if Michael finds new breaks. Chris
> But if the ID isn't coming from a tkhd (either because none exits, or because you have truns before the tkhd appears), perhaps we can consider invalid. Taking a closer look at the spec, I think it actually _is valid_ to have truns before tkhd. They "strongly recommend" that the header boxes like tkhd be placed first, but its not a "must". No clue how often ffmpeg encounters such a file, but today's defaulting of id = the number of streams probably facilitates playback in such cases because the probability of tkhd's track id matching that default is reasonably high. Still, I'll take a stab at the approach, if just for discussion/testing. On Thu, Dec 13, 2018 at 12:39 PM Chris Cunningham <chcunningham@chromium.org> wrote: > "st->id" is not necessary for demuxing AFAIK, please correct me if Im >> wrong. >> Would an init value to -1 for st->id work ? >> > > st->id does get used here and there. For ex, mov_read_trun reads the id to > decide which stream corresponds to the current fragment. But if the ID > isn't coming from a tkhd (either because none exits, or because you have > truns before the tkhd appears), perhaps we can consider invalid. Using a > value of -1 would make that easier to detect. It risks breaking some bad > files, but I'm fine with that if you'd support it. I'll send a new patch > and we can see if Michael finds new breaks. > > Chris >
--- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -4450,6 +4450,7 @@ static int mov_read_tkhd(MOVContext *c, AVIOContext *pb, MOVAtom atom) avio_rb32(pb); /* modification time */ } st->id = (int)avio_rb32(pb); /* track id (NOT 0 !)*/ + av_assert0(st->id); avio_rb32(pb); /* reserved */ /* highlevel (considering edits) duration in movie timebase */