Message ID | 20190327111852.3784-14-andreas.rheinhardt@googlemail.com |
---|---|
State | New |
Headers | show |
On Wed, Mar 27, 2019 at 12:18:44PM +0100, Andreas Rheinhardt via ffmpeg-devel wrote: > The earlier code had three flaws: > > 1. The case of an unknown-sized element inside a finite-sized element > (which is against the specifications) was not caught. > > 2. The error message wasn't helpful: It compared the length of the child > with the offset of the end of the parent and claimed that the first > exceeds the latter, although that is not necessarily true. > > 3. Unknown-sized elements that are not parsed can't be skipped. Given > that according to the Matroska specifications only the segment and the > clusters can be of unknown-size, this is handled by not allowing any > other units to have infinite size whereas the earlier code would seek > back by 1 byte upon encountering an infinite-size element that ought > to be skipped. > > Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com> > --- > libavformat/matroskadec.c | 26 ++++++++++++++++++++++---- > 1 file changed, 22 insertions(+), 4 deletions(-) will apply thanks [...]
On 3/27/2019 12:18 PM, Andreas Rheinhardt via ffmpeg-devel wrote: > The earlier code had three flaws: > > 1. The case of an unknown-sized element inside a finite-sized element > (which is against the specifications) was not caught. > > 2. The error message wasn't helpful: It compared the length of the child > with the offset of the end of the parent and claimed that the first > exceeds the latter, although that is not necessarily true. > > 3. Unknown-sized elements that are not parsed can't be skipped. Given > that according to the Matroska specifications only the segment and the > clusters can be of unknown-size, this is handled by not allowing any This restriction is rather new. There might be old files that use infinite sizes in other places. > other units to have infinite size whereas the earlier code would seek > back by 1 byte upon encountering an infinite-size element that ought > to be skipped. > > Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com> > --- > libavformat/matroskadec.c | 26 ++++++++++++++++++++++---- > 1 file changed, 22 insertions(+), 4 deletions(-) > > diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c > index 6fa324c0cc..0179e5426e 100644 > --- a/libavformat/matroskadec.c > +++ b/libavformat/matroskadec.c > @@ -1180,11 +1180,29 @@ static int ebml_parse_elem(MatroskaDemuxContext *matroska, > if (matroska->num_levels > 0) { > MatroskaLevel *level = &matroska->levels[matroska->num_levels - 1]; > int64_t pos = avio_tell(pb); > - if (level->length != EBML_UNKNOWN_LENGTH && > - (pos + length) > (level->start + level->length)) { > + > + if (length != EBML_UNKNOWN_LENGTH && > + level->length != EBML_UNKNOWN_LENGTH) { > + uint64_t elem_end = pos + length, > + level_end = level->start + level->length; > + > + if (level_end < elem_end) { > + av_log(matroska->ctx, AV_LOG_ERROR, > + "Element at 0x%"PRIx64" ending at 0x%"PRIx64" exceeds " > + "containing master element ending at 0x%"PRIx64"\n", > + pos, elem_end, level_end); > + return AVERROR_INVALIDDATA; > + } > + } else if (level->length != EBML_UNKNOWN_LENGTH) { > + av_log(matroska->ctx, AV_LOG_ERROR, "Unknown-sized element " > + "at 0x%"PRIx64" inside parent with finite size\n", pos); > + return AVERROR_INVALIDDATA; IMO there's no reason to return an error here. You can log the error and still parse the data, it should end up fine (if the code is done as such that you can never read past your parents boundaries). At least libebml can deal with that. > + } else if (length == EBML_UNKNOWN_LENGTH && id != MATROSKA_ID_CLUSTER) { > + // According to the specifications only clusters and segments > + // are allowed to be unknown-sized. > av_log(matroska->ctx, AV_LOG_ERROR, > - "Invalid length 0x%"PRIx64" > 0x%"PRIx64" in parent\n", > - length, level->start + level->length); > + "Found unknown-sized element other than a cluster at " > + "0x%"PRIx64". Dropping the invalid element.\n", pos); > return AVERROR_INVALIDDATA; > } > } > -- > 2.19.2 > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Steve Lhomme: > On 3/27/2019 12:18 PM, Andreas Rheinhardt via ffmpeg-devel wrote: >> The earlier code had three flaws: >> >> 1. The case of an unknown-sized element inside a finite-sized element >> (which is against the specifications) was not caught. >> >> 2. The error message wasn't helpful: It compared the length of the >> child >> with the offset of the end of the parent and claimed that the first >> exceeds the latter, although that is not necessarily true. >> >> 3. Unknown-sized elements that are not parsed can't be skipped. Given >> that according to the Matroska specifications only the segment and the >> clusters can be of unknown-size, this is handled by not allowing any > > This restriction is rather new. There might be old files that use > infinite sizes in other places. > Indeed there are. The file mentioned by Dale (https://ffmpeg.org/pipermail/ffmpeg-devel/2019-February/240376.html; https://cs.chromium.org/chromium/src/media/test/data/bear-320x240-live.webm) has cues where every master element (even CueTrackPositions) are written as unknown-sized element (with length fields coded on eight bytes). The Cues aren't referenced in a Seekhead at the beginning though, so that FFmpeg doesn't try to read them. If they were referenced, neither current nor patched FFmpeg would be able to properly parse them and patched FFmpeg would furthermore give an error message. When exactly has this restriction been enacted? I'll see if I find a way to support such elements (although I am not sure whether it is worthwhile). >> other units to have infinite size whereas the earlier code would seek >> back by 1 byte upon encountering an infinite-size element that ought >> to be skipped. >> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com> >> --- >> libavformat/matroskadec.c | 26 ++++++++++++++++++++++---- >> 1 file changed, 22 insertions(+), 4 deletions(-) >> >> diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c >> index 6fa324c0cc..0179e5426e 100644 >> --- a/libavformat/matroskadec.c >> +++ b/libavformat/matroskadec.c >> @@ -1180,11 +1180,29 @@ static int >> ebml_parse_elem(MatroskaDemuxContext *matroska, >> if (matroska->num_levels > 0) { >> MatroskaLevel *level = >> &matroska->levels[matroska->num_levels - 1]; >> int64_t pos = avio_tell(pb); >> - if (level->length != EBML_UNKNOWN_LENGTH && >> - (pos + length) > (level->start + level->length)) { >> + >> + if (length != EBML_UNKNOWN_LENGTH && >> + level->length != EBML_UNKNOWN_LENGTH) { >> + uint64_t elem_end = pos + length, >> + level_end = level->start + level->length; >> + >> + if (level_end < elem_end) { >> + av_log(matroska->ctx, AV_LOG_ERROR, >> + "Element at 0x%"PRIx64" ending at >> 0x%"PRIx64" exceeds " >> + "containing master element ending at >> 0x%"PRIx64"\n", >> + pos, elem_end, level_end); >> + return AVERROR_INVALIDDATA; >> + } >> + } else if (level->length != EBML_UNKNOWN_LENGTH) { >> + av_log(matroska->ctx, AV_LOG_ERROR, "Unknown-sized >> element " >> + "at 0x%"PRIx64" inside parent with finite >> size\n", pos); >> + return AVERROR_INVALIDDATA; > > IMO there's no reason to return an error here. You can log the error > and still parse the data, it should end up fine (if the code is done > as such that you can never read past your parents boundaries). At > least libebml can deal with that. > Ok, treating such elements as if they extended to the end of their master element is easily doable. >> + } else if (length == EBML_UNKNOWN_LENGTH && id != >> MATROSKA_ID_CLUSTER) { >> + // According to the specifications only clusters >> and segments >> + // are allowed to be unknown-sized. >> av_log(matroska->ctx, AV_LOG_ERROR, >> - "Invalid length 0x%"PRIx64" > 0x%"PRIx64" in >> parent\n", >> - length, level->start + level->length); >> + "Found unknown-sized element other than a >> cluster at " >> + "0x%"PRIx64". Dropping the invalid >> element.\n", pos); >> return AVERROR_INVALIDDATA; >> } >> } >> -- >> 2.19.2 >>
> On April 7, 2019 at 12:54 PM Andreas Rheinhardt via ffmpeg-devel <ffmpeg-devel@ffmpeg.org> wrote: > > > Steve Lhomme: > > On 3/27/2019 12:18 PM, Andreas Rheinhardt via ffmpeg-devel wrote: > >> The earlier code had three flaws: > >> > >> 1. The case of an unknown-sized element inside a finite-sized element > >> (which is against the specifications) was not caught. > >> > >> 2. The error message wasn't helpful: It compared the length of the > >> child > >> with the offset of the end of the parent and claimed that the first > >> exceeds the latter, although that is not necessarily true. > >> > >> 3. Unknown-sized elements that are not parsed can't be skipped. Given > >> that according to the Matroska specifications only the segment and the > >> clusters can be of unknown-size, this is handled by not allowing any > > > > This restriction is rather new. There might be old files that use > > infinite sizes in other places. > > > Indeed there are. The file mentioned by Dale > (https://ffmpeg.org/pipermail/ffmpeg-devel/2019-February/240376.html; > https://cs.chromium.org/chromium/src/media/test/data/bear-320x240-live.webm) > has cues where every master element (even CueTrackPositions) are > written as unknown-sized element (with length fields coded on eight > bytes). The Cues aren't referenced in a Seekhead at the beginning > though, so that FFmpeg doesn't try to read them. If they were > referenced, neither current nor patched FFmpeg would be able to > properly parse them and patched FFmpeg would furthermore give an error > message. > > When exactly has this restriction been enacted? That would be 35e0909ef4fc34ca7cb15adba5e7c7ebf5daeacb in the Matroska specifications (2016/10/31). Before that any master element was allowed to use the unknown length feature. Now only Segment and Cluster are. Given WebM is based on the Matroska specifications on matroska.org I would assume they don't have this restriction yet. > I'll see if I find a way to support such elements (although I am not > sure whether it is worthwhile). > > >> other units to have infinite size whereas the earlier code would seek > >> back by 1 byte upon encountering an infinite-size element that ought > >> to be skipped. > >> > >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com> > >> --- > >> libavformat/matroskadec.c | 26 ++++++++++++++++++++++---- > >> 1 file changed, 22 insertions(+), 4 deletions(-) > >> > >> diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c > >> index 6fa324c0cc..0179e5426e 100644 > >> --- a/libavformat/matroskadec.c > >> +++ b/libavformat/matroskadec.c > >> @@ -1180,11 +1180,29 @@ static int > >> ebml_parse_elem(MatroskaDemuxContext *matroska, > >> if (matroska->num_levels > 0) { > >> MatroskaLevel *level = > >> &matroska->levels[matroska->num_levels - 1]; > >> int64_t pos = avio_tell(pb); > >> - if (level->length != EBML_UNKNOWN_LENGTH && > >> - (pos + length) > (level->start + level->length)) { > >> + > >> + if (length != EBML_UNKNOWN_LENGTH && > >> + level->length != EBML_UNKNOWN_LENGTH) { > >> + uint64_t elem_end = pos + length, > >> + level_end = level->start + level->length; > >> + > >> + if (level_end < elem_end) { > >> + av_log(matroska->ctx, AV_LOG_ERROR, > >> + "Element at 0x%"PRIx64" ending at > >> 0x%"PRIx64" exceeds " > >> + "containing master element ending at > >> 0x%"PRIx64"\n", > >> + pos, elem_end, level_end); > >> + return AVERROR_INVALIDDATA; > >> + } > >> + } else if (level->length != EBML_UNKNOWN_LENGTH) { > >> + av_log(matroska->ctx, AV_LOG_ERROR, "Unknown-sized > >> element " > >> + "at 0x%"PRIx64" inside parent with finite > >> size\n", pos); > >> + return AVERROR_INVALIDDATA; > > > > IMO there's no reason to return an error here. You can log the error > > and still parse the data, it should end up fine (if the code is done > > as such that you can never read past your parents boundaries). At > > least libebml can deal with that. > > > Ok, treating such elements as if they extended to the end of their > master element is easily doable. > > >> + } else if (length == EBML_UNKNOWN_LENGTH && id != > >> MATROSKA_ID_CLUSTER) { > >> + // According to the specifications only clusters > >> and segments > >> + // are allowed to be unknown-sized. > >> av_log(matroska->ctx, AV_LOG_ERROR, > >> - "Invalid length 0x%"PRIx64" > 0x%"PRIx64" in > >> parent\n", > >> - length, level->start + level->length); > >> + "Found unknown-sized element other than a > >> cluster at " > >> + "0x%"PRIx64". Dropping the invalid > >> element.\n", pos); > >> return AVERROR_INVALIDDATA; > >> } > >> } > >> -- > >> 2.19.2 > >> > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c index 6fa324c0cc..0179e5426e 100644 --- a/libavformat/matroskadec.c +++ b/libavformat/matroskadec.c @@ -1180,11 +1180,29 @@ static int ebml_parse_elem(MatroskaDemuxContext *matroska, if (matroska->num_levels > 0) { MatroskaLevel *level = &matroska->levels[matroska->num_levels - 1]; int64_t pos = avio_tell(pb); - if (level->length != EBML_UNKNOWN_LENGTH && - (pos + length) > (level->start + level->length)) { + + if (length != EBML_UNKNOWN_LENGTH && + level->length != EBML_UNKNOWN_LENGTH) { + uint64_t elem_end = pos + length, + level_end = level->start + level->length; + + if (level_end < elem_end) { + av_log(matroska->ctx, AV_LOG_ERROR, + "Element at 0x%"PRIx64" ending at 0x%"PRIx64" exceeds " + "containing master element ending at 0x%"PRIx64"\n", + pos, elem_end, level_end); + return AVERROR_INVALIDDATA; + } + } else if (level->length != EBML_UNKNOWN_LENGTH) { + av_log(matroska->ctx, AV_LOG_ERROR, "Unknown-sized element " + "at 0x%"PRIx64" inside parent with finite size\n", pos); + return AVERROR_INVALIDDATA; + } else if (length == EBML_UNKNOWN_LENGTH && id != MATROSKA_ID_CLUSTER) { + // According to the specifications only clusters and segments + // are allowed to be unknown-sized. av_log(matroska->ctx, AV_LOG_ERROR, - "Invalid length 0x%"PRIx64" > 0x%"PRIx64" in parent\n", - length, level->start + level->length); + "Found unknown-sized element other than a cluster at " + "0x%"PRIx64". Dropping the invalid element.\n", pos); return AVERROR_INVALIDDATA; } }
The earlier code had three flaws: 1. The case of an unknown-sized element inside a finite-sized element (which is against the specifications) was not caught. 2. The error message wasn't helpful: It compared the length of the child with the offset of the end of the parent and claimed that the first exceeds the latter, although that is not necessarily true. 3. Unknown-sized elements that are not parsed can't be skipped. Given that according to the Matroska specifications only the segment and the clusters can be of unknown-size, this is handled by not allowing any other units to have infinite size whereas the earlier code would seek back by 1 byte upon encountering an infinite-size element that ought to be skipped. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com> --- libavformat/matroskadec.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-)