Message ID | 20220208202353.19554-1-michael@niedermayer.cc |
---|---|
State | Accepted |
Commit | 02699490c14e86105104940c009953081f69432c |
Headers | show |
Series | [FFmpeg-devel,1/2] avformat/demux: Make read_frame_internal() return AVERREOR(EAGAIN) on stuck empty input parser | expand |
Context | Check | Description |
---|---|---|
andriy/make_x86 | success | Make finished |
andriy/make_fate_x86 | success | Make fate finished |
andriy/make_ppc | success | Make finished |
andriy/make_fate_ppc | success | Make fate finished |
On Tue, Feb 08, 2022 at 09:23:52PM +0100, Michael Niedermayer wrote: > Fixes: read_frame_internal() which does not return even though both demuxer and parser do return > Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304 > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > --- > libavformat/demux.c | 6 ++++++ > 1 file changed, 6 insertions(+) will apply patchset [...]
On Tue, 8 Feb 2022, Michael Niedermayer wrote: > Fixes: read_frame_internal() which does not return even though both demuxer and parser do return > Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304 > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > --- > libavformat/demux.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/libavformat/demux.c b/libavformat/demux.c > index ec34b65288..dd42d32710 100644 > --- a/libavformat/demux.c > +++ b/libavformat/demux.c > @@ -1222,11 +1222,15 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt) > FFFormatContext *const si = ffformatcontext(s); > int ret, got_packet = 0; > AVDictionary *metadata = NULL; > + int empty = 0; > > while (!got_packet && !si->parse_queue.head) { > AVStream *st; > FFStream *sti; > > + if (empty > 1) > + return AVERROR(EAGAIN); > + > /* read next packet */ > ret = ff_read_packet(s, pkt); > if (ret < 0) { > @@ -1317,6 +1321,8 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt) > } > got_packet = 1; > } else if (st->discard < AVDISCARD_ALL) { > + if (pkt->size == 0) > + empty ++; > if ((ret = parse_packet(s, pkt, pkt->stream_index, 0)) < 0) > return ret; > st->codecpar->sample_rate = sti->avctx->sample_rate; > -- > 2.17.1 Sorry, just noticed this patchset, and it is very hackish. For starters the meaning of AVERROR(EAGAIN) for av_read_frame()/read_frame_internal() is not very well defined. Should the user retry immediately? Should the user sleep() sometime and the retry? Can the user expect that a loop of av_read_frame() will eventually return something different than AVERROR(EAGAIN)? I am not sure I understand the original issue entirely, but it looks that instead of fixing the component which returns infinite amount of zero sized packets you implemented a check that makes the user get an EAGAIN() on the second zero-sized packet. And this helps the user how? Instead of a busy CPU loop in the library he either gets a busy CPU loop in its application or a non-busy CPU loop in its application (if he sleeps() on EAGAIN). Is there a file which causes this? Can't the underlying components be fixed instead? Thanks, Marton
On Fri, Jun 17, 2022 at 09:53:13PM +0200, Marton Balint wrote: > > > On Tue, 8 Feb 2022, Michael Niedermayer wrote: > > > Fixes: read_frame_internal() which does not return even though both demuxer and parser do return > > Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304 > > > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > > --- > > libavformat/demux.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/libavformat/demux.c b/libavformat/demux.c > > index ec34b65288..dd42d32710 100644 > > --- a/libavformat/demux.c > > +++ b/libavformat/demux.c > > @@ -1222,11 +1222,15 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt) > > FFFormatContext *const si = ffformatcontext(s); > > int ret, got_packet = 0; > > AVDictionary *metadata = NULL; > > + int empty = 0; > > > > while (!got_packet && !si->parse_queue.head) { > > AVStream *st; > > FFStream *sti; > > > > + if (empty > 1) > > + return AVERROR(EAGAIN); > > + > > /* read next packet */ > > ret = ff_read_packet(s, pkt); > > if (ret < 0) { > > @@ -1317,6 +1321,8 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt) > > } > > got_packet = 1; > > } else if (st->discard < AVDISCARD_ALL) { > > + if (pkt->size == 0) > > + empty ++; > > if ((ret = parse_packet(s, pkt, pkt->stream_index, 0)) < 0) > > return ret; > > st->codecpar->sample_rate = sti->avctx->sample_rate; > > -- > > 2.17.1 > > Sorry, just noticed this patchset, and it is very hackish. yes thats why i waited so many month before i applied it some patchsets i forget but this i kept pushing away > > For starters the meaning of AVERROR(EAGAIN) for > av_read_frame()/read_frame_internal() is not very well defined. Should the > user retry immediately? Should the user sleep() sometime and the retry? Can > the user expect that a loop of av_read_frame() will eventually return > something different than AVERROR(EAGAIN)? In the context of this problem the idea is to give the user an opertunity to do something else in a single threaded environment or error out if its taking too long not produingh anything > > I am not sure I understand the original issue entirely, but it looks that > instead of fixing the component which returns infinite amount of zero sized > packets you implemented a check that makes the user get an EAGAIN() on the > second zero-sized packet. IIRC the problem was that the demuxer produced 0 sized packets and alot of them for a file only a few hundread bytes large. now the zero sized packets go into the parser which has no output because theres no bytes input so no packet is ever produced by the parser and nothing comes out of the read_frame_internal() code, its just stuck in a loop From the point of view of the parser, the parser seems behaving correct From the point of view of the demuxer, the demuxer seems behaving correct if for example it produces a billion empty packets but from the point of view of the user the code is stuck and hangs > > And this helps the user how? Instead of a busy CPU loop in the library he > either gets a busy CPU loop in its application or a non-busy CPU loop in its > application (if he sleeps() on EAGAIN). The advantage for the user is, she can cleanly error out instead of killing the thread / process > > Is there a file which causes this? Can't the underlying components be fixed > instead? i have 2 files it seems, i will mail them to you privatly As said, its not clear if this is fundamentally any component misbehaving maybe something does break some rule somewhere i dont know thx [...]
On Sat, 18 Jun 2022, Michael Niedermayer wrote: > On Fri, Jun 17, 2022 at 09:53:13PM +0200, Marton Balint wrote: >> >> >> On Tue, 8 Feb 2022, Michael Niedermayer wrote: >> >>> Fixes: read_frame_internal() which does not return even though both demuxer and parser do return >>> Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304 >>> >>> Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg >>> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> >>> --- >>> libavformat/demux.c | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/libavformat/demux.c b/libavformat/demux.c >>> index ec34b65288..dd42d32710 100644 >>> --- a/libavformat/demux.c >>> +++ b/libavformat/demux.c >>> @@ -1222,11 +1222,15 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt) >>> FFFormatContext *const si = ffformatcontext(s); >>> int ret, got_packet = 0; >>> AVDictionary *metadata = NULL; >>> + int empty = 0; >>> >>> while (!got_packet && !si->parse_queue.head) { >>> AVStream *st; >>> FFStream *sti; >>> >>> + if (empty > 1) >>> + return AVERROR(EAGAIN); >>> + >>> /* read next packet */ >>> ret = ff_read_packet(s, pkt); >>> if (ret < 0) { >>> @@ -1317,6 +1321,8 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt) >>> } >>> got_packet = 1; >>> } else if (st->discard < AVDISCARD_ALL) { >>> + if (pkt->size == 0) >>> + empty ++; >>> if ((ret = parse_packet(s, pkt, pkt->stream_index, 0)) < 0) >>> return ret; >>> st->codecpar->sample_rate = sti->avctx->sample_rate; >>> -- >>> 2.17.1 >> >> Sorry, just noticed this patchset, and it is very hackish. > > yes thats why i waited so many month before i applied it > some patchsets i forget but this i kept pushing away > >> >> For starters the meaning of AVERROR(EAGAIN) for >> av_read_frame()/read_frame_internal() is not very well defined. Should the >> user retry immediately? Should the user sleep() sometime and the retry? Can >> the user expect that a loop of av_read_frame() will eventually return >> something different than AVERROR(EAGAIN)? > > In the context of this problem the idea is to give the user an opertunity > to do something else in a single threaded environment or error out if > its taking too long not produingh anything > > >> >> I am not sure I understand the original issue entirely, but it looks that >> instead of fixing the component which returns infinite amount of zero sized >> packets you implemented a check that makes the user get an EAGAIN() on the >> second zero-sized packet. > > IIRC the problem was that the demuxer produced 0 sized packets and alot of them > for a file only a few hundread bytes large. AFAIK a demuxer is not supposed to generate 0-sized packets. Or do you think otherwise? E.g. a simple patch like this seems the better approach: --- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -8677,8 +8677,12 @@ static int mov_read_packet(AVFormatContext *s, AVPacket *pkt) if (st->codecpar->codec_id == AV_CODEC_ID_EIA_608 && sample->size > 8) ret = get_eia608_packet(sc->pb, pkt, sample->size); - else + else if (sample->size > 0) { ret = av_get_packet(sc->pb, pkt, sample->size); + } else { + av_log(mov->fc, AV_LOG_ERROR, "Invalid sample size, corrupt index?\n"); + return AVERROR_INVALIDDATA; + } if (ret < 0) { if (should_retry(sc->pb, ret)) { mov_current_sample_dec(sc); Regards, Marton
Marton Balint: > > > On Sat, 18 Jun 2022, Michael Niedermayer wrote: > >> On Fri, Jun 17, 2022 at 09:53:13PM +0200, Marton Balint wrote: >>> >>> >>> On Tue, 8 Feb 2022, Michael Niedermayer wrote: >>> >>>> Fixes: read_frame_internal() which does not return even though both >>>> demuxer and parser do return >>>> Fixes: >>>> 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304 >>>> >>>> >>>> Found-by: continuous fuzzing process >>>> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg >>>> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> >>>> --- >>>> libavformat/demux.c | 6 ++++++ >>>> 1 file changed, 6 insertions(+) >>>> >>>> diff --git a/libavformat/demux.c b/libavformat/demux.c >>>> index ec34b65288..dd42d32710 100644 >>>> --- a/libavformat/demux.c >>>> +++ b/libavformat/demux.c >>>> @@ -1222,11 +1222,15 @@ static int >>>> read_frame_internal(AVFormatContext *s, AVPacket *pkt) >>>> FFFormatContext *const si = ffformatcontext(s); >>>> int ret, got_packet = 0; >>>> AVDictionary *metadata = NULL; >>>> + int empty = 0; >>>> >>>> while (!got_packet && !si->parse_queue.head) { >>>> AVStream *st; >>>> FFStream *sti; >>>> >>>> + if (empty > 1) >>>> + return AVERROR(EAGAIN); >>>> + >>>> /* read next packet */ >>>> ret = ff_read_packet(s, pkt); >>>> if (ret < 0) { >>>> @@ -1317,6 +1321,8 @@ static int read_frame_internal(AVFormatContext >>>> *s, AVPacket *pkt) >>>> } >>>> got_packet = 1; >>>> } else if (st->discard < AVDISCARD_ALL) { >>>> + if (pkt->size == 0) >>>> + empty ++; >>>> if ((ret = parse_packet(s, pkt, pkt->stream_index, 0)) < 0) >>>> return ret; >>>> st->codecpar->sample_rate = sti->avctx->sample_rate; >>>> -- >>>> 2.17.1 >>> >>> Sorry, just noticed this patchset, and it is very hackish. >> >> yes thats why i waited so many month before i applied it >> some patchsets i forget but this i kept pushing away >> >>> >>> For starters the meaning of AVERROR(EAGAIN) for >>> av_read_frame()/read_frame_internal() is not very well defined. >>> Should the >>> user retry immediately? Should the user sleep() sometime and the >>> retry? Can >>> the user expect that a loop of av_read_frame() will eventually return >>> something different than AVERROR(EAGAIN)? >> >> In the context of this problem the idea is to give the user an opertunity >> to do something else in a single threaded environment or error out if >> its taking too long not produingh anything >> >> >>> >>> I am not sure I understand the original issue entirely, but it looks >>> that >>> instead of fixing the component which returns infinite amount of zero >>> sized >>> packets you implemented a check that makes the user get an EAGAIN() >>> on the >>> second zero-sized packet. >> >> IIRC the problem was that the demuxer produced 0 sized packets and >> alot of them >> for a file only a few hundread bytes large. > > AFAIK a demuxer is not supposed to generate 0-sized packets. Or do you > think otherwise? Is it not possible to use zero-sized packets to clear the screen (i.e. convey the end timestamp of subtitles transmitted earlier)? (The subtitle formats without duration currently use packets with a size > 0 for that, but there is no real reason why this need to be so.) Another reason for zero-sized packets were side-data only packets (maybe not now, but it might be used in the future; after all, we are constantly adding new side-data-types). (This does not mean that I am against your proposed patch.) - Andreas PS: The fact that the parsing API is based upon buffers and not AVPackets btw leads to buggy side-data handling: If a parser introduces delay, then the side-data will not be attached to the correct packet, but to the preceding one.
On Mon, Jun 20, 2022 at 01:44:52AM +0200, Andreas Rheinhardt wrote: [...] > PS: The fact that the parsing API is based upon buffers and not > AVPackets btw leads to buggy side-data handling: If a parser introduces > delay, then the side-data will not be attached to the correct packet, > but to the preceding one. The parsers pass timestamps around correctly, sidedata can maybe be handled similarly thx [...]
diff --git a/libavformat/demux.c b/libavformat/demux.c index ec34b65288..dd42d32710 100644 --- a/libavformat/demux.c +++ b/libavformat/demux.c @@ -1222,11 +1222,15 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt) FFFormatContext *const si = ffformatcontext(s); int ret, got_packet = 0; AVDictionary *metadata = NULL; + int empty = 0; while (!got_packet && !si->parse_queue.head) { AVStream *st; FFStream *sti; + if (empty > 1) + return AVERROR(EAGAIN); + /* read next packet */ ret = ff_read_packet(s, pkt); if (ret < 0) { @@ -1317,6 +1321,8 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt) } got_packet = 1; } else if (st->discard < AVDISCARD_ALL) { + if (pkt->size == 0) + empty ++; if ((ret = parse_packet(s, pkt, pkt->stream_index, 0)) < 0) return ret; st->codecpar->sample_rate = sti->avctx->sample_rate;
Fixes: read_frame_internal() which does not return even though both demuxer and parser do return Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> --- libavformat/demux.c | 6 ++++++ 1 file changed, 6 insertions(+)