diff mbox series

[FFmpeg-devel,1/2] avformat/demux: Make read_frame_internal() return AVERREOR(EAGAIN) on stuck empty input parser

Message ID 20220208202353.19554-1-michael@niedermayer.cc
State Accepted
Commit 02699490c14e86105104940c009953081f69432c
Headers show
Series [FFmpeg-devel,1/2] avformat/demux: Make read_frame_internal() return AVERREOR(EAGAIN) on stuck empty input parser | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished
andriy/make_ppc success Make finished
andriy/make_fate_ppc success Make fate finished

Commit Message

Michael Niedermayer Feb. 8, 2022, 8:23 p.m. UTC
Fixes: read_frame_internal() which does not return even though both demuxer and parser do return
Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
---
 libavformat/demux.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Michael Niedermayer June 16, 2022, 11:50 p.m. UTC | #1
On Tue, Feb 08, 2022 at 09:23:52PM +0100, Michael Niedermayer wrote:
> Fixes: read_frame_internal() which does not return even though both demuxer and parser do return
> Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304
> 
> Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  libavformat/demux.c | 6 ++++++
>  1 file changed, 6 insertions(+)

will apply patchset

[...]
Marton Balint June 17, 2022, 7:53 p.m. UTC | #2
On Tue, 8 Feb 2022, Michael Niedermayer wrote:

> Fixes: read_frame_internal() which does not return even though both demuxer and parser do return
> Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304
>
> Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
> libavformat/demux.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/libavformat/demux.c b/libavformat/demux.c
> index ec34b65288..dd42d32710 100644
> --- a/libavformat/demux.c
> +++ b/libavformat/demux.c
> @@ -1222,11 +1222,15 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt)
>     FFFormatContext *const si = ffformatcontext(s);
>     int ret, got_packet = 0;
>     AVDictionary *metadata = NULL;
> +    int empty = 0;
>
>     while (!got_packet && !si->parse_queue.head) {
>         AVStream *st;
>         FFStream *sti;
>
> +        if (empty > 1)
> +            return AVERROR(EAGAIN);
> +
>         /* read next packet */
>         ret = ff_read_packet(s, pkt);
>         if (ret < 0) {
> @@ -1317,6 +1321,8 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt)
>             }
>             got_packet = 1;
>         } else if (st->discard < AVDISCARD_ALL) {
> +            if (pkt->size == 0)
> +                empty ++;
>             if ((ret = parse_packet(s, pkt, pkt->stream_index, 0)) < 0)
>                 return ret;
>             st->codecpar->sample_rate = sti->avctx->sample_rate;
> -- 
> 2.17.1

Sorry, just noticed this patchset, and it is very hackish.

For starters the meaning of AVERROR(EAGAIN) for 
av_read_frame()/read_frame_internal() is not very well defined. Should the 
user retry immediately? Should the user sleep() sometime and the retry? 
Can the user expect that a loop of av_read_frame() will eventually return 
something different than AVERROR(EAGAIN)?

I am not sure I understand the original issue entirely, but it looks 
that instead of fixing the component which returns infinite amount 
of zero sized packets you implemented a check that makes the user get an 
EAGAIN() on the second zero-sized packet.

And this helps the user how? Instead of a busy CPU loop in the library he 
either gets a busy CPU loop in its application or a non-busy CPU loop in 
its application (if he sleeps() on EAGAIN).

Is there a file which causes this? Can't the underlying components be 
fixed instead?

Thanks,
Marton
Michael Niedermayer June 18, 2022, 4:30 p.m. UTC | #3
On Fri, Jun 17, 2022 at 09:53:13PM +0200, Marton Balint wrote:
> 
> 
> On Tue, 8 Feb 2022, Michael Niedermayer wrote:
> 
> > Fixes: read_frame_internal() which does not return even though both demuxer and parser do return
> > Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304
> > 
> > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > ---
> > libavformat/demux.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> > 
> > diff --git a/libavformat/demux.c b/libavformat/demux.c
> > index ec34b65288..dd42d32710 100644
> > --- a/libavformat/demux.c
> > +++ b/libavformat/demux.c
> > @@ -1222,11 +1222,15 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt)
> >     FFFormatContext *const si = ffformatcontext(s);
> >     int ret, got_packet = 0;
> >     AVDictionary *metadata = NULL;
> > +    int empty = 0;
> > 
> >     while (!got_packet && !si->parse_queue.head) {
> >         AVStream *st;
> >         FFStream *sti;
> > 
> > +        if (empty > 1)
> > +            return AVERROR(EAGAIN);
> > +
> >         /* read next packet */
> >         ret = ff_read_packet(s, pkt);
> >         if (ret < 0) {
> > @@ -1317,6 +1321,8 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt)
> >             }
> >             got_packet = 1;
> >         } else if (st->discard < AVDISCARD_ALL) {
> > +            if (pkt->size == 0)
> > +                empty ++;
> >             if ((ret = parse_packet(s, pkt, pkt->stream_index, 0)) < 0)
> >                 return ret;
> >             st->codecpar->sample_rate = sti->avctx->sample_rate;
> > -- 
> > 2.17.1
> 
> Sorry, just noticed this patchset, and it is very hackish.

yes thats why i waited so many month before i applied it
some patchsets i forget but this i kept pushing away

> 
> For starters the meaning of AVERROR(EAGAIN) for
> av_read_frame()/read_frame_internal() is not very well defined. Should the
> user retry immediately? Should the user sleep() sometime and the retry? Can
> the user expect that a loop of av_read_frame() will eventually return
> something different than AVERROR(EAGAIN)?

In the context of this problem the idea is to give the user an opertunity
to do something else in a single threaded environment or error out if
its taking too long not produingh anything


> 
> I am not sure I understand the original issue entirely, but it looks that
> instead of fixing the component which returns infinite amount of zero sized
> packets you implemented a check that makes the user get an EAGAIN() on the
> second zero-sized packet.

IIRC the problem was that the demuxer produced 0 sized packets and alot of them
for a file only a few hundread bytes large.
now the zero sized packets go into the parser which has no output because
theres no bytes input so no packet is ever produced by the parser and nothing
comes out of the read_frame_internal() code, its just stuck in a loop

From the point of view of the parser, the parser seems behaving correct
From the point of view of the demuxer, the demuxer seems behaving correct
if for example it produces a billion empty packets
but from the point of view of the user the code is stuck and hangs


> 
> And this helps the user how? Instead of a busy CPU loop in the library he
> either gets a busy CPU loop in its application or a non-busy CPU loop in its
> application (if he sleeps() on EAGAIN).

The advantage for the user is, she can cleanly error out instead of killing
the thread / process


> 
> Is there a file which causes this? Can't the underlying components be fixed
> instead?

i have 2 files it seems, i will mail them to you privatly
As said, its not clear if this is fundamentally any component misbehaving
maybe something does break some rule somewhere i dont know

thx

[...]
Marton Balint June 18, 2022, 9:58 p.m. UTC | #4
On Sat, 18 Jun 2022, Michael Niedermayer wrote:

> On Fri, Jun 17, 2022 at 09:53:13PM +0200, Marton Balint wrote:
>>
>>
>> On Tue, 8 Feb 2022, Michael Niedermayer wrote:
>>
>>> Fixes: read_frame_internal() which does not return even though both demuxer and parser do return
>>> Fixes: 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304
>>>
>>> Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>>> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
>>> ---
>>> libavformat/demux.c | 6 ++++++
>>> 1 file changed, 6 insertions(+)
>>>
>>> diff --git a/libavformat/demux.c b/libavformat/demux.c
>>> index ec34b65288..dd42d32710 100644
>>> --- a/libavformat/demux.c
>>> +++ b/libavformat/demux.c
>>> @@ -1222,11 +1222,15 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt)
>>>     FFFormatContext *const si = ffformatcontext(s);
>>>     int ret, got_packet = 0;
>>>     AVDictionary *metadata = NULL;
>>> +    int empty = 0;
>>>
>>>     while (!got_packet && !si->parse_queue.head) {
>>>         AVStream *st;
>>>         FFStream *sti;
>>>
>>> +        if (empty > 1)
>>> +            return AVERROR(EAGAIN);
>>> +
>>>         /* read next packet */
>>>         ret = ff_read_packet(s, pkt);
>>>         if (ret < 0) {
>>> @@ -1317,6 +1321,8 @@ static int read_frame_internal(AVFormatContext *s, AVPacket *pkt)
>>>             }
>>>             got_packet = 1;
>>>         } else if (st->discard < AVDISCARD_ALL) {
>>> +            if (pkt->size == 0)
>>> +                empty ++;
>>>             if ((ret = parse_packet(s, pkt, pkt->stream_index, 0)) < 0)
>>>                 return ret;
>>>             st->codecpar->sample_rate = sti->avctx->sample_rate;
>>> --
>>> 2.17.1
>>
>> Sorry, just noticed this patchset, and it is very hackish.
>
> yes thats why i waited so many month before i applied it
> some patchsets i forget but this i kept pushing away
>
>>
>> For starters the meaning of AVERROR(EAGAIN) for
>> av_read_frame()/read_frame_internal() is not very well defined. Should the
>> user retry immediately? Should the user sleep() sometime and the retry? Can
>> the user expect that a loop of av_read_frame() will eventually return
>> something different than AVERROR(EAGAIN)?
>
> In the context of this problem the idea is to give the user an opertunity
> to do something else in a single threaded environment or error out if
> its taking too long not produingh anything
>
>
>>
>> I am not sure I understand the original issue entirely, but it looks that
>> instead of fixing the component which returns infinite amount of zero sized
>> packets you implemented a check that makes the user get an EAGAIN() on the
>> second zero-sized packet.
>
> IIRC the problem was that the demuxer produced 0 sized packets and alot of them
> for a file only a few hundread bytes large.

AFAIK a demuxer is not supposed to generate 0-sized packets. Or do you 
think otherwise?

E.g. a simple patch like this seems the better approach:

--- a/libavformat/mov.c
+++ b/libavformat/mov.c
@@ -8677,8 +8677,12 @@ static int mov_read_packet(AVFormatContext *s, 
AVPacket *pkt)

          if (st->codecpar->codec_id == AV_CODEC_ID_EIA_608 && sample->size > 8)
              ret = get_eia608_packet(sc->pb, pkt, sample->size);
-        else
+        else if (sample->size > 0) {
              ret = av_get_packet(sc->pb, pkt, sample->size);
+        } else {
+            av_log(mov->fc, AV_LOG_ERROR, "Invalid sample size, corrupt index?\n");
+            return AVERROR_INVALIDDATA;
+        }
          if (ret < 0) {
              if (should_retry(sc->pb, ret)) {
                  mov_current_sample_dec(sc);

Regards,
Marton
Andreas Rheinhardt June 19, 2022, 11:44 p.m. UTC | #5
Marton Balint:
> 
> 
> On Sat, 18 Jun 2022, Michael Niedermayer wrote:
> 
>> On Fri, Jun 17, 2022 at 09:53:13PM +0200, Marton Balint wrote:
>>>
>>>
>>> On Tue, 8 Feb 2022, Michael Niedermayer wrote:
>>>
>>>> Fixes: read_frame_internal() which does not return even though both
>>>> demuxer and parser do return
>>>> Fixes:
>>>> 43717/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5206008287330304
>>>>
>>>>
>>>> Found-by: continuous fuzzing process
>>>> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>>>> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
>>>> ---
>>>> libavformat/demux.c | 6 ++++++
>>>> 1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/libavformat/demux.c b/libavformat/demux.c
>>>> index ec34b65288..dd42d32710 100644
>>>> --- a/libavformat/demux.c
>>>> +++ b/libavformat/demux.c
>>>> @@ -1222,11 +1222,15 @@ static int
>>>> read_frame_internal(AVFormatContext *s, AVPacket *pkt)
>>>>     FFFormatContext *const si = ffformatcontext(s);
>>>>     int ret, got_packet = 0;
>>>>     AVDictionary *metadata = NULL;
>>>> +    int empty = 0;
>>>>
>>>>     while (!got_packet && !si->parse_queue.head) {
>>>>         AVStream *st;
>>>>         FFStream *sti;
>>>>
>>>> +        if (empty > 1)
>>>> +            return AVERROR(EAGAIN);
>>>> +
>>>>         /* read next packet */
>>>>         ret = ff_read_packet(s, pkt);
>>>>         if (ret < 0) {
>>>> @@ -1317,6 +1321,8 @@ static int read_frame_internal(AVFormatContext
>>>> *s, AVPacket *pkt)
>>>>             }
>>>>             got_packet = 1;
>>>>         } else if (st->discard < AVDISCARD_ALL) {
>>>> +            if (pkt->size == 0)
>>>> +                empty ++;
>>>>             if ((ret = parse_packet(s, pkt, pkt->stream_index, 0)) < 0)
>>>>                 return ret;
>>>>             st->codecpar->sample_rate = sti->avctx->sample_rate;
>>>> -- 
>>>> 2.17.1
>>>
>>> Sorry, just noticed this patchset, and it is very hackish.
>>
>> yes thats why i waited so many month before i applied it
>> some patchsets i forget but this i kept pushing away
>>
>>>
>>> For starters the meaning of AVERROR(EAGAIN) for
>>> av_read_frame()/read_frame_internal() is not very well defined.
>>> Should the
>>> user retry immediately? Should the user sleep() sometime and the
>>> retry? Can
>>> the user expect that a loop of av_read_frame() will eventually return
>>> something different than AVERROR(EAGAIN)?
>>
>> In the context of this problem the idea is to give the user an opertunity
>> to do something else in a single threaded environment or error out if
>> its taking too long not produingh anything
>>
>>
>>>
>>> I am not sure I understand the original issue entirely, but it looks
>>> that
>>> instead of fixing the component which returns infinite amount of zero
>>> sized
>>> packets you implemented a check that makes the user get an EAGAIN()
>>> on the
>>> second zero-sized packet.
>>
>> IIRC the problem was that the demuxer produced 0 sized packets and
>> alot of them
>> for a file only a few hundread bytes large.
> 
> AFAIK a demuxer is not supposed to generate 0-sized packets. Or do you
> think otherwise?

Is it not possible to use zero-sized packets to clear the screen (i.e.
convey the end timestamp of subtitles transmitted earlier)? (The
subtitle formats without duration currently use packets with a size > 0
for that, but there is no real reason why this need to be so.)
Another reason for zero-sized packets were side-data only packets (maybe
not now, but it might be used in the future; after all, we are
constantly adding new side-data-types).

(This does not mean that I am against your proposed patch.)

- Andreas

PS: The fact that the parsing API is based upon buffers and not
AVPackets btw leads to buggy side-data handling: If a parser introduces
delay, then the side-data will not be attached to the correct packet,
but to the preceding one.
Michael Niedermayer June 20, 2022, 11:05 p.m. UTC | #6
On Mon, Jun 20, 2022 at 01:44:52AM +0200, Andreas Rheinhardt wrote:
[...]
> PS: The fact that the parsing API is based upon buffers and not
> AVPackets btw leads to buggy side-data handling: If a parser introduces
> delay, then the side-data will not be attached to the correct packet,
> but to the preceding one.

The parsers pass timestamps around correctly, sidedata can maybe be
handled similarly

thx

[...]
diff mbox series

Patch

diff --git a/libavformat/demux.c b/libavformat/demux.c
index ec34b65288..dd42d32710 100644
--- a/libavformat/demux.c
+++ b/libavformat/demux.c
@@ -1222,11 +1222,15 @@  static int read_frame_internal(AVFormatContext *s, AVPacket *pkt)
     FFFormatContext *const si = ffformatcontext(s);
     int ret, got_packet = 0;
     AVDictionary *metadata = NULL;
+    int empty = 0;
 
     while (!got_packet && !si->parse_queue.head) {
         AVStream *st;
         FFStream *sti;
 
+        if (empty > 1)
+            return AVERROR(EAGAIN);
+
         /* read next packet */
         ret = ff_read_packet(s, pkt);
         if (ret < 0) {
@@ -1317,6 +1321,8 @@  static int read_frame_internal(AVFormatContext *s, AVPacket *pkt)
             }
             got_packet = 1;
         } else if (st->discard < AVDISCARD_ALL) {
+            if (pkt->size == 0)
+                empty ++;
             if ((ret = parse_packet(s, pkt, pkt->stream_index, 0)) < 0)
                 return ret;
             st->codecpar->sample_rate = sti->avctx->sample_rate;