diff mbox

[FFmpeg-devel] avcodec/gsm_parser: return -1 on parse error

Message ID 20190130212729.116453-1-chcunningham@chromium.org
State New
Headers show

Commit Message

chcunningham@chromium.org Jan. 30, 2019, 9:27 p.m. UTC
Return replaces an assert0. libfuzzer generated a testcase that
triggered this assert (codec=0), causing a crash of chrome's renderer.
---
 libavcodec/gsm_parser.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Carl Eugen Hoyos Jan. 30, 2019, 9:36 p.m. UTC | #1
2019-01-30 22:27 GMT+01:00, chcunningham <chcunningham@chromium.org>:
> Return replaces an assert0. libfuzzer generated a testcase that

> triggered this assert (codec=0), causing a crash of chrome's renderer.

Wouldn't this indicate a bug somewhere else?

Carl Eugen
Nicolas George Jan. 30, 2019, 9:38 p.m. UTC | #2
chcunningham (12019-01-30):
> Return replaces an assert0. libfuzzer generated a testcase that
> triggered this assert (codec=0), causing a crash of chrome's renderer.
> ---
>  libavcodec/gsm_parser.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavcodec/gsm_parser.c b/libavcodec/gsm_parser.c
> index 1054a30ca9..5cf2235f73 100644
> --- a/libavcodec/gsm_parser.c
> +++ b/libavcodec/gsm_parser.c
> @@ -56,7 +56,7 @@ static int gsm_parse(AVCodecParserContext *s1, AVCodecContext *avctx,
>              s->duration   = GSM_FRAME_SIZE * 2;
>              break;
>          default:
> -            av_assert0(0);
> +            return -1;
>          }
>      }

-1 is not a correct error code.

Also, an assert() means the code was supposed to be unreachable. If it
is not, that means a bug is lurking somewhere else, it must be found,
not just hidden.

Regards,
James Almer Jan. 30, 2019, 9:39 p.m. UTC | #3
On 1/30/2019 6:27 PM, chcunningham wrote:
> Return replaces an assert0. libfuzzer generated a testcase that
> triggered this assert (codec=0), causing a crash of chrome's renderer.
> ---
>  libavcodec/gsm_parser.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavcodec/gsm_parser.c b/libavcodec/gsm_parser.c
> index 1054a30ca9..5cf2235f73 100644
> --- a/libavcodec/gsm_parser.c
> +++ b/libavcodec/gsm_parser.c
> @@ -56,7 +56,7 @@ static int gsm_parse(AVCodecParserContext *s1, AVCodecContext *avctx,
>              s->duration   = GSM_FRAME_SIZE * 2;
>              break;
>          default:
> -            av_assert0(0);
> +            return -1;
>          }
>      }

Parsers can't return negative values, only the output packet size. For
the purpose of errors, they usually return the entire untouched packet size.

And this definitely means there's a bug elsewhere. This parser should
have not been used for codecs ids other than GSM and GSM_MS. That's
precisely what this assert() is making sure of.
chcunningham@chromium.org Jan. 30, 2019, 9:44 p.m. UTC | #4
On Wed, Jan 30, 2019 at 1:40 PM James Almer <jamrial@gmail.com> wrote:

> Parsers can't return negative values, only the output packet size. For
> the purpose of errors, they usually return the entire untouched packet
> size.
>

Understood. Is this documented somewhere? I noted the comment in
av_paser_parse2(), "/* WARNING: the returned index can be negative */", and
guessed that signaled an error.


> And this definitely means there's a bug elsewhere. This parser should
> have not been used for codecs ids other than GSM and GSM_MS. That's
> precisely what this assert() is making sure of.
>

I'll take a closer look at how this parser was selected.

Thanks for the quick reply.
James Almer Jan. 30, 2019, 9:57 p.m. UTC | #5
On 1/30/2019 6:44 PM, Chris Cunningham wrote:
> On Wed, Jan 30, 2019 at 1:40 PM James Almer <jamrial@gmail.com> wrote:
> 
>> Parsers can't return negative values, only the output packet size. For
>> the purpose of errors, they usually return the entire untouched packet
>> size.
>>
> 
> Understood. Is this documented somewhere? I noted the comment in
> av_paser_parse2(), "/* WARNING: the returned index can be negative */", and
> guessed that signaled an error.

    /* This callback never returns an error, a negative value means that
     * the frame start was in a previous packet. */
    int (*parser_parse)(AVCodecParserContext *s,
                        AVCodecContext *avctx,
                        const uint8_t **poutbuf, int *poutbuf_size,
                        const uint8_t *buf, int buf_size);

So i was wrong in that it's not that it can't return negative values,
just not for the purpose of signaling errors.

> 
> 
>> And this definitely means there's a bug elsewhere. This parser should
>> have not been used for codecs ids other than GSM and GSM_MS. That's
>> precisely what this assert() is making sure of.
>>
> 
> I'll take a closer look at how this parser was selected.
> 
> Thanks for the quick reply.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
chcunningham@chromium.org Jan. 31, 2019, 1:20 a.m. UTC | #6
>
> >> And this definitely means there's a bug elsewhere. This parser should
> >> have not been used for codecs ids other than GSM and GSM_MS. That's
> >> precisely what this assert() is making sure of.
> >>
> >
> > I'll take a closer look at how this parser was selected.
>

Ok, here are full details of how this happens:

   1. AV_CODEC_ID_GSM_MS is assigned to on st->codecpar->codec_id
   in ogm_header() (oggparseogm.c:75)
   2. The (deprecated) st->codec->codec_id is not assigned at that time and
   remains 0 (UNKNOWN)
   3. AV_CODEC_ID_GSM_MS from st->codecpar is passed to av_parser_init,
   selecting the gsm parser (in read_frame_internal())
   4. The fuzzer next (only) packet has a size of 0, so the first call to
   parse_packet() (still in read_frame_internal()) does nothing
   5. After this call we assign several members of st->internal->avctx to
   st->codecpar. This leaves codecpar->codec_id = UNKNOWN. Interestingly, we
   do not reset the st->parser at this point (maybe we should?)
   6. Next iteration of the read_frame_internal() loop we get EOF from
   ff_read_packet. This triggers the "flush the parsers" call to
   parse_packet() which finally reaches gsm_parse() to trigger the abort
   (avctx->codec_id is still 0).

Questions (guessing at the fix):

   - At what point should st->codec->codec_id be synchronized with
   st->codecpar->codec_id? It never is in this test.
   - Should we reset st->parser whenever we reset st->codecpar->codec_id
   (step 5 above)?

Thanks,
Chris
James Almer Jan. 31, 2019, 1:43 a.m. UTC | #7
On 1/30/2019 10:20 PM, Chris Cunningham wrote:
>>
>>>> And this definitely means there's a bug elsewhere. This parser should
>>>> have not been used for codecs ids other than GSM and GSM_MS. That's
>>>> precisely what this assert() is making sure of.
>>>>
>>>
>>> I'll take a closer look at how this parser was selected.
>>
> 
> Ok, here are full details of how this happens:
> 
>    1. AV_CODEC_ID_GSM_MS is assigned to on st->codecpar->codec_id
>    in ogm_header() (oggparseogm.c:75)
>    2. The (deprecated) st->codec->codec_id is not assigned at that time and
>    remains 0 (UNKNOWN)
>    3. AV_CODEC_ID_GSM_MS from st->codecpar is passed to av_parser_init,
>    selecting the gsm parser (in read_frame_internal())
>    4. The fuzzer next (only) packet has a size of 0, so the first call to
>    parse_packet() (still in read_frame_internal()) does nothing
>    5. After this call we assign several members of st->internal->avctx to
>    st->codecpar. This leaves codecpar->codec_id = UNKNOWN. Interestingly, we
>    do not reset the st->parser at this point (maybe we should?)

Where does this happen? A call to avcodec_parameters_from_context()
should also copy codec_id.

>    6. Next iteration of the read_frame_internal() loop we get EOF from
>    ff_read_packet. This triggers the "flush the parsers" call to
>    parse_packet() which finally reaches gsm_parse() to trigger the abort
>    (avctx->codec_id is still 0).
> 
> Questions (guessing at the fix):
> 
>    - At what point should st->codec->codec_id be synchronized with
>    st->codecpar->codec_id? It never is in this test.

In avformat_find_stream_info(), i think. In any case, st->codec should
have no effect on parsers. What is used in parse_packet() however is
st->internal->avctx, so that context is the one that evidently contains
codec_id == UNKNOWN when it should be in sync with codecpar.

>    - Should we reset st->parser whenever we reset st->codecpar->codec_id
>    (step 5 above)?
> 
> Thanks,
> Chris
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
chcunningham@chromium.org Jan. 31, 2019, 10:28 p.m. UTC | #8
On Wed, Jan 30, 2019 at 5:43 PM James Almer <jamrial@gmail.com> wrote:

> On 1/30/2019 10:20 PM, Chris Cunningham wrote:
> >>
> >>>> And this definitely means there's a bug elsewhere. This parser should
> >>>> have not been used for codecs ids other than GSM and GSM_MS. That's
> >>>> precisely what this assert() is making sure of.
> >>>>
> >>>
> >>> I'll take a closer look at how this parser was selected.
> >>
> >
> > Ok, here are full details of how this happens:
> >
> >    1. AV_CODEC_ID_GSM_MS is assigned to on st->codecpar->codec_id
> >    in ogm_header() (oggparseogm.c:75)
> >    2. The (deprecated) st->codec->codec_id is not assigned at that time
> and
> >    remains 0 (UNKNOWN)
> >    3. AV_CODEC_ID_GSM_MS from st->codecpar is passed to av_parser_init,
> >    selecting the gsm parser (in read_frame_internal())
> >    4. The fuzzer next (only) packet has a size of 0, so the first call to
> >    parse_packet() (still in read_frame_internal()) does nothing
> >    5. After this call we assign several members of st->internal->avctx to
> >    st->codecpar. This leaves codecpar->codec_id = UNKNOWN.
> Interestingly, we
> >    do not reset the st->parser at this point (maybe we should?)
>
> Where does this happen? A call to avcodec_parameters_from_context()
> should also copy codec_id.
>

Ignore #5 here - I'm not seeing that today - was likely confused.


>
> >    6. Next iteration of the read_frame_internal() loop we get EOF from
> >    ff_read_packet. This triggers the "flush the parsers" call to
> >    parse_packet() which finally reaches gsm_parse() to trigger the abort
> >    (avctx->codec_id is still 0).
> >
> > Questions (guessing at the fix):
> >
> >    - At what point should st->codec->codec_id be synchronized with
> >    st->codecpar->codec_id? It never is in this test.
>
> In avformat_find_stream_info(), i think. In any case, st->codec should
> have no effect on parsers. What is used in parse_packet() however is
> st->internal->avctx, so that context is the one that evidently contains
> codec_id == UNKNOWN when it should be in sync with codecpar.
>

We do call avformat_find_stream_info, and avcodec_parameters_from_context
is called during that process. Everything seems OK when
avformat_find_stream_info is done. The codecpar->codec_id is in sync with
internal->avctx->codec_id for all streams.

But then we start calling av_read_frame as part of normal demuxing.
avcodec_parameters_from_context() does not appear to be called during this
process. Eventually we get this stack:

ogm_header
ogg_packet
ogg_read_packet
ff_read_packet
read_frame_internal
av_read_frame

Inside ogm_header, st->codecpar->codec_id is assigned AV_CODEC_ID_GSM_MS
(previous value was codec NONE). But *st->internal->avctx->codec_id is
never updated. It remains NONE for the rest of this test. *

When this unwinds back to read_frame_internal, st->parser is assigned using
this new codec (GSM_MS)
    st->parser = av_parser_init(st->codecpar->codec_id);

Later on, still inside read_frame_internal's loop, we get EOF and call,
parse_packet (/* flush the parsers */)
parse_packet() calls av_parser_parse2(), passing st->internal->avctx
(codec_id still NONE, while codecpar still says GSM_MS). This ultimately
gets passed to gsm_parse, which triggers the assert0.

The underlined bit above seems like the root cause. Where should we be
updating st->internal->avctx->codec_id?
chcunningham@chromium.org Feb. 5, 2019, 12:33 a.m. UTC | #9
>
> The underlined bit above seems like the root cause. Where should we be
> updating st->internal->avctx->codec_id?
>

Friendly ping for review
Michael Niedermayer Feb. 6, 2019, 4:23 p.m. UTC | #10
On Mon, Feb 04, 2019 at 04:33:37PM -0800, Chris Cunningham wrote:
> >
> > The underlined bit above seems like the root cause. Where should we be
> > updating st->internal->avctx->codec_id?
> >
> 
> Friendly ping for review

I dont have a testcase so i cant test but possibly setting need_context_update
would help

thx

[...]
Hendrik Leppkes Feb. 6, 2019, 4:29 p.m. UTC | #11
On Thu, Jan 31, 2019 at 11:29 PM Chris Cunningham
<chcunningham@chromium.org> wrote:
>
> On Wed, Jan 30, 2019 at 5:43 PM James Almer <jamrial@gmail.com> wrote:
>
> > On 1/30/2019 10:20 PM, Chris Cunningham wrote:
> > >>
> > >>>> And this definitely means there's a bug elsewhere. This parser should
> > >>>> have not been used for codecs ids other than GSM and GSM_MS. That's
> > >>>> precisely what this assert() is making sure of.
> > >>>>
> > >>>
> > >>> I'll take a closer look at how this parser was selected.
> > >>
> > >
> > > Ok, here are full details of how this happens:
> > >
> > >    1. AV_CODEC_ID_GSM_MS is assigned to on st->codecpar->codec_id
> > >    in ogm_header() (oggparseogm.c:75)
> > >    2. The (deprecated) st->codec->codec_id is not assigned at that time
> > and
> > >    remains 0 (UNKNOWN)
> > >    3. AV_CODEC_ID_GSM_MS from st->codecpar is passed to av_parser_init,
> > >    selecting the gsm parser (in read_frame_internal())
> > >    4. The fuzzer next (only) packet has a size of 0, so the first call to
> > >    parse_packet() (still in read_frame_internal()) does nothing
> > >    5. After this call we assign several members of st->internal->avctx to
> > >    st->codecpar. This leaves codecpar->codec_id = UNKNOWN.
> > Interestingly, we
> > >    do not reset the st->parser at this point (maybe we should?)
> >
> > Where does this happen? A call to avcodec_parameters_from_context()
> > should also copy codec_id.
> >
>
> Ignore #5 here - I'm not seeing that today - was likely confused.
>
>
> >
> > >    6. Next iteration of the read_frame_internal() loop we get EOF from
> > >    ff_read_packet. This triggers the "flush the parsers" call to
> > >    parse_packet() which finally reaches gsm_parse() to trigger the abort
> > >    (avctx->codec_id is still 0).
> > >
> > > Questions (guessing at the fix):
> > >
> > >    - At what point should st->codec->codec_id be synchronized with
> > >    st->codecpar->codec_id? It never is in this test.
> >
> > In avformat_find_stream_info(), i think. In any case, st->codec should
> > have no effect on parsers. What is used in parse_packet() however is
> > st->internal->avctx, so that context is the one that evidently contains
> > codec_id == UNKNOWN when it should be in sync with codecpar.
> >
>
> We do call avformat_find_stream_info, and avcodec_parameters_from_context
> is called during that process. Everything seems OK when
> avformat_find_stream_info is done. The codecpar->codec_id is in sync with
> internal->avctx->codec_id for all streams.
>
> But then we start calling av_read_frame as part of normal demuxing.
> avcodec_parameters_from_context() does not appear to be called during this
> process. Eventually we get this stack:
>
> ogm_header
> ogg_packet
> ogg_read_packet
> ff_read_packet
> read_frame_internal
> av_read_frame
>
> Inside ogm_header, st->codecpar->codec_id is assigned AV_CODEC_ID_GSM_MS
> (previous value was codec NONE). But *st->internal->avctx->codec_id is
> never updated. It remains NONE for the rest of this test. *
>
> When this unwinds back to read_frame_internal, st->parser is assigned using
> this new codec (GSM_MS)
>     st->parser = av_parser_init(st->codecpar->codec_id);
>
> Later on, still inside read_frame_internal's loop, we get EOF and call,
> parse_packet (/* flush the parsers */)
> parse_packet() calls av_parser_parse2(), passing st->internal->avctx
> (codec_id still NONE, while codecpar still says GSM_MS). This ultimately
> gets passed to gsm_parse, which triggers the assert0.
>
> The underlined bit above seems like the root cause. Where should we be
> updating st->internal->avctx->codec_id?

We have a flag to set that causes avformat to fix its internal state
if a demuxer changes it after the initial opening.
st->internal->need_context_update = 1

Try setting that at the position where the demuxer changes codecpar
(ie. in ogm_header?), and see if that resolves it.

- Hendrik
chcunningham@chromium.org Feb. 7, 2019, 1:12 a.m. UTC | #12
Thanks, this works great. Stand by for patch.

On Wed, Feb 6, 2019 at 8:38 AM Hendrik Leppkes <h.leppkes@gmail.com> wrote:

> On Thu, Jan 31, 2019 at 11:29 PM Chris Cunningham
> <chcunningham@chromium.org> wrote:
> >
> > On Wed, Jan 30, 2019 at 5:43 PM James Almer <jamrial@gmail.com> wrote:
> >
> > > On 1/30/2019 10:20 PM, Chris Cunningham wrote:
> > > >>
> > > >>>> And this definitely means there's a bug elsewhere. This parser
> should
> > > >>>> have not been used for codecs ids other than GSM and GSM_MS.
> That's
> > > >>>> precisely what this assert() is making sure of.
> > > >>>>
> > > >>>
> > > >>> I'll take a closer look at how this parser was selected.
> > > >>
> > > >
> > > > Ok, here are full details of how this happens:
> > > >
> > > >    1. AV_CODEC_ID_GSM_MS is assigned to on st->codecpar->codec_id
> > > >    in ogm_header() (oggparseogm.c:75)
> > > >    2. The (deprecated) st->codec->codec_id is not assigned at that
> time
> > > and
> > > >    remains 0 (UNKNOWN)
> > > >    3. AV_CODEC_ID_GSM_MS from st->codecpar is passed to
> av_parser_init,
> > > >    selecting the gsm parser (in read_frame_internal())
> > > >    4. The fuzzer next (only) packet has a size of 0, so the first
> call to
> > > >    parse_packet() (still in read_frame_internal()) does nothing
> > > >    5. After this call we assign several members of
> st->internal->avctx to
> > > >    st->codecpar. This leaves codecpar->codec_id = UNKNOWN.
> > > Interestingly, we
> > > >    do not reset the st->parser at this point (maybe we should?)
> > >
> > > Where does this happen? A call to avcodec_parameters_from_context()
> > > should also copy codec_id.
> > >
> >
> > Ignore #5 here - I'm not seeing that today - was likely confused.
> >
> >
> > >
> > > >    6. Next iteration of the read_frame_internal() loop we get EOF
> from
> > > >    ff_read_packet. This triggers the "flush the parsers" call to
> > > >    parse_packet() which finally reaches gsm_parse() to trigger the
> abort
> > > >    (avctx->codec_id is still 0).
> > > >
> > > > Questions (guessing at the fix):
> > > >
> > > >    - At what point should st->codec->codec_id be synchronized with
> > > >    st->codecpar->codec_id? It never is in this test.
> > >
> > > In avformat_find_stream_info(), i think. In any case, st->codec should
> > > have no effect on parsers. What is used in parse_packet() however is
> > > st->internal->avctx, so that context is the one that evidently contains
> > > codec_id == UNKNOWN when it should be in sync with codecpar.
> > >
> >
> > We do call avformat_find_stream_info, and avcodec_parameters_from_context
> > is called during that process. Everything seems OK when
> > avformat_find_stream_info is done. The codecpar->codec_id is in sync with
> > internal->avctx->codec_id for all streams.
> >
> > But then we start calling av_read_frame as part of normal demuxing.
> > avcodec_parameters_from_context() does not appear to be called during
> this
> > process. Eventually we get this stack:
> >
> > ogm_header
> > ogg_packet
> > ogg_read_packet
> > ff_read_packet
> > read_frame_internal
> > av_read_frame
> >
> > Inside ogm_header, st->codecpar->codec_id is assigned AV_CODEC_ID_GSM_MS
> > (previous value was codec NONE). But *st->internal->avctx->codec_id is
> > never updated. It remains NONE for the rest of this test. *
> >
> > When this unwinds back to read_frame_internal, st->parser is assigned
> using
> > this new codec (GSM_MS)
> >     st->parser = av_parser_init(st->codecpar->codec_id);
> >
> > Later on, still inside read_frame_internal's loop, we get EOF and call,
> > parse_packet (/* flush the parsers */)
> > parse_packet() calls av_parser_parse2(), passing st->internal->avctx
> > (codec_id still NONE, while codecpar still says GSM_MS). This ultimately
> > gets passed to gsm_parse, which triggers the assert0.
> >
> > The underlined bit above seems like the root cause. Where should we be
> > updating st->internal->avctx->codec_id?
>
> We have a flag to set that causes avformat to fix its internal state
> if a demuxer changes it after the initial opening.
> st->internal->need_context_update = 1
>
> Try setting that at the position where the demuxer changes codecpar
> (ie. in ogm_header?), and see if that resolves it.
>
> - Hendrik
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
diff mbox

Patch

diff --git a/libavcodec/gsm_parser.c b/libavcodec/gsm_parser.c
index 1054a30ca9..5cf2235f73 100644
--- a/libavcodec/gsm_parser.c
+++ b/libavcodec/gsm_parser.c
@@ -56,7 +56,7 @@  static int gsm_parse(AVCodecParserContext *s1, AVCodecContext *avctx,
             s->duration   = GSM_FRAME_SIZE * 2;
             break;
         default:
-            av_assert0(0);
+            return -1;
         }
     }