diff mbox series

[FFmpeg-devel] avutil/channel_layout: don't error out on truncated strings

Message ID 20220702222532.9609-1-jamrial@gmail.com
State New
Headers show
Series [FFmpeg-devel] avutil/channel_layout: don't error out on truncated strings | expand

Checks

Context Check Description
yinshiyou/make_loongarch64 success Make finished
yinshiyou/make_fate_loongarch64 success Make fate finished
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished
andriy/make_armv7_RPi4 success Make finished
andriy/make_fate_armv7_RPi4 success Make fate finished

Commit Message

James Almer July 2, 2022, 10:25 p.m. UTC
The doxy for av_channel_layout_describe() states that the user should look
at the return value to check if the string was truncated. Returning an error
code in this scenario goes against this and is an API break.

This is a better fix than the one committed in 8154cb7c2f for the timeout
fuzzing issue.

Signed-off-by: James Almer <jamrial@gmail.com>
---
 libavutil/channel_layout.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Andreas Rheinhardt July 3, 2022, 7:19 a.m. UTC | #1
James Almer:
> The doxy for av_channel_layout_describe() states that the user should look
> at the return value to check if the string was truncated. Returning an error
> code in this scenario goes against this and is an API break.
> 
> This is a better fix than the one committed in 8154cb7c2f for the timeout
> fuzzing issue.
> 
> Signed-off-by: James Almer <jamrial@gmail.com>
> ---
>  libavutil/channel_layout.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavutil/channel_layout.c b/libavutil/channel_layout.c
> index 1887987789..d54a0adbe1 100644
> --- a/libavutil/channel_layout.c
> +++ b/libavutil/channel_layout.c
> @@ -759,7 +759,7 @@ int av_channel_layout_describe_bprint(const AVChannelLayout *channel_layout,
>                  av_bprintf(bp, "@%s", channel_layout->u.map[i].name);
>  
>              if (!av_bprint_is_complete(bp))
> -                return AVERROR(ENOMEM);
> +                break;
>  
>          }
>          if (channel_layout->nb_channels) {

Isn't this actually still against the API? av_channel_layout_describe()
will not return the correct number of bytes necessary to write the
string for the channel layout.

- Andreas
Nicolas George July 3, 2022, 10 a.m. UTC | #2
Andreas Rheinhardt (12022-07-03):
> >              if (!av_bprint_is_complete(bp))
> > -                return AVERROR(ENOMEM);
> > +                break;
> >  

> Isn't this actually still against the API? av_channel_layout_describe()
> will not return the correct number of bytes necessary to write the
> string for the channel layout.

You are both right.

BPrint-based APIs are not supposed to check for truncation, because
printing into a bounded buffer to determine the necessary size is a
valid use (see AV_BPRINT_SIZE_COUNT_ONLY).

What is wrong is Michael's original fix:

>> commit 8154cb7c2ff2afcb1a0842de8c215b7714c814d0
>> Author: Michael Niedermayer <michael@niedermayer.cc>
>> Date:   2022-06-30 00:00:32 +0200
>> 
>>     avutil/channel_layout: av_channel_layout_describe_bprint: Check for buffer end
>> 
>>     Fixes: Timeout printing a billion channels
>>     Fixes: 48099/clusterfuzz-testcase-minimized-ffmpeg_dem_MATROSKA_fuzzer-6754782204788736
>> 
>>     Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>>     Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
>> 
>> diff --git a/libavutil/channel_layout.c b/libavutil/channel_layout.c
>> index 21b70173b7..1887987789 100644
>> --- a/libavutil/channel_layout.c
>> +++ b/libavutil/channel_layout.c
>> @@ -757,6 +757,10 @@ int av_channel_layout_describe_bprint(const AVChannelLayout *channel_layout,
>>              if (channel_layout->order == AV_CHANNEL_ORDER_CUSTOM &&

>>                  channel_layout->u.map[i].name[0])

If the channel count is insanely high, then this will cause invalid
reads.

>>                  av_bprintf(bp, "@%s", channel_layout->u.map[i].name);
>> +
>> +            if (!av_bprint_is_complete(bp))
>> +                return AVERROR(ENOMEM);
>> +
>>          }
>>          if (channel_layout->nb_channels) {
>>              av_bprintf(bp, ")");

Obviously, this fuzzer found a case where a demuxer or a decoder
constructs an invalid channel layout in memory without proper
validation. There is a bug, possibly an security-related one, and this
only hides it from the test suite.

The proper fix is to find where the input is improperly validated, and
then revert this change.

I do not know how to exploit the
"48099/clusterfuzz-testcase-minimized-ffmpeg_dem_MATROSKA_fuzzer-6754782204788736"
information to reproduce the bug and investigate.

Regards,
James Almer July 4, 2022, 12:27 a.m. UTC | #3
On 7/3/2022 7:00 AM, Nicolas George wrote:
> Andreas Rheinhardt (12022-07-03):
>>>               if (!av_bprint_is_complete(bp))
>>> -                return AVERROR(ENOMEM);
>>> +                break;
>>>   
> 
>> Isn't this actually still against the API? av_channel_layout_describe()
>> will not return the correct number of bytes necessary to write the
>> string for the channel layout.
> 
> You are both right.
> 
> BPrint-based APIs are not supposed to check for truncation, because
> printing into a bounded buffer to determine the necessary size is a
> valid use (see AV_BPRINT_SIZE_COUNT_ONLY).
> 
> What is wrong is Michael's original fix:
> 
>>> commit 8154cb7c2ff2afcb1a0842de8c215b7714c814d0
>>> Author: Michael Niedermayer <michael@niedermayer.cc>
>>> Date:   2022-06-30 00:00:32 +0200
>>>
>>>      avutil/channel_layout: av_channel_layout_describe_bprint: Check for buffer end
>>>
>>>      Fixes: Timeout printing a billion channels
>>>      Fixes: 48099/clusterfuzz-testcase-minimized-ffmpeg_dem_MATROSKA_fuzzer-6754782204788736
>>>
>>>      Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>>>      Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
>>>
>>> diff --git a/libavutil/channel_layout.c b/libavutil/channel_layout.c
>>> index 21b70173b7..1887987789 100644
>>> --- a/libavutil/channel_layout.c
>>> +++ b/libavutil/channel_layout.c
>>> @@ -757,6 +757,10 @@ int av_channel_layout_describe_bprint(const AVChannelLayout *channel_layout,
>>>               if (channel_layout->order == AV_CHANNEL_ORDER_CUSTOM &&
> 
>>>                   channel_layout->u.map[i].name[0])
> 
> If the channel count is insanely high, then this will cause invalid
> reads.
> 
>>>                   av_bprintf(bp, "@%s", channel_layout->u.map[i].name);
>>> +
>>> +            if (!av_bprint_is_complete(bp))
>>> +                return AVERROR(ENOMEM);
>>> +
>>>           }
>>>           if (channel_layout->nb_channels) {
>>>               av_bprintf(bp, ")");
> 
> Obviously, this fuzzer found a case where a demuxer or a decoder
> constructs an invalid channel layout in memory without proper
> validation. There is a bug, possibly an security-related one, and this
> only hides it from the test suite.

The Matroska demuxer could in theory generate a native layout with more 
than 64 channels where popcnt(mask) != channels, and nothing seems to 
validate what a demuxer's read_header() callback returns if you just 
call lavf API functions like target_dem_fuzzer.c seems to do.

Maybe:

> diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
> index 73ded761fd..ad7ee390a2 100644
> --- a/libavformat/matroskadec.c
> +++ b/libavformat/matroskadec.c
> @@ -2950,10 +2950,10 @@ static int matroska_parse_tracks(AVFormatContext *s)
>              st->codecpar->codec_tag   = fourcc;
>              st->codecpar->sample_rate = track->audio.out_samplerate;
>              // channel layout may be already set by codec private checks above
> -            if (st->codecpar->ch_layout.order == AV_CHANNEL_ORDER_NATIVE &&
> -                !st->codecpar->ch_layout.u.mask)
> +            if (!av_channel_layout_check(&st->codecpar->ch_layout)) {
>                  st->codecpar->ch_layout.order = AV_CHANNEL_ORDER_UNSPEC;
> -            st->codecpar->ch_layout.nb_channels = track->audio.channels;
> +                st->codecpar->ch_layout.nb_channels = track->audio.channels;
> +            }
>              if (!st->codecpar->bits_per_coded_sample)
>                  st->codecpar->bits_per_coded_sample = track->audio.bitdepth;
>              if (st->codecpar->codec_id == AV_CODEC_ID_MP3 ||

is enough to ensure a valid layout is propagated. This assumes 
parameters set by codec private parsing are correct (only FLAC seem to 
be present right now, and it is).

Assuming i got this right, in this fuzzing sample's case it would still 
have a billion channels (since that's what track->audio.channels 
contains, as read from the container), but using the unspec layout, 
which is technically valid even if nothing will really handle it, and 
av_channel_layout_describe() will print a small string.

Still, i think a check in avformat_open_input() might also be a good 
idea, especially once (and if) demuxers start propagating custom 
layouts, where the map array will be allocated.

> 
> The proper fix is to find where the input is improperly validated, and
> then revert this change.
> 
> I do not know how to exploit the
> "48099/clusterfuzz-testcase-minimized-ffmpeg_dem_MATROSKA_fuzzer-6754782204788736"
> information to reproduce the bug and investigate.

Michael can share the sample, but you need to compile 
tools/target_dem_fuzzer.c to read it.

> 
> Regards,
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Michael Niedermayer July 4, 2022, 4:55 p.m. UTC | #4
On Sun, Jul 03, 2022 at 09:27:31PM -0300, James Almer wrote:
> On 7/3/2022 7:00 AM, Nicolas George wrote:
> > Andreas Rheinhardt (12022-07-03):
> > > >               if (!av_bprint_is_complete(bp))
> > > > -                return AVERROR(ENOMEM);
> > > > +                break;
> > 
> > > Isn't this actually still against the API? av_channel_layout_describe()
> > > will not return the correct number of bytes necessary to write the
> > > string for the channel layout.
> > 
> > You are both right.
> > 
> > BPrint-based APIs are not supposed to check for truncation, because
> > printing into a bounded buffer to determine the necessary size is a
> > valid use (see AV_BPRINT_SIZE_COUNT_ONLY).
> > 
> > What is wrong is Michael's original fix:
> > 
> > > > commit 8154cb7c2ff2afcb1a0842de8c215b7714c814d0
> > > > Author: Michael Niedermayer <michael@niedermayer.cc>
> > > > Date:   2022-06-30 00:00:32 +0200
> > > > 
> > > >      avutil/channel_layout: av_channel_layout_describe_bprint: Check for buffer end
> > > > 
> > > >      Fixes: Timeout printing a billion channels
> > > >      Fixes: 48099/clusterfuzz-testcase-minimized-ffmpeg_dem_MATROSKA_fuzzer-6754782204788736
> > > > 
> > > >      Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > > >      Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > > > 
> > > > diff --git a/libavutil/channel_layout.c b/libavutil/channel_layout.c
> > > > index 21b70173b7..1887987789 100644
> > > > --- a/libavutil/channel_layout.c
> > > > +++ b/libavutil/channel_layout.c
> > > > @@ -757,6 +757,10 @@ int av_channel_layout_describe_bprint(const AVChannelLayout *channel_layout,
> > > >               if (channel_layout->order == AV_CHANNEL_ORDER_CUSTOM &&
> > 
> > > >                   channel_layout->u.map[i].name[0])
> > 
> > If the channel count is insanely high, then this will cause invalid
> > reads.
> > 
> > > >                   av_bprintf(bp, "@%s", channel_layout->u.map[i].name);
> > > > +
> > > > +            if (!av_bprint_is_complete(bp))
> > > > +                return AVERROR(ENOMEM);
> > > > +
> > > >           }
> > > >           if (channel_layout->nb_channels) {
> > > >               av_bprintf(bp, ")");
> > 
> > Obviously, this fuzzer found a case where a demuxer or a decoder
> > constructs an invalid channel layout in memory without proper
> > validation. There is a bug, possibly an security-related one, and this
> > only hides it from the test suite.
> 
> The Matroska demuxer could in theory generate a native layout with more than
> 64 channels where popcnt(mask) != channels, and nothing seems to validate
> what a demuxer's read_header() callback returns if you just call lavf API
> functions like target_dem_fuzzer.c seems to do.
> 
> Maybe:
> 
> > diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
> > index 73ded761fd..ad7ee390a2 100644
> > --- a/libavformat/matroskadec.c
> > +++ b/libavformat/matroskadec.c
> > @@ -2950,10 +2950,10 @@ static int matroska_parse_tracks(AVFormatContext *s)
> >              st->codecpar->codec_tag   = fourcc;
> >              st->codecpar->sample_rate = track->audio.out_samplerate;
> >              // channel layout may be already set by codec private checks above
> > -            if (st->codecpar->ch_layout.order == AV_CHANNEL_ORDER_NATIVE &&
> > -                !st->codecpar->ch_layout.u.mask)
> > +            if (!av_channel_layout_check(&st->codecpar->ch_layout)) {
> >                  st->codecpar->ch_layout.order = AV_CHANNEL_ORDER_UNSPEC;
> > -            st->codecpar->ch_layout.nb_channels = track->audio.channels;
> > +                st->codecpar->ch_layout.nb_channels = track->audio.channels;
> > +            }
> >              if (!st->codecpar->bits_per_coded_sample)
> >                  st->codecpar->bits_per_coded_sample = track->audio.bitdepth;
> >              if (st->codecpar->codec_id == AV_CODEC_ID_MP3 ||
> 
> is enough to ensure a valid layout is propagated. This assumes parameters
> set by codec private parsing are correct (only FLAC seem to be present right
> now, and it is).
> 
> Assuming i got this right, in this fuzzing sample's case it would still have
> a billion channels (since that's what track->audio.channels contains, as
> read from the container), but using the unspec layout, which is technically
> valid even if nothing will really handle it, and
> av_channel_layout_describe() will print a small string.

seems this is fixing it
thanks


> 
> Still, i think a check in avformat_open_input() might also be a good idea,

yes, i agree 


> especially once (and if) demuxers start propagating custom layouts, where
> the map array will be allocated.

[...]
James Almer July 4, 2022, 4:57 p.m. UTC | #5
On 7/4/2022 1:55 PM, Michael Niedermayer wrote:
> On Sun, Jul 03, 2022 at 09:27:31PM -0300, James Almer wrote:
>> On 7/3/2022 7:00 AM, Nicolas George wrote:
>>> Andreas Rheinhardt (12022-07-03):
>>>>>                if (!av_bprint_is_complete(bp))
>>>>> -                return AVERROR(ENOMEM);
>>>>> +                break;
>>>
>>>> Isn't this actually still against the API? av_channel_layout_describe()
>>>> will not return the correct number of bytes necessary to write the
>>>> string for the channel layout.
>>>
>>> You are both right.
>>>
>>> BPrint-based APIs are not supposed to check for truncation, because
>>> printing into a bounded buffer to determine the necessary size is a
>>> valid use (see AV_BPRINT_SIZE_COUNT_ONLY).
>>>
>>> What is wrong is Michael's original fix:
>>>
>>>>> commit 8154cb7c2ff2afcb1a0842de8c215b7714c814d0
>>>>> Author: Michael Niedermayer <michael@niedermayer.cc>
>>>>> Date:   2022-06-30 00:00:32 +0200
>>>>>
>>>>>       avutil/channel_layout: av_channel_layout_describe_bprint: Check for buffer end
>>>>>
>>>>>       Fixes: Timeout printing a billion channels
>>>>>       Fixes: 48099/clusterfuzz-testcase-minimized-ffmpeg_dem_MATROSKA_fuzzer-6754782204788736
>>>>>
>>>>>       Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>>>>>       Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
>>>>>
>>>>> diff --git a/libavutil/channel_layout.c b/libavutil/channel_layout.c
>>>>> index 21b70173b7..1887987789 100644
>>>>> --- a/libavutil/channel_layout.c
>>>>> +++ b/libavutil/channel_layout.c
>>>>> @@ -757,6 +757,10 @@ int av_channel_layout_describe_bprint(const AVChannelLayout *channel_layout,
>>>>>                if (channel_layout->order == AV_CHANNEL_ORDER_CUSTOM &&
>>>
>>>>>                    channel_layout->u.map[i].name[0])
>>>
>>> If the channel count is insanely high, then this will cause invalid
>>> reads.
>>>
>>>>>                    av_bprintf(bp, "@%s", channel_layout->u.map[i].name);
>>>>> +
>>>>> +            if (!av_bprint_is_complete(bp))
>>>>> +                return AVERROR(ENOMEM);
>>>>> +
>>>>>            }
>>>>>            if (channel_layout->nb_channels) {
>>>>>                av_bprintf(bp, ")");
>>>
>>> Obviously, this fuzzer found a case where a demuxer or a decoder
>>> constructs an invalid channel layout in memory without proper
>>> validation. There is a bug, possibly an security-related one, and this
>>> only hides it from the test suite.
>>
>> The Matroska demuxer could in theory generate a native layout with more than
>> 64 channels where popcnt(mask) != channels, and nothing seems to validate
>> what a demuxer's read_header() callback returns if you just call lavf API
>> functions like target_dem_fuzzer.c seems to do.
>>
>> Maybe:
>>
>>> diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
>>> index 73ded761fd..ad7ee390a2 100644
>>> --- a/libavformat/matroskadec.c
>>> +++ b/libavformat/matroskadec.c
>>> @@ -2950,10 +2950,10 @@ static int matroska_parse_tracks(AVFormatContext *s)
>>>               st->codecpar->codec_tag   = fourcc;
>>>               st->codecpar->sample_rate = track->audio.out_samplerate;
>>>               // channel layout may be already set by codec private checks above
>>> -            if (st->codecpar->ch_layout.order == AV_CHANNEL_ORDER_NATIVE &&
>>> -                !st->codecpar->ch_layout.u.mask)
>>> +            if (!av_channel_layout_check(&st->codecpar->ch_layout)) {
>>>                   st->codecpar->ch_layout.order = AV_CHANNEL_ORDER_UNSPEC;
>>> -            st->codecpar->ch_layout.nb_channels = track->audio.channels;
>>> +                st->codecpar->ch_layout.nb_channels = track->audio.channels;
>>> +            }
>>>               if (!st->codecpar->bits_per_coded_sample)
>>>                   st->codecpar->bits_per_coded_sample = track->audio.bitdepth;
>>>               if (st->codecpar->codec_id == AV_CODEC_ID_MP3 ||
>>
>> is enough to ensure a valid layout is propagated. This assumes parameters
>> set by codec private parsing are correct (only FLAC seem to be present right
>> now, and it is).
>>
>> Assuming i got this right, in this fuzzing sample's case it would still have
>> a billion channels (since that's what track->audio.channels contains, as
>> read from the container), but using the unspec layout, which is technically
>> valid even if nothing will really handle it, and
>> av_channel_layout_describe() will print a small string.
> 
> seems this is fixing it
> thanks

Ok, will apply it and revert 8154cb7c2f.

> 
> 
>>
>> Still, i think a check in avformat_open_input() might also be a good idea,
> 
> yes, i agree
> 
> 
>> especially once (and if) demuxers start propagating custom layouts, where
>> the map array will be allocated.
> 
> [...]
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
diff mbox series

Patch

diff --git a/libavutil/channel_layout.c b/libavutil/channel_layout.c
index 1887987789..d54a0adbe1 100644
--- a/libavutil/channel_layout.c
+++ b/libavutil/channel_layout.c
@@ -759,7 +759,7 @@  int av_channel_layout_describe_bprint(const AVChannelLayout *channel_layout,
                 av_bprintf(bp, "@%s", channel_layout->u.map[i].name);
 
             if (!av_bprint_is_complete(bp))
-                return AVERROR(ENOMEM);
+                break;
 
         }
         if (channel_layout->nb_channels) {