diff mbox

[FFmpeg-devel,1/3] 4xm: prevent overflow during bit rate calculation

Message ID bb99943c-8114-22d9-630a-44c383c59908@googlemail.com
State Accepted
Commit e558a6348ac10e74c54fb50ffd783ff9b5aec050
Headers show

Commit Message

Andreas Cadhalpun Dec. 14, 2016, 12:57 a.m. UTC
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
---
 libavformat/4xm.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Ronald S. Bultje Dec. 14, 2016, 1:01 a.m. UTC | #1
Hi,

On Tue, Dec 13, 2016 at 7:57 PM, Andreas Cadhalpun <
andreas.cadhalpun@googlemail.com> wrote:

> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
> ---
>  libavformat/4xm.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/libavformat/4xm.c b/libavformat/4xm.c
> index 8a50778..2758b69 100644
> --- a/libavformat/4xm.c
> +++ b/libavformat/4xm.c
> @@ -163,6 +163,12 @@ static int parse_strk(AVFormatContext *s,
>          return AVERROR_INVALIDDATA;
>      }
>
> +    if (fourxm->tracks[track].sample_rate > INT64_MAX /
> fourxm->tracks[track].bits / fourxm->tracks[track].channels) {
> +        av_log(s, AV_LOG_ERROR, "Overflow during bit rate calculation %d
> * %d * %d\n",
> +               fourxm->tracks[track].sample_rate,
> fourxm->tracks[track].bits, fourxm->tracks[track].channels);
> +        return AVERROR_INVALIDDATA;
> +    }


What is the functional effect of the overflow? Does it crash? Or is there
some other security issue?

Ronald
Andreas Cadhalpun Dec. 14, 2016, 1:21 a.m. UTC | #2
On 14.12.2016 02:01, Ronald S. Bultje wrote:
> Hi,
> 
> On Tue, Dec 13, 2016 at 7:57 PM, Andreas Cadhalpun <
> andreas.cadhalpun@googlemail.com> wrote:
> 
>> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
>> ---
>>  libavformat/4xm.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/libavformat/4xm.c b/libavformat/4xm.c
>> index 8a50778..2758b69 100644
>> --- a/libavformat/4xm.c
>> +++ b/libavformat/4xm.c
>> @@ -163,6 +163,12 @@ static int parse_strk(AVFormatContext *s,
>>          return AVERROR_INVALIDDATA;
>>      }
>>
>> +    if (fourxm->tracks[track].sample_rate > INT64_MAX /
>> fourxm->tracks[track].bits / fourxm->tracks[track].channels) {
>> +        av_log(s, AV_LOG_ERROR, "Overflow during bit rate calculation %d
>> * %d * %d\n",
>> +               fourxm->tracks[track].sample_rate,
>> fourxm->tracks[track].bits, fourxm->tracks[track].channels);
>> +        return AVERROR_INVALIDDATA;
>> +    }
> 
> 
> What is the functional effect of the overflow?

It is undefined behavior.

> Does it crash? Or is there some other security issue?

The most likely behavior is that bit_rate will contain a negative value,
which might cause problems later on, but I'm not aware of specific
security issues caused by this.

Best regards,
Andreas
Ronald S. Bultje Dec. 14, 2016, 1:46 a.m. UTC | #3
Hi,

On Tue, Dec 13, 2016 at 8:21 PM, Andreas Cadhalpun <
andreas.cadhalpun@googlemail.com> wrote:

> On 14.12.2016 02:01, Ronald S. Bultje wrote:
> > Hi,
> >
> > On Tue, Dec 13, 2016 at 7:57 PM, Andreas Cadhalpun <
> > andreas.cadhalpun@googlemail.com> wrote:
> >
> >> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
> >> ---
> >>  libavformat/4xm.c | 8 +++++++-
> >>  1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/libavformat/4xm.c b/libavformat/4xm.c
> >> index 8a50778..2758b69 100644
> >> --- a/libavformat/4xm.c
> >> +++ b/libavformat/4xm.c
> >> @@ -163,6 +163,12 @@ static int parse_strk(AVFormatContext *s,
> >>          return AVERROR_INVALIDDATA;
> >>      }
> >>
> >> +    if (fourxm->tracks[track].sample_rate > INT64_MAX /
> >> fourxm->tracks[track].bits / fourxm->tracks[track].channels) {
> >> +        av_log(s, AV_LOG_ERROR, "Overflow during bit rate calculation
> %d
> >> * %d * %d\n",
> >> +               fourxm->tracks[track].sample_rate,
> >> fourxm->tracks[track].bits, fourxm->tracks[track].channels);
> >> +        return AVERROR_INVALIDDATA;
> >> +    }
> >
> >
> > What is the functional effect of the overflow?
>
> It is undefined behavior.
>
> > Does it crash? Or is there some other security issue?
>
> The most likely behavior is that bit_rate will contain a negative value,
> which might cause problems later on, but I'm not aware of specific
> security issues caused by this.


Not wanting to discourage you, but I wonder if there's really a point to
this...? I don't see how the user experience changes.

This isn't specifically intended at this patch, but rather at the sort of
rabbit hole this change might lead to, which would cause the code to be
uber-full of such checks, none of which really have any significance. But
maybe others disagree...

Ronald
Michael Niedermayer Dec. 14, 2016, 2:54 a.m. UTC | #4
On Wed, Dec 14, 2016 at 01:57:54AM +0100, Andreas Cadhalpun wrote:
> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
> ---
>  libavformat/4xm.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)

LGTM

thx

[...]
Andreas Cadhalpun Dec. 15, 2016, 12:11 a.m. UTC | #5
On 14.12.2016 02:46, Ronald S. Bultje wrote:
> Not wanting to discourage you, but I wonder if there's really a point to
> this...?

These patches are prerequisites for enforcing the validity of codec
parameters [1].

> I don't see how the user experience changes.

Users won't see ffmpeg claiming nonsense bit rates like -1184293205235990 kb/s
anymore.

> This isn't specifically intended at this patch, but rather at the sort of
> rabbit hole this change might lead to,

I have a pretty good map of this rabbit hole (i.e. lots of samples triggering
UBSan errors) and one day I might try to dig it up, but for now I'm limiting
myself to the codec parameters.

> which would cause the code to be uber-full of such checks, none of which
> really have any significance. But maybe others disagree...

Not relying on undefined behavior is a significant improvement. And doing
these checks consequently where the values are set makes it possible
for other code to rely on their validity without further checks.

Best regards,
Andreas


1: https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20161026/c0151e90/attachment.patch
Andreas Cadhalpun Dec. 15, 2016, 12:29 a.m. UTC | #6
On 14.12.2016 03:54, Michael Niedermayer wrote:
> On Wed, Dec 14, 2016 at 01:57:54AM +0100, Andreas Cadhalpun wrote:
>> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
>> ---
>>  libavformat/4xm.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> LGTM

Pushed.

Best regards,
Andreas
Ronald S. Bultje Dec. 15, 2016, 1:02 p.m. UTC | #7
Hi,

On Wed, Dec 14, 2016 at 7:11 PM, Andreas Cadhalpun <
andreas.cadhalpun@googlemail.com> wrote:

> On 14.12.2016 02:46, Ronald S. Bultje wrote:
> > Not wanting to discourage you, but I wonder if there's really a point to
> > this...?
>
> These patches are prerequisites for enforcing the validity of codec
> parameters [1].
>
> > I don't see how the user experience changes.
>
> Users won't see ffmpeg claiming nonsense bit rates like -1184293205235990
> kb/s
> anymore.


I don't think you understand my question.

- does this belong in 4xm.c? (Or in generic code? Or in the app?)
- should it return an error? (Or just clip parameters? Or ignore the
invalid value?)

> This isn't specifically intended at this patch, but rather at the sort of
> > rabbit hole this change might lead to,
>
> I have a pretty good map of this rabbit hole (i.e. lots of samples
> triggering
> UBSan errors) and one day I might try to dig it up, but for now I'm
> limiting
> myself to the codec parameters.


I'm not saying mplayer was great, but one of the principles I believe we
always copied from them was to try to play files to the best of our
abilities and not error out at the first convenience. This isn't just a
theoretical thing, a lot of people credited mplayer with playing utterly
broken AVI files that probably even ffmpeg rejects. What ffmpeg added on
top of that is to make a project maintainable by not being an utter
crapshoot.

This isn't 4xm.c-specific, this is a general philosophical question:
- should we error out?
- should this be in generic code if we're likely to repeat such checks all
over the place?

> which would cause the code to be uber-full of such checks, none of which
> > really have any significance. But maybe others disagree...
>
> Not relying on undefined behavior is a significant improvement. And doing
> these checks consequently where the values are set makes it possible
> for other code to rely on their validity without further checks.


Unless "UB" leads to actual bad behaviour, I personally don't necessarily
care. I'm very scared that when you go beyond codec parameters, and you
want to check for overflows all over the place, we'll never see the end of
it...

I'd appreciate if others could chime in here, I don't want to carry this
argument all by myself if nobody cares.

Ronald
Michael Niedermayer Dec. 15, 2016, 2:28 p.m. UTC | #8
On Thu, Dec 15, 2016 at 08:02:52AM -0500, Ronald S. Bultje wrote:
> Hi,
> 
> On Wed, Dec 14, 2016 at 7:11 PM, Andreas Cadhalpun <
> andreas.cadhalpun@googlemail.com> wrote:
> 
> > On 14.12.2016 02:46, Ronald S. Bultje wrote:
> > > Not wanting to discourage you, but I wonder if there's really a point to
> > > this...?
> >
> > These patches are prerequisites for enforcing the validity of codec
> > parameters [1].
> >
> > > I don't see how the user experience changes.
> >
> > Users won't see ffmpeg claiming nonsense bit rates like -1184293205235990
> > kb/s
> > anymore.
> 
> 
> I don't think you understand my question.
> 
> - does this belong in 4xm.c? (Or in generic code? Or in the app?)
> - should it return an error? (Or just clip parameters? Or ignore the
> invalid value?)
> 
> > This isn't specifically intended at this patch, but rather at the sort of
> > > rabbit hole this change might lead to,
> >
> > I have a pretty good map of this rabbit hole (i.e. lots of samples
> > triggering
> > UBSan errors) and one day I might try to dig it up, but for now I'm
> > limiting
> > myself to the codec parameters.
> 
> 
> I'm not saying mplayer was great, but one of the principles I believe we
> always copied from them was to try to play files to the best of our
> abilities and not error out at the first convenience. This isn't just a
> theoretical thing, a lot of people credited mplayer with playing utterly
> broken AVI files that probably even ffmpeg rejects. What ffmpeg added on
> top of that is to make a project maintainable by not being an utter
> crapshoot.
> 
> This isn't 4xm.c-specific, this is a general philosophical question:
> - should we error out?
> - should this be in generic code if we're likely to repeat such checks all
> over the place?
> 
> > which would cause the code to be uber-full of such checks, none of which
> > > really have any significance. But maybe others disagree...
> >
> > Not relying on undefined behavior is a significant improvement. And doing
> > these checks consequently where the values are set makes it possible
> > for other code to rely on their validity without further checks.
> 
> 
> Unless "UB" leads to actual bad behaviour, I personally don't necessarily
> care. I'm very scared that when you go beyond codec parameters, and you
> want to check for overflows all over the place, we'll never see the end of
> it...
> 

> I'd appreciate if others could chime in here, I don't want to carry this
> argument all by myself if nobody cares.

as you are asking for others oppinion
valid C code must not trigger undefined behavior

undefined behavior means anything can happen and this is not guranteed
to be limited to the value a operation produces.
if you write a*b the compiler can assume that a and b are small
enough for this not to overflow, it can use this to simplify checks
before and after the operation.
A silly example would be a*b/b != a  (with signed a and b) which any
good compiler should replace by 0

We have seen bugs from undefined behavior like strict aliassing
violations were the compiler assumed that things wouldnt be accessed
as they had a different type ...

I think we should fix all instances of UB.
If we find cases that we cannot fix without slowdown or without
making the code hard to maintain by the people who activly work on it
then iam not against making exceptions. But in my oppinion in
general UB should be fixed

[...]
Ronald S. Bultje Dec. 15, 2016, 8:57 p.m. UTC | #9
Hi,

On Thu, Dec 15, 2016 at 9:28 AM, Michael Niedermayer <michael@niedermayer.cc
> wrote:

> On Thu, Dec 15, 2016 at 08:02:52AM -0500, Ronald S. Bultje wrote:
> > Hi,
> >
> > On Wed, Dec 14, 2016 at 7:11 PM, Andreas Cadhalpun <
> > andreas.cadhalpun@googlemail.com> wrote:
> >
> > > On 14.12.2016 02:46, Ronald S. Bultje wrote:
> > > > Not wanting to discourage you, but I wonder if there's really a
> point to
> > > > this...?
> > >
> > > These patches are prerequisites for enforcing the validity of codec
> > > parameters [1].
> > >
> > > > I don't see how the user experience changes.
> > >
> > > Users won't see ffmpeg claiming nonsense bit rates like
> -1184293205235990
> > > kb/s
> > > anymore.
> >
> >
> > I don't think you understand my question.
> >
> > - does this belong in 4xm.c? (Or in generic code? Or in the app?)
> > - should it return an error? (Or just clip parameters? Or ignore the
> > invalid value?)
> >
> > > This isn't specifically intended at this patch, but rather at the sort
> of
> > > > rabbit hole this change might lead to,
> > >
> > > I have a pretty good map of this rabbit hole (i.e. lots of samples
> > > triggering
> > > UBSan errors) and one day I might try to dig it up, but for now I'm
> > > limiting
> > > myself to the codec parameters.
> >
> >
> > I'm not saying mplayer was great, but one of the principles I believe we
> > always copied from them was to try to play files to the best of our
> > abilities and not error out at the first convenience. This isn't just a
> > theoretical thing, a lot of people credited mplayer with playing utterly
> > broken AVI files that probably even ffmpeg rejects. What ffmpeg added on
> > top of that is to make a project maintainable by not being an utter
> > crapshoot.
> >
> > This isn't 4xm.c-specific, this is a general philosophical question:
> > - should we error out?
> > - should this be in generic code if we're likely to repeat such checks
> all
> > over the place?
> >
> > > which would cause the code to be uber-full of such checks, none of
> which
> > > > really have any significance. But maybe others disagree...
> > >
> > > Not relying on undefined behavior is a significant improvement. And
> doing
> > > these checks consequently where the values are set makes it possible
> > > for other code to rely on their validity without further checks.
> >
> >
> > Unless "UB" leads to actual bad behaviour, I personally don't necessarily
> > care. I'm very scared that when you go beyond codec parameters, and you
> > want to check for overflows all over the place, we'll never see the end
> of
> > it...
> >
>
> > I'd appreciate if others could chime in here, I don't want to carry this
> > argument all by myself if nobody cares.
>
> as you are asking for others oppinion
> valid C code must not trigger undefined behavior


So, I asked on IRC, here's 3 suggestions from Roger Combs:

- in case we want to be pedantic (I personally don't care, but I understand
other people are), it might make sense to just make these members
(channels, block_align, bitrate) unsigned. That removes the UB of the
overflow, and it fixes the negative number reporting in client apps for
bitrate, but you can still have positive crazy numbers and it doesn't
return an error.
- if for whatever reason some things cannot be done in generic code or by
changing the type (this should really cover most cases), and we want
specific overflow checks, then maybe we want to have some generic helper
macros that make them one-liners in decoders. This would return an error
along with fixing the UB.
- have overflow-safe multiplication functions instead of checking each
argument before doing a multiply, like __builtin_mul_overflow, and then
return INT64_MAX if it would overflow inside that function. This fixes
display of numbers in client applications and the UB, but without returning
an error.

What I want most - and this comment applies to all patches of this sort -
is to have something that we can all agree is OK for pretty much any
decoder out there without significant overhead in code (source - not
binary) size. This way, you have a template to work from and fix specific
instances without us having to argue over every single time you do a next
run with ubsan. I personally like suggestion (1), unsigned is a pretty good
type for things that shouldn't have negative values. WDYT?

Ronald
Michael Niedermayer Dec. 15, 2016, 11 p.m. UTC | #10
On Thu, Dec 15, 2016 at 03:57:57PM -0500, Ronald S. Bultje wrote:
> Hi,
> 
> On Thu, Dec 15, 2016 at 9:28 AM, Michael Niedermayer <michael@niedermayer.cc
> > wrote:
> 
> > On Thu, Dec 15, 2016 at 08:02:52AM -0500, Ronald S. Bultje wrote:
> > > Hi,
> > >
> > > On Wed, Dec 14, 2016 at 7:11 PM, Andreas Cadhalpun <
> > > andreas.cadhalpun@googlemail.com> wrote:
> > >
> > > > On 14.12.2016 02:46, Ronald S. Bultje wrote:
> > > > > Not wanting to discourage you, but I wonder if there's really a
> > point to
> > > > > this...?
> > > >
> > > > These patches are prerequisites for enforcing the validity of codec
> > > > parameters [1].
> > > >
> > > > > I don't see how the user experience changes.
> > > >
> > > > Users won't see ffmpeg claiming nonsense bit rates like
> > -1184293205235990
> > > > kb/s
> > > > anymore.
> > >
> > >
> > > I don't think you understand my question.
> > >
> > > - does this belong in 4xm.c? (Or in generic code? Or in the app?)
> > > - should it return an error? (Or just clip parameters? Or ignore the
> > > invalid value?)
> > >
> > > > This isn't specifically intended at this patch, but rather at the sort
> > of
> > > > > rabbit hole this change might lead to,
> > > >
> > > > I have a pretty good map of this rabbit hole (i.e. lots of samples
> > > > triggering
> > > > UBSan errors) and one day I might try to dig it up, but for now I'm
> > > > limiting
> > > > myself to the codec parameters.
> > >
> > >
> > > I'm not saying mplayer was great, but one of the principles I believe we
> > > always copied from them was to try to play files to the best of our
> > > abilities and not error out at the first convenience. This isn't just a
> > > theoretical thing, a lot of people credited mplayer with playing utterly
> > > broken AVI files that probably even ffmpeg rejects. What ffmpeg added on
> > > top of that is to make a project maintainable by not being an utter
> > > crapshoot.
> > >
> > > This isn't 4xm.c-specific, this is a general philosophical question:
> > > - should we error out?
> > > - should this be in generic code if we're likely to repeat such checks
> > all
> > > over the place?
> > >
> > > > which would cause the code to be uber-full of such checks, none of
> > which
> > > > > really have any significance. But maybe others disagree...
> > > >
> > > > Not relying on undefined behavior is a significant improvement. And
> > doing
> > > > these checks consequently where the values are set makes it possible
> > > > for other code to rely on their validity without further checks.
> > >
> > >
> > > Unless "UB" leads to actual bad behaviour, I personally don't necessarily
> > > care. I'm very scared that when you go beyond codec parameters, and you
> > > want to check for overflows all over the place, we'll never see the end
> > of
> > > it...
> > >
> >
> > > I'd appreciate if others could chime in here, I don't want to carry this
> > > argument all by myself if nobody cares.
> >
> > as you are asking for others oppinion
> > valid C code must not trigger undefined behavior
> 
> 
> So, I asked on IRC, here's 3 suggestions from Roger Combs:
> 
> - in case we want to be pedantic (I personally don't care, but I understand
> other people are), it might make sense to just make these members
> (channels, block_align, bitrate) unsigned. That removes the UB of the
> overflow, and it fixes the negative number reporting in client apps for
> bitrate, but you can still have positive crazy numbers and it doesn't
> return an error.
> - if for whatever reason some things cannot be done in generic code or by
> changing the type (this should really cover most cases), and we want
> specific overflow checks, then maybe we want to have some generic helper
> macros that make them one-liners in decoders. This would return an error
> along with fixing the UB.
> - have overflow-safe multiplication functions instead of checking each
> argument before doing a multiply, like __builtin_mul_overflow, and then
> return INT64_MAX if it would overflow inside that function. This fixes
> display of numbers in client applications and the UB, but without returning
> an error.
> 
> What I want most - and this comment applies to all patches of this sort -
> is to have something that we can all agree is OK for pretty much any
> decoder out there without significant overhead in code (source - not
> binary) size. This way, you have a template to work from and fix specific
> instances without us having to argue over every single time you do a next
> run with ubsan. I personally like suggestion (1), unsigned is a pretty good
> type for things that shouldn't have negative values. WDYT?

unsigned is not unproblematic.
If you use unsigned values, all computations touched are unsigned (
or a larger data type). If unsigned they cannot be negative. This
requires great care buffer end, index and start computations can become
invalid very easily by changing the signedness of a type involved.

also unsigned makes detecting overflows impossible with existing
tools i know of. So while now if a decoder simply gives bad output with
no clear hints as to why, one can use ubsan, asan, valgrind, ...
and it will show integer overflows if thats the cause. But with
unsigned values nothing would be shown in a case that a affected
computation overflow was involved.

IMO if you compute the bitrate out of channels, bits per sample and
sample rate.
its correct to check that the result is representable in the bitrate
field if prior checks on the inputs are not sufficient.

If computations involve for example array indexing or division
checks cant be skiped as they would crash. Should we treat integer
overflow differntly ?
IMO the professional thing to do is to set the bitrate correctly or
if it is not possible to (not set it / set it to "unknown" / set it
by other means)
setting it to some overflowed value even if it doesnt matter for
the functioning of the demuxer is at least ugly

Anyway, what i think is important really is to understand the
extend of this. Are we dealing with 100, 1000, 10000 individual issues?
i think if its 100 or rather 100 checks would fix all issues then
simply adding checks should not really be a problem.

Considering that iam fixing these kind of issues since a while and
people do not report astronomic numbers of such issues to me it may
be that this whole problem is not of a magnitude that requires
extreem meassures.
But i may be wrong i have not myself tried to analyze how many such
issues exist.

What i do know though is there are some of these integer overflows
in  DSP code with fuzzed samples, and i do not know of a solution
for the DSP case that i like. Its almost as if leaving the DSP code
as is with the overflows is least bad.

[...]
Andreas Cadhalpun Dec. 16, 2016, 1:21 a.m. UTC | #11
On 15.12.2016 14:02, Ronald S. Bultje wrote:
> On Wed, Dec 14, 2016 at 7:11 PM, Andreas Cadhalpun <
> andreas.cadhalpun@googlemail.com> wrote:
>> On 14.12.2016 02:46, Ronald S. Bultje wrote:
>>> Not wanting to discourage you, but I wonder if there's really a point to
>>> this...?
>>
>> These patches are prerequisites for enforcing the validity of codec
>> parameters [1].
>>
>>> I don't see how the user experience changes.
>>
>> Users won't see ffmpeg claiming nonsense bit rates like -1184293205235990
>> kb/s
>> anymore.
> 
> 
> I don't think you understand my question.

Maybe you just didn't understand my answer.

> - does this belong in 4xm.c? (Or in generic code? Or in the app?)

I think it belongs both in 4xm.c and generic code, as I have explained
in detail [1] in the discussion back then.

> - should it return an error? (Or just clip parameters? Or ignore the
> invalid value?)

This was also already discussed [2].

On 15.12.2016 21:57, Ronald S. Bultje wrote:
> So, I asked on IRC, here's 3 suggestions from Roger Combs:
> 
> - in case we want to be pedantic (I personally don't care, but I understand
> other people are), it might make sense to just make these members
> (channels, block_align, bitrate) unsigned. That removes the UB of the
> overflow, and it fixes the negative number reporting in client apps for
> bitrate, but you can still have positive crazy numbers and it doesn't
> return an error.

These are public fields, so changing them is not easy and more importantly
changing them from signed to unsigned is a very bad idea, as it will most
probably cause all kinds of subtle bugs for API users, e.g. when comparing,
subtracting, etc. and not expecting unsigned behavior.

> - if for whatever reason some things cannot be done in generic code or by
> changing the type (this should really cover most cases), and we want
> specific overflow checks, then maybe we want to have some generic helper
> macros that make them one-liners in decoders. This would return an error
> along with fixing the UB.

I don't think the number of overflow checks added justifies the additional
complexity of factoring things out. These checks are also subtly different,
so it's not easy to write a generic helper for that.
However, I plan to do this for the actually common cases when validating
codec parameters, like checking that a parameter is not negative.

> - have overflow-safe multiplication functions instead of checking each
> argument before doing a multiply, like __builtin_mul_overflow, and then
> return INT64_MAX if it would overflow inside that function. This fixes
> display of numbers in client applications and the UB, but without returning
> an error.

I think that would be over-engineered.

> What I want most - and this comment applies to all patches of this sort -
> is to have something that we can all agree is OK for pretty much any
> decoder out there without significant overhead in code (source - not
> binary) size. This way, you have a template to work from and fix specific
> instances without us having to argue over every single time you do a next
> run with ubsan.

UBSan is not only about overflows, undefined shifts are also quite common.
But I haven't really looked into these cases yet, so I don't know what kind
of template, if any, would make sense for them.

Best regards,
Andreas


1: https://ffmpeg.org/pipermail/ffmpeg-devel/2016-October/201742.html
2: https://ffmpeg.org/pipermail/ffmpeg-devel/2016-November/203257.html
Rodger Combs Dec. 16, 2016, 1:29 a.m. UTC | #12
> On Dec 15, 2016, at 19:21, Andreas Cadhalpun <andreas.cadhalpun@googlemail.com> wrote:
> 
> On 15.12.2016 14:02, Ronald S. Bultje wrote:
>> On Wed, Dec 14, 2016 at 7:11 PM, Andreas Cadhalpun <
>> andreas.cadhalpun@googlemail.com> wrote:
>>> On 14.12.2016 02:46, Ronald S. Bultje wrote:
>>>> Not wanting to discourage you, but I wonder if there's really a point to
>>>> this...?
>>> 
>>> These patches are prerequisites for enforcing the validity of codec
>>> parameters [1].
>>> 
>>>> I don't see how the user experience changes.
>>> 
>>> Users won't see ffmpeg claiming nonsense bit rates like -1184293205235990
>>> kb/s
>>> anymore.
>> 
>> 
>> I don't think you understand my question.
> 
> Maybe you just didn't understand my answer.
> 
>> - does this belong in 4xm.c? (Or in generic code? Or in the app?)
> 
> I think it belongs both in 4xm.c and generic code, as I have explained
> in detail [1] in the discussion back then.
> 
>> - should it return an error? (Or just clip parameters? Or ignore the
>> invalid value?)
> 
> This was also already discussed [2].
> 
> On 15.12.2016 21:57, Ronald S. Bultje wrote:
>> So, I asked on IRC, here's 3 suggestions from Roger Combs:
>> 
>> - in case we want to be pedantic (I personally don't care, but I understand
>> other people are), it might make sense to just make these members
>> (channels, block_align, bitrate) unsigned. That removes the UB of the
>> overflow, and it fixes the negative number reporting in client apps for
>> bitrate, but you can still have positive crazy numbers and it doesn't
>> return an error.
> 
> These are public fields, so changing them is not easy and more importantly
> changing them from signed to unsigned is a very bad idea, as it will most
> probably cause all kinds of subtle bugs for API users, e.g. when comparing,
> subtracting, etc. and not expecting unsigned behavior.
> 

Fair enough.

>> - if for whatever reason some things cannot be done in generic code or by
>> changing the type (this should really cover most cases), and we want
>> specific overflow checks, then maybe we want to have some generic helper
>> macros that make them one-liners in decoders. This would return an error
>> along with fixing the UB.
> 
> I don't think the number of overflow checks added justifies the additional
> complexity of factoring things out. These checks are also subtly different,
> so it's not easy to write a generic helper for that.
> However, I plan to do this for the actually common cases when validating
> codec parameters, like checking that a parameter is not negative.
> 

My proposal was for something like:
#define BAIL_ON_OVERFLOW(x) if (x) {av_log(avctx, AV_LOG_ERROR, "Overflow check failed: " #x); return AVERROR_INVALIDDATA;}
Which basically reduces the code overhead down to a simple one-liner. It's hard to get detailed error prints out of this, but if we're saying these cases are so unlikely (fuzzer-only?) that we're comfortable outright failing on them, the level of precision in the message probably doesn't matter much?

>> - have overflow-safe multiplication functions instead of checking each
>> argument before doing a multiply, like __builtin_mul_overflow, and then
>> return INT64_MAX if it would overflow inside that function. This fixes
>> display of numbers in client applications and the UB, but without returning
>> an error.
> 
> I think that would be over-engineered.
> 
>> What I want most - and this comment applies to all patches of this sort -
>> is to have something that we can all agree is OK for pretty much any
>> decoder out there without significant overhead in code (source - not
>> binary) size. This way, you have a template to work from and fix specific
>> instances without us having to argue over every single time you do a next
>> run with ubsan.
> 
> UBSan is not only about overflows, undefined shifts are also quite common.
> But I haven't really looked into these cases yet, so I don't know what kind
> of template, if any, would make sense for them.
> 
> Best regards,
> Andreas
> 
> 
> 1: https://ffmpeg.org/pipermail/ffmpeg-devel/2016-October/201742.html
> 2: https://ffmpeg.org/pipermail/ffmpeg-devel/2016-November/203257.html
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Andreas Cadhalpun Dec. 16, 2016, 2:31 a.m. UTC | #13
On 16.12.2016 02:29, Rodger Combs wrote: 
>> On Dec 15, 2016, at 19:21, Andreas Cadhalpun <andreas.cadhalpun@googlemail.com> wrote:
>> On 15.12.2016 14:02, Ronald S. Bultje wrote:
>>> - if for whatever reason some things cannot be done in generic code or by
>>> changing the type (this should really cover most cases), and we want
>>> specific overflow checks, then maybe we want to have some generic helper
>>> macros that make them one-liners in decoders. This would return an error
>>> along with fixing the UB.
>>
>> I don't think the number of overflow checks added justifies the additional
>> complexity of factoring things out. These checks are also subtly different,
>> so it's not easy to write a generic helper for that.
>> However, I plan to do this for the actually common cases when validating
>> codec parameters, like checking that a parameter is not negative.
>>
> 
> My proposal was for something like:
> #define BAIL_ON_OVERFLOW(x) if (x) {av_log(avctx, AV_LOG_ERROR, "Overflow check failed: " #x); return AVERROR_INVALIDDATA;}
> Which basically reduces the code overhead down to a simple one-liner.

Yeah, that's similar to how I plan to handle the more common cases.

> It's hard to get detailed error prints out of this, but if we're saying these
> cases are so unlikely (fuzzer-only?) that we're comfortable outright failing
> on them, the level of precision in the message probably doesn't matter much?

Agreed, so I've updated the patch series using this approach.

Best regards,
Andreas
diff mbox

Patch

diff --git a/libavformat/4xm.c b/libavformat/4xm.c
index 8a50778..2758b69 100644
--- a/libavformat/4xm.c
+++ b/libavformat/4xm.c
@@ -163,6 +163,12 @@  static int parse_strk(AVFormatContext *s,
         return AVERROR_INVALIDDATA;
     }
 
+    if (fourxm->tracks[track].sample_rate > INT64_MAX / fourxm->tracks[track].bits / fourxm->tracks[track].channels) {
+        av_log(s, AV_LOG_ERROR, "Overflow during bit rate calculation %d * %d * %d\n",
+               fourxm->tracks[track].sample_rate, fourxm->tracks[track].bits, fourxm->tracks[track].channels);
+        return AVERROR_INVALIDDATA;
+    }
+
     /* allocate a new AVStream */
     st = avformat_new_stream(s, NULL);
     if (!st)
@@ -178,7 +184,7 @@  static int parse_strk(AVFormatContext *s,
     st->codecpar->channels              = fourxm->tracks[track].channels;
     st->codecpar->sample_rate           = fourxm->tracks[track].sample_rate;
     st->codecpar->bits_per_coded_sample = fourxm->tracks[track].bits;
-    st->codecpar->bit_rate              = st->codecpar->channels *
+    st->codecpar->bit_rate              = (int64_t)st->codecpar->channels *
                                           st->codecpar->sample_rate *
                                           st->codecpar->bits_per_coded_sample;
     st->codecpar->block_align           = st->codecpar->channels *