Message ID | 20200710134803.4347-1-andreas.rheinhardt@gmail.com |
---|---|
State | Superseded |
Headers | show |
Series | [FFmpeg-devel] avcodec/golomb: Prevent shift by negative number | expand |
Context | Check | Description |
---|---|---|
andriy/default | pending | |
andriy/make | success | Make finished |
andriy/make_fate | success | Make fate finished |
Jul 10, 2020, 14:48 by andreas.rheinhardt@gmail.com: > This happened in get_ue_golomb() if the cached bitstream reader was in > use, because there was no check to handle the case of the read value > not being in the range 0..8190. > > Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> > --- > libavcodec/golomb.h | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/libavcodec/golomb.h b/libavcodec/golomb.h > index 7fd46a91bd..5bfcfe085f 100644 > --- a/libavcodec/golomb.h > +++ b/libavcodec/golomb.h > @@ -66,6 +66,10 @@ static inline int get_ue_golomb(GetBitContext *gb) > return ff_ue_golomb_vlc_code[buf]; > } else { > int log = 2 * av_log2(buf) - 31; > + if (log < 0) { > + av_log(NULL, AV_LOG_ERROR, "Invalid UE golomb code\n"); > + return AVERROR_INVALIDDATA; > + } > buf >>= log; > buf--; > skip_bits_long(gb, 32 - log); > That's in an extremely hot path. Any alternatives?
On 10/07/2020 15:48, Lynne wrote:
> That's in an extremely hot path. Any alternatives?
Anothe thing of note is that basically nothing in the entire
codebase checks the return value of this function. Possibly
literally nothing.
- Derek
Lynne: > Jul 10, 2020, 14:48 by andreas.rheinhardt@gmail.com: > >> This happened in get_ue_golomb() if the cached bitstream reader was in >> use, because there was no check to handle the case of the read value >> not being in the range 0..8190. >> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> >> --- >> libavcodec/golomb.h | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/libavcodec/golomb.h b/libavcodec/golomb.h >> index 7fd46a91bd..5bfcfe085f 100644 >> --- a/libavcodec/golomb.h >> +++ b/libavcodec/golomb.h >> @@ -66,6 +66,10 @@ static inline int get_ue_golomb(GetBitContext *gb) >> return ff_ue_golomb_vlc_code[buf]; >> } else { >> int log = 2 * av_log2(buf) - 31; >> + if (log < 0) { >> + av_log(NULL, AV_LOG_ERROR, "Invalid UE golomb code\n"); >> + return AVERROR_INVALIDDATA; >> + } >> buf >>= log; >> buf--; >> skip_bits_long(gb, 32 - log); >> > > That's in an extremely hot path. Any alternatives? If I knew a better solution, I'd have implemented it; I would also have done the same to the codepath for the noncached bitstream reader (which is even hotter). Notice that I actually only copied the check from the noncached codepath (where one even has to check for log < 7 because one might only have 25 valid bits). There is one thing that is improvable though: The av_log(). In situations where the return value is checked and a dedicated error message emitted, it is unnecessary to have an av_log() here (in particular one that logs to NULL). - Andreas
fre 2020-07-10 klockan 15:48 +0200 skrev Andreas Rheinhardt: > This happened in get_ue_golomb() if the cached bitstream reader was > in > use, because there was no check to handle the case of the read value > not being in the range 0..8190. > > Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> > --- > libavcodec/golomb.h | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/libavcodec/golomb.h b/libavcodec/golomb.h > index 7fd46a91bd..5bfcfe085f 100644 > --- a/libavcodec/golomb.h > +++ b/libavcodec/golomb.h > @@ -66,6 +66,10 @@ static inline int get_ue_golomb(GetBitContext *gb) > return ff_ue_golomb_vlc_code[buf]; > } else { > int log = 2 * av_log2(buf) - 31; > + if (log < 0) { > + av_log(NULL, AV_LOG_ERROR, "Invalid UE golomb code\n"); > + return AVERROR_INVALIDDATA; > + } How come invalid codes can even be present like this? Seems inefficient. Or is the decoder wrong? Maybe Michael wants to chime in here, since he wrote the original implementation. Reading a bit about Exp-Golomb it seems counting trailing zeroes would be the way to go. There's even a builtin for it, __builtin_ctz(). Alternatively a table-driven approach could be used. /Tomas
Tomas Härdin: > fre 2020-07-10 klockan 15:48 +0200 skrev Andreas Rheinhardt: >> This happened in get_ue_golomb() if the cached bitstream reader was >> in >> use, because there was no check to handle the case of the read value >> not being in the range 0..8190. >> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> >> --- >> libavcodec/golomb.h | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/libavcodec/golomb.h b/libavcodec/golomb.h >> index 7fd46a91bd..5bfcfe085f 100644 >> --- a/libavcodec/golomb.h >> +++ b/libavcodec/golomb.h >> @@ -66,6 +66,10 @@ static inline int get_ue_golomb(GetBitContext *gb) >> return ff_ue_golomb_vlc_code[buf]; >> } else { >> int log = 2 * av_log2(buf) - 31; >> + if (log < 0) { >> + av_log(NULL, AV_LOG_ERROR, "Invalid UE golomb code\n"); >> + return AVERROR_INVALIDDATA; >> + } > > How come invalid codes can even be present like this? Seems > inefficient. Or is the decoder wrong? Maybe Michael wants to chime in > here, since he wrote the original implementation. > This function is only supposed to be used when the expected value of the exp-golomb code is in the range 0..8190 (or other words: if there are not more than 12 leading zeroes). And just because a specification says that a value must be smaller than that does not mean that it is actually fulfilled in real files. (Btw: The cached bitstream reader version of this function can actually handle 16 leading zeroes (and this patch only errors out if there are more); the limit of 12 is imposed because the noncached bitstream reader can't handle more.) > Reading a bit about Exp-Golomb it seems counting trailing zeroes would > be the way to go. There's even a builtin for it, __builtin_ctz(). > Alternatively a table-driven approach could be used. > Here is the code in question for the cached bitstream reader: buf = show_bits_long(gb, 32); if (buf >= (1 << 27)) { buf >>= 32 - 9; skip_bits_long(gb, ff_golomb_vlc_len[buf]); return ff_ue_golomb_vlc_code[buf]; } else { int log = 2 * av_log2(buf) - 31; buf >>= log; buf--; skip_bits_long(gb, 32 - log); return buf; } So you see we are already using a table-driven approach for small values (namely for those values where the number of leading zeroes is <= 4) and if we are outside of that range, an approach using av_log2 (which for supported systems is implemented using __builtin_clz() (not trailing zeroes!)) is used. - Andreas
On Mon, Jul 13, 2020 at 09:04:30PM +0200, Tomas Härdin wrote: > fre 2020-07-10 klockan 15:48 +0200 skrev Andreas Rheinhardt: > > This happened in get_ue_golomb() if the cached bitstream reader was > > in > > use, because there was no check to handle the case of the read value > > not being in the range 0..8190. > > > > Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> > > --- > > libavcodec/golomb.h | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/libavcodec/golomb.h b/libavcodec/golomb.h > > index 7fd46a91bd..5bfcfe085f 100644 > > --- a/libavcodec/golomb.h > > +++ b/libavcodec/golomb.h > > @@ -66,6 +66,10 @@ static inline int get_ue_golomb(GetBitContext *gb) > > return ff_ue_golomb_vlc_code[buf]; > > } else { > > int log = 2 * av_log2(buf) - 31; > > + if (log < 0) { > > + av_log(NULL, AV_LOG_ERROR, "Invalid UE golomb code\n"); > > + return AVERROR_INVALIDDATA; > > + } > > How come invalid codes can even be present like this? Seems > inefficient. Or is the decoder wrong? Maybe Michael wants to chime in > here, since he wrote the original implementation. The codepath of the non cached one does a check too. It should be done consistently. If the check causes a meassurable speedloss, then it can be implemented without a check. For example by using asm for the shift operation, or maybe compilers have some nicer function doing a never undefined shift. If such function does not exist, its something that maybe someone should add to gcc & clang because other projects could benefit from it too in similar situations. Of course all such improvments make the code more complex as a fallback with check is needed if none of the non standard "safe" functions exist. So it only makes sense to go that way if this actually affects speed. It is after all not on the main codepath which is handled with a table thx [...]
Michael Niedermayer: > On Mon, Jul 13, 2020 at 09:04:30PM +0200, Tomas Härdin wrote: >> fre 2020-07-10 klockan 15:48 +0200 skrev Andreas Rheinhardt: >>> This happened in get_ue_golomb() if the cached bitstream reader was >>> in >>> use, because there was no check to handle the case of the read value >>> not being in the range 0..8190. >>> >>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> >>> --- >>> libavcodec/golomb.h | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/libavcodec/golomb.h b/libavcodec/golomb.h >>> index 7fd46a91bd..5bfcfe085f 100644 >>> --- a/libavcodec/golomb.h >>> +++ b/libavcodec/golomb.h >>> @@ -66,6 +66,10 @@ static inline int get_ue_golomb(GetBitContext *gb) >>> return ff_ue_golomb_vlc_code[buf]; >>> } else { >>> int log = 2 * av_log2(buf) - 31; >>> + if (log < 0) { >>> + av_log(NULL, AV_LOG_ERROR, "Invalid UE golomb code\n"); >>> + return AVERROR_INVALIDDATA; >>> + } >> >> How come invalid codes can even be present like this? Seems >> inefficient. Or is the decoder wrong? Maybe Michael wants to chime in >> here, since he wrote the original implementation. > > The codepath of the non cached one does a check too. It should be > done consistently. Does this imply that you want me to error out for log < 7 although the cached bitstream reader is able to read the number even when log is in the range of 0..6? (The reason for this is that one only gets 25 valid bits in the noncached version.) > If the check causes a meassurable speedloss, then it can be implemented > without a check. For example by using asm for the shift operation, or maybe > compilers have some nicer function doing a never undefined shift. If such > function does not exist, its something that maybe someone should add to > gcc & clang because other projects could benefit from it too in similar > situations. 22e960ad478e568f4094971a58c6ad8f549c0180 (which enabled the check and added the av_log() in the noncached branch) claims no measurable speedloss. I have not found any such compiler builtin function for this purpose; neither GCC nor Clang have one. Notice that if we had such an always defined shift, we could use it to make put_bits() write 0-32 bits at once if this always defined shift had the property that 0U << 32 equals 0. - Andreas
On Tue, Jul 14, 2020 at 06:10:39PM +0200, Andreas Rheinhardt wrote: > Michael Niedermayer: > > On Mon, Jul 13, 2020 at 09:04:30PM +0200, Tomas Härdin wrote: > >> fre 2020-07-10 klockan 15:48 +0200 skrev Andreas Rheinhardt: > >>> This happened in get_ue_golomb() if the cached bitstream reader was > >>> in > >>> use, because there was no check to handle the case of the read value > >>> not being in the range 0..8190. > >>> > >>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> > >>> --- > >>> libavcodec/golomb.h | 4 ++++ > >>> 1 file changed, 4 insertions(+) > >>> > >>> diff --git a/libavcodec/golomb.h b/libavcodec/golomb.h > >>> index 7fd46a91bd..5bfcfe085f 100644 > >>> --- a/libavcodec/golomb.h > >>> +++ b/libavcodec/golomb.h > >>> @@ -66,6 +66,10 @@ static inline int get_ue_golomb(GetBitContext *gb) > >>> return ff_ue_golomb_vlc_code[buf]; > >>> } else { > >>> int log = 2 * av_log2(buf) - 31; > >>> + if (log < 0) { > >>> + av_log(NULL, AV_LOG_ERROR, "Invalid UE golomb code\n"); > >>> + return AVERROR_INVALIDDATA; > >>> + } > >> > >> How come invalid codes can even be present like this? Seems > >> inefficient. Or is the decoder wrong? Maybe Michael wants to chime in > >> here, since he wrote the original implementation. > > > > The codepath of the non cached one does a check too. It should be > > done consistently. > > Does this imply that you want me to error out for log < 7 although the > cached bitstream reader is able to read the number even when log is in > the range of 0..6? (The reason for this is that one only gets 25 valid > bits in the noncached version.) is it intended to be valid for this function to be used for longer codes ? if not it should fail because otherwise a bug in the code would be missed also it would result in fewer differences and easier testing as both would behave the same thx [...]
Michael Niedermayer: > On Tue, Jul 14, 2020 at 06:10:39PM +0200, Andreas Rheinhardt wrote: >> Michael Niedermayer: >>> On Mon, Jul 13, 2020 at 09:04:30PM +0200, Tomas Härdin wrote: >>>> fre 2020-07-10 klockan 15:48 +0200 skrev Andreas Rheinhardt: >>>>> This happened in get_ue_golomb() if the cached bitstream reader was >>>>> in >>>>> use, because there was no check to handle the case of the read value >>>>> not being in the range 0..8190. >>>>> >>>>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> >>>>> --- >>>>> libavcodec/golomb.h | 4 ++++ >>>>> 1 file changed, 4 insertions(+) >>>>> >>>>> diff --git a/libavcodec/golomb.h b/libavcodec/golomb.h >>>>> index 7fd46a91bd..5bfcfe085f 100644 >>>>> --- a/libavcodec/golomb.h >>>>> +++ b/libavcodec/golomb.h >>>>> @@ -66,6 +66,10 @@ static inline int get_ue_golomb(GetBitContext *gb) >>>>> return ff_ue_golomb_vlc_code[buf]; >>>>> } else { >>>>> int log = 2 * av_log2(buf) - 31; >>>>> + if (log < 0) { >>>>> + av_log(NULL, AV_LOG_ERROR, "Invalid UE golomb code\n"); >>>>> + return AVERROR_INVALIDDATA; >>>>> + } >>>> >>>> How come invalid codes can even be present like this? Seems >>>> inefficient. Or is the decoder wrong? Maybe Michael wants to chime in >>>> here, since he wrote the original implementation. >>> >>> The codepath of the non cached one does a check too. It should be >>> done consistently. >> >> Does this imply that you want me to error out for log < 7 although the >> cached bitstream reader is able to read the number even when log is in >> the range of 0..6? (The reason for this is that one only gets 25 valid >> bits in the noncached version.) > > is it intended to be valid for this function to be used for longer codes ? > if not it should fail because otherwise a bug in the code would be missed > also it would result in fewer differences and easier testing as both > would behave the same > Ok, will change locally. - Andreas
diff --git a/libavcodec/golomb.h b/libavcodec/golomb.h index 7fd46a91bd..5bfcfe085f 100644 --- a/libavcodec/golomb.h +++ b/libavcodec/golomb.h @@ -66,6 +66,10 @@ static inline int get_ue_golomb(GetBitContext *gb) return ff_ue_golomb_vlc_code[buf]; } else { int log = 2 * av_log2(buf) - 31; + if (log < 0) { + av_log(NULL, AV_LOG_ERROR, "Invalid UE golomb code\n"); + return AVERROR_INVALIDDATA; + } buf >>= log; buf--; skip_bits_long(gb, 32 - log);
This happened in get_ue_golomb() if the cached bitstream reader was in use, because there was no check to handle the case of the read value not being in the range 0..8190. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> --- libavcodec/golomb.h | 4 ++++ 1 file changed, 4 insertions(+)