Message ID | 20190909201621.14091-2-michael@niedermayer.cc |
---|---|
State | Accepted |
Commit | 61dd2e07be7ca636e1d3d868f90dde1b10985f4c |
Headers | show |
mån 2019-09-09 klockan 22:16 +0200 skrev Michael Niedermayer: > Fixes: Timeout (100sec -> 0.7sec) > Fixes: 8668/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_G2M_fuzzer-5174143888130048 > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > --- > libavcodec/g2meet.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/libavcodec/g2meet.c b/libavcodec/g2meet.c > index 19e1c130ce..731d29a5d4 100644 > --- a/libavcodec/g2meet.c > +++ b/libavcodec/g2meet.c > @@ -244,6 +244,9 @@ static int jpg_decode_block(JPGContext *c, GetBitContext *gb, > const int is_chroma = !!plane; > const uint8_t *qmat = is_chroma ? chroma_quant : luma_quant; > > + if (get_bits_left(gb) < 1) > + return AVERROR_INVALIDDATA; > + > c->bdsp.clear_block(block); > dc = get_vlc2(gb, c->dc_vlc[is_chroma].table, 9, 3); Why doesn't the VLC stuff have EOF handling? There's bound to be a metric bajillion of cases like this strewn across the codebase.. /Tomas
On Mon, Sep 09, 2019 at 11:04:32PM +0200, Tomas Härdin wrote: > mån 2019-09-09 klockan 22:16 +0200 skrev Michael Niedermayer: > > Fixes: Timeout (100sec -> 0.7sec) > > Fixes: 8668/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_G2M_fuzzer-5174143888130048 > > > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > > --- > > libavcodec/g2meet.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/libavcodec/g2meet.c b/libavcodec/g2meet.c > > index 19e1c130ce..731d29a5d4 100644 > > --- a/libavcodec/g2meet.c > > +++ b/libavcodec/g2meet.c > > @@ -244,6 +244,9 @@ static int jpg_decode_block(JPGContext *c, GetBitContext *gb, > > const int is_chroma = !!plane; > > const uint8_t *qmat = is_chroma ? chroma_quant : luma_quant; > > > > + if (get_bits_left(gb) < 1) > > + return AVERROR_INVALIDDATA; > > + > > c->bdsp.clear_block(block); > > dc = get_vlc2(gb, c->dc_vlc[is_chroma].table, 9, 3); > > Why doesn't the VLC stuff have EOF handling? Because it doesnt need it in most of the cases and the get_vlc code is quite speed critical in some codecs. Also it would add a error return to cases that previously never could receive an error. That would require callers to be changed and check for this error in some cases > There's bound to be a > metric bajillion of cases like this strewn across the codebase.. if that was the case, the fuzzer would have likely found more cases. thx [...]
tis 2019-09-10 klockan 16:16 +0200 skrev Michael Niedermayer: > On Mon, Sep 09, 2019 at 11:04:32PM +0200, Tomas Härdin wrote: > > mån 2019-09-09 klockan 22:16 +0200 skrev Michael Niedermayer: > > > Fixes: Timeout (100sec -> 0.7sec) > > > Fixes: 8668/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_G2M_fuzzer-5174143888130048 > > > > > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > > > --- > > > libavcodec/g2meet.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/libavcodec/g2meet.c b/libavcodec/g2meet.c > > > index 19e1c130ce..731d29a5d4 100644 > > > --- a/libavcodec/g2meet.c > > > +++ b/libavcodec/g2meet.c > > > @@ -244,6 +244,9 @@ static int jpg_decode_block(JPGContext *c, GetBitContext *gb, > > > const int is_chroma = !!plane; > > > const uint8_t *qmat = is_chroma ? chroma_quant : luma_quant; > > > > > > + if (get_bits_left(gb) < 1) > > > + return AVERROR_INVALIDDATA; > > > + > > > c->bdsp.clear_block(block); > > > dc = get_vlc2(gb, c->dc_vlc[is_chroma].table, 9, 3); > > > > Why doesn't the VLC stuff have EOF handling? > > Because it doesnt need it in most of the cases and the get_vlc code > is quite speed critical in some codecs. > Also it would add a error return to cases that previously never > could receive an error. That would require callers to be changed > and check for this error in some cases > > > > There's bound to be a > > metric bajillion of cases like this strewn across the codebase.. > > if that was the case, the fuzzer would have likely found more cases. That's only a matter of time. For some formats it may even be impossible to know whether there are parser issues lurking in there, even if you employ formal verification. I've said multiple times that worrying about these timeout things is mostly a waste of time since any serious user will know to put time limits on jobs. Resources would be better spent gearing the fuzzing toward finding memory corruption issues, since the harm from them is far more serious. /Tomas
On Wed, Sep 11, 2019 at 11:18:47PM +0200, Tomas Härdin wrote: > tis 2019-09-10 klockan 16:16 +0200 skrev Michael Niedermayer: > > On Mon, Sep 09, 2019 at 11:04:32PM +0200, Tomas Härdin wrote: > > > mån 2019-09-09 klockan 22:16 +0200 skrev Michael Niedermayer: > > > > Fixes: Timeout (100sec -> 0.7sec) > > > > Fixes: 8668/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_G2M_fuzzer-5174143888130048 > > > > > > > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > > > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > > > > --- > > > > libavcodec/g2meet.c | 3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > diff --git a/libavcodec/g2meet.c b/libavcodec/g2meet.c > > > > index 19e1c130ce..731d29a5d4 100644 > > > > --- a/libavcodec/g2meet.c > > > > +++ b/libavcodec/g2meet.c > > > > @@ -244,6 +244,9 @@ static int jpg_decode_block(JPGContext *c, GetBitContext *gb, > > > > const int is_chroma = !!plane; > > > > const uint8_t *qmat = is_chroma ? chroma_quant : luma_quant; > > > > > > > > + if (get_bits_left(gb) < 1) > > > > + return AVERROR_INVALIDDATA; > > > > + > > > > c->bdsp.clear_block(block); > > > > dc = get_vlc2(gb, c->dc_vlc[is_chroma].table, 9, 3); > > > > > > Why doesn't the VLC stuff have EOF handling? > > > > Because it doesnt need it in most of the cases and the get_vlc code > > is quite speed critical in some codecs. > > Also it would add a error return to cases that previously never > > could receive an error. That would require callers to be changed > > and check for this error in some cases > > > > > > > There's bound to be a > > > metric bajillion of cases like this strewn across the codebase.. > > > > if that was the case, the fuzzer would have likely found more cases. > > That's only a matter of time. There is something wrong with this argument. Because one could say this about anything that is not static no matter if true or false. And theres no practical way to disproof it even if one waited, because maybe one just didnt wait long enough ... [...] > > I've said multiple times that worrying about these timeout things is > mostly a waste of time since any serious user will know to put time > limits on jobs. Everyone probably has timelimits on jobs but these timeouts are still a big problem. And i think this was discussed before ... I think if you just think about what timeout to use for each case A. a web browser loading image, video and audio files B. a online video encoding service C. ... I think you will find that there is no timeout that works well You just do not want 50 byte files or files with the resolution of an icon to take days to decode > Resources would be better spent gearing the fuzzing > toward finding memory corruption issues, since the harm from them is > far more serious. Then fixing the timeouts would be a priority as they hold the fuzzer up from finding other issues. Time spend waiting for a timeout is time the fuzzer cannot search for other issues Thanks [...]
tor 2019-09-12 klockan 00:21 +0200 skrev Michael Niedermayer: > On Wed, Sep 11, 2019 at 11:18:47PM +0200, Tomas Härdin wrote: > > tis 2019-09-10 klockan 16:16 +0200 skrev Michael Niedermayer: > > > On Mon, Sep 09, 2019 at 11:04:32PM +0200, Tomas Härdin wrote: > > > > mån 2019-09-09 klockan 22:16 +0200 skrev Michael Niedermayer: > > > > > Fixes: Timeout (100sec -> 0.7sec) > > > > > Fixes: 8668/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_G2M_fuzzer-5174143888130048 > > > > > > > > > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > > > > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > > > > > --- > > > > > libavcodec/g2meet.c | 3 +++ > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > diff --git a/libavcodec/g2meet.c b/libavcodec/g2meet.c > > > > > index 19e1c130ce..731d29a5d4 100644 > > > > > --- a/libavcodec/g2meet.c > > > > > +++ b/libavcodec/g2meet.c > > > > > @@ -244,6 +244,9 @@ static int jpg_decode_block(JPGContext *c, GetBitContext *gb, > > > > > const int is_chroma = !!plane; > > > > > const uint8_t *qmat = is_chroma ? chroma_quant : luma_quant; > > > > > > > > > > + if (get_bits_left(gb) < 1) > > > > > + return AVERROR_INVALIDDATA; > > > > > + > > > > > c->bdsp.clear_block(block); > > > > > dc = get_vlc2(gb, c->dc_vlc[is_chroma].table, 9, 3); > > > > > > > > Why doesn't the VLC stuff have EOF handling? > > > > > > Because it doesnt need it in most of the cases and the get_vlc code > > > is quite speed critical in some codecs. > > > Also it would add a error return to cases that previously never > > > could receive an error. That would require callers to be changed > > > and check for this error in some cases > > > > > > > > > > There's bound to be a > > > > metric bajillion of cases like this strewn across the codebase.. > > > > > > if that was the case, the fuzzer would have likely found more cases. > > That's only a matter of time. > > There is something wrong with this argument. > Because one could say this about anything that is not static > no matter if true or false. And theres no practical way to > disproof it even if one waited, because maybe one just didnt wait > long enough ... You can actually prove this in many cases, which is something I've been on about for a while now. The exception is formats that approach Turing completeness, for obvious reasons. Even context-sensitive formats can be tricky. The common case of nested length fields can take O(N²) to parse and validate, if I understand the literature correctly. Pointer based formats certainly do, like ZIP. That is of course not really something we can do anything about, formats are a given. > [...] > > I've said multiple times that worrying about these timeout things is > > mostly a waste of time since any serious user will know to put time > > limits on jobs. > > Everyone probably has timelimits on jobs but these timeouts are still > a big problem. And i think this was discussed before ... > > I think if you just think about what timeout to use for each case > A. a web browser loading image, video and audio files Presumably browser devs know how to use TBB and friends. They also don't use g2meet, or cinepak, or anything else that isn't H.26* or VP* etc. Closest thing is GIF > > Resources would be better spent gearing the fuzzing > > toward finding memory corruption issues, since the harm from them is > > far more serious. > > Then fixing the timeouts would be a priority as they hold the fuzzer > up from finding other issues. > Time spend waiting for a timeout is time the fuzzer cannot search for > other issues I see this more as a fuzzer tuning thing. When I last did fuzzing with afl I certainly made sure to give it tiny samples and not a lot of time per round Question: is the fuzzer really allowed to spend 120 seconds on a test case like this one? Or is that timing just an after-the-fact thing? /Tomas
Am Do., 12. Sept. 2019 um 00:22 Uhr schrieb Michael Niedermayer <michael@niedermayer.cc>: > I think you will find that there is no timeout that works well > You just do not want 50 byte files or files with the resolution > of an icon to take days to decode +1! Carl Eugen
On Thu, Sep 12, 2019 at 11:09:16PM +0200, Tomas Härdin wrote: > tor 2019-09-12 klockan 00:21 +0200 skrev Michael Niedermayer: > > On Wed, Sep 11, 2019 at 11:18:47PM +0200, Tomas Härdin wrote: > > > tis 2019-09-10 klockan 16:16 +0200 skrev Michael Niedermayer: > > > > On Mon, Sep 09, 2019 at 11:04:32PM +0200, Tomas Härdin wrote: > > > > > mån 2019-09-09 klockan 22:16 +0200 skrev Michael Niedermayer: > > > > > > Fixes: Timeout (100sec -> 0.7sec) > > > > > > Fixes: 8668/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_G2M_fuzzer-5174143888130048 > > > > > > > > > > > > Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > > > > > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> > > > > > > --- > > > > > > libavcodec/g2meet.c | 3 +++ > > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > > > diff --git a/libavcodec/g2meet.c b/libavcodec/g2meet.c > > > > > > index 19e1c130ce..731d29a5d4 100644 > > > > > > --- a/libavcodec/g2meet.c > > > > > > +++ b/libavcodec/g2meet.c > > > > > > @@ -244,6 +244,9 @@ static int jpg_decode_block(JPGContext *c, GetBitContext *gb, > > > > > > const int is_chroma = !!plane; > > > > > > const uint8_t *qmat = is_chroma ? chroma_quant : luma_quant; > > > > > > > > > > > > + if (get_bits_left(gb) < 1) > > > > > > + return AVERROR_INVALIDDATA; > > > > > > + > > > > > > c->bdsp.clear_block(block); > > > > > > dc = get_vlc2(gb, c->dc_vlc[is_chroma].table, 9, 3); > > > > > > > > > > Why doesn't the VLC stuff have EOF handling? > > > > > > > > Because it doesnt need it in most of the cases and the get_vlc code > > > > is quite speed critical in some codecs. > > > > Also it would add a error return to cases that previously never > > > > could receive an error. That would require callers to be changed > > > > and check for this error in some cases > > > > > > > > > > > > > There's bound to be a > > > > > metric bajillion of cases like this strewn across the codebase.. > > > > > > > > if that was the case, the fuzzer would have likely found more cases. > > > That's only a matter of time. > > > > There is something wrong with this argument. > > Because one could say this about anything that is not static > > no matter if true or false. And theres no practical way to > > disproof it even if one waited, because maybe one just didnt wait > > long enough ... > > You can actually prove this in many cases, which is something I've been > on about for a while now. The exception is formats that approach Turing > completeness, for obvious reasons. Even context-sensitive formats can > be tricky. The common case of nested length fields can take O(N²) to > parse and validate, if I understand the literature correctly. Pointer > based formats certainly do, like ZIP. > > That is of course not really something we can do anything about, > formats are a given. > > > [...] > > > I've said multiple times that worrying about these timeout things is > > > mostly a waste of time since any serious user will know to put time > > > limits on jobs. > > > > Everyone probably has timelimits on jobs but these timeouts are still > > a big problem. And i think this was discussed before ... > > > > I think if you just think about what timeout to use for each case > > A. a web browser loading image, video and audio files > > Presumably browser devs know how to use TBB and friends. They also > don't use g2meet, or cinepak, or anything else that isn't H.26* or VP* > etc. Closest thing is GIF > > > > Resources would be better spent gearing the fuzzing > > > toward finding memory corruption issues, since the harm from them is > > > far more serious. > > > > Then fixing the timeouts would be a priority as they hold the fuzzer > > up from finding other issues. > > Time spend waiting for a timeout is time the fuzzer cannot search for > > other issues > > I see this more as a fuzzer tuning thing. When I last did fuzzing with > afl I certainly made sure to give it tiny samples and not a lot of time > per round > > Question: is the fuzzer really allowed to spend 120 seconds on a test > case like this one? Or is that timing just an after-the-fact thing? The fuzzer has a timeout of 25 seconds IIRC. Thats on the machiene google runs it on. So local times (which is what would be listed in a patch) will always differ a bit So what shall we do about these 2 patches here ? Ok to push or do you want me to do something else ? Thanks [...]
lör 2019-09-28 klockan 17:47 +0200 skrev Michael Niedermayer: > On Thu, Sep 12, 2019 at 11:09:16PM +0200, Tomas Härdin wrote: > > tor 2019-09-12 klockan 00:21 +0200 skrev Michael Niedermayer: > > > On Wed, Sep 11, 2019 at 11:18:47PM +0200, Tomas Härdin wrote: > > > > tis 2019-09-10 klockan 16:16 +0200 skrev Michael Niedermayer: > > > [...] > > > > I've said multiple times that worrying about these timeout things is > > > > mostly a waste of time since any serious user will know to put time > > > > limits on jobs. > > > > > > Everyone probably has timelimits on jobs but these timeouts are still > > > a big problem. And i think this was discussed before ... > > > > > > I think if you just think about what timeout to use for each case > > > A. a web browser loading image, video and audio files > > > > Presumably browser devs know how to use TBB and friends. They also > > don't use g2meet, or cinepak, or anything else that isn't H.26* or VP* > > etc. Closest thing is GIF > > > > > > Resources would be better spent gearing the fuzzing > > > > toward finding memory corruption issues, since the harm from them is > > > > far more serious. > > > > > > Then fixing the timeouts would be a priority as they hold the fuzzer > > > up from finding other issues. > > > Time spend waiting for a timeout is time the fuzzer cannot search for > > > other issues > > > > I see this more as a fuzzer tuning thing. When I last did fuzzing with > > afl I certainly made sure to give it tiny samples and not a lot of time > > per round > > > > Question: is the fuzzer really allowed to spend 120 seconds on a test > > case like this one? Or is that timing just an after-the-fact thing? > > The fuzzer has a timeout of 25 seconds IIRC. Thats on the machiene > google runs it on. So local times (which is what would be listed in a > patch) will always differ a bit OK, that sounds a bit more reasonable. > So what shall we do about these 2 patches here ? > Ok to push or do you want me to do something else ? Nah go ahead. They don't seem to hurt, beyond being a few more lines of code. /Tomas
On Mon, Sep 30, 2019 at 08:07:30PM +0200, Tomas Härdin wrote: > lör 2019-09-28 klockan 17:47 +0200 skrev Michael Niedermayer: > > On Thu, Sep 12, 2019 at 11:09:16PM +0200, Tomas Härdin wrote: > > > tor 2019-09-12 klockan 00:21 +0200 skrev Michael Niedermayer: > > > > On Wed, Sep 11, 2019 at 11:18:47PM +0200, Tomas Härdin wrote: > > > > > tis 2019-09-10 klockan 16:16 +0200 skrev Michael Niedermayer: > > > > [...] > > > > > I've said multiple times that worrying about these timeout things is > > > > > mostly a waste of time since any serious user will know to put time > > > > > limits on jobs. > > > > > > > > Everyone probably has timelimits on jobs but these timeouts are still > > > > a big problem. And i think this was discussed before ... > > > > > > > > I think if you just think about what timeout to use for each case > > > > A. a web browser loading image, video and audio files > > > > > > Presumably browser devs know how to use TBB and friends. They also > > > don't use g2meet, or cinepak, or anything else that isn't H.26* or VP* > > > etc. Closest thing is GIF > > > > > > > > Resources would be better spent gearing the fuzzing > > > > > toward finding memory corruption issues, since the harm from them is > > > > > far more serious. > > > > > > > > Then fixing the timeouts would be a priority as they hold the fuzzer > > > > up from finding other issues. > > > > Time spend waiting for a timeout is time the fuzzer cannot search for > > > > other issues > > > > > > I see this more as a fuzzer tuning thing. When I last did fuzzing with > > > afl I certainly made sure to give it tiny samples and not a lot of time > > > per round > > > > > > Question: is the fuzzer really allowed to spend 120 seconds on a test > > > case like this one? Or is that timing just an after-the-fact thing? > > > > The fuzzer has a timeout of 25 seconds IIRC. Thats on the machiene > > google runs it on. So local times (which is what would be listed in a > > patch) will always differ a bit > > OK, that sounds a bit more reasonable. > > > So what shall we do about these 2 patches here ? > > Ok to push or do you want me to do something else ? > > Nah go ahead. They don't seem to hurt, beyond being a few more lines of > code. ok will apply thanks [...]
diff --git a/libavcodec/g2meet.c b/libavcodec/g2meet.c index 19e1c130ce..731d29a5d4 100644 --- a/libavcodec/g2meet.c +++ b/libavcodec/g2meet.c @@ -244,6 +244,9 @@ static int jpg_decode_block(JPGContext *c, GetBitContext *gb, const int is_chroma = !!plane; const uint8_t *qmat = is_chroma ? chroma_quant : luma_quant; + if (get_bits_left(gb) < 1) + return AVERROR_INVALIDDATA; + c->bdsp.clear_block(block); dc = get_vlc2(gb, c->dc_vlc[is_chroma].table, 9, 3); if (dc < 0)
Fixes: Timeout (100sec -> 0.7sec) Fixes: 8668/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_G2M_fuzzer-5174143888130048 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc> --- libavcodec/g2meet.c | 3 +++ 1 file changed, 3 insertions(+)