diff mbox

[FFmpeg-devel] Fix off-by-few crasher in ff_h2645_extract_rbsp function

Message ID 20170306145151.19939-1-mkrasowski@opera.com
State New
Headers show

Commit Message

Michał Krasowski March 6, 2017, 2:51 p.m. UTC
It seems that the loop tried to access the memory regions
beyond allocation, what caused crashes in not-so-rare cases, when
the memory read did not belong to current process.

This change is fixing the out-of-bounds read problem.
Compiling this function with -fsanitize=address and running doesn't
result in sanitizer warning as before.
---
 libavcodec/h2645_parse.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Michael Niedermayer March 6, 2017, 7:53 p.m. UTC | #1
On Mon, Mar 06, 2017 at 03:51:51PM +0100, Michał Krasowski wrote:
> It seems that the loop tried to access the memory regions
> beyond allocation, what caused crashes in not-so-rare cases, when
> the memory read did not belong to current process.
> 
> This change is fixing the out-of-bounds read problem.
> Compiling this function with -fsanitize=address and running doesn't
> result in sanitizer warning as before.
> ---
>  libavcodec/h2645_parse.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

have you seen/read the documentation for AV_INPUT_BUFFER_PADDING_SIZE
?

if not, that may be the cause of the issues you see


[...]
Michał Krasowski March 7, 2017, 10:55 a.m. UTC | #2
@Michael Niedermayer
I have read the documentation, namely

/**
 * @ingroup lavc_decoding
 * Required number of additionally allocated bytes at the end of the input
bitstream for decoding.
 * This is mainly needed because some optimized bitstream readers read
 * 32 or 64 bit at once and could read over the end.<br>
 * Note: If the first 23 bits of the additional bytes are not 0, then
damaged
 * MPEG bitstreams could cause overread and segfault.
 */
#define AV_INPUT_BUFFER_PADDING_SIZE 32

and now it seems to me that you prefer speed (a.k.a. optimization)
over having a self-contained functions.

There are few things that are still not clear to me:
* Why is the 32 bit padding used when the doc says that
  64 bit offset may also be needed?
* Even if I extend my data buffer
  to have those 4 bytes more, is there a risk that some functions
  in ffmpeg will read out-of-bounds?
* How to find such information without reading all bolts and nuts of ffmpeg
source?


On Mon, Mar 6, 2017 at 8:53 PM, Michael Niedermayer <michael@niedermayer.cc>
wrote:

> On Mon, Mar 06, 2017 at 03:51:51PM +0100, Michał Krasowski wrote:
> > It seems that the loop tried to access the memory regions
> > beyond allocation, what caused crashes in not-so-rare cases, when
> > the memory read did not belong to current process.
> >
> > This change is fixing the out-of-bounds read problem.
> > Compiling this function with -fsanitize=address and running doesn't
> > result in sanitizer warning as before.
> > ---
> >  libavcodec/h2645_parse.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
>
> have you seen/read the documentation for AV_INPUT_BUFFER_PADDING_SIZE
> ?
>
> if not, that may be the cause of the issues you see
>
>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Avoid a single point of failure, be that a person or equipment.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
Hendrik Leppkes March 7, 2017, 11:05 a.m. UTC | #3
On Tue, Mar 7, 2017 at 11:55 AM, Michał Krasowski <mkrasowski@opera.com> wrote:
> @Michael Niedermayer
> I have read the documentation, namely
>
> /**
>  * @ingroup lavc_decoding
>  * Required number of additionally allocated bytes at the end of the input
> bitstream for decoding.
>  * This is mainly needed because some optimized bitstream readers read
>  * 32 or 64 bit at once and could read over the end.<br>
>  * Note: If the first 23 bits of the additional bytes are not 0, then
> damaged
>  * MPEG bitstreams could cause overread and segfault.
>  */
> #define AV_INPUT_BUFFER_PADDING_SIZE 32
>
> and now it seems to me that you prefer speed (a.k.a. optimization)
> over having a self-contained functions.
>
> There are few things that are still not clear to me:
> * Why is the 32 bit padding used when the doc says that
>   64 bit offset may also be needed?
> * Even if I extend my data buffer
>   to have those 4 bytes more, is there a risk that some functions
>   in ffmpeg will read out-of-bounds?

The padding is 32 *bytes*, which is plenty for all optimized bitstream
readers we use.
Just make sure those extra bytes are zeroed to avoid any issues.

> * How to find such information without reading all bolts and nuts of ffmpeg
> source?
>

You don't need to read everything, but the main header would
definitely be benefifical.
Feel free to recommend changes to the documentation if you think
something crucial could be clearer (ideally right in patch form!)

- Hendrik
wm4 March 7, 2017, 11:15 a.m. UTC | #4
On Tue, 7 Mar 2017 11:55:23 +0100
Michał Krasowski <mkrasowski@opera.com> wrote:

> @Michael Niedermayer
> I have read the documentation, namely
> 
> /**
>  * @ingroup lavc_decoding
>  * Required number of additionally allocated bytes at the end of the input
> bitstream for decoding.
>  * This is mainly needed because some optimized bitstream readers read
>  * 32 or 64 bit at once and could read over the end.<br>
>  * Note: If the first 23 bits of the additional bytes are not 0, then
> damaged
>  * MPEG bitstreams could cause overread and segfault.

I think the MPEG comment is misleading - this applies to all codecs.
Also, as someone said, the padding is in bytes, but these comments
above make it confusing by talking about bits.

>  */
> #define AV_INPUT_BUFFER_PADDING_SIZE 32
> 
> and now it seems to me that you prefer speed (a.k.a. optimization)
> over having a self-contained functions.

Unfortunately.

> There are few things that are still not clear to me:
> * Why is the 32 bit padding used when the doc says that
>   64 bit offset may also be needed?
> * Even if I extend my data buffer
>   to have those 4 bytes more, is there a risk that some functions
>   in ffmpeg will read out-of-bounds?
> * How to find such information without reading all bolts and nuts of ffmpeg
> source?

I sort of agree. Normally, the AVPacket functions guarantee the
padding. So you "only" need to be careful when you setup an AVPacket
manually.

I wonder if we could handle this transparently by checking the
AVPacket.buf allocation. We don't really support non-refcounted input
with the new decode/encode APIs anyway, so this would probably be
possible. Of course it'd mean that the data gets copied if the user
doesn't supply the padding.
Carl Eugen Hoyos March 7, 2017, 11:18 a.m. UTC | #5
2017-03-07 11:55 GMT+01:00 Michał Krasowski <mkrasowski@opera.com>:
> @Michael Niedermayer
> I have read the documentation, namely
>
> /**
>  * @ingroup lavc_decoding
>  * Required number of additionally allocated bytes at the end of the input
> bitstream for decoding.
>  * This is mainly needed because some optimized bitstream readers read
>  * 32 or 64 bit at once and could read over the end.<br>
>  * Note: If the first 23 bits of the additional bytes are not 0, then
> damaged
>  * MPEG bitstreams could cause overread and segfault.
>  */
> #define AV_INPUT_BUFFER_PADDING_SIZE 32
>
> and now it seems to me that you prefer speed (a.k.a. optimization)
> over having a self-contained functions.

Luckily:
Video players would be unusable without optimizations.

> There are few things that are still not clear to me:
> * Why is the 32 bit padding used when the doc says that
>   64 bit offset may also be needed?

I don't understand your question but you may want to
send an update for this sentence.

> * Even if I extend my data buffer
>   to have those 4 bytes more, is there a risk that some functions
>   in ffmpeg will read out-of-bounds?

Definitely: Bugs are possible.

[...]

> This email may contain Opera confidential information

Please remove this, Carl Eugen
Michael Niedermayer March 7, 2017, 11:21 a.m. UTC | #6
On Tue, Mar 07, 2017 at 11:55:23AM +0100, Michał Krasowski wrote:
> @Michael Niedermayer
> I have read the documentation, namely
> 
> /**
>  * @ingroup lavc_decoding
>  * Required number of additionally allocated bytes at the end of the input
> bitstream for decoding.
>  * This is mainly needed because some optimized bitstream readers read
>  * 32 or 64 bit at once and could read over the end.<br>
>  * Note: If the first 23 bits of the additional bytes are not 0, then
> damaged
>  * MPEG bitstreams could cause overread and segfault.
>  */
> #define AV_INPUT_BUFFER_PADDING_SIZE 32
> 
> and now it seems to me that you prefer speed (a.k.a. optimization)
> over having a self-contained functions.
> 
> There are few things that are still not clear to me:
> * Why is the 32 bit padding used when the doc says that
>   64 bit offset may also be needed?
> * Even if I extend my data buffer
>   to have those 4 bytes more, is there a risk that some functions
>   in ffmpeg will read out-of-bounds?

> * How to find such information without reading all bolts and nuts of ffmpeg
> source?

Where did you look?
Or said differently where would adding a pointer have helped you to
find this sooner ?
A patch adding such a pointer there is very likely welcome

[...]
Michał Krasowski March 7, 2017, 11:30 a.m. UTC | #7
>> This email may contain Opera confidential information
>
>Please remove this, Carl Eugen
Ah, yes, default footer, sorry.

>> There are few things that are still not clear to me:
>> * Why is the 32 bit padding used when the doc says that
>>   64 bit offset may also be needed?
>
>I don't understand your question but you may want to
>send an update for this sentence.

I mean that the doc says:
> * This is mainly needed because some optimized bitstream readers read
> * 32 or 64 bit at once and could read over the end.<br>
So in case of reading 64 bits at once, may it be the case that 8 bytes padding
is needed?

On Tue, Mar 7, 2017 at 12:18 PM, Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
>
> 2017-03-07 11:55 GMT+01:00 Michał Krasowski <mkrasowski@opera.com>:
> > @Michael Niedermayer
> > I have read the documentation, namely
> >
> > /**
> >  * @ingroup lavc_decoding
> >  * Required number of additionally allocated bytes at the end of the input
> > bitstream for decoding.
> >  * This is mainly needed because some optimized bitstream readers read
> >  * 32 or 64 bit at once and could read over the end.<br>
> >  * Note: If the first 23 bits of the additional bytes are not 0, then
> > damaged
> >  * MPEG bitstreams could cause overread and segfault.
> >  */
> > #define AV_INPUT_BUFFER_PADDING_SIZE 32
> >
> > and now it seems to me that you prefer speed (a.k.a. optimization)
> > over having a self-contained functions.
>
> Luckily:
> Video players would be unusable without optimizations.
>
> > There are few things that are still not clear to me:
> > * Why is the 32 bit padding used when the doc says that
> >   64 bit offset may also be needed?
>
> I don't understand your question but you may want to
> send an update for this sentence.
>
> > * Even if I extend my data buffer
> >   to have those 4 bytes more, is there a risk that some functions
> >   in ffmpeg will read out-of-bounds?
>
> Definitely: Bugs are possible.
>
> [...]
>
> > This email may contain Opera confidential information
>
> Please remove this, Carl Eugen
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Carl Eugen Hoyos March 7, 2017, 11:37 a.m. UTC | #8
2017-03-07 12:30 GMT+01:00 Michał Krasowski <mkrasowski@opera.com>:

>>> There are few things that are still not clear to me:
>>> * Why is the 32 bit padding used when the doc says that
>>>   64 bit offset may also be needed?
>>
>>I don't understand your question but you may want to
>>send an update for this sentence.
>
> I mean that the doc says:
>> * This is mainly needed because some optimized bitstream readers read
>> * 32 or 64 bit at once and could read over the end.<br>
> So in case of reading 64 bits at once, may it be the case that 8 bytes padding
> is needed?

(No, 32 bytes are - curently - always needed, see Hendrik's email.)

The relevant line is the definition of AV_INPUT_BUFFER_PADDING_SIZE

Your code should look similar to:
uint8_t *ptr = av_malloc(buffer_size + AV_INPUT_BUFFER_PADDING_SIZE);
And a zero-initialization of the padding.

Carl Eugen
diff mbox

Patch

diff --git a/libavcodec/h2645_parse.c b/libavcodec/h2645_parse.c
index c3961a5e90..ccb65eabfe 100644
--- a/libavcodec/h2645_parse.c
+++ b/libavcodec/h2645_parse.c
@@ -52,7 +52,7 @@  int ff_h2645_extract_rbsp(const uint8_t *src, int length,
         while (src[i])                                                  \
             i++
 #if HAVE_FAST_64BIT
-    for (i = 0; i + 1 < length; i += 9) {
+    for (i = 0; i + 8 < length; i += 9) {
         if (!((~AV_RN64A(src + i) &
                (AV_RN64A(src + i) - 0x0100010001000101ULL)) &
               0x8000800080008080ULL))
@@ -62,7 +62,7 @@  int ff_h2645_extract_rbsp(const uint8_t *src, int length,
         i -= 7;
     }
 #else
-    for (i = 0; i + 1 < length; i += 5) {
+    for (i = 0; i + 4 < length; i += 5) {
         if (!((~AV_RN32A(src + i) &
                (AV_RN32A(src + i) - 0x01000101U)) &
               0x80008080U))