diff mbox

[FFmpeg-devel,2/3] avcodec/cbs_h2645: Do not shorten reserved and unspecified NAL units in H264

Message ID 20191211210352.1022-2-michael@niedermayer.cc
State New
Headers show

Commit Message

Michael Niedermayer Dec. 11, 2019, 9:03 p.m. UTC
Its unclear if shortening these NAL units would have no effect on them

Fixes: assertion failure
Fixes: 19286/clusterfuzz-testcase-minimized-ffmpeg_BSF_H264_REDUNDANT_PPS_fuzzer-5707990724509696

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
---
 libavcodec/cbs_h2645.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

James Almer Dec. 11, 2019, 9:38 p.m. UTC | #1
On 12/11/2019 6:03 PM, Michael Niedermayer wrote:
> Its unclear if shortening these NAL units would have no effect on them
> 
> Fixes: assertion failure
> Fixes: 19286/clusterfuzz-testcase-minimized-ffmpeg_BSF_H264_REDUNDANT_PPS_fuzzer-5707990724509696
> 
> Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> ---
>  libavcodec/cbs_h2645.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/libavcodec/cbs_h2645.c b/libavcodec/cbs_h2645.c
> index 5f71d80584..b38b1d99ef 100644
> --- a/libavcodec/cbs_h2645.c
> +++ b/libavcodec/cbs_h2645.c
> @@ -564,11 +564,16 @@ static int cbs_h2645_fragment_add_nals(CodedBitstreamContext *ctx,
>          const H2645NAL *nal = &packet->nals[i];
>          AVBufferRef *ref;
>          size_t size = nal->size;
> +        int shorten = 1;
> +
> +        // Do not shorten reserved and unspecified NALs
> +        if (ctx->codec->codec_id == AV_CODEC_ID_H264) {
> +            shorten = nal->type > 0 && nal->type < 23;
> +        }
>  
>          // Remove trailing zeroes.
> -        while (size > 0 && nal->data[size - 1] == 0)
> +        while (shorten && size > 0 && nal->data[size - 1] == 0)
>              --size;
> -        av_assert0(size > 0);

So this is a NAL unit with a bunch of zero bytes? How did it go through
the filter in h2645_parse? It's supposed to skip any NAL like this.

>  
>          ref = (nal->data == nal->raw_data) ? frag->data_ref
>                                             : packet->rbsp.rbsp_buffer_ref;
>
Andreas Rheinhardt Dec. 11, 2019, 9:50 p.m. UTC | #2
On Wed, Dec 11, 2019 at 10:38 PM James Almer <jamrial@gmail.com> wrote:

> On 12/11/2019 6:03 PM, Michael Niedermayer wrote:
> > Its unclear if shortening these NAL units would have no effect on them
> >
> > Fixes: assertion failure
> > Fixes:
> 19286/clusterfuzz-testcase-minimized-ffmpeg_BSF_H264_REDUNDANT_PPS_fuzzer-5707990724509696
> >
> > Found-by: continuous fuzzing process
> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > ---
> >  libavcodec/cbs_h2645.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/libavcodec/cbs_h2645.c b/libavcodec/cbs_h2645.c
> > index 5f71d80584..b38b1d99ef 100644
> > --- a/libavcodec/cbs_h2645.c
> > +++ b/libavcodec/cbs_h2645.c
> > @@ -564,11 +564,16 @@ static int
> cbs_h2645_fragment_add_nals(CodedBitstreamContext *ctx,
> >          const H2645NAL *nal = &packet->nals[i];
> >          AVBufferRef *ref;
> >          size_t size = nal->size;
> > +        int shorten = 1;
> > +
> > +        // Do not shorten reserved and unspecified NALs
> > +        if (ctx->codec->codec_id == AV_CODEC_ID_H264) {
> > +            shorten = nal->type > 0 && nal->type < 23;
> > +        }
> >
> >          // Remove trailing zeroes.
> > -        while (size > 0 && nal->data[size - 1] == 0)
> > +        while (shorten && size > 0 && nal->data[size - 1] == 0)
> >              --size;
> > -        av_assert0(size > 0);
>
> So this is a NAL unit with a bunch of zero bytes? How did it go through
> the filter in h2645_parse? It's supposed to skip any NAL like this.
>
> I guess it triggered this workaround:
        /* see commit 3566042a0 */
        if (bytestream2_get_bytes_left(&bc) >= 4 &&
            bytestream2_peek_be32(&bc) == 0x000001E0)
            skip_trailing_zeros = 0;

If I am not mistaken, then all NAL units except one consisting solely of a
single 0x00 (where the RBSP is empty) have to have a rbsp_stop_one_bit. So
the only instance where stripping zeroes is not good is for such a 0x00
unit. And such a unit should be stripped, because it actually can't be
represented in annex b (which we output) at all.

- Andreas
Michael Niedermayer Dec. 11, 2019, 10:58 p.m. UTC | #3
On Wed, Dec 11, 2019 at 10:50:54PM +0100, Andreas Rheinhardt wrote:
> On Wed, Dec 11, 2019 at 10:38 PM James Almer <jamrial@gmail.com> wrote:
> 
> > On 12/11/2019 6:03 PM, Michael Niedermayer wrote:
> > > Its unclear if shortening these NAL units would have no effect on them
> > >
> > > Fixes: assertion failure
> > > Fixes:
> > 19286/clusterfuzz-testcase-minimized-ffmpeg_BSF_H264_REDUNDANT_PPS_fuzzer-5707990724509696
> > >
> > > Found-by: continuous fuzzing process
> > https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > > ---
> > >  libavcodec/cbs_h2645.c | 9 +++++++--
> > >  1 file changed, 7 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/libavcodec/cbs_h2645.c b/libavcodec/cbs_h2645.c
> > > index 5f71d80584..b38b1d99ef 100644
> > > --- a/libavcodec/cbs_h2645.c
> > > +++ b/libavcodec/cbs_h2645.c
> > > @@ -564,11 +564,16 @@ static int
> > cbs_h2645_fragment_add_nals(CodedBitstreamContext *ctx,
> > >          const H2645NAL *nal = &packet->nals[i];
> > >          AVBufferRef *ref;
> > >          size_t size = nal->size;
> > > +        int shorten = 1;
> > > +
> > > +        // Do not shorten reserved and unspecified NALs
> > > +        if (ctx->codec->codec_id == AV_CODEC_ID_H264) {
> > > +            shorten = nal->type > 0 && nal->type < 23;
> > > +        }
> > >
> > >          // Remove trailing zeroes.
> > > -        while (size > 0 && nal->data[size - 1] == 0)
> > > +        while (shorten && size > 0 && nal->data[size - 1] == 0)
> > >              --size;
> > > -        av_assert0(size > 0);
> >
> > So this is a NAL unit with a bunch of zero bytes? How did it go through

yes


> > the filter in h2645_parse? It's supposed to skip any NAL like this.
> >
> > I guess it triggered this workaround:
>         /* see commit 3566042a0 */
>         if (bytestream2_get_bytes_left(&bc) >= 4 &&
>             bytestream2_peek_be32(&bc) == 0x000001E0)
>             skip_trailing_zeros = 0;

yes


> 
> If I am not mistaken, then all NAL units except one consisting solely of a
> single 0x00 (where the RBSP is empty) have to have a rbsp_stop_one_bit. So
> the only instance where stripping zeroes is not good is for such a 0x00
> unit. And such a unit should be stripped, because it actually can't be
> represented in annex b (which we output) at all.

I dont see why it could not be represented.
Also the code you quoted above shows that there are NAL units where the
removal of zeros would damage the units.

I wonder if we should assume that NALs not described in the current H26* specs
follow exactly the RBSP syntax and can saftely be truncated.
My patch went in the direction of rather leaving that alone.
But of course we can do something else or i can try to implement that more
completely also preserving that on the input side but having no real world 
testcase that would not be tested.

What do you prefer? you seem to have a strong oppinion here

Thanks
Andreas Rheinhardt Dec. 11, 2019, 11:48 p.m. UTC | #4
On Wed, Dec 11, 2019 at 10:50 PM Andreas Rheinhardt <
andreas.rheinhardt@gmail.com> wrote:

> On Wed, Dec 11, 2019 at 10:38 PM James Almer <jamrial@gmail.com> wrote:
>
>> On 12/11/2019 6:03 PM, Michael Niedermayer wrote:
>> > Its unclear if shortening these NAL units would have no effect on them
>> >
>> > Fixes: assertion failure
>> > Fixes:
>> 19286/clusterfuzz-testcase-minimized-ffmpeg_BSF_H264_REDUNDANT_PPS_fuzzer-5707990724509696
>> >
>> > Found-by: continuous fuzzing process
>> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>> > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
>> > ---
>> >  libavcodec/cbs_h2645.c | 9 +++++++--
>> >  1 file changed, 7 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/libavcodec/cbs_h2645.c b/libavcodec/cbs_h2645.c
>> > index 5f71d80584..b38b1d99ef 100644
>> > --- a/libavcodec/cbs_h2645.c
>> > +++ b/libavcodec/cbs_h2645.c
>> > @@ -564,11 +564,16 @@ static int
>> cbs_h2645_fragment_add_nals(CodedBitstreamContext *ctx,
>> >          const H2645NAL *nal = &packet->nals[i];
>> >          AVBufferRef *ref;
>> >          size_t size = nal->size;
>> > +        int shorten = 1;
>> > +
>> > +        // Do not shorten reserved and unspecified NALs
>> > +        if (ctx->codec->codec_id == AV_CODEC_ID_H264) {
>> > +            shorten = nal->type > 0 && nal->type < 23;
>> > +        }
>> >
>> >          // Remove trailing zeroes.
>> > -        while (size > 0 && nal->data[size - 1] == 0)
>> > +        while (shorten && size > 0 && nal->data[size - 1] == 0)
>> >              --size;
>> > -        av_assert0(size > 0);
>>
>> So this is a NAL unit with a bunch of zero bytes? How did it go through
>> the filter in h2645_parse? It's supposed to skip any NAL like this.
>>
>> I guess it triggered this workaround:
>         /* see commit 3566042a0 */
>         if (bytestream2_get_bytes_left(&bc) >= 4 &&
>             bytestream2_peek_be32(&bc) == 0x000001E0)
>             skip_trailing_zeros = 0;
>
> If I am not mistaken, then all NAL units except one consisting solely of a
> single 0x00 (where the RBSP is empty) have to have a rbsp_stop_one_bit. So
> the only instance where stripping zeroes is not good is for such a 0x00
> unit. And such a unit should be stripped, because it actually can't be
> represented in annex b (which we output) at all.
>
> I think I am wrong about the very last part. From the H.264 spec (the HEVC
is the same):
"NumBytesInNALunit is set equal to the number of bytes starting with the
byte at the current position in the byte
stream up to and including the last byte that precedes the location of any
of the following:
– A subsequent byte-aligned three-byte sequence equal to 0x000000,
– A subsequent byte-aligned three-byte sequence equal to 0x000001,
– The end of the byte stream, as determined by unspecified means."
My earlier interpretation was that the three bytes following the first
startcode in this sequence 00 00 01 00 00 00 01 xx would mean that
NumBytesinNALunit were zero, because the last byte in the above sentence is
the 0x01 from the first startcode. But upon rereading I think that this is
ruled out by "subsequent", so that the NAL unit would be ended upon
encountering the second startcode (in this case, three bytes long).
But it still shows that if someone used one of the unspecified NAL units
and if these NAL units were allowed to end with zero bytes that are part of
the NAL unit, then said unit can't be represented in annex b (which
cbs_h2645 outputs) anyway. Furthermore, cbs_h2645 currently presumes that
all NAL units are escaped and consequently unescapes them upon reading and
escapes them upon writing. If unspecified NAL units were to deviate from
this, they would not survive this process unharmed (0x03 escapes might be
added). In particular, this applies to the aggregators from ISO/IEC
14496-15:
"The syntax of Aggregators may not follow the NAL unit syntax and the NAL
unit constraints specified in ISO/IEC
14496-10 or ISO/IEC 23008-2. For example, there may exist three continuous
bytes equal to a value in the range of
0x000000 to 0x000010, inclusive. This specifiation disallows the presence
of Aggregators in a video bitstream output
from parsing a file, therefore formal non-compliance with the video
specifications is immaterial as they will never be
presented to a video decoder."
Such units can't be represented in annex b at all, given the potential for
start code emulation. So I think the best we can do is what we already do:
Unescape every unit and escape them later. If any of the unspecified NAL
units uses the typical escaping scheme, it is fine; if not, we can't do
anything anyway.
Furthermore, our code for splitting a packet searches for 0x00 00 01 inside
an mp4/mkv NAL unit and views these as start of a new NAL unit (because
there are invalid files out there doing it that way). This of course has a
potential for start code emulation and so input that may contain start
codes within the NAL units would already be broken at this stage. I don't
think that there is a generic way to deal with this scenario; one would
have to specifically add support for every system that uses these
unspecified units. I think it's safe to say that this won't happen. Any
system that uses the typical escaping scheme works, the rest won't.

- Andreas
Andreas Rheinhardt Dec. 12, 2019, 1:53 a.m. UTC | #5
On Wed, Dec 11, 2019 at 11:58 PM Michael Niedermayer <michael@niedermayer.cc>
wrote:

> On Wed, Dec 11, 2019 at 10:50:54PM +0100, Andreas Rheinhardt wrote:
> > On Wed, Dec 11, 2019 at 10:38 PM James Almer <jamrial@gmail.com> wrote:
> >
> > > On 12/11/2019 6:03 PM, Michael Niedermayer wrote:
> > > > Its unclear if shortening these NAL units would have no effect on
> them
> > > >
> > > > Fixes: assertion failure
> > > > Fixes:
> > >
> 19286/clusterfuzz-testcase-minimized-ffmpeg_BSF_H264_REDUNDANT_PPS_fuzzer-5707990724509696
> > > >
> > > > Found-by: continuous fuzzing process
> > > https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
> > > > Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
> > > > ---
> > > >  libavcodec/cbs_h2645.c | 9 +++++++--
> > > >  1 file changed, 7 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/libavcodec/cbs_h2645.c b/libavcodec/cbs_h2645.c
> > > > index 5f71d80584..b38b1d99ef 100644
> > > > --- a/libavcodec/cbs_h2645.c
> > > > +++ b/libavcodec/cbs_h2645.c
> > > > @@ -564,11 +564,16 @@ static int
> > > cbs_h2645_fragment_add_nals(CodedBitstreamContext *ctx,
> > > >          const H2645NAL *nal = &packet->nals[i];
> > > >          AVBufferRef *ref;
> > > >          size_t size = nal->size;
> > > > +        int shorten = 1;
> > > > +
> > > > +        // Do not shorten reserved and unspecified NALs
> > > > +        if (ctx->codec->codec_id == AV_CODEC_ID_H264) {
> > > > +            shorten = nal->type > 0 && nal->type < 23;
> > > > +        }
> > > >
> > > >          // Remove trailing zeroes.
> > > > -        while (size > 0 && nal->data[size - 1] == 0)
> > > > +        while (shorten && size > 0 && nal->data[size - 1] == 0)
> > > >              --size;
> > > > -        av_assert0(size > 0);
> > >
> > > So this is a NAL unit with a bunch of zero bytes? How did it go through
>
> yes
>
>
> > > the filter in h2645_parse? It's supposed to skip any NAL like this.
> > >
> > > I guess it triggered this workaround:
> >         /* see commit 3566042a0 */
> >         if (bytestream2_get_bytes_left(&bc) >= 4 &&
> >             bytestream2_peek_be32(&bc) == 0x000001E0)
> >             skip_trailing_zeros = 0;
>
> yes
>
>
> >
> > If I am not mistaken, then all NAL units except one consisting solely of
> a
> > single 0x00 (where the RBSP is empty) have to have a rbsp_stop_one_bit.
> So
> > the only instance where stripping zeroes is not good is for such a 0x00
> > unit. And such a unit should be stripped, because it actually can't be
> > represented in annex b (which we output) at all.
>
> I dont see why it could not be represented.
>

I think I was wrong.


> Also the code you quoted above shows that there are NAL units where the
> removal of zeros would damage the units.
>
> Yes. But not removing zeroes could also alter the units, because these
zeroes might lead to the addition of 0x03 escape bytes when reassembling.
Btw: Where is the sample that triggered this?


> I wonder if we should assume that NALs not described in the current H26*
> specs
> follow exactly the RBSP syntax and can saftely be truncated.
>

As has been said in my other mail: Anything that may contain start codes
inside their NAL units can't be put into annex b anyway. Unfortunately
systems using such units exist.


> My patch went in the direction of rather leaving that alone.
>

Even with your patch they would not be left completely alone: The
unescaping->escaping process changes things for units initially unescaped;
not to mention the possibility of start code emulation within NAL units for
units that may contain start codes.
(The utmost that could be done to leave these units alone would be to use
the raw data of the unit instead of the unescaped unit in case it is one of
the unspecified units. And upon reassembling, no escaping would have to be
performed for these units, just a straight memcpy. But this would still not
solve the problem that h2645_packet_split searches the units for start
codes and treats them as start of new units. And even if this were solved,
one would have the problem that units that contain start codes can't be put
in annex b anyway. It's just a mess.)


> But of course we can do something else or i can try to implement that more
> completely also preserving that on the input side but having no real world
> testcase that would not be tested.
>
> What do you prefer? you seem to have a strong oppinion here
>
> The current code works with systems that follow the typical RBSP scheme;
supporting anything else would require significant changes to not only
cbs_h2645, but also the H.2645 parsing functions (or cbs_h2645 would have
to abandon them and implement their own). I don't think we have the
manpower, the samples and the motivation for this. So if it works, fine; if
not, then not. (We don't even support the SVC/MVC/3d extensions yet and
these are official.)
While this RBSP scheme would allow a NAL unit with empty RBSP that consists
of a single 0x00, I don't think that this will intentionally happen in
practice. And now that I have found out that I was wrong about such units
not being representable in annex b, I don't have a strong opinion any more
on whether these units should be discarded or kept as size one units. I am
still leaning towards discarding though, because I don't see a real-world
usecase for them and because they might confuse players in general (but I
haven't tested it to back this up, it's just a feeling).
But that's just my opinion. Actually, Mark should decide.

- Andreas
diff mbox

Patch

diff --git a/libavcodec/cbs_h2645.c b/libavcodec/cbs_h2645.c
index 5f71d80584..b38b1d99ef 100644
--- a/libavcodec/cbs_h2645.c
+++ b/libavcodec/cbs_h2645.c
@@ -564,11 +564,16 @@  static int cbs_h2645_fragment_add_nals(CodedBitstreamContext *ctx,
         const H2645NAL *nal = &packet->nals[i];
         AVBufferRef *ref;
         size_t size = nal->size;
+        int shorten = 1;
+
+        // Do not shorten reserved and unspecified NALs
+        if (ctx->codec->codec_id == AV_CODEC_ID_H264) {
+            shorten = nal->type > 0 && nal->type < 23;
+        }
 
         // Remove trailing zeroes.
-        while (size > 0 && nal->data[size - 1] == 0)
+        while (shorten && size > 0 && nal->data[size - 1] == 0)
             --size;
-        av_assert0(size > 0);
 
         ref = (nal->data == nal->raw_data) ? frag->data_ref
                                            : packet->rbsp.rbsp_buffer_ref;