diff mbox

[FFmpeg-devel,1/2] h2645_parse: Propagate NAL header parsing errors

Message ID 20190318154610.237635-2-derek.buitenhuis@gmail.com
State New
Headers show

Commit Message

Derek Buitenhuis March 18, 2019, 3:46 p.m. UTC
If we don't propagate these errors, h264dec and hevcdec can get up to all sorts of
weirdness, especially threaded, while trying to continue on with things they shouldn't.
Can cause stuff like:

    [hevc @ 0x5555577107c0] get_buffer() cannot be called after ff_thread_finish_setup()
    [hevc @ 0x5555577107c0] thread_get_buffer() failed
    [hevc @ 0x5555577107c0] Error parsing NAL unit #5.
    Error while decoding stream #0:0: Cannot allocate memory
    <deadlock>

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
---
 libavcodec/h2645_parse.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

James Almer March 18, 2019, 3:50 p.m. UTC | #1
On 3/18/2019 12:46 PM, Derek Buitenhuis wrote:
> If we don't propagate these errors, h264dec and hevcdec can get up to all sorts of
> weirdness, especially threaded, while trying to continue on with things they shouldn't.
> Can cause stuff like:
> 
>     [hevc @ 0x5555577107c0] get_buffer() cannot be called after ff_thread_finish_setup()
>     [hevc @ 0x5555577107c0] thread_get_buffer() failed
>     [hevc @ 0x5555577107c0] Error parsing NAL unit #5.
>     Error while decoding stream #0:0: Cannot allocate memory
>     <deadlock>
> 
> Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
> ---
>  libavcodec/h2645_parse.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/libavcodec/h2645_parse.c b/libavcodec/h2645_parse.c
> index 942f2c5d71..175c986c71 100644
> --- a/libavcodec/h2645_parse.c
> +++ b/libavcodec/h2645_parse.c
> @@ -503,6 +503,8 @@ int ff_h2645_packet_split(H2645Packet *pkt, const uint8_t *buf, int length,
>                         nal->type);
>              }
>              pkt->nb_nals--;
> +            if (ret < 0)
> +                return ret;

This will abort the splitting process when it's meant to only discard
the faulty NAL. Even the log message stats it's skipping it.
Derek Buitenhuis March 18, 2019, 3:52 p.m. UTC | #2
On 18/03/2019 15:50, James Almer wrote:
> This will abort the splitting process when it's meant to only discard
> the faulty NAL. Even the log message stats it's skipping it.

The log message also claims to be AV_LOG_ERROR.

Also there are no comments indicating why this is right, or any commit I
could find.

If you feel you can fix this in hevcdec.c, feel free... I couldn't figure
out where it was blowing up, specifically.

- Derek
James Almer March 18, 2019, 6:38 p.m. UTC | #3
On 3/18/2019 12:52 PM, Derek Buitenhuis wrote:
> On 18/03/2019 15:50, James Almer wrote:
>> This will abort the splitting process when it's meant to only discard
>> the faulty NAL. Even the log message stats it's skipping it.
> 
> The log message also claims to be AV_LOG_ERROR.
> 
> Also there are no comments indicating why this is right, or any commit I
> could find.
> 
> If you feel you can fix this in hevcdec.c, feel free... I couldn't figure
> out where it was blowing up, specifically.
> 
> - Derek

So, what i'm seeing here is two slice NALs in the same packet (which
means processed in the same decode_nal_units() loop in hevcdec.c)
reporting being the "first slice segment in the pic". And that's
seemingly making the threading logic shit itself.
In between them are two NALs, both with valid starting codes but invalid
data (first bit is 1 when it's a bitstream requirement for it to be 0),
but they ultimately have nothing to do with the issue in this file. Your
patch works around this simply because it aborts the NAL splitting
before it gets to the second slice NAL, and the whole packet gets discarded.

This is among other things a muxing mistake, since if you remux this
into raw hevc (ffmpeg -i nal_header_deadlock.mp4 -c:v copy -an
nal_header_deadlock.hevc) and try to decode that, the hevc parser will
split it into proper packets with one slice/pic NAL each and the
deadlock will not happen (see hevc_find_frame_end() in hevc_parser.c).

The following change fixes this for me by preventing the decoder from
trying to decode more than one "first" slice for the same frame:

> diff --git a/libavcodec/hevcdec.c b/libavcodec/hevcdec.c
> index 967f8f1def..9492108c77 100644
> --- a/libavcodec/hevcdec.c
> +++ b/libavcodec/hevcdec.c
> @@ -2927,6 +2927,10 @@ static int decode_nal_unit(HEVCContext *s, const H2645NAL *nal)
>          }
> 
>          if (s->sh.first_slice_in_pic_flag) {
> +            if (s->ref) {
> +                av_log(s->avctx, AV_LOG_ERROR, "Two slices reporting being the first in the picture.\n");
> +                goto fail;
> +            }
>              if (s->max_ra == INT_MAX) {
>                  if (s->nal_unit_type == HEVC_NAL_CRA_NUT || IS_BLA(s)) {
>                      s->max_ra = s->poc;

But it will however discard the second slice, despite its only apparent
problem being showing up in the wrong packet, so it's probably still not
ideal.
Another solution would be to enable the parser for mp4 input, so the
packetization in the input will be ignored, but that'll slow down
demuxing for every single file.

Someone else that knows hevc and/or threading might want to look at this
and fix this in a better way.
Derek Buitenhuis March 18, 2019, 7:17 p.m. UTC | #4
On 18/03/2019 18:38, James Almer wrote:
> So, what i'm seeing here is two slice NALs in the same packet (which
> means processed in the same decode_nal_units() loop in hevcdec.c)
> reporting being the "first slice segment in the pic". And that's
> seemingly making the threading logic shit itself.

[...]

> In between them are two NALs, both with valid starting codes but invalid
> data (first bit is 1 when it's a bitstream requirement for it to be 0),
> but they ultimately have nothing to do with the issue in this file. Your
> patch works around this simply because it aborts the NAL splitting
> before it gets to the second slice NAL, and the whole packet gets discarded.

Yes, this is correct, and arguably also not a wrong solution, just 'less good'
for broken files.

> This is among other things a muxing mistake, since if you remux this
> into raw hevc (ffmpeg -i nal_header_deadlock.mp4 -c:v copy -an
> nal_header_deadlock.hevc) and try to decode that, the hevc parser will
> split it into proper packets with one slice/pic NAL each and the
> deadlock will not happen (see hevc_find_frame_end() in hevc_parser.c).

Yes, very obviously so, but shouldn't explode the parser/decoder. 

> The following change fixes this for me by preventing the decoder from
> trying to decode more than one "first" slice for the same frame:

Fixes it for me, too, and makes sense.

>> diff --git a/libavcodec/hevcdec.c b/libavcodec/hevcdec.c
>> index 967f8f1def..9492108c77 100644
>> --- a/libavcodec/hevcdec.c
>> +++ b/libavcodec/hevcdec.c
>> @@ -2927,6 +2927,10 @@ static int decode_nal_unit(HEVCContext *s, const H2645NAL *nal)
>>          }
>>
>>          if (s->sh.first_slice_in_pic_flag) {
>> +            if (s->ref) {
>> +                av_log(s->avctx, AV_LOG_ERROR, "Two slices reporting being the first in the picture.\n");
>> +                goto fail;
>> +            }
>>              if (s->max_ra == INT_MAX) {
>>                  if (s->nal_unit_type == HEVC_NAL_CRA_NUT || IS_BLA(s)) {
>>                      s->max_ra = s->poc;

[...]

> But it will however discard the second slice, despite its only apparent
> problem being showing up in the wrong packet, so it's probably still not
> ideal.

Not deadlocking is more ideal than not. I'm not particularily concerned with making
broken files look as best as possible, myself, but I know this sentiment is not
shared around here.

I'm content with it as-is, unless someone can offer a better one.

> Another solution would be to enable the parser for mp4 input, so the
> packetization in the input will be ignored, but that'll slow down
> demuxing for every single file.

Agree this would be a poor solution...

> Someone else that knows hevc and/or threading might want to look at this
> and fix this in a better way.

Does anyone? I'm not sure anyone still around could be considered an expert on
the HEVC decoder...

- Derel
Michael Niedermayer March 18, 2019, 8:44 p.m. UTC | #5
On Mon, Mar 18, 2019 at 03:46:09PM +0000, Derek Buitenhuis wrote:
> If we don't propagate these errors, h264dec and hevcdec can get up to all sorts of
> weirdness, especially threaded, while trying to continue on with things they shouldn't.
> Can cause stuff like:

do you have a sample for h264 ? (or only teh one for hevc) ?

[...]
Derek Buitenhuis March 18, 2019, 10:38 p.m. UTC | #6
On 18/03/2019 20:44, Michael Niedermayer wrote:
> do you have a sample for h264 ? (or only teh one for hevc) ?

On hand, just HEVC.

- Derek
diff mbox

Patch

diff --git a/libavcodec/h2645_parse.c b/libavcodec/h2645_parse.c
index 942f2c5d71..175c986c71 100644
--- a/libavcodec/h2645_parse.c
+++ b/libavcodec/h2645_parse.c
@@ -503,6 +503,8 @@  int ff_h2645_packet_split(H2645Packet *pkt, const uint8_t *buf, int length,
                        nal->type);
             }
             pkt->nb_nals--;
+            if (ret < 0)
+                return ret;
         }
     }