[FFmpeg-devel] avcodec/h264, videotoolbox: fix use-after-free of AVFrame buffer when VT decoder fails

Submitted by Aman Gupta on Feb. 14, 2017, 2:04 a.m.

Details

Message ID 20170214020410.6054-1-ffmpeg@tmm1.net
State New
Headers show

Commit Message

Aman Gupta Feb. 14, 2017, 2:04 a.m.
From: Aman Gupta <aman@tmm1.net>

The videotoolbox hwaccel returns a dummy frame with a 1-byte buffer from
alloc_frame(). In end_frame(), this buffer is replaced with the actual
decoded data from VTDecompressionSessionDecodeFrame(). This is hacky,
but works well enough, as long as the VT decoder returns a valid frame on
every end_frame() call.

In the case of errors, things get more interesting. Before
9747219958060d8c4f697df62e7f172c2a77e6c7, the dummy 1-byte frame would
accidentally be returned all the way up to the user. After that commit,
the dummy frame was properly unref'd so the user received an error.

However, since that commit, VT hwaccel failures started causing random
segfaults in the h264 decoder. This happened more often on iOS where the
VT implementation is more likely to throw errors on bitstream anomolies.
A recent report of this issue can be see in
http://ffmpeg.org/pipermail/libav-user/2016-November/009831.html

The root cause here is that between the calls to alloc_frame() and
end_frame(), the h264 decoder will reference the dummy 1-byte frame in
its ref_list. When the end_frame() call fails, the dummy frame is
unref'd but still referenced in the h264 decoder. This eventually causes
a NULL deference and segmentation fault.

This patch fixes the issue by properly discarding references to the
dummy frame in the H264Context after the frame has been unref'd.
---
 libavcodec/h264_picture.c |  3 +++
 libavcodec/h264_refs.c    | 20 ++++++++++++++++++--
 libavcodec/h264dec.h      |  1 +
 3 files changed, 22 insertions(+), 2 deletions(-)

Comments

wm4 Feb. 14, 2017, 6:01 a.m.
On Mon, 13 Feb 2017 18:04:10 -0800
Aman Gupta <ffmpeg@tmm1.net> wrote:

> From: Aman Gupta <aman@tmm1.net>
> 
> The videotoolbox hwaccel returns a dummy frame with a 1-byte buffer from
> alloc_frame(). In end_frame(), this buffer is replaced with the actual
> decoded data from VTDecompressionSessionDecodeFrame(). This is hacky,
> but works well enough, as long as the VT decoder returns a valid frame on
> every end_frame() call.
> 
> In the case of errors, things get more interesting. Before
> 9747219958060d8c4f697df62e7f172c2a77e6c7, the dummy 1-byte frame would
> accidentally be returned all the way up to the user. After that commit,
> the dummy frame was properly unref'd so the user received an error.
> 
> However, since that commit, VT hwaccel failures started causing random
> segfaults in the h264 decoder. This happened more often on iOS where the
> VT implementation is more likely to throw errors on bitstream anomolies.
> A recent report of this issue can be see in
> http://ffmpeg.org/pipermail/libav-user/2016-November/009831.html
> 
> The root cause here is that between the calls to alloc_frame() and
> end_frame(), the h264 decoder will reference the dummy 1-byte frame in
> its ref_list. When the end_frame() call fails, the dummy frame is
> unref'd but still referenced in the h264 decoder. This eventually causes
> a NULL deference and segmentation fault.
> 
> This patch fixes the issue by properly discarding references to the
> dummy frame in the H264Context after the frame has been unref'd.
> ---
>  libavcodec/h264_picture.c |  3 +++
>  libavcodec/h264_refs.c    | 20 ++++++++++++++++++--
>  libavcodec/h264dec.h      |  1 +
>  3 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/libavcodec/h264_picture.c b/libavcodec/h264_picture.c
> index f634d2a..702ca12 100644
> --- a/libavcodec/h264_picture.c
> +++ b/libavcodec/h264_picture.c
> @@ -177,6 +177,9 @@ int ff_h264_field_end(H264Context *h, H264SliceContext *sl, int in_setup)
>          if (err < 0)
>              av_log(avctx, AV_LOG_ERROR,
>                     "hardware accelerator failed to decode picture\n");
> +
> +        if (err < 0 && avctx->hwaccel->pix_fmt == AV_PIX_FMT_VIDEOTOOLBOX)
> +            ff_h264_remove_cur_pic_ref(h);
>      }
>  
>  #if FF_API_CAP_VDPAU
> diff --git a/libavcodec/h264_refs.c b/libavcodec/h264_refs.c
> index 97bf588..9c77bc7 100644
> --- a/libavcodec/h264_refs.c
> +++ b/libavcodec/h264_refs.c
> @@ -560,6 +560,23 @@ static H264Picture *remove_long(H264Context *h, int i, int ref_mask)
>      return pic;
>  }
>  
> +void ff_h264_remove_cur_pic_ref(H264Context *h)
> +{
> +    int j;
> +
> +    if (h->short_ref[0] == h->cur_pic_ptr) {
> +        remove_short_at_index(h, 0);
> +    }
> +
> +    if (h->cur_pic_ptr->long_ref) {
> +        for (j = 0; j < FF_ARRAY_ELEMS(h->long_ref); j++) {
> +            if (h->long_ref[j] == h->cur_pic_ptr) {
> +                remove_long(h, j, 0);
> +            }
> +        }
> +    }
> +}
> +
>  void ff_h264_remove_all_refs(H264Context *h)
>  {
>      int i;
> @@ -571,8 +588,7 @@ void ff_h264_remove_all_refs(H264Context *h)
>  
>      if (h->short_ref_count && !h->last_pic_for_ec.f->data[0]) {
>          ff_h264_unref_picture(h, &h->last_pic_for_ec);
> -        if (h->short_ref[0]->f->buf[0])
> -            ff_h264_ref_picture(h, &h->last_pic_for_ec, h->short_ref[0]);
> +        ff_h264_ref_picture(h, &h->last_pic_for_ec, h->short_ref[0]);
>      }
>  
>      for (i = 0; i < h->short_ref_count; i++) {
> diff --git a/libavcodec/h264dec.h b/libavcodec/h264dec.h
> index 5f868b7..063afed 100644
> --- a/libavcodec/h264dec.h
> +++ b/libavcodec/h264dec.h
> @@ -569,6 +569,7 @@ int ff_h264_alloc_tables(H264Context *h);
>  int ff_h264_decode_ref_pic_list_reordering(H264SliceContext *sl, void *logctx);
>  int ff_h264_build_ref_list(H264Context *h, H264SliceContext *sl);
>  void ff_h264_remove_all_refs(H264Context *h);
> +void ff_h264_remove_cur_pic_ref(H264Context *h);
>  
>  /**
>   * Execute the reference picture marking (memory management control operations).

Still a bit hacky, but if it improves the crash situation, why not.

I'm wondering if it might be better to just check for a dummy buffer
before returning a frame to the user, though. (Instead of strictly
removing the dummy buffer on the end_frame hwaccel callback.)
Aman Gupta Feb. 21, 2017, 4:57 p.m.
On Mon, Feb 13, 2017 at 6:04 PM, Aman Gupta <ffmpeg@tmm1.net> wrote:

> From: Aman Gupta <aman@tmm1.net>
>
> The videotoolbox hwaccel returns a dummy frame with a 1-byte buffer from
> alloc_frame(). In end_frame(), this buffer is replaced with the actual
> decoded data from VTDecompressionSessionDecodeFrame(). This is hacky,
> but works well enough, as long as the VT decoder returns a valid frame on
> every end_frame() call.
>
> In the case of errors, things get more interesting. Before
> 9747219958060d8c4f697df62e7f172c2a77e6c7, the dummy 1-byte frame would
> accidentally be returned all the way up to the user. After that commit,
> the dummy frame was properly unref'd so the user received an error.
>
> However, since that commit, VT hwaccel failures started causing random
> segfaults in the h264 decoder. This happened more often on iOS where the
> VT implementation is more likely to throw errors on bitstream anomolies.
> A recent report of this issue can be see in
> http://ffmpeg.org/pipermail/libav-user/2016-November/009831.html
>
> The root cause here is that between the calls to alloc_frame() and
> end_frame(), the h264 decoder will reference the dummy 1-byte frame in
> its ref_list. When the end_frame() call fails, the dummy frame is
> unref'd but still referenced in the h264 decoder. This eventually causes
> a NULL deference and segmentation fault.
>
> This patch fixes the issue by properly discarding references to the
> dummy frame in the H264Context after the frame has been unref'd.
> ---
>  libavcodec/h264_picture.c |  3 +++
>  libavcodec/h264_refs.c    | 20 ++++++++++++++++++--
>  libavcodec/h264dec.h      |  1 +
>  3 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/libavcodec/h264_picture.c b/libavcodec/h264_picture.c
> index f634d2a..702ca12 100644
> --- a/libavcodec/h264_picture.c
> +++ b/libavcodec/h264_picture.c
> @@ -177,6 +177,9 @@ int ff_h264_field_end(H264Context *h, H264SliceContext
> *sl, int in_setup)
>          if (err < 0)
>              av_log(avctx, AV_LOG_ERROR,
>                     "hardware accelerator failed to decode picture\n");
> +
> +        if (err < 0 && avctx->hwaccel->pix_fmt == AV_PIX_FMT_VIDEOTOOLBOX)
> +            ff_h264_remove_cur_pic_ref(h);
>

This patch fixed the crash I was seeing pretty often, but now another
assert() can trip (albeit less frequently):

Assertion h->cur_pic_ptr->f->buf[0] failed at
src/libavcodec/h264_slice.c:1340

I will try wm4's suggestion of leaving the dummy 1-byte frame in-place, but
ignoring it when it's time to return a frame to the user.


>      }
>
>  #if FF_API_CAP_VDPAU
> diff --git a/libavcodec/h264_refs.c b/libavcodec/h264_refs.c
> index 97bf588..9c77bc7 100644
> --- a/libavcodec/h264_refs.c
> +++ b/libavcodec/h264_refs.c
> @@ -560,6 +560,23 @@ static H264Picture *remove_long(H264Context *h, int
> i, int ref_mask)
>      return pic;
>  }
>
> +void ff_h264_remove_cur_pic_ref(H264Context *h)
> +{
> +    int j;
> +
> +    if (h->short_ref[0] == h->cur_pic_ptr) {
> +        remove_short_at_index(h, 0);
> +    }
> +
> +    if (h->cur_pic_ptr->long_ref) {
> +        for (j = 0; j < FF_ARRAY_ELEMS(h->long_ref); j++) {
> +            if (h->long_ref[j] == h->cur_pic_ptr) {
> +                remove_long(h, j, 0);
> +            }
> +        }
> +    }
> +}
> +
>  void ff_h264_remove_all_refs(H264Context *h)
>  {
>      int i;
> @@ -571,8 +588,7 @@ void ff_h264_remove_all_refs(H264Context *h)
>
>      if (h->short_ref_count && !h->last_pic_for_ec.f->data[0]) {
>          ff_h264_unref_picture(h, &h->last_pic_for_ec);
> -        if (h->short_ref[0]->f->buf[0])
> -            ff_h264_ref_picture(h, &h->last_pic_for_ec, h->short_ref[0]);
> +        ff_h264_ref_picture(h, &h->last_pic_for_ec, h->short_ref[0]);
>      }
>
>      for (i = 0; i < h->short_ref_count; i++) {
> diff --git a/libavcodec/h264dec.h b/libavcodec/h264dec.h
> index 5f868b7..063afed 100644
> --- a/libavcodec/h264dec.h
> +++ b/libavcodec/h264dec.h
> @@ -569,6 +569,7 @@ int ff_h264_alloc_tables(H264Context *h);
>  int ff_h264_decode_ref_pic_list_reordering(H264SliceContext *sl, void
> *logctx);
>  int ff_h264_build_ref_list(H264Context *h, H264SliceContext *sl);
>  void ff_h264_remove_all_refs(H264Context *h);
> +void ff_h264_remove_cur_pic_ref(H264Context *h);
>
>  /**
>   * Execute the reference picture marking (memory management control
> operations).
> --
> 2.10.1 (Apple Git-78)
>
>

Patch hide | download patch | download mbox

diff --git a/libavcodec/h264_picture.c b/libavcodec/h264_picture.c
index f634d2a..702ca12 100644
--- a/libavcodec/h264_picture.c
+++ b/libavcodec/h264_picture.c
@@ -177,6 +177,9 @@  int ff_h264_field_end(H264Context *h, H264SliceContext *sl, int in_setup)
         if (err < 0)
             av_log(avctx, AV_LOG_ERROR,
                    "hardware accelerator failed to decode picture\n");
+
+        if (err < 0 && avctx->hwaccel->pix_fmt == AV_PIX_FMT_VIDEOTOOLBOX)
+            ff_h264_remove_cur_pic_ref(h);
     }
 
 #if FF_API_CAP_VDPAU
diff --git a/libavcodec/h264_refs.c b/libavcodec/h264_refs.c
index 97bf588..9c77bc7 100644
--- a/libavcodec/h264_refs.c
+++ b/libavcodec/h264_refs.c
@@ -560,6 +560,23 @@  static H264Picture *remove_long(H264Context *h, int i, int ref_mask)
     return pic;
 }
 
+void ff_h264_remove_cur_pic_ref(H264Context *h)
+{
+    int j;
+
+    if (h->short_ref[0] == h->cur_pic_ptr) {
+        remove_short_at_index(h, 0);
+    }
+
+    if (h->cur_pic_ptr->long_ref) {
+        for (j = 0; j < FF_ARRAY_ELEMS(h->long_ref); j++) {
+            if (h->long_ref[j] == h->cur_pic_ptr) {
+                remove_long(h, j, 0);
+            }
+        }
+    }
+}
+
 void ff_h264_remove_all_refs(H264Context *h)
 {
     int i;
@@ -571,8 +588,7 @@  void ff_h264_remove_all_refs(H264Context *h)
 
     if (h->short_ref_count && !h->last_pic_for_ec.f->data[0]) {
         ff_h264_unref_picture(h, &h->last_pic_for_ec);
-        if (h->short_ref[0]->f->buf[0])
-            ff_h264_ref_picture(h, &h->last_pic_for_ec, h->short_ref[0]);
+        ff_h264_ref_picture(h, &h->last_pic_for_ec, h->short_ref[0]);
     }
 
     for (i = 0; i < h->short_ref_count; i++) {
diff --git a/libavcodec/h264dec.h b/libavcodec/h264dec.h
index 5f868b7..063afed 100644
--- a/libavcodec/h264dec.h
+++ b/libavcodec/h264dec.h
@@ -569,6 +569,7 @@  int ff_h264_alloc_tables(H264Context *h);
 int ff_h264_decode_ref_pic_list_reordering(H264SliceContext *sl, void *logctx);
 int ff_h264_build_ref_list(H264Context *h, H264SliceContext *sl);
 void ff_h264_remove_all_refs(H264Context *h);
+void ff_h264_remove_cur_pic_ref(H264Context *h);
 
 /**
  * Execute the reference picture marking (memory management control operations).