[FFmpeg-devel,v3] lavc/pthread_frame: update context in child thread in multi-thread mode

Submitted by Linjie Fu on June 27, 2019, 11:54 p.m.

Details

Message ID 1561679654-23336-1-git-send-email-linjie.fu@intel.com
State New
Headers show

Commit Message

Linjie Fu June 27, 2019, 11:54 p.m.
Currently in ff_thread_decode_frame, context is updated from child thread
to user thread, and user thread releases the context in avcodec_close()
when decode finishes.

However, when resolution/format changes, ff_get_format is called, and
hwaccel_uninit() and hwaccel_init will be used to destroy and re-create
the context. Due to the async between user-thread and child-thread,
user-thread updates its context from child earlier than the context
is refreshed in child-thread. And it will lead to:
    1. memory leak in child-thread.
    2. double free in user-thread while calling avcodec_close().

Can be reproduced with a resolution change case, and use -vframes
to terminate the decode between the dynamic resolution changing frames:

ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -v
verbose -i ./test2360_1672_4980.ivf -pix_fmt p010le -f rawvideo -vsync
passthrough -vframes 6 -y out.yuv

ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -v verbose -i
./reinit-large_420_8-to-small_420_8.h264 -pix_fmt nv12 -f rawvideo
-vsync passthrough -vframes 45 -y out.yuv

Move update_context_from_thread from ff_thread_decode_frame(user thread)
to frame_worker_thread(child thread), update the context in child thread
right after the context is refreshed to avoid the async issue.

Signed-off-by: Linjie Fu <linjie.fu@intel.com>
---
 libavcodec/internal.h      |  7 +++++++
 libavcodec/pthread_frame.c | 21 ++++++++++++++++++---
 2 files changed, 25 insertions(+), 3 deletions(-)

Comments

Carl Eugen Hoyos June 27, 2019, 12:32 p.m.
Am Do., 27. Juni 2019 um 13:56 Uhr schrieb Linjie Fu <linjie.fu@intel.com>:

>  libavcodec/internal.h      |  7 +++++++
>  libavcodec/pthread_frame.c | 21 ++++++++++++++++++---
>  2 files changed, 25 insertions(+), 3 deletions(-)
>
> diff --git a/libavcodec/internal.h b/libavcodec/internal.h
> index 5096ffa..a85ffff 100644
> --- a/libavcodec/internal.h
> +++ b/libavcodec/internal.h
> @@ -162,6 +162,13 @@ typedef struct AVCodecInternal {
>
>      void *thread_ctx;
>
> +    /**
> +     * User thread AVCodecContext pointer and
> +     * context mutex
> +     */
> +    void *user_avctx;

Sorry if this was already answered:
Why is this not an AVCodecContext* ?

Carl Eugen
Hendrik Leppkes June 28, 2019, 12:56 a.m.
On Thu, Jun 27, 2019 at 1:56 PM Linjie Fu <linjie.fu@intel.com> wrote:
>
> Currently in ff_thread_decode_frame, context is updated from child thread
> to user thread, and user thread releases the context in avcodec_close()
> when decode finishes.
>
> However, when resolution/format changes, ff_get_format is called, and
> hwaccel_uninit() and hwaccel_init will be used to destroy and re-create
> the context. Due to the async between user-thread and child-thread,
> user-thread updates its context from child earlier than the context
> is refreshed in child-thread. And it will lead to:
>     1. memory leak in child-thread.
>     2. double free in user-thread while calling avcodec_close().
>
> Can be reproduced with a resolution change case, and use -vframes
> to terminate the decode between the dynamic resolution changing frames:
>
> ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -v
> verbose -i ./test2360_1672_4980.ivf -pix_fmt p010le -f rawvideo -vsync
> passthrough -vframes 6 -y out.yuv
>
> ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -v verbose -i
> ./reinit-large_420_8-to-small_420_8.h264 -pix_fmt nv12 -f rawvideo
> -vsync passthrough -vframes 45 -y out.yuv
>
> Move update_context_from_thread from ff_thread_decode_frame(user thread)
> to frame_worker_thread(child thread), update the context in child thread
> right after the context is refreshed to avoid the async issue.
>

I think the undelying issue that Michael mentioned remains - you are
changing the user context from a worker thread, at which point the
user might be accessing his context simultaneously. You cannot prevent
that with a mutex, since the user does not use your mutex.

Additionally, the user context should reflect the state of the last
frame that was output to the user, if a worker thread changes it
immediately as it sees the size change, wouldn't it be possible for
frames of the old size to be output after the context was already
updated? That sounds like trouble to me.

- Hendrik
Linjie Fu June 28, 2019, 1:57 a.m.
> -----Original Message-----

> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf

> Of Carl Eugen Hoyos

> Sent: Thursday, June 27, 2019 20:32

> To: FFmpeg development discussions and patches <ffmpeg-

> devel@ffmpeg.org>

> Subject: Re: [FFmpeg-devel] [PATCH, v3] lavc/pthread_frame: update

> context in child thread in multi-thread mode

> 

> Am Do., 27. Juni 2019 um 13:56 Uhr schrieb Linjie Fu <linjie.fu@intel.com>:

> 

> >  libavcodec/internal.h      |  7 +++++++

> >  libavcodec/pthread_frame.c | 21 ++++++++++++++++++---

> >  2 files changed, 25 insertions(+), 3 deletions(-)

> >

> > diff --git a/libavcodec/internal.h b/libavcodec/internal.h

> > index 5096ffa..a85ffff 100644

> > --- a/libavcodec/internal.h

> > +++ b/libavcodec/internal.h

> > @@ -162,6 +162,13 @@ typedef struct AVCodecInternal {

> >

> >      void *thread_ctx;

> >

> > +    /**

> > +     * User thread AVCodecContext pointer and

> > +     * context mutex

> > +     */

> > +    void *user_avctx;

> 

> Sorry if this was already answered:

> Why is this not an AVCodecContext* ?


Followed the behavior of :
void *thread_ctx;
void *frame_thread_encoder

Since it will only be used as AVCodecContext* (not like void *hwaccel_priv_data)
AVCodecContext *user_avctx seems to be more directly.
I'm not strongly insist on this unless there are some other concerns or differents.
Linjie Fu July 11, 2019, 9:11 a.m.
> -----Original Message-----

> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf

> Of Hendrik Leppkes

> Sent: Friday, June 28, 2019 08:56

> To: FFmpeg development discussions and patches <ffmpeg-

> devel@ffmpeg.org>

> Subject: Re: [FFmpeg-devel] [PATCH, v3] lavc/pthread_frame: update

> context in child thread in multi-thread mode

> 

> On Thu, Jun 27, 2019 at 1:56 PM Linjie Fu <linjie.fu@intel.com> wrote:

> >

> > Currently in ff_thread_decode_frame, context is updated from child

> thread

> > to user thread, and user thread releases the context in avcodec_close()

> > when decode finishes.

> >

> > However, when resolution/format changes, ff_get_format is called, and

> > hwaccel_uninit() and hwaccel_init will be used to destroy and re-create

> > the context. Due to the async between user-thread and child-thread,

> > user-thread updates its context from child earlier than the context

> > is refreshed in child-thread. And it will lead to:

> >     1. memory leak in child-thread.

> >     2. double free in user-thread while calling avcodec_close().

> >

> > Can be reproduced with a resolution change case, and use -vframes

> > to terminate the decode between the dynamic resolution changing frames:

> >

> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -v

> > verbose -i ./test2360_1672_4980.ivf -pix_fmt p010le -f rawvideo -vsync

> > passthrough -vframes 6 -y out.yuv

> >

> > ffmpeg -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -v verbose -i

> > ./reinit-large_420_8-to-small_420_8.h264 -pix_fmt nv12 -f rawvideo

> > -vsync passthrough -vframes 45 -y out.yuv

> >

> > Move update_context_from_thread from ff_thread_decode_frame(user

> thread)

> > to frame_worker_thread(child thread), update the context in child thread

> > right after the context is refreshed to avoid the async issue.

> >

> 

> I think the undelying issue that Michael mentioned remains - you are

> changing the user context from a worker thread, at which point the

> user might be accessing his context simultaneously. You cannot prevent

> that with a mutex, since the user does not use your mutex.


User context could be accessed everywhere in the user thread, so changing from
worker thread will always lead to a race condition.
Seems I didn't fully understood this before.

Async feature is designed on purpose, it's hard to capture the changes in worker
thread in time and get the context updated for user thread correctly.

As the async issue exists commonly for resolution changing, it is in need for fixing.

> Additionally, the user context should reflect the state of the last

> frame that was output to the user, if a worker thread changes it

> immediately as it sees the size change, wouldn't it be possible for

> frames of the old size to be output after the context was already

> updated? That sounds like trouble to me.


I didn't quite understand it, but as another patch for vp9 shows, output frame whose size
Is different from context can still be correct.

Anyway, since it's not easy(or possible) to modify in worker thread, this won't be a problem. 

Thanks,
- linjie

Patch hide | download patch | download mbox

diff --git a/libavcodec/internal.h b/libavcodec/internal.h
index 5096ffa..a85ffff 100644
--- a/libavcodec/internal.h
+++ b/libavcodec/internal.h
@@ -162,6 +162,13 @@  typedef struct AVCodecInternal {
 
     void *thread_ctx;
 
+    /**
+     * User thread AVCodecContext pointer and
+     * context mutex
+     */
+    void *user_avctx;
+    pthread_mutex_t context_mutex;
+
     DecodeSimpleContext ds;
     DecodeFilterContext filter;
 
diff --git a/libavcodec/pthread_frame.c b/libavcodec/pthread_frame.c
index 36ac0ac..60110f2 100644
--- a/libavcodec/pthread_frame.c
+++ b/libavcodec/pthread_frame.c
@@ -140,6 +140,8 @@  typedef struct FrameThreadContext {
 #define THREAD_SAFE_CALLBACKS(avctx) \
 ((avctx)->thread_safe_callbacks || (avctx)->get_buffer2 == avcodec_default_get_buffer2)
 
+static int update_context_from_thread(AVCodecContext *dst, AVCodecContext *src, int for_user);
+
 static void async_lock(FrameThreadContext *fctx)
 {
     pthread_mutex_lock(&fctx->async_mutex);
@@ -157,7 +159,6 @@  static void async_unlock(FrameThreadContext *fctx)
     pthread_cond_broadcast(&fctx->async_cond);
     pthread_mutex_unlock(&fctx->async_mutex);
 }
-
 /**
  * Codec worker thread.
  *
@@ -169,6 +170,7 @@  static attribute_align_arg void *frame_worker_thread(void *arg)
 {
     PerThreadContext *p = arg;
     AVCodecContext *avctx = p->avctx;
+    AVCodecContext *user_avctx = p->avctx->internal->user_avctx;
     const AVCodec *codec = avctx->codec;
 
     pthread_mutex_lock(&p->mutex);
@@ -200,6 +202,12 @@  static attribute_align_arg void *frame_worker_thread(void *arg)
         p->got_frame = 0;
         p->result = codec->decode(avctx, p->frame, &p->got_frame, &p->avpkt);
 
+        if (user_avctx) {
+            pthread_mutex_lock(&user_avctx->internal->context_mutex);
+            update_context_from_thread(user_avctx, p->avctx, 1);
+            pthread_mutex_unlock(&user_avctx->internal->context_mutex);
+        }
+
         if ((p->result < 0 || !p->got_frame) && p->frame->buf[0]) {
             if (avctx->internal->allocate_progress)
                 av_log(avctx, AV_LOG_ERROR, "A frame threaded decoder did not "
@@ -390,7 +398,9 @@  static int submit_packet(PerThreadContext *p, AVCodecContext *user_avctx,
 
     pthread_mutex_lock(&p->mutex);
 
+    pthread_mutex_lock(&user_avctx->internal->context_mutex);
     ret = update_context_from_user(p->avctx, user_avctx);
+    pthread_mutex_unlock(&user_avctx->internal->context_mutex);
     if (ret) {
         pthread_mutex_unlock(&p->mutex);
         return ret;
@@ -540,8 +550,6 @@  int ff_thread_decode_frame(AVCodecContext *avctx,
         if (finished >= avctx->thread_count) finished = 0;
     } while (!avpkt->size && !*got_picture_ptr && err >= 0 && finished != fctx->next_finished);
 
-    update_context_from_thread(avctx, p->avctx, 1);
-
     if (fctx->next_decoding >= avctx->thread_count) fctx->next_decoding = 0;
 
     fctx->next_finished = finished;
@@ -713,6 +721,8 @@  void ff_frame_thread_free(AVCodecContext *avctx, int thread_count)
     pthread_mutex_destroy(&fctx->async_mutex);
     pthread_cond_destroy(&fctx->async_cond);
 
+    pthread_mutex_destroy(&avctx->internal->context_mutex);
+
     av_freep(&avctx->internal->thread_ctx);
 
     if (avctx->priv_data && avctx->codec && avctx->codec->priv_class)
@@ -728,6 +738,8 @@  int ff_frame_thread_init(AVCodecContext *avctx)
     FrameThreadContext *fctx;
     int i, err = 0;
 
+    avctx->internal->user_avctx = avctx;
+
     if (!thread_count) {
         int nb_cpus = av_cpu_count();
 #if FF_API_DEBUG_MV
@@ -761,6 +773,8 @@  int ff_frame_thread_init(AVCodecContext *avctx)
     pthread_mutex_init(&fctx->async_mutex, NULL);
     pthread_cond_init(&fctx->async_cond, NULL);
 
+    pthread_mutex_init(&avctx->internal->context_mutex, NULL);
+
     fctx->async_lock = 1;
     fctx->delaying = 1;
 
@@ -800,6 +814,7 @@  int ff_frame_thread_init(AVCodecContext *avctx)
         *copy->internal = *src->internal;
         copy->internal->thread_ctx = p;
         copy->internal->last_pkt_props = &p->avpkt;
+        copy->internal->user_avctx = avctx;
 
         if (!i) {
             src = copy;