Message ID | 1491229482-25479-1-git-send-email-rsbultje@gmail.com |
---|---|
State | Accepted |
Headers | show |
Hi, On Mon, Apr 3, 2017 at 10:24 AM, Ronald S. Bultje <rsbultje@gmail.com> wrote: > This tries to handle cases where separate invocations of decode_frame() > (each running in separate threads) write to respective fields in the > same AVFrame->data[]. Having per-field owners makes interaction between > readers (the referencing thread) and writers (the decoding thread) > slightly more optimal if both accesses are field-based, since they will > use the respective producer's thread objects (mutex/cond) instead of > sharing the thread objects of the first field's producer. > > In practice, this fixes the following tsan-warning in fate-h264: > > WARNING: ThreadSanitizer: data race (pid=21615) > Read of size 4 at 0x7d640000d9fc by thread T2 (mutexes: write M1006): > #0 ff_thread_report_progress pthread_frame.c:569 > (ffmpeg:x86_64+0x100f7cf54) > [..] > Previous write of size 4 at 0x7d640000d9fc by main thread (mutexes: > write M1004): > #0 update_context_from_user pthread_frame.c:335 > (ffmpeg:x86_64+0x100f81abb) > --- > libavcodec/h264_slice.c | 8 +++++--- > libavcodec/pthread_frame.c | 18 ++++++++++-------- > libavcodec/thread.h | 2 +- > libavcodec/utils.c | 7 ++++--- > 4 files changed, 20 insertions(+), 15 deletions(-) > Ping. (I realize this patch is probably very hard to review, but I'd really appreciate a second set of eyes/brains on my proposed solution, exactly because it's hard - and also because it touches a very fundamental core element of frame-mt design.) Ronald
Hi Ronald, I have a question about this patch. On Mon, Apr 3, 2017 at 7:24 AM, Ronald S. Bultje <rsbultje@gmail.com> wrote: > This tries to handle cases where separate invocations of decode_frame() > (each running in separate threads) write to respective fields in the > same AVFrame->data[]. Having per-field owners makes interaction between > readers (the referencing thread) and writers (the decoding thread) > slightly more optimal if both accesses are field-based, since they will > use the respective producer's thread objects (mutex/cond) instead of > sharing the thread objects of the first field's producer. > > In practice, this fixes the following tsan-warning in fate-h264: > > WARNING: ThreadSanitizer: data race (pid=21615) > Read of size 4 at 0x7d640000d9fc by thread T2 (mutexes: write M1006): > #0 ff_thread_report_progress pthread_frame.c:569 (ffmpeg:x86_64+0x100f7cf54) > [..] > Previous write of size 4 at 0x7d640000d9fc by main thread (mutexes: write M1004): > #0 update_context_from_user pthread_frame.c:335 (ffmpeg:x86_64+0x100f81abb) > --- > libavcodec/h264_slice.c | 8 +++++--- > libavcodec/pthread_frame.c | 18 ++++++++++-------- > libavcodec/thread.h | 2 +- > libavcodec/utils.c | 7 ++++--- > 4 files changed, 20 insertions(+), 15 deletions(-) > > diff --git a/libavcodec/h264_slice.c b/libavcodec/h264_slice.c > index fa1e9ae..d4d31cc 100644 > --- a/libavcodec/h264_slice.c > +++ b/libavcodec/h264_slice.c > @@ -1423,14 +1423,14 @@ static int h264_field_start(H264Context *h, const H264SliceContext *sl, > * We have to do that before the "dummy" in-between frame allocation, > * since that can modify h->cur_pic_ptr. */ > if (h->first_field) { > + int last_field = last_pic_structure == PICT_BOTTOM_FIELD; > av_assert0(h->cur_pic_ptr); > av_assert0(h->cur_pic_ptr->f->buf[0]); > assert(h->cur_pic_ptr->reference != DELAYED_PIC_REF); > > /* Mark old field/frame as completed */ > - if (h->cur_pic_ptr->tf.owner == h->avctx) { > - ff_thread_report_progress(&h->cur_pic_ptr->tf, INT_MAX, > - last_pic_structure == PICT_BOTTOM_FIELD); > + if (h->cur_pic_ptr->tf.owner[last_field] == h->avctx) { > + ff_thread_report_progress(&h->cur_pic_ptr->tf, INT_MAX, last_field); > } > > /* figure out if we have a complementary field pair */ > @@ -1568,7 +1568,9 @@ static int h264_field_start(H264Context *h, const H264SliceContext *sl, > return AVERROR_INVALIDDATA; > } > } else { > + int field = h->picture_structure == PICT_BOTTOM_FIELD; > release_unused_pictures(h, 0); > + h->cur_pic_ptr->tf.owner[field] = h->avctx; > } > /* Some macroblocks can be accessed before they're available in case > * of lost slices, MBAFF or threading. */ Note: I have to admit I don't understand the changes to libavcodec/h264_slice.c. The changes to the other files are straightforward, except for the one issue I ask about below. > diff --git a/libavcodec/utils.c b/libavcodec/utils.c > index 3e8677d..0c68836 100644 > --- a/libavcodec/utils.c > +++ b/libavcodec/utils.c > @@ -3971,7 +3971,8 @@ int ff_thread_ref_frame(ThreadFrame *dst, ThreadFrame *src) > { > int ret; > > - dst->owner = src->owner; > + dst->owner[0] = src->owner[0]; > + dst->owner[1] = src->owner[1]; > > ret = av_frame_ref(dst->f, src->f); > if (ret < 0) > @@ -3981,7 +3982,7 @@ int ff_thread_ref_frame(ThreadFrame *dst, ThreadFrame *src) > > if (src->progress && > !(dst->progress = av_buffer_ref(src->progress))) { > - ff_thread_release_buffer(dst->owner, dst); > + ff_thread_release_buffer(dst->owner[0], dst); > return AVERROR(ENOMEM); > } > [...] I don't understand why we pass dst->owner[0], not dst->owner[1], to the ff_thread_release_buffer() call. Does this assume dst->owner[0] == dst->owner[1]? Although dst->owner[0] and dst->owner[1] are initialized to the same value, the changes to libavcodec/h264_slice.c seem to imply dst->owner[0] and dst->owner[1] may become different. Thanks, Wan-Teh Chang
Hi Wan-Teh, On Tue, Jul 11, 2017 at 4:53 PM, Wan-Teh Chang <wtc@google.com> wrote: > Hi Ronald, > > I have a question about this patch. > > On Mon, Apr 3, 2017 at 7:24 AM, Ronald S. Bultje <rsbultje@gmail.com> > wrote: > > This tries to handle cases where separate invocations of decode_frame() > > (each running in separate threads) write to respective fields in the > > same AVFrame->data[]. Having per-field owners makes interaction between > > readers (the referencing thread) and writers (the decoding thread) > > slightly more optimal if both accesses are field-based, since they will > > use the respective producer's thread objects (mutex/cond) instead of > > sharing the thread objects of the first field's producer. > > > > In practice, this fixes the following tsan-warning in fate-h264: > > > > WARNING: ThreadSanitizer: data race (pid=21615) > > Read of size 4 at 0x7d640000d9fc by thread T2 (mutexes: write M1006): > > #0 ff_thread_report_progress pthread_frame.c:569 > (ffmpeg:x86_64+0x100f7cf54) > > [..] > > Previous write of size 4 at 0x7d640000d9fc by main thread (mutexes: > write M1004): > > #0 update_context_from_user pthread_frame.c:335 > (ffmpeg:x86_64+0x100f81abb) > > --- > > libavcodec/h264_slice.c | 8 +++++--- > > libavcodec/pthread_frame.c | 18 ++++++++++-------- > > libavcodec/thread.h | 2 +- > > libavcodec/utils.c | 7 ++++--- > > 4 files changed, 20 insertions(+), 15 deletions(-) > > > > diff --git a/libavcodec/h264_slice.c b/libavcodec/h264_slice.c > > index fa1e9ae..d4d31cc 100644 > > --- a/libavcodec/h264_slice.c > > +++ b/libavcodec/h264_slice.c > > @@ -1423,14 +1423,14 @@ static int h264_field_start(H264Context *h, > const H264SliceContext *sl, > > * We have to do that before the "dummy" in-between frame > allocation, > > * since that can modify h->cur_pic_ptr. */ > > if (h->first_field) { > > + int last_field = last_pic_structure == PICT_BOTTOM_FIELD; > > av_assert0(h->cur_pic_ptr); > > av_assert0(h->cur_pic_ptr->f->buf[0]); > > assert(h->cur_pic_ptr->reference != DELAYED_PIC_REF); > > > > /* Mark old field/frame as completed */ > > - if (h->cur_pic_ptr->tf.owner == h->avctx) { > > - ff_thread_report_progress(&h->cur_pic_ptr->tf, INT_MAX, > > - last_pic_structure == > PICT_BOTTOM_FIELD); > > + if (h->cur_pic_ptr->tf.owner[last_field] == h->avctx) { > > + ff_thread_report_progress(&h->cur_pic_ptr->tf, INT_MAX, > last_field); > > } > > > > /* figure out if we have a complementary field pair */ > > @@ -1568,7 +1568,9 @@ static int h264_field_start(H264Context *h, const > H264SliceContext *sl, > > return AVERROR_INVALIDDATA; > > } > > } else { > > + int field = h->picture_structure == PICT_BOTTOM_FIELD; > > release_unused_pictures(h, 0); > > + h->cur_pic_ptr->tf.owner[field] = h->avctx; > > } > > /* Some macroblocks can be accessed before they're available in case > > * of lost slices, MBAFF or threading. */ > > Note: I have to admit I don't understand the changes to > libavcodec/h264_slice.c. The changes to the other files are > straightforward, except for the one issue I ask about below. > > > diff --git a/libavcodec/utils.c b/libavcodec/utils.c > > index 3e8677d..0c68836 100644 > > --- a/libavcodec/utils.c > > +++ b/libavcodec/utils.c > > @@ -3971,7 +3971,8 @@ int ff_thread_ref_frame(ThreadFrame *dst, > ThreadFrame *src) > > { > > int ret; > > > > - dst->owner = src->owner; > > + dst->owner[0] = src->owner[0]; > > + dst->owner[1] = src->owner[1]; > > > > ret = av_frame_ref(dst->f, src->f); > > if (ret < 0) > > @@ -3981,7 +3982,7 @@ int ff_thread_ref_frame(ThreadFrame *dst, > ThreadFrame *src) > > > > if (src->progress && > > !(dst->progress = av_buffer_ref(src->progress))) { > > - ff_thread_release_buffer(dst->owner, dst); > > + ff_thread_release_buffer(dst->owner[0], dst); > > return AVERROR(ENOMEM); > > } > > > [...] > > I don't understand why we pass dst->owner[0], not dst->owner[1], to > the ff_thread_release_buffer() call. Does this assume dst->owner[0] == > dst->owner[1]? > Neither is perfect, ideally (from an API programming PoV) you'd want to use both. But that's not possible... In practice, I don't think it matters much, since all it does is decide which release_buffer-queue it'll be appended to (if we do delayed free()'ing). Are there problems with this? Although dst->owner[0] and dst->owner[1] are initialized to the same > value, the changes to libavcodec/h264_slice.c seem to imply > dst->owner[0] and dst->owner[1] may become different. Right, if each slice is decoded in a separate (frame-)worker thread, then they will be different. Ronald
Hi Ronald, Thank you for the quick reply. On Tue, Jul 11, 2017 at 2:09 PM, Ronald S. Bultje <rsbultje@gmail.com> wrote: > Hi Wan-Teh, > > On Tue, Jul 11, 2017 at 4:53 PM, Wan-Teh Chang <wtc@google.com> wrote: >> >> Hi Ronald, >> >> I have a question about this patch. >> >> On Mon, Apr 3, 2017 at 7:24 AM, Ronald S. Bultje <rsbultje@gmail.com> >> wrote: [...] >> > diff --git a/libavcodec/utils.c b/libavcodec/utils.c >> > index 3e8677d..0c68836 100644 >> > --- a/libavcodec/utils.c >> > +++ b/libavcodec/utils.c >> > @@ -3971,7 +3971,8 @@ int ff_thread_ref_frame(ThreadFrame *dst, >> > ThreadFrame *src) >> > { >> > int ret; >> > >> > - dst->owner = src->owner; >> > + dst->owner[0] = src->owner[0]; >> > + dst->owner[1] = src->owner[1]; >> > >> > ret = av_frame_ref(dst->f, src->f); >> > if (ret < 0) >> > @@ -3981,7 +3982,7 @@ int ff_thread_ref_frame(ThreadFrame *dst, >> > ThreadFrame *src) >> > >> > if (src->progress && >> > !(dst->progress = av_buffer_ref(src->progress))) { >> > - ff_thread_release_buffer(dst->owner, dst); >> > + ff_thread_release_buffer(dst->owner[0], dst); >> > return AVERROR(ENOMEM); >> > } >> > >> [...] >> >> I don't understand why we pass dst->owner[0], not dst->owner[1], to >> the ff_thread_release_buffer() call. Does this assume dst->owner[0] == >> dst->owner[1]? > > Neither is perfect, ideally (from an API programming PoV) you'd want to use > both. But that's not possible... Can ff_thread_release_buffer() use f->owner[0] and f->owner[1] and ignore the avctx argument? This should work for the ff_thread_release_buffer() call in ff_thread_ref_frame(), but I don't know if this is correct in general. > In practice, I don't think it matters much, > since all it does is decide which release_buffer-queue it'll be appended to > (if we do delayed free()'ing). Are there problems with this? No, there aren't. With your explanation, I roughly understand the ff_thread_release_buffer() code now. Thanks. Note: I am cherry-picking tsan warning fixes to our copy of ffmpeg, which is several months behind the tip. As due diligence, I review the fixes before I merge them. Wan-Teh Chang
Hi Wan-Teh, On Tue, Jul 11, 2017 at 6:30 PM, Wan-Teh Chang <wtc@google.com> wrote: > Hi Ronald, > > Thank you for the quick reply. > > On Tue, Jul 11, 2017 at 2:09 PM, Ronald S. Bultje <rsbultje@gmail.com> > wrote: > > Hi Wan-Teh, > > > > On Tue, Jul 11, 2017 at 4:53 PM, Wan-Teh Chang <wtc@google.com> wrote: > >> > >> Hi Ronald, > >> > >> I have a question about this patch. > >> > >> On Mon, Apr 3, 2017 at 7:24 AM, Ronald S. Bultje <rsbultje@gmail.com> > >> wrote: > [...] > >> > diff --git a/libavcodec/utils.c b/libavcodec/utils.c > >> > index 3e8677d..0c68836 100644 > >> > --- a/libavcodec/utils.c > >> > +++ b/libavcodec/utils.c > >> > @@ -3971,7 +3971,8 @@ int ff_thread_ref_frame(ThreadFrame *dst, > >> > ThreadFrame *src) > >> > { > >> > int ret; > >> > > >> > - dst->owner = src->owner; > >> > + dst->owner[0] = src->owner[0]; > >> > + dst->owner[1] = src->owner[1]; > >> > > >> > ret = av_frame_ref(dst->f, src->f); > >> > if (ret < 0) > >> > @@ -3981,7 +3982,7 @@ int ff_thread_ref_frame(ThreadFrame *dst, > >> > ThreadFrame *src) > >> > > >> > if (src->progress && > >> > !(dst->progress = av_buffer_ref(src->progress))) { > >> > - ff_thread_release_buffer(dst->owner, dst); > >> > + ff_thread_release_buffer(dst->owner[0], dst); > >> > return AVERROR(ENOMEM); > >> > } > >> > > >> [...] > >> > >> I don't understand why we pass dst->owner[0], not dst->owner[1], to > >> the ff_thread_release_buffer() call. Does this assume dst->owner[0] == > >> dst->owner[1]? > > > > Neither is perfect, ideally (from an API programming PoV) you'd want to > use > > both. But that's not possible... > > Can ff_thread_release_buffer() use f->owner[0] and f->owner[1] and > ignore the avctx argument? > > This should work for the ff_thread_release_buffer() call in > ff_thread_ref_frame(), but I don't know if this is correct in general. I guess. I'm guessing that the prototype was modeled after the get_buffer2() (and release_buffer(), before refcounting) callback prototype in AVCodecContext. I wouldn't object to a patch to fix that. Ronald
diff --git a/libavcodec/h264_slice.c b/libavcodec/h264_slice.c index fa1e9ae..d4d31cc 100644 --- a/libavcodec/h264_slice.c +++ b/libavcodec/h264_slice.c @@ -1423,14 +1423,14 @@ static int h264_field_start(H264Context *h, const H264SliceContext *sl, * We have to do that before the "dummy" in-between frame allocation, * since that can modify h->cur_pic_ptr. */ if (h->first_field) { + int last_field = last_pic_structure == PICT_BOTTOM_FIELD; av_assert0(h->cur_pic_ptr); av_assert0(h->cur_pic_ptr->f->buf[0]); assert(h->cur_pic_ptr->reference != DELAYED_PIC_REF); /* Mark old field/frame as completed */ - if (h->cur_pic_ptr->tf.owner == h->avctx) { - ff_thread_report_progress(&h->cur_pic_ptr->tf, INT_MAX, - last_pic_structure == PICT_BOTTOM_FIELD); + if (h->cur_pic_ptr->tf.owner[last_field] == h->avctx) { + ff_thread_report_progress(&h->cur_pic_ptr->tf, INT_MAX, last_field); } /* figure out if we have a complementary field pair */ @@ -1568,7 +1568,9 @@ static int h264_field_start(H264Context *h, const H264SliceContext *sl, return AVERROR_INVALIDDATA; } } else { + int field = h->picture_structure == PICT_BOTTOM_FIELD; release_unused_pictures(h, 0); + h->cur_pic_ptr->tf.owner[field] = h->avctx; } /* Some macroblocks can be accessed before they're available in case * of lost slices, MBAFF or threading. */ diff --git a/libavcodec/pthread_frame.c b/libavcodec/pthread_frame.c index 9a6b83a..c246c2f 100644 --- a/libavcodec/pthread_frame.c +++ b/libavcodec/pthread_frame.c @@ -564,10 +564,11 @@ void ff_thread_report_progress(ThreadFrame *f, int n, int field) atomic_load_explicit(&progress[field], memory_order_relaxed) >= n) return; - p = f->owner->internal->thread_ctx; + p = f->owner[field]->internal->thread_ctx; - if (f->owner->debug&FF_DEBUG_THREADS) - av_log(f->owner, AV_LOG_DEBUG, "%p finished %d field %d\n", progress, n, field); + if (f->owner[field]->debug&FF_DEBUG_THREADS) + av_log(f->owner[field], AV_LOG_DEBUG, + "%p finished %d field %d\n", progress, n, field); pthread_mutex_lock(&p->progress_mutex); @@ -586,10 +587,11 @@ void ff_thread_await_progress(ThreadFrame *f, int n, int field) atomic_load_explicit(&progress[field], memory_order_acquire) >= n) return; - p = f->owner->internal->thread_ctx; + p = f->owner[field]->internal->thread_ctx; - if (f->owner->debug&FF_DEBUG_THREADS) - av_log(f->owner, AV_LOG_DEBUG, "thread awaiting %d field %d from %p\n", n, field, progress); + if (f->owner[field]->debug&FF_DEBUG_THREADS) + av_log(f->owner[field], AV_LOG_DEBUG, + "thread awaiting %d field %d from %p\n", n, field, progress); pthread_mutex_lock(&p->progress_mutex); while (atomic_load_explicit(&progress[field], memory_order_relaxed) < n) @@ -882,7 +884,7 @@ static int thread_get_buffer_internal(AVCodecContext *avctx, ThreadFrame *f, int PerThreadContext *p = avctx->internal->thread_ctx; int err; - f->owner = avctx; + f->owner[0] = f->owner[1] = avctx; ff_init_buffer_info(avctx, f->f); @@ -986,7 +988,7 @@ void ff_thread_release_buffer(AVCodecContext *avctx, ThreadFrame *f) av_log(avctx, AV_LOG_DEBUG, "thread_release_buffer called on pic %p\n", f); av_buffer_unref(&f->progress); - f->owner = NULL; + f->owner[0] = f->owner[1] = NULL; if (can_direct_free) { av_frame_unref(f->f); diff --git a/libavcodec/thread.h b/libavcodec/thread.h index c848d7a..90864b5 100644 --- a/libavcodec/thread.h +++ b/libavcodec/thread.h @@ -34,7 +34,7 @@ typedef struct ThreadFrame { AVFrame *f; - AVCodecContext *owner; + AVCodecContext *owner[2]; // progress->data is an array of 2 ints holding progress for top/bottom // fields AVBufferRef *progress; diff --git a/libavcodec/utils.c b/libavcodec/utils.c index 3e8677d..0c68836 100644 --- a/libavcodec/utils.c +++ b/libavcodec/utils.c @@ -3971,7 +3971,8 @@ int ff_thread_ref_frame(ThreadFrame *dst, ThreadFrame *src) { int ret; - dst->owner = src->owner; + dst->owner[0] = src->owner[0]; + dst->owner[1] = src->owner[1]; ret = av_frame_ref(dst->f, src->f); if (ret < 0) @@ -3981,7 +3982,7 @@ int ff_thread_ref_frame(ThreadFrame *dst, ThreadFrame *src) if (src->progress && !(dst->progress = av_buffer_ref(src->progress))) { - ff_thread_release_buffer(dst->owner, dst); + ff_thread_release_buffer(dst->owner[0], dst); return AVERROR(ENOMEM); } @@ -3997,7 +3998,7 @@ enum AVPixelFormat ff_thread_get_format(AVCodecContext *avctx, const enum AVPixe int ff_thread_get_buffer(AVCodecContext *avctx, ThreadFrame *f, int flags) { - f->owner = avctx; + f->owner[0] = f->owner[1] = avctx; return ff_get_buffer(avctx, f->f, flags); }