[FFmpeg-devel] idctdsp: make ff_add/put_pixels_clamped function pointers atomic.

Submitted by Ronald S. Bultje on April 4, 2017, 1:09 p.m.

Details

Message ID 1491311376-4130-1-git-send-email-rsbultje@gmail.com
State New
Headers show

Commit Message

Ronald S. Bultje April 4, 2017, 1:09 p.m.
This fixes the following tsan warnings when running fate-dnxhr-parse:

WARNING: ThreadSanitizer: data race (pid=29917)
  Write of size 8 at 0x0000025b12d8 by thread T2 (mutexes: write M1543):
    #0 ff_idctdsp_init src/libavcodec/idctdsp.c:313 (ffmpeg+0x00000044b68e)
[..]
  Previous write of size 8 at 0x0000025b12d8 by thread T1 (mutexes: write M1541):
    #0 ff_idctdsp_init src/libavcodec/idctdsp.c:313 (ffmpeg+0x00000044b68e)
---
 libavcodec/idctdsp.c |  8 ++++----
 libavcodec/idctdsp.h | 17 +++++++++++++++--
 2 files changed, 19 insertions(+), 6 deletions(-)

Comments

Michael Niedermayer April 4, 2017, 2:07 p.m.
On Tue, Apr 04, 2017 at 09:09:36AM -0400, Ronald S. Bultje wrote:
> This fixes the following tsan warnings when running fate-dnxhr-parse:
> 
> WARNING: ThreadSanitizer: data race (pid=29917)
>   Write of size 8 at 0x0000025b12d8 by thread T2 (mutexes: write M1543):
>     #0 ff_idctdsp_init src/libavcodec/idctdsp.c:313 (ffmpeg+0x00000044b68e)
> [..]
>   Previous write of size 8 at 0x0000025b12d8 by thread T1 (mutexes: write M1541):
>     #0 ff_idctdsp_init src/libavcodec/idctdsp.c:313 (ffmpeg+0x00000044b68e)
> ---
>  libavcodec/idctdsp.c |  8 ++++----
>  libavcodec/idctdsp.h | 17 +++++++++++++++--
>  2 files changed, 19 insertions(+), 6 deletions(-)

does this have any performance effect and why do 2 threads write into
the same variable ?
It seems rather suspicious that 2 diferent threads write into the same
pointer


[...]
Ronald S. Bultje April 4, 2017, 2:10 p.m.
Hi,

On Tue, Apr 4, 2017 at 10:07 AM, Michael Niedermayer <michael@niedermayer.cc
> wrote:

> On Tue, Apr 04, 2017 at 09:09:36AM -0400, Ronald S. Bultje wrote:
> > This fixes the following tsan warnings when running fate-dnxhr-parse:
> >
> > WARNING: ThreadSanitizer: data race (pid=29917)
> >   Write of size 8 at 0x0000025b12d8 by thread T2 (mutexes: write M1543):
> >     #0 ff_idctdsp_init src/libavcodec/idctdsp.c:313
> (ffmpeg+0x00000044b68e)
> > [..]
> >   Previous write of size 8 at 0x0000025b12d8 by thread T1 (mutexes:
> write M1541):
> >     #0 ff_idctdsp_init src/libavcodec/idctdsp.c:313
> (ffmpeg+0x00000044b68e)
> > ---
> >  libavcodec/idctdsp.c |  8 ++++----
> >  libavcodec/idctdsp.h | 17 +++++++++++++++--
> >  2 files changed, 19 insertions(+), 6 deletions(-)
>
> does this have any performance effect and why do 2 threads write into
> the same variable ?
> It seems rather suspicious that 2 diferent threads write into the same
> pointer


In dnxhd, the initial context is uninitialized and is initialized at
runtime during header parsing (or when bpp changes). So each thread calls
idctdsp_init() once, each of which writes into the same global variable
exactly once.

This patch is a total hack, I openly admit that, I'm just exploring the
bugs and am happy to discuss alternate designs. Some people have expressed
the wish to get rid of the global function pointers which is probably a
good longer-term goal...

Ronald

Patch hide | download patch | download mbox

diff --git a/libavcodec/idctdsp.c b/libavcodec/idctdsp.c
index 84dd645..9bcf8ae 100644
--- a/libavcodec/idctdsp.c
+++ b/libavcodec/idctdsp.c
@@ -80,8 +80,8 @@  av_cold void ff_init_scantable_permutation(uint8_t *idct_permutation,
     }
 }
 
-void (*ff_put_pixels_clamped)(const int16_t *block, uint8_t *pixels, ptrdiff_t line_size);
-void (*ff_add_pixels_clamped)(const int16_t *block, uint8_t *pixels, ptrdiff_t line_size);
+atomic_uintptr_t ff_put_pixels_clamped_fn_ptr;
+atomic_uintptr_t ff_add_pixels_clamped_fn_ptr;
 
 static void put_pixels_clamped_c(const int16_t *block, uint8_t *av_restrict pixels,
                                  ptrdiff_t line_size)
@@ -310,8 +310,8 @@  av_cold void ff_idctdsp_init(IDCTDSPContext *c, AVCodecContext *avctx)
     if (ARCH_MIPS)
         ff_idctdsp_init_mips(c, avctx, high_bit_depth);
 
-    ff_put_pixels_clamped = c->put_pixels_clamped;
-    ff_add_pixels_clamped = c->add_pixels_clamped;
+    atomic_store(&ff_put_pixels_clamped_fn_ptr, (uintptr_t) c->put_pixels_clamped);
+    atomic_store(&ff_add_pixels_clamped_fn_ptr, (uintptr_t) c->add_pixels_clamped);
 
     ff_init_scantable_permutation(c->idct_permutation,
                                   c->perm_type);
diff --git a/libavcodec/idctdsp.h b/libavcodec/idctdsp.h
index f9ba6c3..16595fc 100644
--- a/libavcodec/idctdsp.h
+++ b/libavcodec/idctdsp.h
@@ -19,6 +19,7 @@ 
 #ifndef AVCODEC_IDCTDSP_H
 #define AVCODEC_IDCTDSP_H
 
+#include <stdatomic.h>
 #include <stdint.h>
 
 #include "config.h"
@@ -97,8 +98,20 @@  typedef struct IDCTDSPContext {
     enum idct_permutation_type perm_type;
 } IDCTDSPContext;
 
-extern void (*ff_put_pixels_clamped)(const int16_t *block, uint8_t *pixels, ptrdiff_t line_size);
-extern void (*ff_add_pixels_clamped)(const int16_t *block, uint8_t *pixels, ptrdiff_t line_size);
+typedef void (*px_clamp_fn)(const int16_t *block, uint8_t *pixels,
+                            ptrdiff_t line_size);
+
+#define ff_put_pixels_clamped ff_put_pixels_clamped_get_fn_ptr()
+extern atomic_uintptr_t ff_put_pixels_clamped_fn_ptr;
+static av_always_inline px_clamp_fn ff_put_pixels_clamped_get_fn_ptr(void) {
+    return (px_clamp_fn) atomic_load(&ff_put_pixels_clamped_fn_ptr);
+}
+
+#define ff_add_pixels_clamped ff_add_pixels_clamped_get_fn_ptr()
+extern atomic_uintptr_t ff_add_pixels_clamped_fn_ptr;
+static av_always_inline px_clamp_fn ff_add_pixels_clamped_get_fn_ptr(void) {
+    return (px_clamp_fn) atomic_load(&ff_add_pixels_clamped_fn_ptr);
+}
 
 void ff_idctdsp_init(IDCTDSPContext *c, AVCodecContext *avctx);