From patchwork Tue Nov 22 19:30:32 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wan-Teh Chang X-Patchwork-Id: 1530 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.90.1 with SMTP id o1csp2365596vsb; Tue, 22 Nov 2016 11:32:21 -0800 (PST) X-Received: by 10.194.22.170 with SMTP id e10mr14964629wjf.125.1479843141208; Tue, 22 Nov 2016 11:32:21 -0800 (PST) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j16si4108703wmd.116.2016.11.22.11.32.20; Tue, 22 Nov 2016 11:32:21 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DE5E1687ECC; Tue, 22 Nov 2016 21:32:09 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg0-f44.google.com (mail-pg0-f44.google.com [74.125.83.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C2833680D49 for ; Tue, 22 Nov 2016 21:32:02 +0200 (EET) Received: by mail-pg0-f44.google.com with SMTP id p66so10559604pga.2 for ; Tue, 22 Nov 2016 11:32:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:subject:date:message-id:in-reply-to:references; bh=50fxXcRbWPEaMdOlfSGwQQgzXfO2ikOvTdgkQO8dK0c=; b=Q6LmUxHv/mn7jrhkCQhSBlnMMyohzwwNPgHm6+tk2lt0m89KxnqOMmFHIaKYdWRoip tZADtbefPtG7oDrbDlWCZLbhMb7uMgSDXhaqt8lMIGReavHH/U1QJaHrZef9cnzRQDcn FXCLezDEWmzwucZ4uPAKeBCs5jMxZ7uRrPyGdS+C0LTiNBWIxSyNxmwqcT+bVT2uAdeb VcoHpY7sfy3RUrIsYHcKWp1LvCvFmiMoEVraVzgM/J0nXUBNa/7XAx0dWwp0ezBc4Eh4 PgyuM/ECo6mtziTL8/qV9WLKh9YETV6GeiCcTdhtVCmKmiRsPgKKnWdg8WCY//yXRfjv MSeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=50fxXcRbWPEaMdOlfSGwQQgzXfO2ikOvTdgkQO8dK0c=; b=O7GMnKoJIoTZmlspjW+8RyOyiemn8nv/vYZrIzIPpM5suQ1ER5Ke/v+9THS800S2LJ /nxhCxcRJ5qc4vXZbqhNpoWD/IeHT7HcYfebL+OmAPXI6mNnjT31uJZp6pryuMyDbVTk 3C98ZyGiuMNOwcmOl+5k7stKBVUhAjDs4Vxs7r4XIJhiklG2XkUZdt9rVlcyme/ny/vU heS9Jr2CjC4VQteId2GQee5awFDvndp4UlFpoAQvCU9NfnBQrtJhnPfrVrJxVGzECTlU 6qyHZjWVpprlG4kmfE632fc0r+cIAdgJUZ34iMVx9RMleNfC7lBRzeYF3Dxq4UELlQ9k cCfA== X-Gm-Message-State: AKaTC02E2PyO0qeIhyxX/GJ6I5/eYD4a03fD7epE8ECXwCSTUZTav1/mVpWxRFwmqDvX1P5N X-Received: by 10.99.151.1 with SMTP id n1mr46863072pge.122.1479843124132; Tue, 22 Nov 2016 11:32:04 -0800 (PST) Received: from wtc-desktop.mtv.corp.google.com ([100.98.5.56]) by smtp.googlemail.com with ESMTPSA id n25sm47034440pfi.33.2016.11.22.11.32.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 22 Nov 2016 11:32:03 -0800 (PST) From: Wan-Teh Chang To: ffmpeg-devel@ffmpeg.org Date: Tue, 22 Nov 2016 11:30:32 -0800 Message-Id: <1479843032-96639-2-git-send-email-wtc@google.com> X-Mailer: git-send-email 2.8.0.rc3.226.g39d4020 In-Reply-To: <1479843032-96639-1-git-send-email-wtc@google.com> References: <1479771469-60379-1-git-send-email-wtc@google.com> <1479843032-96639-1-git-send-email-wtc@google.com> Subject: [FFmpeg-devel] [PATCH] Make the one-time initialization in av_get_cpu_flags() thread-safe. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" The static variables |flags| and |checked| in libavutil/cpu.c are read and written using normal load and store instructions. These are considered as data races. The fix is to use atomic load and store instructions. The fix also eliminates the |checked| variable because the invalid value of -1 for |flags| can be used to indicate the same condition. The fix requires a new atomic integer compare and swap function. Since the fix does not need memory barriers, "relaxed" variants of avpriv_atomic_int_get() and avpriv_atomic_int_set() are also added. Add a test program libavutil/tests/cpu_init.c to verify the one-time initialization in av_get_cpu_flags() is thread-safe. Co-author: Dmitry Vyukov of Google, which suggested the data race fix. Signed-off-by: Wan-Teh Chang --- libavutil/Makefile | 1 + libavutil/atomic.c | 40 ++++++++++++++++++++++++++ libavutil/atomic.h | 34 ++++++++++++++++++++-- libavutil/atomic_gcc.h | 33 +++++++++++++++++++++ libavutil/atomic_suncc.h | 19 ++++++++++++ libavutil/atomic_win32.h | 21 ++++++++++++++ libavutil/cpu.c | 41 ++++++++++++++------------ libavutil/cpu.h | 2 -- libavutil/tests/atomic.c | 13 +++++++++ libavutil/tests/cpu_init.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++ 10 files changed, 253 insertions(+), 23 deletions(-) create mode 100644 libavutil/tests/cpu_init.c diff --git a/libavutil/Makefile b/libavutil/Makefile index 0fa90fe..732961a 100644 --- a/libavutil/Makefile +++ b/libavutil/Makefile @@ -187,6 +187,7 @@ TESTPROGS = adler32 \ camellia \ color_utils \ cpu \ + cpu_init \ crc \ des \ dict \ diff --git a/libavutil/atomic.c b/libavutil/atomic.c index 64cff25..7bce013 100644 --- a/libavutil/atomic.c +++ b/libavutil/atomic.c @@ -40,6 +40,11 @@ int avpriv_atomic_int_get(volatile int *ptr) return res; } +int avpriv_atomic_int_get_relaxed(volatile int *ptr) +{ + return avpriv_atomic_int_get(ptr); +} + void avpriv_atomic_int_set(volatile int *ptr, int val) { pthread_mutex_lock(&atomic_lock); @@ -47,6 +52,11 @@ void avpriv_atomic_int_set(volatile int *ptr, int val) pthread_mutex_unlock(&atomic_lock); } +void avpriv_atomic_int_set_relaxed(volatile int *ptr, int val) +{ + avpriv_atomic_int_set(ptr, val); +} + int avpriv_atomic_int_add_and_fetch(volatile int *ptr, int inc) { int res; @@ -59,6 +69,17 @@ int avpriv_atomic_int_add_and_fetch(volatile int *ptr, int inc) return res; } +int avpriv_atomic_int_cas_relaxed(volatile int *ptr, int oldval, int newval) +{ + int ret; + pthread_mutex_lock(&atomic_lock); + ret = *ptr; + if (ret == oldval) + *ptr = newval; + pthread_mutex_unlock(&atomic_lock); + return ret; +} + void *avpriv_atomic_ptr_cas(void * volatile *ptr, void *oldval, void *newval) { void *ret; @@ -77,17 +98,36 @@ int avpriv_atomic_int_get(volatile int *ptr) return *ptr; } +int avpriv_atomic_int_get_relaxed(volatile int *ptr) +{ + return avpriv_atomic_int_get(ptr); +} + void avpriv_atomic_int_set(volatile int *ptr, int val) { *ptr = val; } +void avpriv_atomic_int_set_relaxed(volatile int *ptr, int val) +{ + avpriv_atomic_int_set(ptr, val); +} + int avpriv_atomic_int_add_and_fetch(volatile int *ptr, int inc) { *ptr += inc; return *ptr; } +int avpriv_atomic_int_cas_relaxed(volatile int *ptr, int oldval, int newval) +{ + if (*ptr == oldval) { + *ptr = newval; + return oldval; + } + return *ptr; +} + void *avpriv_atomic_ptr_cas(void * volatile *ptr, void *oldval, void *newval) { if (*ptr == oldval) { diff --git a/libavutil/atomic.h b/libavutil/atomic.h index 15906d2..c5aa1a4 100644 --- a/libavutil/atomic.h +++ b/libavutil/atomic.h @@ -40,20 +40,38 @@ * * @param ptr atomic integer * @return the current value of the atomic integer - * @note This acts as a memory barrier. + * @note This acts as a memory barrier with sequentially-consistent ordering. */ int avpriv_atomic_int_get(volatile int *ptr); /** + * Load the current value stored in an atomic integer. + * + * @param ptr atomic integer + * @return the current value of the atomic integer + * @note This does NOT act as a memory barrier. + */ +int avpriv_atomic_int_get_relaxed(volatile int *ptr); + +/** * Store a new value in an atomic integer. * * @param ptr atomic integer * @param val the value to store in the atomic integer - * @note This acts as a memory barrier. + * @note This acts as a memory barrier with sequentially-consistent ordering. */ void avpriv_atomic_int_set(volatile int *ptr, int val); /** + * Store a new value in an atomic integer. + * + * @param ptr atomic integer + * @param val the value to store in the atomic integer + * @note This does NOT act as a memory barrier. + */ +void avpriv_atomic_int_set_relaxed(volatile int *ptr, int val); + +/** * Add a value to an atomic integer. * * @param ptr atomic integer @@ -65,12 +83,24 @@ void avpriv_atomic_int_set(volatile int *ptr, int val); int avpriv_atomic_int_add_and_fetch(volatile int *ptr, int inc); /** + * Atomic integer compare and swap. + * + * @param ptr pointer to the integer to operate on + * @param oldval do the swap if the current value of *ptr equals to oldval + * @param newval value to replace *ptr with + * @return the value of *ptr before comparison + * @note This does NOT act as a memory barrier. + */ +int avpriv_atomic_int_cas_relaxed(volatile int *ptr, int oldval, int newval); + +/** * Atomic pointer compare and swap. * * @param ptr pointer to the pointer to operate on * @param oldval do the swap if the current value of *ptr equals to oldval * @param newval value to replace *ptr with * @return the value of *ptr before comparison + * @note This acts as a memory barrier with sequentially-consistent ordering. */ void *avpriv_atomic_ptr_cas(void * volatile *ptr, void *oldval, void *newval); diff --git a/libavutil/atomic_gcc.h b/libavutil/atomic_gcc.h index 5f9fc49..15dd508 100644 --- a/libavutil/atomic_gcc.h +++ b/libavutil/atomic_gcc.h @@ -36,6 +36,16 @@ static inline int atomic_int_get_gcc(volatile int *ptr) #endif } +#define avpriv_atomic_int_get_relaxed atomic_int_get_relaxed_gcc +static inline int atomic_int_get_relaxed_gcc(volatile int *ptr) +{ +#if HAVE_ATOMIC_COMPARE_EXCHANGE + return __atomic_load_n(ptr, __ATOMIC_RELAXED); +#else + return *ptr; +#endif +} + #define avpriv_atomic_int_set atomic_int_set_gcc static inline void atomic_int_set_gcc(volatile int *ptr, int val) { @@ -47,6 +57,16 @@ static inline void atomic_int_set_gcc(volatile int *ptr, int val) #endif } +#define avpriv_atomic_int_set_relaxed atomic_int_set_relaxed_gcc +static inline void atomic_int_set_relaxed_gcc(volatile int *ptr, int val) +{ +#if HAVE_ATOMIC_COMPARE_EXCHANGE + __atomic_store_n(ptr, val, __ATOMIC_RELAXED); +#else + *ptr = val; +#endif +} + #define avpriv_atomic_int_add_and_fetch atomic_int_add_and_fetch_gcc static inline int atomic_int_add_and_fetch_gcc(volatile int *ptr, int inc) { @@ -57,6 +77,19 @@ static inline int atomic_int_add_and_fetch_gcc(volatile int *ptr, int inc) #endif } +#define avpriv_atomic_int_cas_relaxed atomic_int_cas_relaxed_gcc +static inline int atomic_int_cas_relaxed_gcc(volatile int *ptr, + int oldval, int newval) +{ +#if HAVE_SYNC_VAL_COMPARE_AND_SWAP + return __sync_val_compare_and_swap(ptr, oldval, newval); +#else + __atomic_compare_exchange_n(ptr, &oldval, newval, 0, __ATOMIC_RELAXED, + __ATOMIC_RELAXED); + return oldval; +#endif +} + #define avpriv_atomic_ptr_cas atomic_ptr_cas_gcc static inline void *atomic_ptr_cas_gcc(void * volatile *ptr, void *oldval, void *newval) diff --git a/libavutil/atomic_suncc.h b/libavutil/atomic_suncc.h index a75a37b..0997f23 100644 --- a/libavutil/atomic_suncc.h +++ b/libavutil/atomic_suncc.h @@ -31,6 +31,12 @@ static inline int atomic_int_get_suncc(volatile int *ptr) return *ptr; } +#define avpriv_atomic_int_get_relaxed atomic_int_get_relaxed_suncc +static inline int atomic_int_get_relaxed_suncc(volatile int *ptr) +{ + return *ptr; +} + #define avpriv_atomic_int_set atomic_int_set_suncc static inline void atomic_int_set_suncc(volatile int *ptr, int val) { @@ -38,12 +44,25 @@ static inline void atomic_int_set_suncc(volatile int *ptr, int val) __machine_rw_barrier(); } +#define avpriv_atomic_int_set_relaxed atomic_int_set_relaxed_suncc +static inline void atomic_int_set_relaxed_suncc(volatile int *ptr, int val) +{ + *ptr = val; +} + #define avpriv_atomic_int_add_and_fetch atomic_int_add_and_fetch_suncc static inline int atomic_int_add_and_fetch_suncc(volatile int *ptr, int inc) { return atomic_add_int_nv(ptr, inc); } +#define avpriv_atomic_int_cas_relaxed atomic_int_cas_relaxed_suncc +static inline int atomic_int_cas_relaxed_suncc(volatile int *ptr, + int oldval, int newval) +{ + return atomic_cas_uint((volatile uint_t *)ptr, oldval, newval); +} + #define avpriv_atomic_ptr_cas atomic_ptr_cas_suncc static inline void *atomic_ptr_cas_suncc(void * volatile *ptr, void *oldval, void *newval) diff --git a/libavutil/atomic_win32.h b/libavutil/atomic_win32.h index f729933..7ea0a56 100644 --- a/libavutil/atomic_win32.h +++ b/libavutil/atomic_win32.h @@ -31,6 +31,12 @@ static inline int atomic_int_get_win32(volatile int *ptr) return *ptr; } +#define avpriv_atomic_int_get_relaxed atomic_int_get_relaxed_win32 +static inline int atomic_int_get_relaxed_win32(volatile int *ptr) +{ + return *ptr; +} + #define avpriv_atomic_int_set atomic_int_set_win32 static inline void atomic_int_set_win32(volatile int *ptr, int val) { @@ -38,12 +44,27 @@ static inline void atomic_int_set_win32(volatile int *ptr, int val) MemoryBarrier(); } +#define avpriv_atomic_int_set_relaxed atomic_int_set_relaxed_win32 +static inline void atomic_int_set_relaxed_win32(volatile int *ptr, int val) +{ + *ptr = val; +} + #define avpriv_atomic_int_add_and_fetch atomic_int_add_and_fetch_win32 static inline int atomic_int_add_and_fetch_win32(volatile int *ptr, int inc) { return inc + InterlockedExchangeAdd(ptr, inc); } +#define avpriv_atomic_int_cas_relaxed atomic_int_cas_relaxed_win32 +static inline int atomic_int_cas_relaxed_win32(volatile int *ptr, + int oldval, int newval) +{ + /* InterlockedCompareExchangeNoFence is available on Windows 8 or later + * only. */ + return InterlockedCompareExchange((volatile LONG *)ptr, newval, oldval); +} + #define avpriv_atomic_ptr_cas atomic_ptr_cas_win32 static inline void *atomic_ptr_cas_win32(void * volatile *ptr, void *oldval, void *newval) diff --git a/libavutil/cpu.c b/libavutil/cpu.c index f5785fc..70b61de 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -23,6 +23,7 @@ #include "config.h" #include "opt.h" #include "common.h" +#include "atomic.h" #if HAVE_SCHED_GETAFFINITY #ifndef _GNU_SOURCE @@ -44,7 +45,20 @@ #include #endif -static int flags, checked; +static int cpu_flags = -1; + +static int get_cpu_flags(void) +{ + if (ARCH_AARCH64) + return ff_get_cpu_flags_aarch64(); + if (ARCH_ARM) + return ff_get_cpu_flags_arm(); + if (ARCH_PPC) + return ff_get_cpu_flags_ppc(); + if (ARCH_X86) + return ff_get_cpu_flags_x86(); + return 0; +} void av_force_cpu_flags(int arg){ if ( (arg & ( AV_CPU_FLAG_3DNOW | @@ -69,33 +83,22 @@ void av_force_cpu_flags(int arg){ arg |= AV_CPU_FLAG_MMX; } - flags = arg; - checked = arg != -1; + avpriv_atomic_int_set_relaxed(&cpu_flags, arg); } int av_get_cpu_flags(void) { - if (checked) - return flags; - - if (ARCH_AARCH64) - flags = ff_get_cpu_flags_aarch64(); - if (ARCH_ARM) - flags = ff_get_cpu_flags_arm(); - if (ARCH_PPC) - flags = ff_get_cpu_flags_ppc(); - if (ARCH_X86) - flags = ff_get_cpu_flags_x86(); - - checked = 1; + int flags = avpriv_atomic_int_get_relaxed(&cpu_flags); + if (flags == -1) { + flags = get_cpu_flags(); + avpriv_atomic_int_cas_relaxed(&cpu_flags, -1, flags); + } return flags; } void av_set_cpu_flags_mask(int mask) { - checked = 0; - flags = av_get_cpu_flags() & mask; - checked = 1; + avpriv_atomic_int_set_relaxed(&cpu_flags, get_cpu_flags() & mask); } int av_parse_cpu_flags(const char *s) diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 4bff167..8499f0e 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -85,8 +85,6 @@ void av_force_cpu_flags(int flags); * Set a mask on flags returned by av_get_cpu_flags(). * This function is mainly useful for testing. * Please use av_force_cpu_flags() and av_get_cpu_flags() instead which are more flexible - * - * @warning this function is not thread safe. */ attribute_deprecated void av_set_cpu_flags_mask(int mask); diff --git a/libavutil/tests/atomic.c b/libavutil/tests/atomic.c index c92f220..5a12439 100644 --- a/libavutil/tests/atomic.c +++ b/libavutil/tests/atomic.c @@ -26,9 +26,22 @@ int main(void) res = avpriv_atomic_int_add_and_fetch(&val, 1); av_assert0(res == 2); + res = avpriv_atomic_int_add_and_fetch(&val, -1); + av_assert0(res == 1); avpriv_atomic_int_set(&val, 3); res = avpriv_atomic_int_get(&val); av_assert0(res == 3); + avpriv_atomic_int_set_relaxed(&val, 4); + res = avpriv_atomic_int_get_relaxed(&val); + av_assert0(res == 4); + res = avpriv_atomic_int_cas_relaxed(&val, 3, 5); + av_assert0(res == 4); + res = avpriv_atomic_int_get_relaxed(&val); + av_assert0(res == 4); + res = avpriv_atomic_int_cas_relaxed(&val, 4, 5); + av_assert0(res == 4); + res = avpriv_atomic_int_get_relaxed(&val); + av_assert0(res == 5); return 0; } diff --git a/libavutil/tests/cpu_init.c b/libavutil/tests/cpu_init.c new file mode 100644 index 0000000..402b6a8 --- /dev/null +++ b/libavutil/tests/cpu_init.c @@ -0,0 +1,72 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* + * This test program verifies that the one-time initialization in + * av_get_cpu_flags() is thread-safe. + */ + +#include +#include + +#include "config.h" + +#include "libavutil/cpu.h" +#include "libavutil/thread.h" + +#if HAVE_PTHREADS +static void *thread_main(void *arg) +{ + int *flags = arg; + + *flags = av_get_cpu_flags(); + return NULL; +} +#endif + + +int main(int argc, char **argv) +{ +#if HAVE_PTHREADS + int cpu_flags1; + int cpu_flags2; + int ret; + pthread_t thread1; + pthread_t thread2; + + if ((ret = pthread_create(&thread1, NULL, thread_main, &cpu_flags1))) { + fprintf(stderr, "pthread_create failed: %s.\n", strerror(ret)); + return 1; + } + if ((ret = pthread_create(&thread2, NULL, thread_main, &cpu_flags2))) { + fprintf(stderr, "pthread_create failed: %s.\n", strerror(ret)); + return 1; + } + pthread_join(thread1, NULL); + pthread_join(thread2, NULL); + + if (cpu_flags1 < 0) + return 2; + if (cpu_flags2 < 0) + return 2; + if (cpu_flags1 != cpu_flags2) + return 3; +#endif + + return 0; +}