diff mbox series

[FFmpeg-devel] compat/atomics/win32: improve similarity to stdatomic.h

Message ID 20230405152629.35740-1-post@frankplowman.com
State New
Headers show
Series [FFmpeg-devel] compat/atomics/win32: improve similarity to stdatomic.h | expand

Checks

Context Check Description
andriy/commit_msg_x86 warning Please wrap lines in the body of the commit message between 60 and 72 characters.
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Frank Plowman April 5, 2023, 3:26 p.m. UTC
Some preliminary info:
* The win32 atomic compatibility header is based on VLC's (c91e72ed52). This patch makes the header more in line with what it was before they got rid of this way of doing things: https://code.videolan.org/videolan/vlc/-/blob/ce150f3849cebe97bc7fc028674d3c7f8c73f64d/include/vlc_atomic.h
* The Windows API does not support atomic operations on 8-bit types and only has functions for atomic operations on 16-bit types on Windows Desktop.
* There is nowhere in the FFmpeg codebase which currently uses 8- or 16-bit atomic types

This patch:
* Makes atomic types the same size as their non-atomic counterparts. Previously, all atomic types were intptr_t to get around the lack of 8- and 16-bit atomic operations. This could lead to overreads. Now, each atomic type is the same size as its non-atomic counterpart, in line with the C11 specification.
* Dispatches Windows API functions based on operand size to avoid overwrites. Now there are a variety of sizes of atomic types, the correct Windows API function must be selected based on the object's sizes. Feedback is appreciated on how best to handle the failure case in the atomic_helper macro.
* Removes 8- and 16-bit types not supported by all versions of winnt.h. None of these were being used anywhere in FFmpeg. As these cannot be operated on atomically, they have been removed. Alternatives to this recommended.

(PS: This is my first submitted patch, please be nice although I'm sure you will)

Signed-off-by: Frank Plowman <post@frankplowman.com>

---
 compat/atomics/win32/stdatomic.h | 104 +++++++++++--------------------
 1 file changed, 36 insertions(+), 68 deletions(-)

Comments

Rémi Denis-Courmont April 9, 2023, 8:59 a.m. UTC | #1
Le mercredi 5 avril 2023, 18:26:29 EEST Frank Plowman a écrit :
> Some preliminary info:
> * The win32 atomic compatibility header is based on VLC's (c91e72ed52). This
> patch makes the header more in line with what it was before they got rid of
> this way of doing things:
> https://code.videolan.org/videolan/vlc/-/blob/ce150f3849cebe97bc7fc028674d3
> c7f8c73f64d/include/vlc_atomic.h
> * The Windows API does not support atomic
> operations on 8-bit types and only has functions for atomic operations on
> 16-bit types on Windows Desktop.

FWIW, Windows atomic wait/notify system call functions support 8- and 16-bit 
types, so I would expect that there is a mean to perform atomic load/store/
exchange/compare-exchange too.

> * Makes atomic types the same size as their non-atomic counterparts.
> Previously, all atomic types were intptr_t to get around the lack of 8- and
> 16-bit atomic operations. This could lead to overreads. Now, each atomic
> type is the same size as its non-atomic counterpart, in line with the C11
> specification.

There are no requirements in C11 that atomic variables have the same size as 
their non-atomic-qualified equivalent. It is not clear what you mean here.

I also don't know what you mean by overread. If you don't want values to 
overflow their specified boundaries, the "correct" approach is to mask the 
excess bits (like how compilers implement small atomics on RISC-V).
Frank Plowman April 10, 2023, 9:54 a.m. UTC | #2
> On 9 Apr 2023, at 09:59, Rémi Denis-Courmont <remi@remlab.net> wrote:
> 
> Le mercredi 5 avril 2023, 18:26:29 EEST Frank Plowman a écrit :
>> Some preliminary info:
>> * The win32 atomic compatibility header is based on VLC's (c91e72ed52). This
>> patch makes the header more in line with what it was before they got rid of
>> this way of doing things:
>> https://code.videolan.org/videolan/vlc/-/blob/ce150f3849cebe97bc7fc028674d3
>> c7f8c73f64d/include/vlc_atomic.h
>> * The Windows API does not support atomic
>> operations on 8-bit types and only has functions for atomic operations on
>> 16-bit types on Windows Desktop.
> 
> FWIW, Windows atomic wait/notify system call functions support 8- and 16-bit 
> types, so I would expect that there is a mean to perform atomic load/store/
> exchange/compare-exchange too.

The functions for 16-bit atomic operations do exist and have been supported by Windows Desktop and Server versions since around 2003, however support for non-Desktop/Server platforms seems to only have been introduced with UWP in 2019. I have since found there are also 8-bit interlocked variable access functions (the links to them in the documentation were missing). These were only added in Windows Desktop 8 and Windows Server 2012 however. These could be used, but at the cost of dropping support for some old/obscure systems.

> 
>> * Makes atomic types the same size as their non-atomic counterparts.
>> Previously, all atomic types were intptr_t to get around the lack of 8- and
>> 16-bit atomic operations. This could lead to overreads. Now, each atomic
>> type is the same size as its non-atomic counterpart, in line with the C11
>> specification.
> 
> There are no requirements in C11 that atomic variables have the same size as 
> their non-atomic-qualified equivalent. It is not clear what you mean here.

While not a requirement, the specification does say atomic types and their corresponding regular types "should have the same size whenever possible”. This is an alternative way I see we could support 8- and 16-bit atomic types though.

> I also don't know what you mean by overread. If you don't want values to 
> overflow their specified boundaries, the "correct" approach is to mask the 
> excess bits (like how compilers implement small atomics on RISC-V).

I think I was a bit confused writing this to be honest. The functions where this was a potential problem are atomic_compare_exchange_* as they take a pointer to a non-atomic type. This point was nullified however as atomic_compare_exchange_strong was defined as a function rather than a macro. The patch allows atomic_compare_exchange_strong to be defined as a macro taking various sized types, making it easier to use. Previously, the following code

atomic_int atomic = 0;
int regular = 0;
atomic_compare_exchange_strong(&atomic, &regular, 0);

would produce a compilation error when compiled with 64-bit MSVC (where ints are 32 bits). The snippet above is how I would expect most developers to try to use the function - having to define regular as an intptr_t seems unintuitive and other typical implementations of stdatomic.h require an int.

Thanks for your feedback. I will produce a second version of the patch including support for 8- and 16-bit atomic types. I see two ways of doing this:
1. Defining the 8- and 16-bit atomic types as 32-bit types and implementing their operations using the 32-bit interlocked variable access functions.
2. Defining the 8- and 16-bit atomic types as 8- and 16-bit types respectively, and implementing their operations using the 8- and 16-bit interlocked variable access functions.
I lean towards 1. as 2. would require dropping support for Windows Desktop 7/Windows Server 2008 and earlier. The disadvantage of approach 1. is that atomic_compare_exchange_* won't work for 8- and 16-bit types.
diff mbox series

Patch

diff --git a/compat/atomics/win32/stdatomic.h b/compat/atomics/win32/stdatomic.h
index 28a627bfd3..a0475fda7d 100644
--- a/compat/atomics/win32/stdatomic.h
+++ b/compat/atomics/win32/stdatomic.h
@@ -43,42 +43,26 @@  do {                            \
 
 #define atomic_is_lock_free(obj) 0
 
-typedef intptr_t atomic_flag;
-typedef intptr_t atomic_bool;
-typedef intptr_t atomic_char;
-typedef intptr_t atomic_schar;
-typedef intptr_t atomic_uchar;
-typedef intptr_t atomic_short;
-typedef intptr_t atomic_ushort;
-typedef intptr_t atomic_int;
-typedef intptr_t atomic_uint;
-typedef intptr_t atomic_long;
-typedef intptr_t atomic_ulong;
-typedef intptr_t atomic_llong;
-typedef intptr_t atomic_ullong;
-typedef intptr_t atomic_wchar_t;
-typedef intptr_t atomic_int_least8_t;
-typedef intptr_t atomic_uint_least8_t;
-typedef intptr_t atomic_int_least16_t;
-typedef intptr_t atomic_uint_least16_t;
-typedef intptr_t atomic_int_least32_t;
-typedef intptr_t atomic_uint_least32_t;
-typedef intptr_t atomic_int_least64_t;
-typedef intptr_t atomic_uint_least64_t;
-typedef intptr_t atomic_int_fast8_t;
-typedef intptr_t atomic_uint_fast8_t;
-typedef intptr_t atomic_int_fast16_t;
-typedef intptr_t atomic_uint_fast16_t;
-typedef intptr_t atomic_int_fast32_t;
-typedef intptr_t atomic_uint_fast32_t;
-typedef intptr_t atomic_int_fast64_t;
-typedef intptr_t atomic_uint_fast64_t;
-typedef intptr_t atomic_intptr_t;
-typedef intptr_t atomic_uintptr_t;
-typedef intptr_t atomic_size_t;
-typedef intptr_t atomic_ptrdiff_t;
-typedef intptr_t atomic_intmax_t;
-typedef intptr_t atomic_uintmax_t;
+typedef          int       atomic_int;
+typedef unsigned int       atomic_uint;
+typedef          long      atomic_long;
+typedef unsigned long      atomic_ulong;
+typedef          long long atomic_llong;
+typedef unsigned long long atomic_ullong;
+typedef      int_least32_t atomic_int_least32_t;
+typedef     uint_least32_t atomic_uint_least32_t;
+typedef      int_least64_t atomic_int_least64_t;
+typedef     uint_least64_t atomic_uint_least64_t;
+typedef       int_fast32_t atomic_int_fast32_t;
+typedef      uint_fast32_t atomic_uint_fast32_t;
+typedef       int_fast64_t atomic_int_fast64_t;
+typedef      uint_fast64_t atomic_uint_fast64_t;
+typedef           intptr_t atomic_intptr_t;
+typedef          uintptr_t atomic_uintptr_t;
+typedef             size_t atomic_size_t;
+typedef          ptrdiff_t atomic_ptrdiff_t;
+typedef           intmax_t atomic_intmax_t;
+typedef          uintmax_t atomic_uintmax_t;
 
 #define atomic_store(object, desired)   \
 do {                                    \
@@ -95,20 +79,21 @@  do {                                    \
 #define atomic_load_explicit(object, order) \
     atomic_load(object)
 
+#define atomic_helper(operation, object, ...)                  \
+    (sizeof(*object) == 4 ?                                    \
+        operation((volatile LONG *) object, __VA_ARGS__)       \
+    : sizeof(*object) == 8 ?                                   \
+        operation##64((volatile LONG64 *) object, __VA_ARGS__) \
+    : (abort(), 0))
+
 #define atomic_exchange(object, desired) \
-    InterlockedExchangePointer((PVOID volatile *)object, (PVOID)desired)
+    atomic_helper(InterlockedExchange, object, desired)
 
 #define atomic_exchange_explicit(object, desired, order) \
     atomic_exchange(object, desired)
 
-static inline int atomic_compare_exchange_strong(intptr_t *object, intptr_t *expected,
-                                                 intptr_t desired)
-{
-    intptr_t old = *expected;
-    *expected = (intptr_t)InterlockedCompareExchangePointer(
-        (PVOID *)object, (PVOID)desired, (PVOID)old);
-    return *expected == old;
-}
+#define atomic_compare_exchange_strong(object, expected, desired) \
+    atomic_helper(InterlockedCompareExchange, object, desired, *expected)
 
 #define atomic_compare_exchange_strong_explicit(object, expected, desired, success, failure) \
     atomic_compare_exchange_strong(object, expected, desired)
@@ -119,37 +104,20 @@  static inline int atomic_compare_exchange_strong(intptr_t *object, intptr_t *exp
 #define atomic_compare_exchange_weak_explicit(object, expected, desired, success, failure) \
     atomic_compare_exchange_weak(object, expected, desired)
 
-#ifdef _WIN64
-#define atomic_fetch_add(object, operand) \
-    InterlockedExchangeAdd64(object, operand)
-
-#define atomic_fetch_sub(object, operand) \
-    InterlockedExchangeAdd64(object, -(operand))
-
-#define atomic_fetch_or(object, operand) \
-    InterlockedOr64(object, operand)
-
-#define atomic_fetch_xor(object, operand) \
-    InterlockedXor64(object, operand)
-
-#define atomic_fetch_and(object, operand) \
-    InterlockedAnd64(object, operand)
-#else
 #define atomic_fetch_add(object, operand) \
-    InterlockedExchangeAdd(object, operand)
+    atomic_helper(InterlockedExchangeAdd, object, operand)
 
-#define atomic_fetch_sub(object, operand) \
-    InterlockedExchangeAdd(object, -(operand))
+#define atomic_fetch_sub(object, operand)    \
+    atomic_fetch_add(object, -(operand))
 
 #define atomic_fetch_or(object, operand) \
-    InterlockedOr(object, operand)
+    atomic_helper(InterlockedOr, object, operand)
 
 #define atomic_fetch_xor(object, operand) \
-    InterlockedXor(object, operand)
+    atomic_helper(InterlockedXor, object, operand)
 
 #define atomic_fetch_and(object, operand) \
-    InterlockedAnd(object, operand)
-#endif /* _WIN64 */
+    atomic_helper(InterlockedAnd, object, operand)
 
 #define atomic_fetch_add_explicit(object, operand, order) \
     atomic_fetch_add(object, operand)