diff mbox series

[FFmpeg-devel,4/5] lavu/intmath.h: Fix UB in ff_ctz_c() and ff_ctzll_c()

Message ID 0c19df65a8142e2cf96fbc07a92bf44bb32f1107.camel@haerdin.se
State New
Headers show
Series [FFmpeg-devel,1/5] lavu/common.h: Fix UB in av_clipl_int32_c() | expand

Checks

Context Check Description
yinshiyou/configure_loongarch64 warning Failed to apply patch
andriy/configure_x86 warning Failed to apply patch

Commit Message

Tomas Härdin May 29, 2024, 10:14 p.m. UTC

Comments

Rémi Denis-Courmont May 30, 2024, 7:54 a.m. UTC | #1
Can't we just use the compiler built-ins here? AFAIK, they (GCC, LLVM) use the same algorithm if the CPU doesn't support native CTZ. And they will pick the right instruction if CPU does have CTZ.

I get it that maybe it wasn't working so well 20 years ago, but we've increased compiler version requirements since then.
Tomas Härdin May 30, 2024, 9:50 a.m. UTC | #2
tor 2024-05-30 klockan 10:54 +0300 skrev Rémi Denis-Courmont:
> Can't we just use the compiler built-ins here? AFAIK, they (GCC,
> LLVM) use the same algorithm if the CPU doesn't support native CTZ.
> And they will pick the right instruction if CPU does have CTZ.
> 
> I get it that maybe it wasn't working so well 20 years ago, but we've
> increased compiler version requirements since then.

I think we still support MSVC, but maybe we shouldn't? It's possible to
cross-compile for Windows either way.

/Tomas
Hendrik Leppkes May 30, 2024, 12:29 p.m. UTC | #3
On Thu, May 30, 2024 at 11:50 AM Tomas Härdin <git@haerdin.se> wrote:
>
> tor 2024-05-30 klockan 10:54 +0300 skrev Rémi Denis-Courmont:
> > Can't we just use the compiler built-ins here? AFAIK, they (GCC,
> > LLVM) use the same algorithm if the CPU doesn't support native CTZ.
> > And they will pick the right instruction if CPU does have CTZ.
> >
> > I get it that maybe it wasn't working so well 20 years ago, but we've
> > increased compiler version requirements since then.
>
> I think we still support MSVC, but maybe we shouldn't? It's possible to
> cross-compile for Windows either way.
>

This is not going to happen.

- Hendrik
Rémi Denis-Courmont May 30, 2024, 1:06 p.m. UTC | #4
Le 30 mai 2024 12:50:05 GMT+03:00, "Tomas Härdin" <git@haerdin.se> a écrit :
>tor 2024-05-30 klockan 10:54 +0300 skrev Rémi Denis-Courmont:
>> Can't we just use the compiler built-ins here? AFAIK, they (GCC,
>> LLVM) use the same algorithm if the CPU doesn't support native CTZ.
>> And they will pick the right instruction if CPU does have CTZ.
>> 
>> I get it that maybe it wasn't working so well 20 years ago, but we've
>> increased compiler version requirements since then.
>
>I think we still support MSVC, but maybe we shouldn't? It's possible to
>cross-compile for Windows either way.

I don't get how that prevents using the GCC and Clang builtins (on GCC and Clang).
Tomas Härdin May 30, 2024, 2:03 p.m. UTC | #5
tor 2024-05-30 klockan 16:06 +0300 skrev Rémi Denis-Courmont:
> 
> 
> Le 30 mai 2024 12:50:05 GMT+03:00, "Tomas Härdin" <git@haerdin.se> a
> écrit :
> > tor 2024-05-30 klockan 10:54 +0300 skrev Rémi Denis-Courmont:
> > > Can't we just use the compiler built-ins here? AFAIK, they (GCC,
> > > LLVM) use the same algorithm if the CPU doesn't support native
> > > CTZ.
> > > And they will pick the right instruction if CPU does have CTZ.
> > > 
> > > I get it that maybe it wasn't working so well 20 years ago, but
> > > we've
> > > increased compiler version requirements since then.
> > 
> > I think we still support MSVC, but maybe we shouldn't? It's
> > possible to
> > cross-compile for Windows either way.
> 
> I don't get how that prevents using the GCC and Clang builtins (on
> GCC and Clang).

Does MSVC have builtins for these? Do all compilers we support?

/Tomas
Rémi Denis-Courmont May 30, 2024, 2:32 p.m. UTC | #6
Le 30 mai 2024 17:03:09 GMT+03:00, "Tomas Härdin" <git@haerdin.se> a écrit :
>> I don't get how that prevents using the GCC and Clang builtins (on
>> GCC and Clang).
>
>Does MSVC have builtins for these?

I don't know, but insofar as MSVC is used for x86, it should use x86 instructions rather than the complex fallback algo anyway, be it via built-ins, or assembler.

Either way, I don't see how that detracts from using the built-ins on compilers that do have them.

> Do all compilers we support?

No, unless all other compilers are C23 (CTZ and CLZ are in stdbit.h). Again, that's hardly a reason not to use built-ins where available.
Michael Niedermayer May 31, 2024, 12:31 a.m. UTC | #7
On Thu, May 30, 2024 at 12:14:23AM +0200, Tomas Härdin wrote:
> 

>  intmath.h |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 5ae92c415924602aeca4e0fcb3bf886a4c1911f1  0004-lavu-intmath.h-Fix-UB-in-ff_ctz_c-and-ff_ctzll_c.patch
> From f9a12089bc98dde0ccc2487d1442ec6ddb7705f6 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= <git@haerdin.se>
> Date: Thu, 16 May 2024 18:10:58 +0200
> Subject: [PATCH 4/5] lavu/intmath.h: Fix UB in ff_ctz_c() and ff_ctzll_c()
> 
> Found by value analysis
> ---
>  libavutil/intmath.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

LGTM

thx

[...]
Michael Niedermayer May 31, 2024, 12:41 a.m. UTC | #8
On Thu, May 30, 2024 at 10:54:46AM +0300, Rémi Denis-Courmont wrote:
> Can't we just use the compiler built-ins here? AFAIK, they (GCC, LLVM) use the same algorithm if the CPU doesn't support native CTZ. And they will pick the right instruction if CPU does have CTZ.
> 
> I get it that maybe it wasn't working so well 20 years ago, but we've increased compiler version requirements since then.

ffmpeg is written in C not GNU-C nor LLVM-C
so we need to have non buggy C code

A modern compiler should turn a built-in into a efficient piece of code
but so should it recognize that efficient piece of code and turn it into
a single instruction if such exist.

In the end with a modern compiler it shouldnt matter how you write this
a loop, some optimized standard implementation or a built in
the disavantage of a builtin is that it requires #ifs and checks
for each compiler.
So we should go with the simplest/cleanest that reliably produces optimal code
across all supported platforms (and also consider potential future platforms
so we dont have a maintaince nightmare but something that we can write once
and expect it will be close to optimal forever)

Above though is off topic in this thread, which was a bugfix that we need anyway
(also for backporting)

thx

[...]
Ronald S. Bultje May 31, 2024, 12:43 a.m. UTC | #9
Hi,

On Thu, May 30, 2024 at 10:03 AM Tomas Härdin <git@haerdin.se> wrote:

> tor 2024-05-30 klockan 16:06 +0300 skrev Rémi Denis-Courmont:
> >
> >
> > Le 30 mai 2024 12:50:05 GMT+03:00, "Tomas Härdin" <git@haerdin.se> a
> > écrit :
> > > tor 2024-05-30 klockan 10:54 +0300 skrev Rémi Denis-Courmont:
> > > > Can't we just use the compiler built-ins here? AFAIK, they (GCC,
> > > > LLVM) use the same algorithm if the CPU doesn't support native
> > > > CTZ.
> > > > And they will pick the right instruction if CPU does have CTZ.
> > > >
> > > > I get it that maybe it wasn't working so well 20 years ago, but
> > > > we've
> > > > increased compiler version requirements since then.
> > >
> > > I think we still support MSVC, but maybe we shouldn't? It's
> > > possible to
> > > cross-compile for Windows either way.
> >
> > I don't get how that prevents using the GCC and Clang builtins (on
> > GCC and Clang).
>
> Does MSVC have builtins for these? Do all compilers we support?
>

I think what Remi is suggesting is that someone (maybe Remi himself, maybe
you, maybe me, maybe someone else) should send a patch to use the built-ins
when available. Where not, we would continue using the C versions.

All of this is unrelated to this patch, which would continue to be useful
for the C fallback.

Ronald
Rémi Denis-Courmont May 31, 2024, 5:48 a.m. UTC | #10
Le 31 mai 2024 03:41:40 GMT+03:00, Michael Niedermayer <michael@niedermayer.cc> a écrit :
>On Thu, May 30, 2024 at 10:54:46AM +0300, Rémi Denis-Courmont wrote:
>> Can't we just use the compiler built-ins here? AFAIK, they (GCC, LLVM) use the same algorithm if the CPU doesn't support native CTZ. And they will pick the right instruction if CPU does have CTZ.
>> 
>> I get it that maybe it wasn't working so well 20 years ago, but we've increased compiler version requirements since then.
>
>ffmpeg is written in C not GNU-C nor LLVM-C
>so we need to have non buggy C code

What does that mean here?

We can put compilers in 3 sets:
1) GCC+Clang
2) MSVC+ICC
3) others

Note that ICC and others are *not* tested (at least in FATE). MSVC and ICC are using intrinsics (in x86/intmath.h).

The problem is, GCC and Clang use intrinsics only if fast_clz is manually selected. This is silly. That's just requiring more platform-specific code added for no reasons.

If you want to test this code, DO write a test that calls the C version always. DO NOT force all unknown or unoptimised FFmpeg platforms to use the C just because we need to keep supporting hypothetical C11-conforming C compilers.

Note that I never said that we should remove the standard code.

>A modern compiler should turn a built-in into a efficient piece of code
>but so should it recognize that efficient piece of code and turn it into
>a single instruction if such exist.

I doubt any compiler will detect that the Debujin algorithm can be replaced by a CTZ.  The *official* standard solution is to use <stdbit.h> (C23-only unfortunately), so there are no reasons why compilers should even care to deal with this by now.

>In the end with a modern compiler it shouldnt matter how you write this
>a loop, some optimized standard implementation or a built in
>the disavantage of a builtin is that it requires #ifs and checks
>for each compiler.

A modern compiler supports STDC bit-wise functions from C23, and therefore doesn't care about this.

>So we should go with the simplest/cleanest that reliably produces optimal code
>across all supported platforms (and also consider potential future platforms
>so we dont have a maintaince nightmare but something that we can write once
>and expect it will be close to optimal forever)

So what do you do about the existing x86 special cases and the fast_clz case?

AFAICT, the only way to (arguably) do that is to switch to the C23 bit manipulation functions, and provide fallback for old compilers.

That's clearly out of the scope of this patch, and I bet some people would object anyway because they'll find the names of the new functions ugly and/or too long.
diff mbox series

Patch

From f9a12089bc98dde0ccc2487d1442ec6ddb7705f6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= <git@haerdin.se>
Date: Thu, 16 May 2024 18:10:58 +0200
Subject: [PATCH 4/5] lavu/intmath.h: Fix UB in ff_ctz_c() and ff_ctzll_c()

Found by value analysis
---
 libavutil/intmath.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavutil/intmath.h b/libavutil/intmath.h
index c54d23b7bf..52e11a8d5f 100644
--- a/libavutil/intmath.h
+++ b/libavutil/intmath.h
@@ -119,7 +119,7 @@  static av_always_inline av_const int ff_ctz_c(int v)
         0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
         31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
     };
-    return debruijn_ctz32[(uint32_t)((v & -v) * 0x077CB531U) >> 27];
+    return debruijn_ctz32[(uint32_t)((v & -(uint32_t)v) * 0x077CB531U) >> 27];
 }
 #endif
 
@@ -135,7 +135,7 @@  static av_always_inline av_const int ff_ctzll_c(long long v)
         63, 52, 6, 26, 37, 40, 33, 47, 61, 45, 43, 21, 23, 58, 17, 10,
         51, 25, 36, 32, 60, 20, 57, 16, 50, 31, 19, 15, 30, 14, 13, 12
     };
-    return debruijn_ctz64[(uint64_t)((v & -v) * 0x022FDD63CC95386DU) >> 58];
+    return debruijn_ctz64[(uint64_t)((v & -(uint64_t)v) * 0x022FDD63CC95386DU) >> 58];
 }
 #endif
 
-- 
2.39.2