diff mbox series

[FFmpeg-devel,1/5] configure: Add option for enabling LC3/LC3plus wrapper

Message ID 20240326164739.153011-1-asoulier@google.com
State New
Headers show
Series [FFmpeg-devel,1/5] configure: Add option for enabling LC3/LC3plus wrapper | expand

Checks

Context Check Description
andriy/make_x86 success Make finished
andriy/make_fate_x86 success Make fate finished

Commit Message

Antoine Soulier March 26, 2024, 4:47 p.m. UTC
Signed-off-by: Antoine Soulier <asoulier@google.com>
Signed-off-by: Antoine SOULIER <asoulier@google.com>
---
 configure | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Paul B Mahol March 26, 2024, 4:58 p.m. UTC | #1
Isn't this using sub-optimal MDCT implementation?
Antoine Soulier March 26, 2024, 5:07 p.m. UTC | #2
What do you mean by sub-optimal?
It's stacked by prime factors, and unrolled for FFT3 and FF5.
The butterfly implementations of FFT3 and FF5, gives me slightly slower
computation. FFT5 is done first, so it takes advantage of sin()/cos()
values of 0 or 1.
There are also no reordering steps (this stage is completely removed), but
cannot run in-place.
Benchmarks I made show that it runs slightly faster.

On Tue, Mar 26, 2024 at 9:59 AM Paul B Mahol <onemda@gmail.com> wrote:

>
> Isn't this using sub-optimal MDCT implementation?
>
Stefano Sabatini March 26, 2024, 5:31 p.m. UTC | #3
On date Tuesday 2024-03-26 16:47:35 +0000, ffmpeg-devel Mailing List wrote:

> Signed-off-by: Antoine Soulier <asoulier@google.com>
> Signed-off-by: Antoine SOULIER <asoulier@google.com>

why the double sign-off?

[...]

LGTM.
Antoine Soulier March 26, 2024, 5:35 p.m. UTC | #4
Arf, sorry for that. I used `git send-email -s`, perhaps it's the source of
the double signed-off.

On Tue, Mar 26, 2024 at 10:32 AM Stefano Sabatini <stefasab@gmail.com>
wrote:

> On date Tuesday 2024-03-26 16:47:35 +0000, ffmpeg-devel Mailing List wrote:
>
> > Signed-off-by: Antoine Soulier <asoulier@google.com>
> > Signed-off-by: Antoine SOULIER <asoulier@google.com>
>
> why the double sign-off?
>
> [...]
>
> LGTM.
>
Paul B Mahol March 26, 2024, 5:44 p.m. UTC | #5
On Tue, Mar 26, 2024 at 6:07 PM Antoine Soulier <asoulier@google.com> wrote:

> What do you mean by sub-optimal?
> It's stacked by prime factors, and unrolled for FFT3 and FF5.
> The butterfly implementations of FFT3 and FF5, gives me slightly slower
> computation. FFT5 is done first, so it takes advantage of sin()/cos()
> values of 0 or 1.
> There are also no reordering steps (this stage is completely removed), but
> cannot run in-place.
> Benchmarks I made show that it runs slightly faster.
>

Compared with what?
Where is at least x86 SIMD for that MDCT?


>
> On Tue, Mar 26, 2024 at 9:59 AM Paul B Mahol <onemda@gmail.com> wrote:
>
>>
>> Isn't this using sub-optimal MDCT implementation?
>>
>
Antoine Soulier March 26, 2024, 5:57 p.m. UTC | #6
Compared with the C implementation of KissFFT (it's the only one I tested
on ARM M4).
Yes, there is no SIMD on x86. This was not the main target.
Was mainly made for ARM M4 (for BLE devices Nordic Semi / Zephyr), and ARM
Neon (Android).
By the way, this does not change a lot, the FFT/MDCT on powerful CPU's is
marginal compared to the read/write of the bitstream arithmetically coded.
We can perhaps connect the FFMpeg implementation, but it will probably miss
2 things:
- Some transformations are not a multiple of 15, but only 5 * 2^n. I guess
FFmpeg only has a base 15 implementation.
- It uses asymmetric windowing, to reduce algorithmic delay. Some
coefficients are zeroed. Not important, but will need a larger coefficients
table, and a bunch of multiplication by 0, without a specific
implementation.
So I think it will need some work.

On Tue, Mar 26, 2024 at 10:45 AM Paul B Mahol <onemda@gmail.com> wrote:

>
>
> On Tue, Mar 26, 2024 at 6:07 PM Antoine Soulier <asoulier@google.com>
> wrote:
>
>> What do you mean by sub-optimal?
>> It's stacked by prime factors, and unrolled for FFT3 and FF5.
>> The butterfly implementations of FFT3 and FF5, gives me slightly slower
>> computation. FFT5 is done first, so it takes advantage of sin()/cos()
>> values of 0 or 1.
>> There are also no reordering steps (this stage is completely removed),
>> but cannot run in-place.
>> Benchmarks I made show that it runs slightly faster.
>>
>
> Compared with what?
> Where is at least x86 SIMD for that MDCT?
>
>
>>
>> On Tue, Mar 26, 2024 at 9:59 AM Paul B Mahol <onemda@gmail.com> wrote:
>>
>>>
>>> Isn't this using sub-optimal MDCT implementation?
>>>
>>
diff mbox series

Patch

diff --git a/configure b/configure
index 343edb38ab..eb8ff81a11 100755
--- a/configure
+++ b/configure
@@ -244,6 +244,7 @@  External library support:
   --enable-libjxl          enable JPEG XL de/encoding via libjxl [no]
   --enable-libklvanc       enable Kernel Labs VANC processing [no]
   --enable-libkvazaar      enable HEVC encoding via libkvazaar [no]
+  --enable-liblc3          enable LC3 de/encoding via liblc3 [no]
   --enable-liblensfun      enable lensfun lens correction [no]
   --enable-libmodplug      enable ModPlug via libmodplug [no]
   --enable-libmp3lame      enable MP3 encoding via libmp3lame [no]
@@ -1926,6 +1927,7 @@  EXTERNAL_LIBRARY_LIST="
     libjxl
     libklvanc
     libkvazaar
+    liblc3
     libmodplug
     libmp3lame
     libmysofa
@@ -3501,6 +3503,10 @@  libilbc_encoder_deps="libilbc"
 libjxl_decoder_deps="libjxl libjxl_threads"
 libjxl_encoder_deps="libjxl libjxl_threads"
 libkvazaar_encoder_deps="libkvazaar"
+liblc3_lc3_decoder_deps="liblc3"
+liblc3_lc3plus_decoder_deps="liblc3"
+liblc3_encoder_deps="liblc3"
+liblc3_encoder_select="audio_frame_queue"
 libmodplug_demuxer_deps="libmodplug"
 libmp3lame_encoder_deps="libmp3lame"
 libmp3lame_encoder_select="audio_frame_queue mpegaudioheader"
@@ -6858,6 +6864,7 @@  enabled libjxl            && require_pkg_config libjxl "libjxl >= 0.7.0" jxl/dec
                              require_pkg_config libjxl_threads "libjxl_threads >= 0.7.0" jxl/thread_parallel_runner.h JxlThreadParallelRunner
 enabled libklvanc         && require libklvanc libklvanc/vanc.h klvanc_context_create -lklvanc
 enabled libkvazaar        && require_pkg_config libkvazaar "kvazaar >= 2.0.0" kvazaar.h kvz_api_get
+enabled liblc3            && require_pkg_config liblc3 "lc3 >= 1.1.0" lc3.h lc3_hr_setup_encoder
 enabled liblensfun        && require_pkg_config liblensfun lensfun lensfun.h lf_db_create
 
 if enabled libmfx && enabled libvpl; then