Message ID | MQrn3cw--3-2@lynne.ee |
---|---|
State | Accepted |
Headers | show |
Series | [FFmpeg-devel,v2,1/5] ac3enc_fixed: convert to 32-bit sample format | expand |
Context | Check | Description |
---|---|---|
andriy/x86_make | success | Make finished |
andriy/x86_make_fate | fail | Make fate failed |
andriy/PPC64_make | success | Make finished |
andriy/PPC64_make_fate | warning | Make fate failed |
Lynne: > The AC3 encoder used to be a separate library called "Aften", which > got merged into libavcodec (literally, SVN commits and all). > The merge preserved as much features from the library as possible. > > The code had two versions - a fixed point version and a floating > point version. FFmpeg had floating point DSP code used by other > codecs, the AC3 decoder including, so the floating-point DSP was > simply replaced with FFmpeg's own functions. > However, FFmpeg had no fixed-point audio code at that point. So > the encoder brought along its own fixed-point DSP functions, > including a fixed-point MDCT. > > The fixed-point MDCT itself is trivially just a float MDCT with a > different type and each multiply being a fixed-point multiply. > So over time, it got refactored, and the FFT used for all other codecs > was templated. > > Due to design decisions at the time, the fixed-point version of the > encoder operates at 16-bits of precision. Although convenient, this, > even at the time, was inadequate and inefficient. The encoder is noisy, > does not produce output comparable to the float encoder, and even > rings at higher frequencies due to the badly approximated winow function. > > Enter MIPS (owned by Imagination Technologies at the time). They wanted > quick fixed-point decoding on their FPUless cores. So they contributed > patches to template the AC3 decoder so it had both a fixed-point > and a floating-point version. They also did the same for the AAC decoder. > They however, used 32-bit samples. Not 16-bits. And we did not have > 32-bit fixed-point DSP functions, including an MDCT. But instead of > templating our MDCT to output 3 versions (float, 32-bit fixed and 16-bit fixed), > they simply copy-pasted their own MDCT into ours, and completely > ifdeffed our own MDCT code out if a 32-bit fixed point MDCT was selected. > > This is also the status quo nowadays - 2 separate MDCTs, one which > produces floating point and 16-bit fixed point versions, and one > sort-of integrated which produces 32-bit MDCT. > > MIPS weren't all that interested in encoding, so they left the encoder > as-is, and they didn't care much about the ifdeffery, mess or quality - it's > not their problem. > > So the MDCT/FFT code has always been a thorn in anyone looking to clean up > code's eye. > > Backstory over. Internally AC3 operates on 25-bit fixed-point coefficients. > So for the floating point version, the encoder simply runs the float MDCT, > and converts the resulting coefficients to 25-bit fixed-point, as AC3 is inherently > a fixed-point codec. For the fixed-point version, the input is 16-bit samples, > so to maximize precision the frame samples are analyzed and the highest set > bit is detected via ac3_max_msb_abs_int16(), and the coefficients are then > scaled up via ac3_lshift_int16(), so the input for the FFT is always at least 14 bits, > computed in normalize_samples(). After FFT, the coefficients are scaled up to 25 bits. > > This patch simply changes the encoder to accept 32-bit samples, reusing > the already well-optimized 32-bit MDCT code, allowing us to clean up and drop > a large part of a very messy code of ours, as well as prepare for the future lavu/tx > conversion. The coefficients are simply scaled down to 25 bits during windowing, > skipping 2 separate scalings, as the hacks to extend precision are simply no longer > necessary. There's no point in running the MDCT always at 32 bits when you're > going to drop 6 bits off anyway, the headroom is plenty, and the MDCT rounds > properly. > > This also makes the encoder even slightly more accurate over the float version, > as there's no coefficient conversion step necessary. > > SIZE SAVINGS: > ARM32: > HARDCODED TABLES: > BASE - 10709590 > DROP DSP - 10702872 - diff: -6.56KiB > DROP MDCT - 10667932 - diff: -34.12KiB - both: -40.68KiB > DROP FFT - 10336652 - diff: -323.52KiB - all: -364.20KiB > SOFTCODED TABLES: > BASE - 9685096 > DROP DSP - 9678378 - diff: -6.56KiB > DROP MDCT - 9643466 - diff: -34.09KiB - both: -40.65KiB > DROP FFT - 9573918 - diff: -67.92KiB - all: -108.57KiB > > ARM64: > HARDCODED TABLES: > BASE - 14641112 > DROP DSP - 14633806 - diff: -7.13KiB > DROP MDCT - 14604812 - diff: -28.31KiB - both: -35.45KiB > DROP FFT - 14286826 - diff: -310.53KiB - all: -345.98KiB > SOFTCODED TABLES: > BASE - 13636238 > DROP DSP - 13628932 - diff: -7.13KiB > DROP MDCT - 13599866 - diff: -28.38KiB - both: -35.52KiB > DROP FFT - 13542080 - diff: -56.43KiB - all: -91.95KiB > > x86: > HARDCODED TABLES: > BASE - 12367336 > DROP DSP - 12354698 - diff: -12.34KiB > DROP MDCT - 12331024 - diff: -23.12KiB - both: -35.46KiB > DROP FFT - 12029788 - diff: -294.18KiB - all: -329.64KiB > SOFTCODED TABLES: > BASE - 11358094 > DROP DSP - 11345456 - diff: -12.34KiB > DROP MDCT - 11321742 - diff: -23.16KiB - both: -35.50KiB > DROP FFT - 11276946 - diff: -43.75KiB - all: -79.25KiB > > PERFORMANCE (10min random s32le): > ARM32 - before - 39.9x - 0m15.046s > ARM32 - after - 28.2x - 0m21.525s > Speed: -30% > > ARM64 - before - 36.1x - 0m16.637s > ARM64 - after - 36.0x - 0m16.727s > Speed: -0.5% > > x86 - before - 184x - 0m3.277s > x86 - after - 190x - 0m3.187s > Speed: +3% > > New patch attached. > FATE still fails with patches 1 and 2 on patchwork. - Andreas
Jan 12, 2021, 22:24 by andreas.rheinhardt@gmail.com: > Lynne: > >> The AC3 encoder used to be a separate library called "Aften", which >> got merged into libavcodec (literally, SVN commits and all). >> The merge preserved as much features from the library as possible. >> >> The code had two versions - a fixed point version and a floating >> point version. FFmpeg had floating point DSP code used by other >> codecs, the AC3 decoder including, so the floating-point DSP was >> simply replaced with FFmpeg's own functions. >> However, FFmpeg had no fixed-point audio code at that point. So >> the encoder brought along its own fixed-point DSP functions, >> including a fixed-point MDCT. >> >> The fixed-point MDCT itself is trivially just a float MDCT with a >> different type and each multiply being a fixed-point multiply. >> So over time, it got refactored, and the FFT used for all other codecs >> was templated. >> >> Due to design decisions at the time, the fixed-point version of the >> encoder operates at 16-bits of precision. Although convenient, this, >> even at the time, was inadequate and inefficient. The encoder is noisy, >> does not produce output comparable to the float encoder, and even >> rings at higher frequencies due to the badly approximated winow function. >> >> Enter MIPS (owned by Imagination Technologies at the time). They wanted >> quick fixed-point decoding on their FPUless cores. So they contributed >> patches to template the AC3 decoder so it had both a fixed-point >> and a floating-point version. They also did the same for the AAC decoder. >> They however, used 32-bit samples. Not 16-bits. And we did not have >> 32-bit fixed-point DSP functions, including an MDCT. But instead of >> templating our MDCT to output 3 versions (float, 32-bit fixed and 16-bit fixed), >> they simply copy-pasted their own MDCT into ours, and completely >> ifdeffed our own MDCT code out if a 32-bit fixed point MDCT was selected. >> >> This is also the status quo nowadays - 2 separate MDCTs, one which >> produces floating point and 16-bit fixed point versions, and one >> sort-of integrated which produces 32-bit MDCT. >> >> MIPS weren't all that interested in encoding, so they left the encoder >> as-is, and they didn't care much about the ifdeffery, mess or quality - it's >> not their problem. >> >> So the MDCT/FFT code has always been a thorn in anyone looking to clean up >> code's eye. >> >> Backstory over. Internally AC3 operates on 25-bit fixed-point coefficients. >> So for the floating point version, the encoder simply runs the float MDCT, >> and converts the resulting coefficients to 25-bit fixed-point, as AC3 is inherently >> a fixed-point codec. For the fixed-point version, the input is 16-bit samples, >> so to maximize precision the frame samples are analyzed and the highest set >> bit is detected via ac3_max_msb_abs_int16(), and the coefficients are then >> scaled up via ac3_lshift_int16(), so the input for the FFT is always at least 14 bits, >> computed in normalize_samples(). After FFT, the coefficients are scaled up to 25 bits. >> >> This patch simply changes the encoder to accept 32-bit samples, reusing >> the already well-optimized 32-bit MDCT code, allowing us to clean up and drop >> a large part of a very messy code of ours, as well as prepare for the future lavu/tx >> conversion. The coefficients are simply scaled down to 25 bits during windowing, >> skipping 2 separate scalings, as the hacks to extend precision are simply no longer >> necessary. There's no point in running the MDCT always at 32 bits when you're >> going to drop 6 bits off anyway, the headroom is plenty, and the MDCT rounds >> properly. >> >> This also makes the encoder even slightly more accurate over the float version, >> as there's no coefficient conversion step necessary. >> >> SIZE SAVINGS: >> ARM32: >> HARDCODED TABLES: >> BASE - 10709590 >> DROP DSP - 10702872 - diff: -6.56KiB >> DROP MDCT - 10667932 - diff: -34.12KiB - both: -40.68KiB >> DROP FFT - 10336652 - diff: -323.52KiB - all: -364.20KiB >> SOFTCODED TABLES: >> BASE - 9685096 >> DROP DSP - 9678378 - diff: -6.56KiB >> DROP MDCT - 9643466 - diff: -34.09KiB - both: -40.65KiB >> DROP FFT - 9573918 - diff: -67.92KiB - all: -108.57KiB >> >> ARM64: >> HARDCODED TABLES: >> BASE - 14641112 >> DROP DSP - 14633806 - diff: -7.13KiB >> DROP MDCT - 14604812 - diff: -28.31KiB - both: -35.45KiB >> DROP FFT - 14286826 - diff: -310.53KiB - all: -345.98KiB >> SOFTCODED TABLES: >> BASE - 13636238 >> DROP DSP - 13628932 - diff: -7.13KiB >> DROP MDCT - 13599866 - diff: -28.38KiB - both: -35.52KiB >> DROP FFT - 13542080 - diff: -56.43KiB - all: -91.95KiB >> >> x86: >> HARDCODED TABLES: >> BASE - 12367336 >> DROP DSP - 12354698 - diff: -12.34KiB >> DROP MDCT - 12331024 - diff: -23.12KiB - both: -35.46KiB >> DROP FFT - 12029788 - diff: -294.18KiB - all: -329.64KiB >> SOFTCODED TABLES: >> BASE - 11358094 >> DROP DSP - 11345456 - diff: -12.34KiB >> DROP MDCT - 11321742 - diff: -23.16KiB - both: -35.50KiB >> DROP FFT - 11276946 - diff: -43.75KiB - all: -79.25KiB >> >> PERFORMANCE (10min random s32le): >> ARM32 - before - 39.9x - 0m15.046s >> ARM32 - after - 28.2x - 0m21.525s >> Speed: -30% >> >> ARM64 - before - 36.1x - 0m16.637s >> ARM64 - after - 36.0x - 0m16.727s >> Speed: -0.5% >> >> x86 - before - 184x - 0m3.277s >> x86 - after - 190x - 0m3.187s >> Speed: +3% >> >> New patch attached. >> > FATE still fails with patches 1 and 2 on patchwork. > Fixed. Forgot a line. New patches attached.
Jan 13, 2021, 00:55 by dev@lynne.ee: > Jan 12, 2021, 22:24 by andreas.rheinhardt@gmail.com: > >> Lynne: >> >>> The AC3 encoder used to be a separate library called "Aften", which >>> got merged into libavcodec (literally, SVN commits and all). >>> The merge preserved as much features from the library as possible. >>> >>> The code had two versions - a fixed point version and a floating >>> point version. FFmpeg had floating point DSP code used by other >>> codecs, the AC3 decoder including, so the floating-point DSP was >>> simply replaced with FFmpeg's own functions. >>> However, FFmpeg had no fixed-point audio code at that point. So >>> the encoder brought along its own fixed-point DSP functions, >>> including a fixed-point MDCT. >>> >>> The fixed-point MDCT itself is trivially just a float MDCT with a >>> different type and each multiply being a fixed-point multiply. >>> So over time, it got refactored, and the FFT used for all other codecs >>> was templated. >>> >>> Due to design decisions at the time, the fixed-point version of the >>> encoder operates at 16-bits of precision. Although convenient, this, >>> even at the time, was inadequate and inefficient. The encoder is noisy, >>> does not produce output comparable to the float encoder, and even >>> rings at higher frequencies due to the badly approximated winow function. >>> >>> Enter MIPS (owned by Imagination Technologies at the time). They wanted >>> quick fixed-point decoding on their FPUless cores. So they contributed >>> patches to template the AC3 decoder so it had both a fixed-point >>> and a floating-point version. They also did the same for the AAC decoder. >>> They however, used 32-bit samples. Not 16-bits. And we did not have >>> 32-bit fixed-point DSP functions, including an MDCT. But instead of >>> templating our MDCT to output 3 versions (float, 32-bit fixed and 16-bit fixed), >>> they simply copy-pasted their own MDCT into ours, and completely >>> ifdeffed our own MDCT code out if a 32-bit fixed point MDCT was selected. >>> >>> This is also the status quo nowadays - 2 separate MDCTs, one which >>> produces floating point and 16-bit fixed point versions, and one >>> sort-of integrated which produces 32-bit MDCT. >>> >>> MIPS weren't all that interested in encoding, so they left the encoder >>> as-is, and they didn't care much about the ifdeffery, mess or quality - it's >>> not their problem. >>> >>> So the MDCT/FFT code has always been a thorn in anyone looking to clean up >>> code's eye. >>> >>> Backstory over. Internally AC3 operates on 25-bit fixed-point coefficients. >>> So for the floating point version, the encoder simply runs the float MDCT, >>> and converts the resulting coefficients to 25-bit fixed-point, as AC3 is inherently >>> a fixed-point codec. For the fixed-point version, the input is 16-bit samples, >>> so to maximize precision the frame samples are analyzed and the highest set >>> bit is detected via ac3_max_msb_abs_int16(), and the coefficients are then >>> scaled up via ac3_lshift_int16(), so the input for the FFT is always at least 14 bits, >>> computed in normalize_samples(). After FFT, the coefficients are scaled up to 25 bits. >>> >>> This patch simply changes the encoder to accept 32-bit samples, reusing >>> the already well-optimized 32-bit MDCT code, allowing us to clean up and drop >>> a large part of a very messy code of ours, as well as prepare for the future lavu/tx >>> conversion. The coefficients are simply scaled down to 25 bits during windowing, >>> skipping 2 separate scalings, as the hacks to extend precision are simply no longer >>> necessary. There's no point in running the MDCT always at 32 bits when you're >>> going to drop 6 bits off anyway, the headroom is plenty, and the MDCT rounds >>> properly. >>> >>> This also makes the encoder even slightly more accurate over the float version, >>> as there's no coefficient conversion step necessary. >>> >>> SIZE SAVINGS: >>> ARM32: >>> HARDCODED TABLES: >>> BASE - 10709590 >>> DROP DSP - 10702872 - diff: -6.56KiB >>> DROP MDCT - 10667932 - diff: -34.12KiB - both: -40.68KiB >>> DROP FFT - 10336652 - diff: -323.52KiB - all: -364.20KiB >>> SOFTCODED TABLES: >>> BASE - 9685096 >>> DROP DSP - 9678378 - diff: -6.56KiB >>> DROP MDCT - 9643466 - diff: -34.09KiB - both: -40.65KiB >>> DROP FFT - 9573918 - diff: -67.92KiB - all: -108.57KiB >>> >>> ARM64: >>> HARDCODED TABLES: >>> BASE - 14641112 >>> DROP DSP - 14633806 - diff: -7.13KiB >>> DROP MDCT - 14604812 - diff: -28.31KiB - both: -35.45KiB >>> DROP FFT - 14286826 - diff: -310.53KiB - all: -345.98KiB >>> SOFTCODED TABLES: >>> BASE - 13636238 >>> DROP DSP - 13628932 - diff: -7.13KiB >>> DROP MDCT - 13599866 - diff: -28.38KiB - both: -35.52KiB >>> DROP FFT - 13542080 - diff: -56.43KiB - all: -91.95KiB >>> >>> x86: >>> HARDCODED TABLES: >>> BASE - 12367336 >>> DROP DSP - 12354698 - diff: -12.34KiB >>> DROP MDCT - 12331024 - diff: -23.12KiB - both: -35.46KiB >>> DROP FFT - 12029788 - diff: -294.18KiB - all: -329.64KiB >>> SOFTCODED TABLES: >>> BASE - 11358094 >>> DROP DSP - 11345456 - diff: -12.34KiB >>> DROP MDCT - 11321742 - diff: -23.16KiB - both: -35.50KiB >>> DROP FFT - 11276946 - diff: -43.75KiB - all: -79.25KiB >>> >>> PERFORMANCE (10min random s32le): >>> ARM32 - before - 39.9x - 0m15.046s >>> ARM32 - after - 28.2x - 0m21.525s >>> Speed: -30% >>> >>> ARM64 - before - 36.1x - 0m16.637s >>> ARM64 - after - 36.0x - 0m16.727s >>> Speed: -0.5% >>> >>> x86 - before - 184x - 0m3.277s >>> x86 - after - 190x - 0m3.187s >>> Speed: +3% >>> >>> New patch attached. >>> >> FATE still fails with patches 1 and 2 on patchwork. >> > > Fixed. Forgot a line. New patches attached. > Pushed. Thanks to all who commented. Next up will likely be the mdct15 removal, which will be around 80KiB on all platforms. Fun fact: commit 4/5 removed MMX code. Not something you see every day.
diff --git a/doc/encoders.texi b/doc/encoders.texi index 0b1c69e982..60e763a704 100644 --- a/doc/encoders.texi +++ b/doc/encoders.texi @@ -151,10 +151,9 @@ the undocumented RealAudio 3 (a.k.a. dnet). The @var{ac3} encoder uses floating-point math, while the @var{ac3_fixed} encoder only uses fixed-point integer math. This does not mean that one is always faster, just that one or the other may be better suited to a -particular system. The floating-point encoder will generally produce better -quality audio for a given bitrate. The @var{ac3_fixed} encoder is not the -default codec for any of the output formats, so it must be specified explicitly -using the option @code{-acodec ac3_fixed} in order to use it. +particular system. The @var{ac3_fixed} encoder is not the default codec for +any of the output formats, so it must be specified explicitly using the option +@code{-acodec ac3_fixed} in order to use it. @subsection AC-3 Metadata diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 35318f4f4d..0546e6f6c5 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -181,7 +181,7 @@ OBJS-$(CONFIG_AC3_DECODER) += ac3dec_float.o ac3dec_data.o ac3.o kbd OBJS-$(CONFIG_AC3_FIXED_DECODER) += ac3dec_fixed.o ac3dec_data.o ac3.o kbdwin.o ac3tab.o OBJS-$(CONFIG_AC3_ENCODER) += ac3enc_float.o ac3enc.o ac3tab.o \ ac3.o kbdwin.o -OBJS-$(CONFIG_AC3_FIXED_ENCODER) += ac3enc_fixed.o ac3enc.o ac3tab.o ac3.o +OBJS-$(CONFIG_AC3_FIXED_ENCODER) += ac3enc_fixed.o ac3enc.o ac3tab.o ac3.o kbdwin.o OBJS-$(CONFIG_AC3_MF_ENCODER) += mfenc.o mf_utils.o OBJS-$(CONFIG_ACELP_KELVIN_DECODER) += g729dec.o lsp.o celp_math.o celp_filters.o acelp_filters.o acelp_pitch_delay.o acelp_vectors.o g729postfilter.o OBJS-$(CONFIG_AGM_DECODER) += agm.o diff --git a/libavcodec/ac3enc.c b/libavcodec/ac3enc.c index b2e3b2bb4b..9dafe0ef55 100644 --- a/libavcodec/ac3enc.c +++ b/libavcodec/ac3enc.c @@ -2047,6 +2047,7 @@ av_cold int ff_ac3_encode_close(AVCodecContext *avctx) int blk, ch; AC3EncodeContext *s = avctx->priv_data; + av_freep(&s->mdct_window); av_freep(&s->windowed_samples); if (s->planar_samples) for (ch = 0; ch < s->channels; ch++) diff --git a/libavcodec/ac3enc.h b/libavcodec/ac3enc.h index 044564ecb4..ba62891371 100644 --- a/libavcodec/ac3enc.h +++ b/libavcodec/ac3enc.h @@ -30,8 +30,6 @@ #include <stdint.h> -#include "libavutil/float_dsp.h" - #include "ac3.h" #include "ac3dsp.h" #include "avcodec.h" @@ -53,6 +51,7 @@ #define AC3ENC_TYPE_EAC3 2 #if AC3ENC_FLOAT +#include "libavutil/float_dsp.h" #define AC3_NAME(x) ff_ac3_float_ ## x #define MAC_COEF(d,a,b) ((d)+=(a)*(b)) #define COEF_MIN (-16777215.0/16777216.0) @@ -62,12 +61,13 @@ typedef float SampleType; typedef float CoefType; typedef float CoefSumType; #else +#include "libavutil/fixed_dsp.h" #define AC3_NAME(x) ff_ac3_fixed_ ## x #define MAC_COEF(d,a,b) MAC64(d,a,b) #define COEF_MIN -16777215 #define COEF_MAX 16777215 #define NEW_CPL_COORD_THRESHOLD 503317 -typedef int16_t SampleType; +typedef int32_t SampleType; typedef int32_t CoefType; typedef int64_t CoefSumType; #endif @@ -141,7 +141,6 @@ typedef struct AC3Block { uint16_t **qmant; ///< quantized mantissas uint8_t **cpl_coord_exp; ///< coupling coord exponents (cplcoexp) uint8_t **cpl_coord_mant; ///< coupling coord mantissas (cplcomant) - uint8_t coeff_shift[AC3_MAX_CHANNELS]; ///< fixed-point coefficient shift values uint8_t new_rematrixing_strategy; ///< send new rematrixing flags in this block int num_rematrixing_bands; ///< number of rematrixing bands uint8_t rematrixing_flags[4]; ///< rematrixing flags @@ -165,7 +164,11 @@ typedef struct AC3EncodeContext { AVCodecContext *avctx; ///< parent AVCodecContext PutBitContext pb; ///< bitstream writer context AudioDSPContext adsp; +#if AC3ENC_FLOAT AVFloatDSPContext *fdsp; +#else + AVFixedDSPContext *fdsp; +#endif MECmpContext mecc; AC3DSPContext ac3dsp; ///< AC-3 optimized functions FFTContext mdct; ///< FFT context for MDCT calculation diff --git a/libavcodec/ac3enc_fixed.c b/libavcodec/ac3enc_fixed.c index 7818dd8c35..3b302d40df 100644 --- a/libavcodec/ac3enc_fixed.c +++ b/libavcodec/ac3enc_fixed.c @@ -26,12 +26,14 @@ * fixed-point AC-3 encoder. */ -#define FFT_FLOAT 0 #define AC3ENC_FLOAT 0 +#define FFT_FLOAT 0 +#define FFT_FIXED_32 1 #include "internal.h" #include "audiodsp.h" #include "ac3enc.h" #include "eac3enc.h" +#include "kbdwin.h" #define AC3ENC_TYPE AC3ENC_TYPE_AC3_FIXED #include "ac3enc_opts_template.c" @@ -43,37 +45,6 @@ static const AVClass ac3enc_class = { .version = LIBAVUTIL_VERSION_INT, }; -/* - * Normalize the input samples to use the maximum available precision. - * This assumes signed 16-bit input samples. - */ -static int normalize_samples(AC3EncodeContext *s) -{ - int v = s->ac3dsp.ac3_max_msb_abs_int16(s->windowed_samples, AC3_WINDOW_SIZE); - v = 14 - av_log2(v); - if (v > 0) - s->ac3dsp.ac3_lshift_int16(s->windowed_samples, AC3_WINDOW_SIZE, v); - /* +6 to right-shift from 31-bit to 25-bit */ - return v + 6; -} - - -/* - * Scale MDCT coefficients to 25-bit signed fixed-point. - */ -static void scale_coefficients(AC3EncodeContext *s) -{ - int blk, ch; - - for (blk = 0; blk < s->num_blocks; blk++) { - AC3Block *block = &s->blocks[blk]; - for (ch = 1; ch <= s->channels; ch++) { - s->ac3dsp.ac3_rshift_int32(block->mdct_coef[ch], AC3_MAX_COEFS, - block->coeff_shift[ch]); - } - } -} - static void sum_square_butterfly(AC3EncodeContext *s, int64_t sum[4], const int32_t *coef0, const int32_t *coef1, int len) @@ -120,7 +91,6 @@ static av_cold void ac3_fixed_mdct_end(AC3EncodeContext *s) ff_mdct_end(&s->mdct); } - /** * Initialize MDCT tables. * @@ -129,9 +99,24 @@ static av_cold void ac3_fixed_mdct_end(AC3EncodeContext *s) */ static av_cold int ac3_fixed_mdct_init(AC3EncodeContext *s) { - int ret = ff_mdct_init(&s->mdct, 9, 0, -1.0); - s->mdct_window = ff_ac3_window; - return ret; + float fwin[AC3_BLOCK_SIZE]; + + int32_t *iwin = av_malloc_array(AC3_WINDOW_SIZE, sizeof(*iwin)); + if (!iwin) + return AVERROR(ENOMEM); + + ff_kbd_window_init(fwin, 5.0, AC3_WINDOW_SIZE/2); + + for (int i = 0; i < AC3_WINDOW_SIZE/2; i++) + iwin[AC3_WINDOW_SIZE-1-i] = lrintf(fwin[i] * (1 << 22)); + + s->mdct_window = iwin; + + s->fdsp = avpriv_alloc_fixed_dsp(s->avctx->flags & AV_CODEC_FLAG_BITEXACT); + if (!s->fdsp) + return AVERROR(ENOMEM); + + return ff_mdct_init(&s->mdct, 9, 0, -1.0); } @@ -155,7 +140,7 @@ AVCodec ff_ac3_fixed_encoder = { .init = ac3_fixed_encode_init, .encode2 = ff_ac3_fixed_encode_frame, .close = ff_ac3_encode_close, - .sample_fmts = (const enum AVSampleFormat[]){ AV_SAMPLE_FMT_S16P, + .sample_fmts = (const enum AVSampleFormat[]){ AV_SAMPLE_FMT_S32P, AV_SAMPLE_FMT_NONE }, .priv_class = &ac3enc_class, .caps_internal = FF_CODEC_CAP_INIT_THREADSAFE | FF_CODEC_CAP_INIT_CLEANUP, diff --git a/libavcodec/ac3enc_float.c b/libavcodec/ac3enc_float.c index 45bfed34f9..b17b3a2365 100644 --- a/libavcodec/ac3enc_float.c +++ b/libavcodec/ac3enc_float.c @@ -97,7 +97,6 @@ static void sum_square_butterfly(AC3EncodeContext *s, float sum[4], static av_cold void ac3_float_mdct_end(AC3EncodeContext *s) { ff_mdct_end(&s->mdct); - av_freep(&s->mdct_window); } diff --git a/libavcodec/ac3enc_template.c b/libavcodec/ac3enc_template.c index 0fdc95b968..de6eba71d8 100644 --- a/libavcodec/ac3enc_template.c +++ b/libavcodec/ac3enc_template.c @@ -91,18 +91,11 @@ static void apply_mdct(AC3EncodeContext *s) AC3Block *block = &s->blocks[blk]; const SampleType *input_samples = &s->planar_samples[ch][blk * AC3_BLOCK_SIZE]; -#if AC3ENC_FLOAT s->fdsp->vector_fmul(s->windowed_samples, input_samples, - s->mdct_window, AC3_WINDOW_SIZE); -#else - s->ac3dsp.apply_window_int16(s->windowed_samples, input_samples, - s->mdct_window, AC3_WINDOW_SIZE); - - block->coeff_shift[ch + 1] = normalize_samples(s); -#endif + s->mdct_window, AC3_WINDOW_SIZE); - s->mdct.mdct_calcw(&s->mdct, block->mdct_coef[ch+1], - s->windowed_samples); + s->mdct.mdct_calc(&s->mdct, block->mdct_coef[ch+1], + s->windowed_samples); } } } @@ -390,9 +383,6 @@ int AC3_NAME(encode_frame)(AVCodecContext *avctx, AVPacket *avpkt, apply_mdct(s); - if (!AC3ENC_FLOAT) - scale_coefficients(s); - clip_coefficients(&s->adsp, s->blocks[0].mdct_coef[1], AC3_MAX_COEFS * s->num_blocks * s->channels); @@ -404,8 +394,9 @@ int AC3_NAME(encode_frame)(AVCodecContext *avctx, AVPacket *avpkt, compute_rematrixing_strategy(s); - if (AC3ENC_FLOAT) - scale_coefficients(s); +#if AC3ENC_FLOAT + scale_coefficients(s); +#endif return ff_ac3_encode_frame_common_end(avctx, avpkt, frame, got_packet_ptr); } diff --git a/libavcodec/version.h b/libavcodec/version.h index 1420439044..cd871f0fa0 100644 --- a/libavcodec/version.h +++ b/libavcodec/version.h @@ -28,7 +28,7 @@ #include "libavutil/version.h" #define LIBAVCODEC_VERSION_MAJOR 58 -#define LIBAVCODEC_VERSION_MINOR 116 +#define LIBAVCODEC_VERSION_MINOR 117 #define LIBAVCODEC_VERSION_MICRO 100 #define LIBAVCODEC_VERSION_INT AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \ diff --git a/tests/fate/ac3.mak b/tests/fate/ac3.mak index 757cd51cf2..d76e22bade 100644 --- a/tests/fate/ac3.mak +++ b/tests/fate/ac3.mak @@ -90,7 +90,7 @@ fate-ac3-fixed-encode: tests/data/asynth-44100-2.wav fate-ac3-fixed-encode: SRC = $(TARGET_PATH)/tests/data/asynth-44100-2.wav fate-ac3-fixed-encode: CMD = md5 -i $(SRC) -c ac3_fixed -ab 128k -f ac3 -flags +bitexact -af aresample fate-ac3-fixed-encode: CMP = oneline -fate-ac3-fixed-encode: REF = a1d1fc116463b771abf5aef7ed37d7b1 +fate-ac3-fixed-encode: REF = 1f548175e11a95e62ce20e442fcc8d08 FATE_EAC3-$(call ALLYES, EAC3_DEMUXER EAC3_MUXER EAC3_CORE_BSF) += fate-eac3-core-bsf fate-eac3-core-bsf: CMD = md5pipe -i $(TARGET_SAMPLES)/eac3/the_great_wall_7.1.eac3 -c:a copy -bsf:a eac3_core -fflags +bitexact -f eac3 diff --git a/tests/fate/ffmpeg.mak b/tests/fate/ffmpeg.mak index c6d8dc2e5c..4dfb77d250 100644 --- a/tests/fate/ffmpeg.mak +++ b/tests/fate/ffmpeg.mak @@ -83,7 +83,7 @@ fate-unknown_layout-pcm: CMD = md5 \ FATE_FFMPEG-$(call ALLYES, PCM_S16LE_DEMUXER AC3_MUXER PCM_S16LE_DECODER AC3_FIXED_ENCODER) += fate-unknown_layout-ac3 fate-unknown_layout-ac3: $(AREF) fate-unknown_layout-ac3: CMD = md5 -auto_conversion_filters \ - -guess_layout_max 0 -f s16le -ac 1 -ar 44100 -i $(TARGET_PATH)/$(AREF) \ + -guess_layout_max 0 -f s32le -ac 1 -ar 44100 -i $(TARGET_PATH)/$(AREF) \ -f ac3 -flags +bitexact -c ac3_fixed diff --git a/tests/ref/fate/unknown_layout-ac3 b/tests/ref/fate/unknown_layout-ac3 index d332efcec4..719a44aacf 100644 --- a/tests/ref/fate/unknown_layout-ac3 +++ b/tests/ref/fate/unknown_layout-ac3 @@ -1 +1 @@ -bbb7550d6d93973c10f4ee13c87cf799 +febdb165cfd6cba375aa086195e61213 diff --git a/tests/ref/lavf/rm b/tests/ref/lavf/rm index 43ea4c7897..fc2a6564a2 100644 --- a/tests/ref/lavf/rm +++ b/tests/ref/lavf/rm @@ -1,2 +1,2 @@ -e30681d05d6f3d24108d3614600bf116 *tests/data/lavf/lavf.rm +8dfb8d4556d61d3615e0d0012ffe540c *tests/data/lavf/lavf.rm 346424 tests/data/lavf/lavf.rm