From patchwork Wed Jan  6 23:13:05 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
X-Patchwork-Id: 24814
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
X-Original-To: patchwork@ffaux-bg.ffmpeg.org
Delivered-To: patchwork@ffaux-bg.ffmpeg.org
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by ffaux.localdomain (Postfix) with ESMTP id 91F7C44A498
	for <patchwork@ffaux-bg.ffmpeg.org>; Thu,  7 Jan 2021 01:13:55 +0200 (EET)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 76B3268A2CA;
	Thu,  7 Jan 2021 01:13:55 +0200 (EET)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com
 [209.85.208.41])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3E3C268A18D
 for <ffmpeg-devel@ffmpeg.org>; Thu,  7 Jan 2021 01:13:47 +0200 (EET)
Received: by mail-ed1-f41.google.com with SMTP id g24so5916700edw.9
 for <ffmpeg-devel@ffmpeg.org>; Wed, 06 Jan 2021 15:13:47 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:in-reply-to:references:reply-to
 :mime-version:content-transfer-encoding;
 bh=4zzgHJdDT3mkc3mX+H1gAOZw28DQkKcVOShtm2nDqYo=;
 b=BQrN/9EiSUfmqvHebyakApssATAvsAVZnus+73WPo3upGVeiynhMq5eR2r++FD/mpm
 DL82I9IIIDvyyVpuj457wREGXUbyBwN7+YuJfIoj5Zke++9S0lrSwDQmNW6/CDhSxG/T
 cNkAFww4T4AfpK5qgRhqprvCB0YMDZ+zmz2Czg/wp7Os+8c9VHCAEWGEOfnWCjLEKLom
 KmdcuML6YN7MsLtzu/uKs5GTFrrvYxb6JENIcE8Wqs326HnmRlfx0+bl29zDwxfT/hKA
 YBy+mWLjwAqVri/t1Q1LUi7JbGTJp+fhd5VDg72tTpzK0v8gdyZPhrnbeE3uhaMp23BY
 F1Jw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references:reply-to:mime-version:content-transfer-encoding;
 bh=4zzgHJdDT3mkc3mX+H1gAOZw28DQkKcVOShtm2nDqYo=;
 b=MKLDcyhzY4dGBal52S50eOXmLxF9JiQeP2iPDD6S3NhLeyPgHf6xZSmDnApGscJTUH
 yljVXn6NduAqyp4qun0Y4AkUaFcmDeaB/iabdQ2Mrtu7lruVkbGKEGh4UrsMdXH5jw4K
 IF5H7hPhuvfNpkL4rFU2GxJavYttGUswTIVxYnQWaANVNicuHGthoDDjUaJf7b+zIAyy
 7C1HXmde+0WKzooElTaX/LL+CsLQ/vbQtIc9UwCUzY6EQwWETKtG7wq6/QFh+Twg/lui
 jMaOzd4UthDTuT/GrC0r2VK/TlztDSWMUOlJtEOaC2gwBrCcOeZP3ckfWGpEmQhz3vCS
 mRqQ==
X-Gm-Message-State: AOAM5339na1DV8No/tHNbUlBT7kiCw2dGekUSDGt8SVM5enwQHuOzyzE
 JXlc7/4smKa1t5gcQJigzfQCaJD0e7U=
X-Google-Smtp-Source: 
 ABdhPJzXlU6Gb2SxHhy/2ArjTbGL2c+er1oto9WzQsG6yUXkHTqrD80YAY7C+/KthGoRJScDGcoifw==
X-Received: by 2002:a05:6402:171a:: with SMTP id
 y26mr5636430edu.371.1609974826352;
 Wed, 06 Jan 2021 15:13:46 -0800 (PST)
Received: from sblaptop.fritz.box (ipbcc1aa4b.dynamic.kabel-deutschland.de.
 [188.193.170.75])
 by smtp.gmail.com with ESMTPSA id b7sm1794295ejz.4.2021.01.06.15.13.45
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 06 Jan 2021 15:13:45 -0800 (PST)
From: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
To: ffmpeg-devel@ffmpeg.org
Date: Thu,  7 Jan 2021 00:13:05 +0100
Message-Id: <20210106231308.2952217-2-andreas.rheinhardt@gmail.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20210106231308.2952217-1-andreas.rheinhardt@gmail.com>
References: <20210106231308.2952217-1-andreas.rheinhardt@gmail.com>
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCH 2/5] avcodec/fft_template: Remove unused
	fixed-point cosine tables
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

There are three types of FFTs: floating-point, 32-bit fixed-point and
16-bit fixed-point. The latter has exactly one user: The fixed-point
AC-3-encoder; the cosine tables used by it use up to seven bits. The
tables corresponding to eight to seventeen bits are unused, as are the
FFT functions for these bits.

Therefore this commit removes these tables and functions. This is
especially beneficial when using hardcoded tables as they take up more
than 255 KiB. But even without it one saves said unused functions as
well as entries in corresponding tables (this also saves relocations).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
---
Thee changes to ARM assembly are honstely untested. I hope someone can
test them. Btw: It seems that the ARM assembly code wouldn't be able to
deal with an FFT with more than 16 bits (no function for this has been
defined), which only worked because no one ever used that many bits with
the fixed-point FFT.

 libavcodec/arm/fft_fixed_neon.S | 18 ------------------
 libavcodec/cos_tablegen.c       |  4 ++--
 libavcodec/fft.h                |  4 +++-
 libavcodec/fft_fixed.c          |  1 +
 libavcodec/fft_template.c       | 31 +++++++++++++++++++++++--------
 tests/fate/fft.mak              |  8 ++++++--
 6 files changed, 35 insertions(+), 31 deletions(-)

diff --git a/libavcodec/arm/fft_fixed_neon.S b/libavcodec/arm/fft_fixed_neon.S
index 2651607544..c94da56f80 100644
--- a/libavcodec/arm/fft_fixed_neon.S
+++ b/libavcodec/arm/fft_fixed_neon.S
@@ -223,15 +223,6 @@ endfunc
         def_fft    32,    16,     8
         def_fft    64,    32,    16
         def_fft   128,    64,    32
-        def_fft   256,   128,    64
-        def_fft   512,   256,   128
-        def_fft  1024,   512,   256
-        def_fft  2048,  1024,   512
-        def_fft  4096,  2048,  1024
-        def_fft  8192,  4096,  2048
-        def_fft 16384,  8192,  4096
-        def_fft 32768, 16384,  8192
-        def_fft 65536, 32768, 16384
 
 function ff_fft_fixed_calc_neon, export=1
         ldr             r2,  [r0]
@@ -249,13 +240,4 @@ const   fft_fixed_tab_neon, relocate=1
         .word fft32_neon
         .word fft64_neon
         .word fft128_neon
-        .word fft256_neon
-        .word fft512_neon
-        .word fft1024_neon
-        .word fft2048_neon
-        .word fft4096_neon
-        .word fft8192_neon
-        .word fft16384_neon
-        .word fft32768_neon
-        .word fft65536_neon
 endconst
diff --git a/libavcodec/cos_tablegen.c b/libavcodec/cos_tablegen.c
index 7206aad5dd..5929c29e1a 100644
--- a/libavcodec/cos_tablegen.c
+++ b/libavcodec/cos_tablegen.c
@@ -26,7 +26,6 @@
 
 #include "libavutil/mathematics.h"
 
-#define BITS 17
 #define FLOATFMT "%.18e"
 #define FIXEDFMT "%6d"
 
@@ -56,12 +55,13 @@ int main(int argc, char *argv[])
     int i, j;
     int do_sin = argc > 1 && !strcmp(argv[1], "sin");
     int fixed  = argc > 1 &&  strstr(argv[1], "fixed");
+    int bits   = fixed ? 7 : 17;
     double (*func)(double) = do_sin ? sin : cos;
 
     printf("/* This file was automatically generated. */\n");
     printf("#define FFT_FLOAT %d\n", !fixed);
     printf("#include \"libavcodec/%s\"\n", do_sin ? "rdft.h" : "fft.h");
-    for (i = 4; i <= BITS; i++) {
+    for (i = 4; i <= bits; i++) {
         int m = 1 << i;
         double freq = 2*M_PI/m;
         printf("%s(%i) = {\n   ", do_sin ? "SINTABLE" : "COSTABLE", m);
diff --git a/libavcodec/fft.h b/libavcodec/fft.h
index 5f67b61f06..fedc0c5ef0 100644
--- a/libavcodec/fft.h
+++ b/libavcodec/fft.h
@@ -127,6 +127,7 @@ extern COSTABLE(16);
 extern COSTABLE(32);
 extern COSTABLE(64);
 extern COSTABLE(128);
+#if FFT_FLOAT
 extern COSTABLE(256);
 extern COSTABLE(512);
 extern COSTABLE(1024);
@@ -137,7 +138,8 @@ extern COSTABLE(16384);
 extern COSTABLE(32768);
 extern COSTABLE(65536);
 extern COSTABLE(131072);
-extern COSTABLE_CONST FFTSample* const FFT_NAME(ff_cos_tabs)[18];
+#endif /* FFT_FLOAT */
+extern COSTABLE_CONST FFTSample* const FFT_NAME(ff_cos_tabs)[];
 
 #define ff_init_ff_cos_tabs FFT_NAME(ff_init_ff_cos_tabs)
 
diff --git a/libavcodec/fft_fixed.c b/libavcodec/fft_fixed.c
index 3d3bd2fca6..52d225ee09 100644
--- a/libavcodec/fft_fixed.c
+++ b/libavcodec/fft_fixed.c
@@ -18,4 +18,5 @@
 
 #define FFT_FLOAT 0
 #define FFT_FIXED_32 0
+#define MAX_BITS 7
 #include "fft_template.c"
diff --git a/libavcodec/fft_template.c b/libavcodec/fft_template.c
index 8825e39f79..7a7d51a6b4 100644
--- a/libavcodec/fft_template.c
+++ b/libavcodec/fft_template.c
@@ -33,6 +33,10 @@
 #include "fft.h"
 #include "fft-internal.h"
 
+#ifndef MAX_BITS
+#define MAX_BITS 17
+#endif
+
 #if FFT_FIXED_32
 #include "fft_table.h"
 #else /* FFT_FIXED_32 */
@@ -43,6 +47,7 @@ COSTABLE(16);
 COSTABLE(32);
 COSTABLE(64);
 COSTABLE(128);
+#if FFT_FLOAT
 COSTABLE(256);
 COSTABLE(512);
 COSTABLE(1024);
@@ -53,6 +58,7 @@ COSTABLE(16384);
 COSTABLE(32768);
 COSTABLE(65536);
 COSTABLE(131072);
+#endif /* FFT_FLOAT */
 
 static av_cold void init_ff_cos_tabs(int index)
 {
@@ -81,6 +87,7 @@ INIT_FF_COS_TABS_FUNC(4, 16)
 INIT_FF_COS_TABS_FUNC(5, 32)
 INIT_FF_COS_TABS_FUNC(6, 64)
 INIT_FF_COS_TABS_FUNC(7, 128)
+#if FFT_FLOAT
 INIT_FF_COS_TABS_FUNC(8, 256)
 INIT_FF_COS_TABS_FUNC(9, 512)
 INIT_FF_COS_TABS_FUNC(10, 1024)
@@ -91,6 +98,7 @@ INIT_FF_COS_TABS_FUNC(14, 16384)
 INIT_FF_COS_TABS_FUNC(15, 32768)
 INIT_FF_COS_TABS_FUNC(16, 65536)
 INIT_FF_COS_TABS_FUNC(17, 131072)
+#endif /* FFT_FLOAT */
 
 static CosTabsInitOnce cos_tabs_init_once[] = {
     { NULL },
@@ -101,6 +109,7 @@ static CosTabsInitOnce cos_tabs_init_once[] = {
     { init_ff_cos_tabs_32, AV_ONCE_INIT },
     { init_ff_cos_tabs_64, AV_ONCE_INIT },
     { init_ff_cos_tabs_128, AV_ONCE_INIT },
+#if FFT_FLOAT
     { init_ff_cos_tabs_256, AV_ONCE_INIT },
     { init_ff_cos_tabs_512, AV_ONCE_INIT },
     { init_ff_cos_tabs_1024, AV_ONCE_INIT },
@@ -111,6 +120,7 @@ static CosTabsInitOnce cos_tabs_init_once[] = {
     { init_ff_cos_tabs_32768, AV_ONCE_INIT },
     { init_ff_cos_tabs_65536, AV_ONCE_INIT },
     { init_ff_cos_tabs_131072, AV_ONCE_INIT },
+#endif /* FFT_FLOAT */
 };
 
 #endif
@@ -120,6 +130,7 @@ COSTABLE_CONST FFTSample * const FFT_NAME(ff_cos_tabs)[] = {
     FFT_NAME(ff_cos_32),
     FFT_NAME(ff_cos_64),
     FFT_NAME(ff_cos_128),
+#if FFT_FLOAT
     FFT_NAME(ff_cos_256),
     FFT_NAME(ff_cos_512),
     FFT_NAME(ff_cos_1024),
@@ -130,6 +141,7 @@ COSTABLE_CONST FFTSample * const FFT_NAME(ff_cos_tabs)[] = {
     FFT_NAME(ff_cos_32768),
     FFT_NAME(ff_cos_65536),
     FFT_NAME(ff_cos_131072),
+#endif /* FFT_FLOAT */
 };
 
 #endif /* FFT_FIXED_32 */
@@ -200,7 +212,7 @@ av_cold int ff_fft_init(FFTContext *s, int nbits, int inverse)
     s->revtab = NULL;
     s->revtab32 = NULL;
 
-    if (nbits < 2 || nbits > 17)
+    if (nbits < 2 || nbits > MAX_BITS)
         goto fail;
     s->nbits = nbits;
     n = 1 << nbits;
@@ -537,11 +549,6 @@ static void name(FFTComplex *z, const FFTSample *wre, unsigned int n)\
 }
 
 PASS(pass)
-#if !CONFIG_SMALL
-#undef BUTTERFLIES
-#define BUTTERFLIES BUTTERFLIES_BIG
-PASS(pass_big)
-#endif
 
 #define DECL_FFT(n,n2,n4)\
 static void fft##n(FFTComplex *z)\
@@ -603,9 +610,13 @@ DECL_FFT(16,8,4)
 DECL_FFT(32,16,8)
 DECL_FFT(64,32,16)
 DECL_FFT(128,64,32)
+#if FFT_FLOAT
 DECL_FFT(256,128,64)
 DECL_FFT(512,256,128)
 #if !CONFIG_SMALL
+#undef BUTTERFLIES
+#define BUTTERFLIES BUTTERFLIES_BIG
+PASS(pass_big)
 #define pass pass_big
 #endif
 DECL_FFT(1024,512,256)
@@ -616,10 +627,14 @@ DECL_FFT(16384,8192,4096)
 DECL_FFT(32768,16384,8192)
 DECL_FFT(65536,32768,16384)
 DECL_FFT(131072,65536,32768)
+#endif /* FFT_FLOAT */
 
 static void (* const fft_dispatch[])(FFTComplex*) = {
-    fft4, fft8, fft16, fft32, fft64, fft128, fft256, fft512, fft1024,
-    fft2048, fft4096, fft8192, fft16384, fft32768, fft65536, fft131072
+    fft4, fft8, fft16, fft32, fft64, fft128,
+#if FFT_FLOAT
+    fft256, fft512, fft1024, fft2048, fft4096,
+    fft8192, fft16384, fft32768, fft65536, fft131072
+#endif /* FFT_FLOAT */
 };
 
 static void fft_calc_c(FFTContext *s, FFTComplex *z)
diff --git a/tests/fate/fft.mak b/tests/fate/fft.mak
index 5da6e687ec..3eb8450d94 100644
--- a/tests/fate/fft.mak
+++ b/tests/fate/fft.mak
@@ -28,15 +28,19 @@ $(FATE_FFT_ALL): CMD = run libavcodec/tests/fft$(EXESUF) $(CPUFLAGS:%=-c%) $(ARG
 
 define DEF_FFT_FIXED
 FATE_FFT_FIXED-$(CONFIG_FFT)   += fate-fft-fixed-$(1)  fate-ifft-fixed-$(1)
-FATE_MDCT_FIXED-$(CONFIG_MDCT) += fate-mdct-fixed-$(1) fate-imdct-fixed-$(1)
 
 fate-fft-fixed-$(1):   ARGS = -n$(1)
 fate-ifft-fixed-$(1):  ARGS = -n$(1) -i
+endef
+define DEF_MDCT_FIXED
+FATE_MDCT_FIXED-$(CONFIG_MDCT) += fate-mdct-fixed-$(1) fate-imdct-fixed-$(1)
+
 fate-mdct-fixed-$(1):  ARGS = -n$(1) -m
 fate-imdct-fixed-$(1): ARGS = -n$(1) -m -i
 endef
 
-$(foreach N, 4 5 6 7 8 9 10 11 12, $(eval $(call DEF_FFT_FIXED,$(N))))
+$(foreach N, 4 5 6 7,     $(eval $(call DEF_FFT_FIXED,$(N))))
+$(foreach N, 4 5 6 7 8 9, $(eval $(call DEF_MDCT_FIXED,$(N))))
 
 fate-fft-fixed: $(FATE_FFT_FIXED-yes)
 fate-mdct-fixed: $(FATE_MDCT_FIXED-yes)