From patchwork Wed Jan 6 23:13:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 24814 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 91F7C44A498 for ; Thu, 7 Jan 2021 01:13:55 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 76B3268A2CA; Thu, 7 Jan 2021 01:13:55 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3E3C268A18D for ; Thu, 7 Jan 2021 01:13:47 +0200 (EET) Received: by mail-ed1-f41.google.com with SMTP id g24so5916700edw.9 for ; Wed, 06 Jan 2021 15:13:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; bh=4zzgHJdDT3mkc3mX+H1gAOZw28DQkKcVOShtm2nDqYo=; b=BQrN/9EiSUfmqvHebyakApssATAvsAVZnus+73WPo3upGVeiynhMq5eR2r++FD/mpm DL82I9IIIDvyyVpuj457wREGXUbyBwN7+YuJfIoj5Zke++9S0lrSwDQmNW6/CDhSxG/T cNkAFww4T4AfpK5qgRhqprvCB0YMDZ+zmz2Czg/wp7Os+8c9VHCAEWGEOfnWCjLEKLom KmdcuML6YN7MsLtzu/uKs5GTFrrvYxb6JENIcE8Wqs326HnmRlfx0+bl29zDwxfT/hKA YBy+mWLjwAqVri/t1Q1LUi7JbGTJp+fhd5VDg72tTpzK0v8gdyZPhrnbeE3uhaMp23BY F1Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:reply-to:mime-version:content-transfer-encoding; bh=4zzgHJdDT3mkc3mX+H1gAOZw28DQkKcVOShtm2nDqYo=; b=MKLDcyhzY4dGBal52S50eOXmLxF9JiQeP2iPDD6S3NhLeyPgHf6xZSmDnApGscJTUH yljVXn6NduAqyp4qun0Y4AkUaFcmDeaB/iabdQ2Mrtu7lruVkbGKEGh4UrsMdXH5jw4K IF5H7hPhuvfNpkL4rFU2GxJavYttGUswTIVxYnQWaANVNicuHGthoDDjUaJf7b+zIAyy 7C1HXmde+0WKzooElTaX/LL+CsLQ/vbQtIc9UwCUzY6EQwWETKtG7wq6/QFh+Twg/lui jMaOzd4UthDTuT/GrC0r2VK/TlztDSWMUOlJtEOaC2gwBrCcOeZP3ckfWGpEmQhz3vCS mRqQ== X-Gm-Message-State: AOAM5339na1DV8No/tHNbUlBT7kiCw2dGekUSDGt8SVM5enwQHuOzyzE JXlc7/4smKa1t5gcQJigzfQCaJD0e7U= X-Google-Smtp-Source: ABdhPJzXlU6Gb2SxHhy/2ArjTbGL2c+er1oto9WzQsG6yUXkHTqrD80YAY7C+/KthGoRJScDGcoifw== X-Received: by 2002:a05:6402:171a:: with SMTP id y26mr5636430edu.371.1609974826352; Wed, 06 Jan 2021 15:13:46 -0800 (PST) Received: from sblaptop.fritz.box (ipbcc1aa4b.dynamic.kabel-deutschland.de. [188.193.170.75]) by smtp.gmail.com with ESMTPSA id b7sm1794295ejz.4.2021.01.06.15.13.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Jan 2021 15:13:45 -0800 (PST) From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Thu, 7 Jan 2021 00:13:05 +0100 Message-Id: <20210106231308.2952217-2-andreas.rheinhardt@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210106231308.2952217-1-andreas.rheinhardt@gmail.com> References: <20210106231308.2952217-1-andreas.rheinhardt@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/5] avcodec/fft_template: Remove unused fixed-point cosine tables X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" There are three types of FFTs: floating-point, 32-bit fixed-point and 16-bit fixed-point. The latter has exactly one user: The fixed-point AC-3-encoder; the cosine tables used by it use up to seven bits. The tables corresponding to eight to seventeen bits are unused, as are the FFT functions for these bits. Therefore this commit removes these tables and functions. This is especially beneficial when using hardcoded tables as they take up more than 255 KiB. But even without it one saves said unused functions as well as entries in corresponding tables (this also saves relocations). Signed-off-by: Andreas Rheinhardt --- Thee changes to ARM assembly are honstely untested. I hope someone can test them. Btw: It seems that the ARM assembly code wouldn't be able to deal with an FFT with more than 16 bits (no function for this has been defined), which only worked because no one ever used that many bits with the fixed-point FFT. libavcodec/arm/fft_fixed_neon.S | 18 ------------------ libavcodec/cos_tablegen.c | 4 ++-- libavcodec/fft.h | 4 +++- libavcodec/fft_fixed.c | 1 + libavcodec/fft_template.c | 31 +++++++++++++++++++++++-------- tests/fate/fft.mak | 8 ++++++-- 6 files changed, 35 insertions(+), 31 deletions(-) diff --git a/libavcodec/arm/fft_fixed_neon.S b/libavcodec/arm/fft_fixed_neon.S index 2651607544..c94da56f80 100644 --- a/libavcodec/arm/fft_fixed_neon.S +++ b/libavcodec/arm/fft_fixed_neon.S @@ -223,15 +223,6 @@ endfunc def_fft 32, 16, 8 def_fft 64, 32, 16 def_fft 128, 64, 32 - def_fft 256, 128, 64 - def_fft 512, 256, 128 - def_fft 1024, 512, 256 - def_fft 2048, 1024, 512 - def_fft 4096, 2048, 1024 - def_fft 8192, 4096, 2048 - def_fft 16384, 8192, 4096 - def_fft 32768, 16384, 8192 - def_fft 65536, 32768, 16384 function ff_fft_fixed_calc_neon, export=1 ldr r2, [r0] @@ -249,13 +240,4 @@ const fft_fixed_tab_neon, relocate=1 .word fft32_neon .word fft64_neon .word fft128_neon - .word fft256_neon - .word fft512_neon - .word fft1024_neon - .word fft2048_neon - .word fft4096_neon - .word fft8192_neon - .word fft16384_neon - .word fft32768_neon - .word fft65536_neon endconst diff --git a/libavcodec/cos_tablegen.c b/libavcodec/cos_tablegen.c index 7206aad5dd..5929c29e1a 100644 --- a/libavcodec/cos_tablegen.c +++ b/libavcodec/cos_tablegen.c @@ -26,7 +26,6 @@ #include "libavutil/mathematics.h" -#define BITS 17 #define FLOATFMT "%.18e" #define FIXEDFMT "%6d" @@ -56,12 +55,13 @@ int main(int argc, char *argv[]) int i, j; int do_sin = argc > 1 && !strcmp(argv[1], "sin"); int fixed = argc > 1 && strstr(argv[1], "fixed"); + int bits = fixed ? 7 : 17; double (*func)(double) = do_sin ? sin : cos; printf("/* This file was automatically generated. */\n"); printf("#define FFT_FLOAT %d\n", !fixed); printf("#include \"libavcodec/%s\"\n", do_sin ? "rdft.h" : "fft.h"); - for (i = 4; i <= BITS; i++) { + for (i = 4; i <= bits; i++) { int m = 1 << i; double freq = 2*M_PI/m; printf("%s(%i) = {\n ", do_sin ? "SINTABLE" : "COSTABLE", m); diff --git a/libavcodec/fft.h b/libavcodec/fft.h index 5f67b61f06..fedc0c5ef0 100644 --- a/libavcodec/fft.h +++ b/libavcodec/fft.h @@ -127,6 +127,7 @@ extern COSTABLE(16); extern COSTABLE(32); extern COSTABLE(64); extern COSTABLE(128); +#if FFT_FLOAT extern COSTABLE(256); extern COSTABLE(512); extern COSTABLE(1024); @@ -137,7 +138,8 @@ extern COSTABLE(16384); extern COSTABLE(32768); extern COSTABLE(65536); extern COSTABLE(131072); -extern COSTABLE_CONST FFTSample* const FFT_NAME(ff_cos_tabs)[18]; +#endif /* FFT_FLOAT */ +extern COSTABLE_CONST FFTSample* const FFT_NAME(ff_cos_tabs)[]; #define ff_init_ff_cos_tabs FFT_NAME(ff_init_ff_cos_tabs) diff --git a/libavcodec/fft_fixed.c b/libavcodec/fft_fixed.c index 3d3bd2fca6..52d225ee09 100644 --- a/libavcodec/fft_fixed.c +++ b/libavcodec/fft_fixed.c @@ -18,4 +18,5 @@ #define FFT_FLOAT 0 #define FFT_FIXED_32 0 +#define MAX_BITS 7 #include "fft_template.c" diff --git a/libavcodec/fft_template.c b/libavcodec/fft_template.c index 8825e39f79..7a7d51a6b4 100644 --- a/libavcodec/fft_template.c +++ b/libavcodec/fft_template.c @@ -33,6 +33,10 @@ #include "fft.h" #include "fft-internal.h" +#ifndef MAX_BITS +#define MAX_BITS 17 +#endif + #if FFT_FIXED_32 #include "fft_table.h" #else /* FFT_FIXED_32 */ @@ -43,6 +47,7 @@ COSTABLE(16); COSTABLE(32); COSTABLE(64); COSTABLE(128); +#if FFT_FLOAT COSTABLE(256); COSTABLE(512); COSTABLE(1024); @@ -53,6 +58,7 @@ COSTABLE(16384); COSTABLE(32768); COSTABLE(65536); COSTABLE(131072); +#endif /* FFT_FLOAT */ static av_cold void init_ff_cos_tabs(int index) { @@ -81,6 +87,7 @@ INIT_FF_COS_TABS_FUNC(4, 16) INIT_FF_COS_TABS_FUNC(5, 32) INIT_FF_COS_TABS_FUNC(6, 64) INIT_FF_COS_TABS_FUNC(7, 128) +#if FFT_FLOAT INIT_FF_COS_TABS_FUNC(8, 256) INIT_FF_COS_TABS_FUNC(9, 512) INIT_FF_COS_TABS_FUNC(10, 1024) @@ -91,6 +98,7 @@ INIT_FF_COS_TABS_FUNC(14, 16384) INIT_FF_COS_TABS_FUNC(15, 32768) INIT_FF_COS_TABS_FUNC(16, 65536) INIT_FF_COS_TABS_FUNC(17, 131072) +#endif /* FFT_FLOAT */ static CosTabsInitOnce cos_tabs_init_once[] = { { NULL }, @@ -101,6 +109,7 @@ static CosTabsInitOnce cos_tabs_init_once[] = { { init_ff_cos_tabs_32, AV_ONCE_INIT }, { init_ff_cos_tabs_64, AV_ONCE_INIT }, { init_ff_cos_tabs_128, AV_ONCE_INIT }, +#if FFT_FLOAT { init_ff_cos_tabs_256, AV_ONCE_INIT }, { init_ff_cos_tabs_512, AV_ONCE_INIT }, { init_ff_cos_tabs_1024, AV_ONCE_INIT }, @@ -111,6 +120,7 @@ static CosTabsInitOnce cos_tabs_init_once[] = { { init_ff_cos_tabs_32768, AV_ONCE_INIT }, { init_ff_cos_tabs_65536, AV_ONCE_INIT }, { init_ff_cos_tabs_131072, AV_ONCE_INIT }, +#endif /* FFT_FLOAT */ }; #endif @@ -120,6 +130,7 @@ COSTABLE_CONST FFTSample * const FFT_NAME(ff_cos_tabs)[] = { FFT_NAME(ff_cos_32), FFT_NAME(ff_cos_64), FFT_NAME(ff_cos_128), +#if FFT_FLOAT FFT_NAME(ff_cos_256), FFT_NAME(ff_cos_512), FFT_NAME(ff_cos_1024), @@ -130,6 +141,7 @@ COSTABLE_CONST FFTSample * const FFT_NAME(ff_cos_tabs)[] = { FFT_NAME(ff_cos_32768), FFT_NAME(ff_cos_65536), FFT_NAME(ff_cos_131072), +#endif /* FFT_FLOAT */ }; #endif /* FFT_FIXED_32 */ @@ -200,7 +212,7 @@ av_cold int ff_fft_init(FFTContext *s, int nbits, int inverse) s->revtab = NULL; s->revtab32 = NULL; - if (nbits < 2 || nbits > 17) + if (nbits < 2 || nbits > MAX_BITS) goto fail; s->nbits = nbits; n = 1 << nbits; @@ -537,11 +549,6 @@ static void name(FFTComplex *z, const FFTSample *wre, unsigned int n)\ } PASS(pass) -#if !CONFIG_SMALL -#undef BUTTERFLIES -#define BUTTERFLIES BUTTERFLIES_BIG -PASS(pass_big) -#endif #define DECL_FFT(n,n2,n4)\ static void fft##n(FFTComplex *z)\ @@ -603,9 +610,13 @@ DECL_FFT(16,8,4) DECL_FFT(32,16,8) DECL_FFT(64,32,16) DECL_FFT(128,64,32) +#if FFT_FLOAT DECL_FFT(256,128,64) DECL_FFT(512,256,128) #if !CONFIG_SMALL +#undef BUTTERFLIES +#define BUTTERFLIES BUTTERFLIES_BIG +PASS(pass_big) #define pass pass_big #endif DECL_FFT(1024,512,256) @@ -616,10 +627,14 @@ DECL_FFT(16384,8192,4096) DECL_FFT(32768,16384,8192) DECL_FFT(65536,32768,16384) DECL_FFT(131072,65536,32768) +#endif /* FFT_FLOAT */ static void (* const fft_dispatch[])(FFTComplex*) = { - fft4, fft8, fft16, fft32, fft64, fft128, fft256, fft512, fft1024, - fft2048, fft4096, fft8192, fft16384, fft32768, fft65536, fft131072 + fft4, fft8, fft16, fft32, fft64, fft128, +#if FFT_FLOAT + fft256, fft512, fft1024, fft2048, fft4096, + fft8192, fft16384, fft32768, fft65536, fft131072 +#endif /* FFT_FLOAT */ }; static void fft_calc_c(FFTContext *s, FFTComplex *z) diff --git a/tests/fate/fft.mak b/tests/fate/fft.mak index 5da6e687ec..3eb8450d94 100644 --- a/tests/fate/fft.mak +++ b/tests/fate/fft.mak @@ -28,15 +28,19 @@ $(FATE_FFT_ALL): CMD = run libavcodec/tests/fft$(EXESUF) $(CPUFLAGS:%=-c%) $(ARG define DEF_FFT_FIXED FATE_FFT_FIXED-$(CONFIG_FFT) += fate-fft-fixed-$(1) fate-ifft-fixed-$(1) -FATE_MDCT_FIXED-$(CONFIG_MDCT) += fate-mdct-fixed-$(1) fate-imdct-fixed-$(1) fate-fft-fixed-$(1): ARGS = -n$(1) fate-ifft-fixed-$(1): ARGS = -n$(1) -i +endef +define DEF_MDCT_FIXED +FATE_MDCT_FIXED-$(CONFIG_MDCT) += fate-mdct-fixed-$(1) fate-imdct-fixed-$(1) + fate-mdct-fixed-$(1): ARGS = -n$(1) -m fate-imdct-fixed-$(1): ARGS = -n$(1) -m -i endef -$(foreach N, 4 5 6 7 8 9 10 11 12, $(eval $(call DEF_FFT_FIXED,$(N)))) +$(foreach N, 4 5 6 7, $(eval $(call DEF_FFT_FIXED,$(N)))) +$(foreach N, 4 5 6 7 8 9, $(eval $(call DEF_MDCT_FIXED,$(N)))) fate-fft-fixed: $(FATE_FFT_FIXED-yes) fate-mdct-fixed: $(FATE_MDCT_FIXED-yes)