From patchwork Wed Jan 6 23:13:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 24815 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 751DD44A498 for ; Thu, 7 Jan 2021 01:13:56 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6022968A606; Thu, 7 Jan 2021 01:13:56 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ej1-f44.google.com (mail-ej1-f44.google.com [209.85.218.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1AAFE68A295 for ; Thu, 7 Jan 2021 01:13:48 +0200 (EET) Received: by mail-ej1-f44.google.com with SMTP id x16so7273205ejj.7 for ; Wed, 06 Jan 2021 15:13:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; bh=T0jDfEhquxZV1ezYY0QJ2Mj679zJS96Pi7TlYA6qXWQ=; b=W2rC8hfsp9X4cuCE6EJ1zoyMmFWg612RK9BvMyZbGI1q/HCqKefJdKgPQ75c6JH4W7 gi6IGpm88AWqRbi0hp+jn7vztoYjtrWfqf76wkoLfE0cyri+n5YsfvX1nEWn46BXEU5g HKRKCN7hEC4aS6y79GSdQfQk1WME23FZSzbBU3+2cnUjSr/qL1K7H+bmzb1p8+2tVnvx Bqk9Gd3luze72leDiNUSZnIVnFKiyhSt/cLxNxn3byJexQdHoFab/yP8h9BmtRQPX06h 2mwHG9YaHqqWJtFJorYYdgrbkGk1kq2m8UGq+zePEDHlx5igNuAZYzC3TV5AcDTtVNs4 8BkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:reply-to:mime-version:content-transfer-encoding; bh=T0jDfEhquxZV1ezYY0QJ2Mj679zJS96Pi7TlYA6qXWQ=; b=o+t7oVUtQmTKtnft/CGJbU2rHlq9HDtPTXM0GzORTXsShry2AqMLiijkq4iC+U2uK4 qxxjRDgx5/xflOwK/rXCKYvEwAktOZM2hpABtKZB9B9fa8SpCER0jbUbnvp0Cxb81WGR zcmj+MCAIvnRFlsAD8v8QBWfbgpQzRuq4Ru5fiPdMbiHd798q9DYPwweFCdQyO0B/6Nt bucMtW/rjkfngQ4o9/i45vZT5JddwG5Rpb81leUysPmPUAtk2uO/KvAkmkZ0j+l4gqDF QcKZM/w9J+X9k3RRyLD4vmFrhfSuxBWWv5gfxfSXYChmclzi8aWHEkUz53bsoSRrsF/r xSyg== X-Gm-Message-State: AOAM531l22ouKkxkhhY6P6ZedjokuCmHe0AzPArlaK3FyLI9+Cyd7I3P DdQw64yx49k1rTgZC7NOIMtLBtgX7Gk= X-Google-Smtp-Source: ABdhPJxCA/l7vADDp6LfH8/Rvhk3NoF8q0jgedni3uJZhWAyxK5ycRRDygE8y3jnfiIBziJT9QEggw== X-Received: by 2002:a17:906:3b55:: with SMTP id h21mr4484936ejf.184.1609974827276; Wed, 06 Jan 2021 15:13:47 -0800 (PST) Received: from sblaptop.fritz.box (ipbcc1aa4b.dynamic.kabel-deutschland.de. [188.193.170.75]) by smtp.gmail.com with ESMTPSA id b7sm1794295ejz.4.2021.01.06.15.13.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Jan 2021 15:13:46 -0800 (PST) From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Thu, 7 Jan 2021 00:13:06 +0100 Message-Id: <20210106231308.2952217-3-andreas.rheinhardt@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210106231308.2952217-1-andreas.rheinhardt@gmail.com> References: <20210106231308.2952217-1-andreas.rheinhardt@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/5] avcodec/fft_fixed: Hardcode cosine tables to save space X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" The tables that are used take 256B; the code to initialize them uses 281B here (GCC 9.3, x64, -O3, but in av_cold functions). On top of that, removing this code also allows to remove the array of AVOnce used to guard the cosine tables against multiple initializations; this also removes relocations. Signed-off-by: Andreas Rheinhardt --- libavcodec/Makefile | 2 +- libavcodec/fft.h | 19 +++++++++--------- libavcodec/fft_fixed.c | 42 +++++++++++++++++++++++++++++++++++++++ libavcodec/fft_template.c | 27 +++++++++---------------- 4 files changed, 62 insertions(+), 28 deletions(-) diff --git a/libavcodec/Makefile b/libavcodec/Makefile index fea37ef3c9..8e03feb7d1 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -83,7 +83,7 @@ OBJS-$(CONFIG_EXIF) += exif.o tiff_common.o OBJS-$(CONFIG_FAANDCT) += faandct.o OBJS-$(CONFIG_FAANIDCT) += faanidct.o OBJS-$(CONFIG_FDCTDSP) += fdctdsp.o jfdctfst.o jfdctint.o -FFT-OBJS-$(CONFIG_HARDCODED_TABLES) += cos_tables.o cos_fixed_tables.o +FFT-OBJS-$(CONFIG_HARDCODED_TABLES) += cos_tables.o OBJS-$(CONFIG_FFT) += avfft.o fft_fixed.o fft_float.o \ fft_fixed_32.o fft_init_table.o \ $(FFT-OBJS-yes) diff --git a/libavcodec/fft.h b/libavcodec/fft.h index fedc0c5ef0..4b54265b7f 100644 --- a/libavcodec/fft.h +++ b/libavcodec/fft.h @@ -114,10 +114,18 @@ struct FFTContext { uint32_t *revtab32; }; -#if CONFIG_HARDCODED_TABLES +#if CONFIG_HARDCODED_TABLES || !FFT_FLOAT #define COSTABLE_CONST const +#define ff_init_ff_cos_tabs(...) #else #define COSTABLE_CONST +#define ff_init_ff_cos_tabs FFT_NAME(ff_init_ff_cos_tabs) + +/** + * Initialize the cosine table in ff_cos_tabs[index] + * @param index index in ff_cos_tabs array of the table to initialize + */ +void ff_init_ff_cos_tabs(int index); #endif #define COSTABLE(size) \ @@ -138,16 +146,9 @@ extern COSTABLE(16384); extern COSTABLE(32768); extern COSTABLE(65536); extern COSTABLE(131072); -#endif /* FFT_FLOAT */ extern COSTABLE_CONST FFTSample* const FFT_NAME(ff_cos_tabs)[]; -#define ff_init_ff_cos_tabs FFT_NAME(ff_init_ff_cos_tabs) - -/** - * Initialize the cosine table in ff_cos_tabs[index] - * @param index index in ff_cos_tabs array of the table to initialize - */ -void ff_init_ff_cos_tabs(int index); +#endif /* FFT_FLOAT */ #define ff_fft_init FFT_NAME(ff_fft_init) #define ff_fft_end FFT_NAME(ff_fft_end) diff --git a/libavcodec/fft_fixed.c b/libavcodec/fft_fixed.c index 52d225ee09..ce52dec7fd 100644 --- a/libavcodec/fft_fixed.c +++ b/libavcodec/fft_fixed.c @@ -19,4 +19,46 @@ #define FFT_FLOAT 0 #define FFT_FIXED_32 0 #define MAX_BITS 7 + +#include "fft.h" + +COSTABLE(16) = { + 32767, 30274, 23170, 12540, + 0, 12540, 23170, 30274, +}; +COSTABLE(32) = { + 32767, 32138, 30274, 27246, + 23170, 18205, 12540, 6393, + 0, 6393, 12540, 18205, + 23170, 27246, 30274, 32138, +}; +COSTABLE(64) = { + 32767, 32610, 32138, 31357, + 30274, 28899, 27246, 25330, + 23170, 20788, 18205, 15447, + 12540, 9512, 6393, 3212, + 0, 3212, 6393, 9512, + 12540, 15447, 18205, 20788, + 23170, 25330, 27246, 28899, + 30274, 31357, 32138, 32610, +}; +COSTABLE(128) = { + 32767, 32729, 32610, 32413, + 32138, 31786, 31357, 30853, + 30274, 29622, 28899, 28106, + 27246, 26320, 25330, 24279, + 23170, 22006, 20788, 19520, + 18205, 16846, 15447, 14010, + 12540, 11039, 9512, 7962, + 6393, 4808, 3212, 1608, + 0, 1608, 3212, 4808, + 6393, 7962, 9512, 11039, + 12540, 14010, 15447, 16846, + 18205, 19520, 20788, 22006, + 23170, 24279, 25330, 26320, + 27246, 28106, 28899, 29622, + 30274, 30853, 31357, 31786, + 32138, 32413, 32610, 32729, +}; + #include "fft_template.c" diff --git a/libavcodec/fft_template.c b/libavcodec/fft_template.c index 7a7d51a6b4..9d125de073 100644 --- a/libavcodec/fft_template.c +++ b/libavcodec/fft_template.c @@ -42,12 +42,12 @@ #else /* FFT_FIXED_32 */ /* cos(2*pi*x/n) for 0<=x<=n/4, followed by its reverse */ +#if FFT_FLOAT #if !CONFIG_HARDCODED_TABLES COSTABLE(16); COSTABLE(32); COSTABLE(64); COSTABLE(128); -#if FFT_FLOAT COSTABLE(256); COSTABLE(512); COSTABLE(1024); @@ -58,7 +58,6 @@ COSTABLE(16384); COSTABLE(32768); COSTABLE(65536); COSTABLE(131072); -#endif /* FFT_FLOAT */ static av_cold void init_ff_cos_tabs(int index) { @@ -87,7 +86,6 @@ INIT_FF_COS_TABS_FUNC(4, 16) INIT_FF_COS_TABS_FUNC(5, 32) INIT_FF_COS_TABS_FUNC(6, 64) INIT_FF_COS_TABS_FUNC(7, 128) -#if FFT_FLOAT INIT_FF_COS_TABS_FUNC(8, 256) INIT_FF_COS_TABS_FUNC(9, 512) INIT_FF_COS_TABS_FUNC(10, 1024) @@ -98,7 +96,6 @@ INIT_FF_COS_TABS_FUNC(14, 16384) INIT_FF_COS_TABS_FUNC(15, 32768) INIT_FF_COS_TABS_FUNC(16, 65536) INIT_FF_COS_TABS_FUNC(17, 131072) -#endif /* FFT_FLOAT */ static CosTabsInitOnce cos_tabs_init_once[] = { { NULL }, @@ -109,7 +106,6 @@ static CosTabsInitOnce cos_tabs_init_once[] = { { init_ff_cos_tabs_32, AV_ONCE_INIT }, { init_ff_cos_tabs_64, AV_ONCE_INIT }, { init_ff_cos_tabs_128, AV_ONCE_INIT }, -#if FFT_FLOAT { init_ff_cos_tabs_256, AV_ONCE_INIT }, { init_ff_cos_tabs_512, AV_ONCE_INIT }, { init_ff_cos_tabs_1024, AV_ONCE_INIT }, @@ -120,17 +116,20 @@ static CosTabsInitOnce cos_tabs_init_once[] = { { init_ff_cos_tabs_32768, AV_ONCE_INIT }, { init_ff_cos_tabs_65536, AV_ONCE_INIT }, { init_ff_cos_tabs_131072, AV_ONCE_INIT }, -#endif /* FFT_FLOAT */ }; +av_cold void ff_init_ff_cos_tabs(int index) +{ + ff_thread_once(&cos_tabs_init_once[index].control, cos_tabs_init_once[index].func); +} #endif + COSTABLE_CONST FFTSample * const FFT_NAME(ff_cos_tabs)[] = { NULL, NULL, NULL, NULL, FFT_NAME(ff_cos_16), FFT_NAME(ff_cos_32), FFT_NAME(ff_cos_64), FFT_NAME(ff_cos_128), -#if FFT_FLOAT FFT_NAME(ff_cos_256), FFT_NAME(ff_cos_512), FFT_NAME(ff_cos_1024), @@ -141,8 +140,8 @@ COSTABLE_CONST FFTSample * const FFT_NAME(ff_cos_tabs)[] = { FFT_NAME(ff_cos_32768), FFT_NAME(ff_cos_65536), FFT_NAME(ff_cos_131072), -#endif /* FFT_FLOAT */ }; +#endif /* FFT_FLOAT */ #endif /* FFT_FIXED_32 */ @@ -160,13 +159,6 @@ static int split_radix_permutation(int i, int n, int inverse) else return split_radix_permutation(i, m, inverse)*4 - 1; } -av_cold void ff_init_ff_cos_tabs(int index) -{ -#if (!CONFIG_HARDCODED_TABLES) && (!FFT_FIXED_32) - ff_thread_once(&cos_tabs_init_once[index].control, cos_tabs_init_once[index].func); -#endif -} - static const int avx_tab[] = { 0, 4, 1, 5, 8, 12, 9, 13, 2, 6, 3, 7, 10, 14, 11, 15 }; @@ -250,13 +242,12 @@ av_cold int ff_fft_init(FFTContext *s, int nbits, int inverse) if (ARCH_X86) ff_fft_init_x86(s); if (CONFIG_MDCT) s->mdct_calcw = s->mdct_calc; if (HAVE_MIPSFPU) ff_fft_init_mips(s); + for (j = 4; !CONFIG_HARDCODED_TABLES && j <= nbits; j++) + ff_init_ff_cos_tabs(j); #else if (CONFIG_MDCT) s->mdct_calcw = ff_mdct_calcw_c; if (ARCH_ARM) ff_fft_fixed_init_arm(s); #endif - for(j=4; j<=nbits; j++) { - ff_init_ff_cos_tabs(j); - } #endif /* FFT_FIXED_32 */