From patchwork Fri Jul 7 07:50:47 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muhammad Faiz X-Patchwork-Id: 4255 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.1.76 with SMTP id 73csp3197454vsb; Fri, 7 Jul 2017 00:51:13 -0700 (PDT) X-Received: by 10.28.107.131 with SMTP id a3mr1282845wmi.60.1499413873406; Fri, 07 Jul 2017 00:51:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1499413873; cv=none; d=google.com; s=arc-20160816; b=Vbnc9nwuIUKTvkrDanvQo1Oo3BJ7OBSu1XTymULhGc0Uc4lDFQOmBKlwMP+cFkHK11 uSIeSqE+3PkaUQyg4/c3LOV8Hxg6VQpxeuUT3GUy+2HLoem03s+tiy/hKvHTyCQN4vND MYttZ611jsQzdwDe9dEK8EGe5GM52aaq5az3yQ8G1bEEwUkB9Op5c0WxJmxzeGOh8dgH rKQYy4srH8DSa6liq/GCTOpPJPO/FgTKtz6JPiCroR4hleSnY2bz8mZxd0nLM3+mTj28 cO7lFQ7Y84JnQc60mDMOHNa9FJk0ftUIOL+EVA+DCR1r1R9jyI3RHWocDCZELjANSuPW n/kQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to:arc-authentication-results; bh=SFE9pb19f20xfy8+qxj2rSUjaNWxtxRWCCHlwDWtC9E=; b=DfnUMM5qgaj+PpHhtCNZVvC7jdUvEVRewDNCytr8DRFnew22l7mEQ/y6udnHG224Gv /pgZBg8BsxcmzgZcDcnfahgybe0/nWc+vx9nYx7Zo+xthZpOHHzrfUZIFluWc2tRFA3U Z3OIb6uC63F6c4kZpdUV7pRiJ6+8BsJ3n1KwVpb/LJj2+v5sP8QdhkdJF6oKVui5/yRh /DTj7qzTyRzKy8xvQeWdCxSHxCGv5GEvKnTfGK0YV4X1l14a+Y2g77eWMpR2rE11CXH6 VxHPk0byqsN9FsoFlKpYO1ml4K4pm6soblgN2KgkGFNj9U5zDKc2F07cMo9t1cjR8bSA QhlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.b=b9CMMK7u; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y64si1634138wrc.160.2017.07.07.00.51.12; Fri, 07 Jul 2017 00:51:13 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.b=b9CMMK7u; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CCDC56899D3; Fri, 7 Jul 2017 10:51:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg0-f68.google.com (mail-pg0-f68.google.com [74.125.83.68]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id AFA6C689742 for ; Fri, 7 Jul 2017 10:51:01 +0300 (EEST) Received: by mail-pg0-f68.google.com with SMTP id u36so3151439pgn.3 for ; Fri, 07 Jul 2017 00:51:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=vIqHRMPZaRl3YpKQ+qV/LxqhwqgWE7olnGLh7RgmLHU=; b=b9CMMK7ubH7kuk9uvfe9AtE7n97tXvASMuG2jDCAbKRC1FGbyDPHSjjNLSJ1Q+CwqS kuEjwnG2XcMVAO3odkDfZhfWHLLdRnxkg18yLjugVNcA5daK2EI3gZPOmtstXAdbMaXF cmxd5Bmjyi6pmWGvtOCM/1nJB4i1KyDCukmk/bVolBI24E31hXLLPv+TcxfvbaWEd8/x PDrx4Z/jjFRKsN9Ty+4NRsiE2TI0tgMCCNY2pYo3PJk59gs9+X6MGZqlKOALoaOsYJx4 T4Y8D2GcJTSmwFZJk6+VNu4ph77Bxjglgf4xMRHuO2PgqlkFNpZT9HPJfCRTsXSKifai VcMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=vIqHRMPZaRl3YpKQ+qV/LxqhwqgWE7olnGLh7RgmLHU=; b=M9anPhqeY0G6BQh8uNRoNZksMDseU0Wx74BEWXdbhsybgs4SaGaKKDGmnUjXuEnyc1 qQ8bSxWrrQDc92HJl6qUFT0GCoFFJEILn0smZy5k2uYdm2/j8oaJH7WchgsDlv5faEdc VxX/MhDcsvukZ42yIxyXQzC20U0ML0maJXxl9XOuuUxDIsK0ISritVlVALJKPSNy+Fdh ATp0Dw4lLzeyC0Qh036i2RamQLfewklFgb9N0AqNMHw/9pQoQxdxYgTgPyvqN2ZcKdWX JbIk8tpcy5x10kvc5xtp+uikNqvRZSD8Iw5A+eLVZVKO682SN5696KmZ/QdprAVbVszp d91A== X-Gm-Message-State: AIVw110mTv3l46XXPEqSu3M2ih1lldkOQ1jGqm2VacvkQiRG19bTRzZO ytMJEPXPEG0hPgBW X-Received: by 10.101.69.135 with SMTP id o7mr132201pgq.242.1499413862872; Fri, 07 Jul 2017 00:51:02 -0700 (PDT) Received: from localhost.localdomain ([114.124.151.140]) by smtp.gmail.com with ESMTPSA id m79sm5482363pfk.117.2017.07.07.00.51.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 07 Jul 2017 00:51:02 -0700 (PDT) From: Muhammad Faiz To: ffmpeg-devel@ffmpeg.org Date: Fri, 7 Jul 2017 14:50:47 +0700 Message-Id: <20170707075048.8151-1-mfcc64@gmail.com> X-Mailer: git-send-email 2.9.3 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3] avcodec/rdft: remove sintable X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Muhammad Faiz Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" It is redundant with costable. The first half of sintable is identical with the second half of costable. The second half of sintable is negative value of the first half of sintable. The computation is changed to handle sign of sin values, in C code and ARM assembly code. Signed-off-by: Muhammad Faiz --- libavcodec/Makefile | 3 +- libavcodec/arm/rdft_neon.S | 13 ++++++--- libavcodec/rdft.c | 68 ++++++++++++++++------------------------------ libavcodec/rdft.h | 26 ++---------------- 4 files changed, 36 insertions(+), 74 deletions(-) diff --git a/libavcodec/Makefile b/libavcodec/Makefile index b440a00..59029a8 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -122,8 +122,7 @@ OBJS-$(CONFIG_QSV) += qsv.o OBJS-$(CONFIG_QSVDEC) += qsvdec.o OBJS-$(CONFIG_QSVENC) += qsvenc.o OBJS-$(CONFIG_RANGECODER) += rangecoder.o -RDFT-OBJS-$(CONFIG_HARDCODED_TABLES) += sin_tables.o -OBJS-$(CONFIG_RDFT) += rdft.o $(RDFT-OBJS-yes) +OBJS-$(CONFIG_RDFT) += rdft.o OBJS-$(CONFIG_RV34DSP) += rv34dsp.o OBJS-$(CONFIG_SHARED) += log2_tab.o reverse.o OBJS-$(CONFIG_SINEWIN) += sinewin.o sinewin_fixed.o diff --git a/libavcodec/arm/rdft_neon.S b/libavcodec/arm/rdft_neon.S index 781d976..eabb92b 100644 --- a/libavcodec/arm/rdft_neon.S +++ b/libavcodec/arm/rdft_neon.S @@ -30,18 +30,21 @@ function ff_rdft_calc_neon, export=1 lsls r6, r6, #31 bne 1f - add r0, r4, #20 + add r0, r4, #24 bl X(ff_fft_permute_neon) - add r0, r4, #20 + add r0, r4, #24 mov r1, r5 bl X(ff_fft_calc_neon) 1: ldr r12, [r4, #0] @ nbits mov r2, #1 + ldr r8, [r4, #20] @ negative_sin lsl r12, r2, r12 add r0, r5, #8 + lsl r8, r8, #31 add r1, r5, r12, lsl #2 lsr r12, r12, #2 + vdup.32 d26, r8 ldr r2, [r4, #12] @ tcos sub r12, r12, #2 ldr r3, [r4, #16] @ tsin @@ -55,6 +58,7 @@ function ff_rdft_calc_neon, export=1 vld1.32 {d5}, [r3,:64]! @ tsin[i] vmov.f32 d18, #0.5 @ k1 vdup.32 d19, r6 + veor d5, d26, d5 pld [r0, #32] veor d19, d18, d19 @ k2 vmov.i32 d16, #0 @@ -90,6 +94,7 @@ function ff_rdft_calc_neon, export=1 vld1.32 {d5}, [r3,:64]! @ tsin[i] veor d24, d22, d17 @ ev.re,-ev.im vrev64.32 d3, d23 @ od.re, od.im + veor d5, d26, d5 pld [r2, #32] veor d2, d3, d16 @ -od.re, od.im pld [r3, #32] @@ -140,10 +145,10 @@ function ff_rdft_calc_neon, export=1 vmul.f32 d22, d22, d18 vst1.32 {d22}, [r5,:64] - add r0, r4, #20 + add r0, r4, #24 mov r1, r5 bl X(ff_fft_permute_neon) - add r0, r4, #20 + add r0, r4, #24 mov r1, r5 pop {r4-r8,lr} b X(ff_fft_calc_neon) diff --git a/libavcodec/rdft.c b/libavcodec/rdft.c index c318aa8..194e0bc 100644 --- a/libavcodec/rdft.c +++ b/libavcodec/rdft.c @@ -28,28 +28,6 @@ * (Inverse) Real Discrete Fourier Transforms. */ -/* sin(2*pi*x/n) for 0<=x>2); i++) { - i1 = 2*i; - i2 = n-i1; - /* Separate even and odd FFTs */ - ev.re = k1*(data[i1 ]+data[i2 ]); - od.im = -k2*(data[i1 ]-data[i2 ]); - ev.im = k1*(data[i1+1]-data[i2+1]); - od.re = k2*(data[i1+1]+data[i2+1]); - /* Apply twiddle factors to the odd FFT and add to the even FFT */ - data[i1 ] = ev.re + od.re*tcos[i] - od.im*tsin[i]; - data[i1+1] = ev.im + od.im*tcos[i] + od.re*tsin[i]; - data[i2 ] = ev.re - od.re*tcos[i] + od.im*tsin[i]; - data[i2+1] = -ev.im + od.im*tcos[i] + od.re*tsin[i]; + +#define RDFT_UNMANGLE(sign0, sign1) \ + for (i = 1; i < (n>>2); i++) { \ + i1 = 2*i; \ + i2 = n-i1; \ + /* Separate even and odd FFTs */ \ + ev.re = k1*(data[i1 ]+data[i2 ]); \ + od.im = -k2*(data[i1 ]-data[i2 ]); \ + ev.im = k1*(data[i1+1]-data[i2+1]); \ + od.re = k2*(data[i1+1]+data[i2+1]); \ + /* Apply twiddle factors to the odd FFT and add to the even FFT */ \ + data[i1 ] = ev.re + od.re*tcos[i] sign0 od.im*tsin[i]; \ + data[i1+1] = ev.im + od.im*tcos[i] sign1 od.re*tsin[i]; \ + data[i2 ] = ev.re - od.re*tcos[i] sign1 od.im*tsin[i]; \ + data[i2+1] = -ev.im + od.im*tcos[i] sign1 od.re*tsin[i]; \ + } + + if (s->negative_sin) { + RDFT_UNMANGLE(+,-) + } else { + RDFT_UNMANGLE(-,+) } + data[2*i+1]=s->sign_convention*data[2*i+1]; if (s->inverse) { data[0] *= k1; @@ -104,6 +91,7 @@ av_cold int ff_rdft_init(RDFTContext *s, int nbits, enum RDFTransformType trans) s->nbits = nbits; s->inverse = trans == IDFT_C2R || trans == DFT_C2R; s->sign_convention = trans == IDFT_R2C || trans == DFT_C2R ? 1 : -1; + s->negative_sin = trans == DFT_C2R || trans == DFT_R2C; if (nbits < 4 || nbits > 16) return AVERROR(EINVAL); @@ -113,15 +101,7 @@ av_cold int ff_rdft_init(RDFTContext *s, int nbits, enum RDFTransformType trans) ff_init_ff_cos_tabs(nbits); s->tcos = ff_cos_tabs[nbits]; - s->tsin = ff_sin_tabs[nbits]+(trans == DFT_R2C || trans == DFT_C2R)*(n>>2); -#if !CONFIG_HARDCODED_TABLES - { - int i; - const double theta = (trans == DFT_R2C || trans == DFT_C2R ? -1 : 1) * 2 * M_PI / n; - for (i = 0; i < (n >> 2); i++) - s->tsin[i] = sin(i * theta); - } -#endif + s->tsin = ff_cos_tabs[nbits] + (n >> 2); s->rdft_calc = rdft_calc_c; if (ARCH_ARM) ff_rdft_init_arm(s); diff --git a/libavcodec/rdft.h b/libavcodec/rdft.h index 37c40e7..ffafca7 100644 --- a/libavcodec/rdft.h +++ b/libavcodec/rdft.h @@ -25,29 +25,6 @@ #include "config.h" #include "fft.h" -#if CONFIG_HARDCODED_TABLES -# define SINTABLE_CONST const -#else -# define SINTABLE_CONST -#endif - -#define SINTABLE(size) \ - SINTABLE_CONST DECLARE_ALIGNED(16, FFTSample, ff_sin_##size)[size/2] - -extern SINTABLE(16); -extern SINTABLE(32); -extern SINTABLE(64); -extern SINTABLE(128); -extern SINTABLE(256); -extern SINTABLE(512); -extern SINTABLE(1024); -extern SINTABLE(2048); -extern SINTABLE(4096); -extern SINTABLE(8192); -extern SINTABLE(16384); -extern SINTABLE(32768); -extern SINTABLE(65536); - struct RDFTContext { int nbits; int inverse; @@ -55,7 +32,8 @@ struct RDFTContext { /* pre/post rotation tables */ const FFTSample *tcos; - SINTABLE_CONST FFTSample *tsin; + const FFTSample *tsin; + int negative_sin; FFTContext fft; void (*rdft_calc)(struct RDFTContext *s, FFTSample *z); };