From patchwork Thu Jul 6 09:18:16 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muhammad Faiz X-Patchwork-Id: 4223 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.1.76 with SMTP id 73csp1944141vsb; Thu, 6 Jul 2017 02:18:43 -0700 (PDT) X-Received: by 10.28.23.137 with SMTP id 131mr4503955wmx.7.1499332723530; Thu, 06 Jul 2017 02:18:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1499332723; cv=none; d=google.com; s=arc-20160816; b=Dg1rtFKDPmeQhobgdg6Xr39FCCjWug8kCoiCeB5JlcK3CuCT3UJfpmNVB3HSa6/M+1 fc4zjL69LM6epj/zsulDa/KgRRnTeRuIKxHmg1n3dRV8O6C8LTRgKdGQw6OXq2rK8fnT CFn7HkNKEpSB+VTZ6mHrrz1Fm9UVgAl3aZFTgAqIYtJtTDQIStpoV32RDeV33YAg2esk VxU7vUl8nfDTn9PGUttzk9oGiTLwOUjEBZG9kplBfoYSi4Y14oJzox0iWlc9XiMV/tk3 v5Xf2f1Li2HHZOVkCw7QTi/nsyFfUsu0k9kuRYoQLAJUCMONmi+sBkUhHV5iTt7Ia9rk uFUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to:arc-authentication-results; bh=XabLRKm7FrKO3/R8TNypwqKYlCWcW2PYv5xe6TBHnDQ=; b=Lep3UtGoYdIlkodl9ppULA1M7Jo8WiXBS8tSvMnxhyrLx5/SkJVmT0Qd0dCFD200Jr 1r0VXJ51JBR320T7Y1I+DTlTsW5XDcObELdkHkdKT9qK5NYbhvJmYOkxLAmQrLG84HJZ 4AB7x4veOlI+B/WTRGhnaA0b/C1fAcJr1lLtTU8ebXlEheUcKo0lH2jksqFQLAIG/hKb XUmg/vcKzYeqITeIw7M6f6f+rnu80uZXHyPbxzhrxzd+yGk3nZO898+h88lUqGjyxuxO OQYqsBHMr6nI9D0auqkOijxmNnscbelZ5VKRSZ2H7kjShngraAm+8E3w4VGZz5Yu6ze9 4fng== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.b=tg7BuQmO; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 190si22704694wmx.159.2017.07.06.02.18.42; Thu, 06 Jul 2017 02:18:43 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.b=tg7BuQmO; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6958C6883A8; Thu, 6 Jul 2017 12:18:37 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf0-f194.google.com (mail-pf0-f194.google.com [209.85.192.194]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8DE1C6806C4 for ; Thu, 6 Jul 2017 12:18:30 +0300 (EEST) Received: by mail-pf0-f194.google.com with SMTP id z6so2414285pfk.3 for ; Thu, 06 Jul 2017 02:18:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=lq0OIs6ZlzMGzl2Wp9zUVPGoFGFCKwJhZOQpyZ6fIqo=; b=tg7BuQmOtHK6JNp01YnhU2Scz8hfLPFvAiKPYKHwmxifo6Rp7Y6CQMu2m3L2dhGnMr AvrQEGRlXbNKft3Tq+8OzDx57C/cDxTO6eNeL5VQNggzAG2JQ9f2OubEE+CVb9on4xXV If0WlbaV9TZYvIE0kjI0dh6rBwyXjwlbX5zoy/FprSS+QpelLOOb5wI6HKymPs7lHTk7 ci6/PtoabhqzCCk9XFC4QUh5eILxVCGfmkjROGeupL22fe1n/NfCKLxWAjRDXIIZE+ei E6H2lfKm2+aNMc+80npdTotvFTClr9hXtLBjpfPkIUgKdVCbrk5zXGRdKNZ0ZD8sqVLk z64w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=lq0OIs6ZlzMGzl2Wp9zUVPGoFGFCKwJhZOQpyZ6fIqo=; b=VH6PSZE+ZOJfxUF3qGuluv4cylBpp2ylOJNIzNFp+wKsQ5irgNgPGeQhElBD0yhFUk 6y6R5NihkwWs17IVyscJPLtFPLRHNILmcZIXx1KuLPtsJtaRpBdT+eFFuiQ78DIuQ0q/ RGZUrCBAWgzC6yS8j3TcVowLL29ggfWnKowB+iiDR5ngLg+yGbjWj21yrUDCzQ4GxB0D YVHW/XBCHWiCbKn1D3ux9v4ckoWPlH9Z4myzQa92zTCpG8BEvXyeTvt4uPNGfAvHnOs/ SlzTno+Q1MVXw4y9TnDPzE2fIza+FwjbrOdW+y4R20Jl95Z20EsY6Hz04Loty7DL+Fyz Wrqg== X-Gm-Message-State: AIVw113OFtnpSL9vEtkPY2vtgjK/O10c/jKcUE1NPGN+z5zgRUqGBM95 KW+/Fxy2oJcX/o1a X-Received: by 10.84.232.79 with SMTP id f15mr26668804pln.189.1499332711506; Thu, 06 Jul 2017 02:18:31 -0700 (PDT) Received: from localhost.localdomain ([114.124.237.223]) by smtp.gmail.com with ESMTPSA id a79sm3639908pfj.5.2017.07.06.02.18.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 06 Jul 2017 02:18:31 -0700 (PDT) From: Muhammad Faiz To: ffmpeg-devel@ffmpeg.org Date: Thu, 6 Jul 2017 16:18:16 +0700 Message-Id: <20170706091816.21798-1-mfcc64@gmail.com> X-Mailer: git-send-email 2.9.3 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2] avcodec/rdft: remove sintable X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Muhammad Faiz Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" It is redundant with costable. The first half of sintable is identical with the second half of costable. The second half of sintable is negative value of the first half of sintable. The computation is changed to handle sign of sin values. Signed-off-by: Muhammad Faiz --- libavcodec/Makefile | 3 +- libavcodec/arm/rdft_neon.S | 20 ++++++++------ libavcodec/rdft.c | 68 ++++++++++++++++------------------------------ libavcodec/rdft.h | 26 ++---------------- 4 files changed, 39 insertions(+), 78 deletions(-) diff --git a/libavcodec/Makefile b/libavcodec/Makefile index b440a00..59029a8 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -122,8 +122,7 @@ OBJS-$(CONFIG_QSV) += qsv.o OBJS-$(CONFIG_QSVDEC) += qsvdec.o OBJS-$(CONFIG_QSVENC) += qsvenc.o OBJS-$(CONFIG_RANGECODER) += rangecoder.o -RDFT-OBJS-$(CONFIG_HARDCODED_TABLES) += sin_tables.o -OBJS-$(CONFIG_RDFT) += rdft.o $(RDFT-OBJS-yes) +OBJS-$(CONFIG_RDFT) += rdft.o OBJS-$(CONFIG_RV34DSP) += rv34dsp.o OBJS-$(CONFIG_SHARED) += log2_tab.o reverse.o OBJS-$(CONFIG_SINEWIN) += sinewin.o sinewin_fixed.o diff --git a/libavcodec/arm/rdft_neon.S b/libavcodec/arm/rdft_neon.S index 781d976..3bea8b4 100644 --- a/libavcodec/arm/rdft_neon.S +++ b/libavcodec/arm/rdft_neon.S @@ -22,7 +22,7 @@ #include "libavutil/arm/asm.S" function ff_rdft_calc_neon, export=1 - push {r4-r8,lr} + push {r4-r9,lr} ldr r6, [r0, #4] @ inverse mov r4, r0 @@ -30,9 +30,9 @@ function ff_rdft_calc_neon, export=1 lsls r6, r6, #31 bne 1f - add r0, r4, #20 + add r0, r4, #24 bl X(ff_fft_permute_neon) - add r0, r4, #20 + add r0, r4, #24 mov r1, r5 bl X(ff_fft_calc_neon) 1: @@ -46,8 +46,10 @@ function ff_rdft_calc_neon, export=1 sub r12, r12, #2 ldr r3, [r4, #16] @ tsin mov r7, r0 + ldr r9, [r4, #20] @ negative_sin sub r1, r1, #8 mov lr, r1 + lsl r9, r9, #31 mov r8, #-8 vld1.32 {d0}, [r0,:64]! @ d1[0,1] vld1.32 {d1}, [r1,:64], r8 @ d2[0,1] @@ -61,8 +63,10 @@ function ff_rdft_calc_neon, export=1 vmov.i32 d17, #1<<31 pld [r1, #-32] vtrn.32 d16, d17 + vdup.32 d16, r9 pld [r2, #32] - vrev64.32 d16, d16 @ d16=1,0 d17=0,1 + veor d17, d16, d17 + vrev64.32 d16, d17 @ negative_sin ? d16=0,1 d17=1,0 : d16=1,0 d17=0,1 pld [r3, #32] 2: veor q1, q0, q8 @ -d1[0],d1[1], d2[0],-d2[1] @@ -136,15 +140,15 @@ function ff_rdft_calc_neon, export=1 cmp r6, #0 it eq - popeq {r4-r8,pc} + popeq {r4-r9,pc} vmul.f32 d22, d22, d18 vst1.32 {d22}, [r5,:64] - add r0, r4, #20 + add r0, r4, #24 mov r1, r5 bl X(ff_fft_permute_neon) - add r0, r4, #20 + add r0, r4, #24 mov r1, r5 - pop {r4-r8,lr} + pop {r4-r9,lr} b X(ff_fft_calc_neon) endfunc diff --git a/libavcodec/rdft.c b/libavcodec/rdft.c index c318aa8..194e0bc 100644 --- a/libavcodec/rdft.c +++ b/libavcodec/rdft.c @@ -28,28 +28,6 @@ * (Inverse) Real Discrete Fourier Transforms. */ -/* sin(2*pi*x/n) for 0<=x>2); i++) { - i1 = 2*i; - i2 = n-i1; - /* Separate even and odd FFTs */ - ev.re = k1*(data[i1 ]+data[i2 ]); - od.im = -k2*(data[i1 ]-data[i2 ]); - ev.im = k1*(data[i1+1]-data[i2+1]); - od.re = k2*(data[i1+1]+data[i2+1]); - /* Apply twiddle factors to the odd FFT and add to the even FFT */ - data[i1 ] = ev.re + od.re*tcos[i] - od.im*tsin[i]; - data[i1+1] = ev.im + od.im*tcos[i] + od.re*tsin[i]; - data[i2 ] = ev.re - od.re*tcos[i] + od.im*tsin[i]; - data[i2+1] = -ev.im + od.im*tcos[i] + od.re*tsin[i]; + +#define RDFT_UNMANGLE(sign0, sign1) \ + for (i = 1; i < (n>>2); i++) { \ + i1 = 2*i; \ + i2 = n-i1; \ + /* Separate even and odd FFTs */ \ + ev.re = k1*(data[i1 ]+data[i2 ]); \ + od.im = -k2*(data[i1 ]-data[i2 ]); \ + ev.im = k1*(data[i1+1]-data[i2+1]); \ + od.re = k2*(data[i1+1]+data[i2+1]); \ + /* Apply twiddle factors to the odd FFT and add to the even FFT */ \ + data[i1 ] = ev.re + od.re*tcos[i] sign0 od.im*tsin[i]; \ + data[i1+1] = ev.im + od.im*tcos[i] sign1 od.re*tsin[i]; \ + data[i2 ] = ev.re - od.re*tcos[i] sign1 od.im*tsin[i]; \ + data[i2+1] = -ev.im + od.im*tcos[i] sign1 od.re*tsin[i]; \ + } + + if (s->negative_sin) { + RDFT_UNMANGLE(+,-) + } else { + RDFT_UNMANGLE(-,+) } + data[2*i+1]=s->sign_convention*data[2*i+1]; if (s->inverse) { data[0] *= k1; @@ -104,6 +91,7 @@ av_cold int ff_rdft_init(RDFTContext *s, int nbits, enum RDFTransformType trans) s->nbits = nbits; s->inverse = trans == IDFT_C2R || trans == DFT_C2R; s->sign_convention = trans == IDFT_R2C || trans == DFT_C2R ? 1 : -1; + s->negative_sin = trans == DFT_C2R || trans == DFT_R2C; if (nbits < 4 || nbits > 16) return AVERROR(EINVAL); @@ -113,15 +101,7 @@ av_cold int ff_rdft_init(RDFTContext *s, int nbits, enum RDFTransformType trans) ff_init_ff_cos_tabs(nbits); s->tcos = ff_cos_tabs[nbits]; - s->tsin = ff_sin_tabs[nbits]+(trans == DFT_R2C || trans == DFT_C2R)*(n>>2); -#if !CONFIG_HARDCODED_TABLES - { - int i; - const double theta = (trans == DFT_R2C || trans == DFT_C2R ? -1 : 1) * 2 * M_PI / n; - for (i = 0; i < (n >> 2); i++) - s->tsin[i] = sin(i * theta); - } -#endif + s->tsin = ff_cos_tabs[nbits] + (n >> 2); s->rdft_calc = rdft_calc_c; if (ARCH_ARM) ff_rdft_init_arm(s); diff --git a/libavcodec/rdft.h b/libavcodec/rdft.h index 37c40e7..ffafca7 100644 --- a/libavcodec/rdft.h +++ b/libavcodec/rdft.h @@ -25,29 +25,6 @@ #include "config.h" #include "fft.h" -#if CONFIG_HARDCODED_TABLES -# define SINTABLE_CONST const -#else -# define SINTABLE_CONST -#endif - -#define SINTABLE(size) \ - SINTABLE_CONST DECLARE_ALIGNED(16, FFTSample, ff_sin_##size)[size/2] - -extern SINTABLE(16); -extern SINTABLE(32); -extern SINTABLE(64); -extern SINTABLE(128); -extern SINTABLE(256); -extern SINTABLE(512); -extern SINTABLE(1024); -extern SINTABLE(2048); -extern SINTABLE(4096); -extern SINTABLE(8192); -extern SINTABLE(16384); -extern SINTABLE(32768); -extern SINTABLE(65536); - struct RDFTContext { int nbits; int inverse; @@ -55,7 +32,8 @@ struct RDFTContext { /* pre/post rotation tables */ const FFTSample *tcos; - SINTABLE_CONST FFTSample *tsin; + const FFTSample *tsin; + int negative_sin; FFTContext fft; void (*rdft_calc)(struct RDFTContext *s, FFTSample *z); };