From patchwork Thu Nov 24 08:52:28 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muhammad Faiz X-Patchwork-Id: 1545 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.90.1 with SMTP id o1csp72955vsb; Thu, 24 Nov 2016 00:53:57 -0800 (PST) X-Received: by 10.28.174.76 with SMTP id x73mr1321422wme.25.1479977636946; Thu, 24 Nov 2016 00:53:56 -0800 (PST) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id kw6si35630011wjb.292.2016.11.24.00.53.56; Thu, 24 Nov 2016 00:53:56 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 81C1A689A77; Thu, 24 Nov 2016 10:53:50 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf0-f193.google.com (mail-pf0-f193.google.com [209.85.192.193]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A9C36689A69 for ; Thu, 24 Nov 2016 10:53:43 +0200 (EET) Received: by mail-pf0-f193.google.com with SMTP id y68so1675783pfb.1 for ; Thu, 24 Nov 2016 00:53:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=vQgvTPlOzPjnYbwQlbUfevsIZ1o+bw5ACPhwnkE1Ewo=; b=L2pJWApqfw/+Vre2KPxvVjqGlWViYLf0UhRR0Atq9P2lZdJFUaBDL7XEJipL5Iel6X NldQ9GB0roO+A03K1e9pK38Txskq/A6HZrqDYJKdw2oMpZUOClGNtxJCZsxCq9PEyz+1 WTYM/+GpWJZSgECx/9Zq6tw5HYoZ5/usTHENdMcQ+Aff67SjuNjj8HgfX+zeNoLRRIFV WtFsHJCRTyWQOVIUKi4XU0b8YaLR2RJRgYenIAAr3uUsbgV2ywrwZrNF+OnKPDwkM4ps 3taKPyba2ibRGGFO54TQIm7mneaOJe9LQDOVJj+3OvjHShu3LZnQ3XhNqUxbR/uX2T3n KM/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vQgvTPlOzPjnYbwQlbUfevsIZ1o+bw5ACPhwnkE1Ewo=; b=dZVDfsuZRJBjwBS85zRn2pSXOVKEj6omsQfninBACaSumyZWg0ys5xt9IaHnIbIGGm hwcreGj541D1wMhrZgooB5IaoQTIpX9G0GOz5u4ay+F9SIiJX/rPX0hQokpTSUacHeGN GzfdH8JeqXPmDQxNgPPkpkJ2BbDVXsLmzqi/bhgBK5BDmK67qtNFQXa17+lsqso5XoT6 qLty+Z8dFnSn5JU7ujkAt34yBeq/ZUKyvQE/2sGsFuF1AF6jCkgoQqKzLjZw07jl+xGZ 5KoMu7D3ZY6V5p8IaL2QWkYAKavZU/4qxOmYk2NHk2DlI/WmvE3E1KLQBZhjlD/tmhjG TF5A== X-Gm-Message-State: AKaTC02a5ocODkaXa0SpckPwL/3hKQr8MhxIOKp32BivvryaPSbxrp3Byz4OQ6hgdrq28Q== X-Received: by 10.84.211.7 with SMTP id b7mr2958195pli.83.1479977626031; Thu, 24 Nov 2016 00:53:46 -0800 (PST) Received: from localhost.localdomain.localdomain ([114.121.235.104]) by smtp.gmail.com with ESMTPSA id b80sm58377610pfe.52.2016.11.24.00.53.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Nov 2016 00:53:45 -0800 (PST) From: Muhammad Faiz To: ffmpeg-devel@ffmpeg.org Date: Thu, 24 Nov 2016 15:52:28 +0700 Message-Id: <1479977548-2621-2-git-send-email-mfcc64@gmail.com> X-Mailer: git-send-email 2.5.0 In-Reply-To: <1479977548-2621-1-git-send-email-mfcc64@gmail.com> References: <1479977548-2621-1-git-send-email-mfcc64@gmail.com> Subject: [FFmpeg-devel] [PATCH 2/2] swresample/resample: optimize exact_rational=on:linear_interp=on case X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Muhammad Faiz MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" separate dsp.resample to dsp.resample_common and dsp.resample_linear and choose to call faster resample_common even when linear_interp=on when c->frac and c->dst_incr_mod are both zero speed up resampling when exact_rational and linear_interp are both enabled because exact_rational force c->frac and c->dst_incr_mod to be zero when soft compensation does not happen benchmark on exact_rational=on:linear_interp=on old new real 8.432s 5.097s user 7.679s 4.989s sys 0.125s 0.107s Signed-off-by: Muhammad Faiz --- libswresample/arm/resample_init.c | 6 ++---- libswresample/resample.c | 7 ++++++- libswresample/resample.h | 6 ++++-- libswresample/resample_dsp.c | 12 ++++++++---- libswresample/x86/resample_init.c | 32 ++++++++++++++++---------------- 5 files changed, 36 insertions(+), 27 deletions(-) diff --git a/libswresample/arm/resample_init.c b/libswresample/arm/resample_init.c index 003fafd..e334a27 100644 --- a/libswresample/arm/resample_init.c +++ b/libswresample/arm/resample_init.c @@ -111,12 +111,10 @@ av_cold void swri_resample_dsp_arm_init(ResampleContext *c) switch(c->format) { case AV_SAMPLE_FMT_FLTP: - if (!c->linear) - c->dsp.resample = ff_resample_common_float_neon; + c->dsp.resample_common = ff_resample_common_float_neon; break; case AV_SAMPLE_FMT_S16P: - if (!c->linear) - c->dsp.resample = ff_resample_common_s16_neon; + c->dsp.resample_common = ff_resample_common_s16_neon; break; } } diff --git a/libswresample/resample.c b/libswresample/resample.c index 8635bf1..e65a57a 100644 --- a/libswresample/resample.c +++ b/libswresample/resample.c @@ -496,7 +496,12 @@ static int swri_resample(ResampleContext *c, dst_size = FFMIN(dst_size, delta_n); if (dst_size > 0) { - *consumed = c->dsp.resample(c, dst, src, dst_size, update_ctx); + /* resample_linear and resample_common should have same behavior + * when frac and dst_incr_mod are zero */ + if (c->linear && (c->frac || c->dst_incr_mod)) + *consumed = c->dsp.resample_linear(c, dst, src, dst_size, update_ctx); + else + *consumed = c->dsp.resample_common(c, dst, src, dst_size, update_ctx); } else { *consumed = 0; } diff --git a/libswresample/resample.h b/libswresample/resample.h index 7fe9b97..946f5cc 100644 --- a/libswresample/resample.h +++ b/libswresample/resample.h @@ -53,8 +53,10 @@ typedef struct ResampleContext { struct { void (*resample_one)(void *dst, const void *src, int n, int64_t index, int64_t incr); - int (*resample)(struct ResampleContext *c, void *dst, - const void *src, int n, int update_ctx); + int (*resample_common)(struct ResampleContext *c, void *dst, + const void *src, int n, int update_ctx); + int (*resample_linear)(struct ResampleContext *c, void *dst, + const void *src, int n, int update_ctx); } dsp; } ResampleContext; diff --git a/libswresample/resample_dsp.c b/libswresample/resample_dsp.c index 41369f3..6ffbb87 100644 --- a/libswresample/resample_dsp.c +++ b/libswresample/resample_dsp.c @@ -48,19 +48,23 @@ void swri_resample_dsp_init(ResampleContext *c) switch(c->format){ case AV_SAMPLE_FMT_S16P: c->dsp.resample_one = resample_one_int16; - c->dsp.resample = c->linear ? resample_linear_int16 : resample_common_int16; + c->dsp.resample_common = resample_common_int16; + c->dsp.resample_linear = resample_linear_int16; break; case AV_SAMPLE_FMT_S32P: c->dsp.resample_one = resample_one_int32; - c->dsp.resample = c->linear ? resample_linear_int32 : resample_common_int32; + c->dsp.resample_common = resample_common_int32; + c->dsp.resample_linear = resample_linear_int32; break; case AV_SAMPLE_FMT_FLTP: c->dsp.resample_one = resample_one_float; - c->dsp.resample = c->linear ? resample_linear_float : resample_common_float; + c->dsp.resample_common = resample_common_float; + c->dsp.resample_linear = resample_linear_float; break; case AV_SAMPLE_FMT_DBLP: c->dsp.resample_one = resample_one_double; - c->dsp.resample = c->linear ? resample_linear_double : resample_common_double; + c->dsp.resample_common = resample_common_double; + c->dsp.resample_linear = resample_linear_double; break; } diff --git a/libswresample/x86/resample_init.c b/libswresample/x86/resample_init.c index 9d7d5cf..e515762 100644 --- a/libswresample/x86/resample_init.c +++ b/libswresample/x86/resample_init.c @@ -50,40 +50,40 @@ av_cold void swri_resample_dsp_x86_init(ResampleContext *c) switch(c->format){ case AV_SAMPLE_FMT_S16P: if (ARCH_X86_32 && EXTERNAL_MMXEXT(mm_flags)) { - c->dsp.resample = c->linear ? ff_resample_linear_int16_mmxext - : ff_resample_common_int16_mmxext; + c->dsp.resample_linear = ff_resample_linear_int16_mmxext; + c->dsp.resample_common = ff_resample_common_int16_mmxext; } if (EXTERNAL_SSE2(mm_flags)) { - c->dsp.resample = c->linear ? ff_resample_linear_int16_sse2 - : ff_resample_common_int16_sse2; + c->dsp.resample_linear = ff_resample_linear_int16_sse2; + c->dsp.resample_common = ff_resample_common_int16_sse2; } if (EXTERNAL_XOP(mm_flags)) { - c->dsp.resample = c->linear ? ff_resample_linear_int16_xop - : ff_resample_common_int16_xop; + c->dsp.resample_linear = ff_resample_linear_int16_xop; + c->dsp.resample_common = ff_resample_common_int16_xop; } break; case AV_SAMPLE_FMT_FLTP: if (EXTERNAL_SSE(mm_flags)) { - c->dsp.resample = c->linear ? ff_resample_linear_float_sse - : ff_resample_common_float_sse; + c->dsp.resample_linear = ff_resample_linear_float_sse; + c->dsp.resample_common = ff_resample_common_float_sse; } if (EXTERNAL_AVX_FAST(mm_flags)) { - c->dsp.resample = c->linear ? ff_resample_linear_float_avx - : ff_resample_common_float_avx; + c->dsp.resample_linear = ff_resample_linear_float_avx; + c->dsp.resample_common = ff_resample_common_float_avx; } if (EXTERNAL_FMA3_FAST(mm_flags)) { - c->dsp.resample = c->linear ? ff_resample_linear_float_fma3 - : ff_resample_common_float_fma3; + c->dsp.resample_linear = ff_resample_linear_float_fma3; + c->dsp.resample_common = ff_resample_common_float_fma3; } if (EXTERNAL_FMA4(mm_flags)) { - c->dsp.resample = c->linear ? ff_resample_linear_float_fma4 - : ff_resample_common_float_fma4; + c->dsp.resample_linear = ff_resample_linear_float_fma4; + c->dsp.resample_common = ff_resample_common_float_fma4; } break; case AV_SAMPLE_FMT_DBLP: if (EXTERNAL_SSE2(mm_flags)) { - c->dsp.resample = c->linear ? ff_resample_linear_double_sse2 - : ff_resample_common_double_sse2; + c->dsp.resample_linear = ff_resample_linear_double_sse2; + c->dsp.resample_common = ff_resample_common_double_sse2; } break; }