From patchwork Thu Mar 16 04:37:48 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muhammad Faiz X-Patchwork-Id: 2949 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.50.79 with SMTP id y76csp598379vsy; Wed, 15 Mar 2017 22:07:39 -0700 (PDT) X-Received: by 10.223.182.133 with SMTP id j5mr6762660wre.19.1489640859306; Wed, 15 Mar 2017 22:07:39 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e68si2999251wmd.117.2017.03.15.22.07.37; Wed, 15 Mar 2017 22:07:39 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B93DD68828F; Thu, 16 Mar 2017 07:07:17 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf0-f196.google.com (mail-pf0-f196.google.com [209.85.192.196]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E4148680485 for ; Thu, 16 Mar 2017 07:07:10 +0200 (EET) Received: by mail-pf0-f196.google.com with SMTP id x63so4334516pfx.2 for ; Wed, 15 Mar 2017 22:07:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=C9SwOQ7U1m22uCZY67WUKHVBMYhBOcY2ZNogM9V3Zlw=; b=hIoywgXLNm1nMObJ96eFzB0ZMGYGg8FtTU4iGk7GMpngTLzCaOw3v4KsXB3ROFnZSC AczofWruOs+eAboe8riedfESd+1mLyMhE3gRfnmqxKEy9B1i0ybR0ONwn4pKOtUc0YYM zXm2RvcyQxUSxOdRoTv8CVxVwzAl7nAXpsjjAzSJ8YxOLUGFiWwMDe5MqHchfUUj+56p CnroOjojnRh52mrjr3kyRU6exyUlN72c1VIsJqFHPxKfdf21Tp9ycF67W4Cna2WF2PxK uVF5fWQX9nyBvzopzdVZFZBWXI0aQcxCikPh3DcmtoUP661y7l4jxF85MykzumgS2EK+ t2cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=C9SwOQ7U1m22uCZY67WUKHVBMYhBOcY2ZNogM9V3Zlw=; b=d/JKmrbmVfK/TqYsyiZLJSJF14VlCxgHs5/Xe8160O5acsFiAl/+Y2lBVbl9fyLLZ/ /PzjvMgcd4QVcF4mknFQSjRbZioTQ3guVTqZzC1m0QrYlyS94d2lay/nbkQR2hfwlj4Y W83Rsw18tTlUseo9o57J6YQ65uHG4CigQ0DiXPSw0FcgxWvFR+0kUcpEHLvP8V1EgAVr Y7YLFqCZ02mSNfLW5oOTc64gU4SEG1kZFf0dWvD/dbe+8Fi5UZqlRKHDPNg92dHq7hGB T9T0FyPl+O1tdEPIYAoi1F/MU5bboa7HnOzxqjSN7rdbUi1u3w3UsaYzWFAisR71cmN+ WZzg== X-Gm-Message-State: AFeK/H32WQ6MnZhHrt4utZvNqQohfJXqGafPlMz4HN9CW84HQqa4FytSwANOtAJi9XPsNA== X-Received: by 10.99.224.69 with SMTP id n5mr7805380pgj.113.1489639089847; Wed, 15 Mar 2017 21:38:09 -0700 (PDT) Received: from localhost.localdomain ([114.120.238.220]) by smtp.gmail.com with ESMTPSA id g64sm7080508pfc.57.2017.03.15.21.38.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 Mar 2017 21:38:09 -0700 (PDT) From: Muhammad Faiz To: ffmpeg-devel@ffmpeg.org Date: Thu, 16 Mar 2017 11:37:48 +0700 Message-Id: <20170316043748.21058-1-mfcc64@gmail.com> X-Mailer: git-send-email 2.9.3 Subject: [FFmpeg-devel] [PATCH] swresample/x86/resample: extend resample_double to support avx and fma3 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Muhammad Faiz MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" benchmark: sse2 10.670s avx 8.763s fma3 8.380s Signed-off-by: Muhammad Faiz --- libswresample/x86/resample.asm | 15 ++++++++++++--- libswresample/x86/resample_init.c | 10 ++++++++++ 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/libswresample/x86/resample.asm b/libswresample/x86/resample.asm index 4163df1..7107cf9 100644 --- a/libswresample/x86/resample.asm +++ b/libswresample/x86/resample.asm @@ -203,7 +203,7 @@ cglobal resample_common_%1, 1, 7, 2, ctx, phase_count, dst, frac, \ ; horizontal sum & store %if mmsize == 32 vextractf128 xm1, m0, 0x1 - addps xm0, xm1 + addp%4 xm0, xm1 %endif movhlps xm1, xm0 %ifidn %1, float @@ -489,8 +489,8 @@ cglobal resample_linear_%1, 1, 7, 5, ctx, min_filter_length_x4, filter2, \ %if mmsize == 32 vextractf128 xm1, m0, 0x1 vextractf128 xm3, m2, 0x1 - addps xm0, xm1 - addps xm2, xm3 + addp%4 xm0, xm1 + addp%4 xm2, xm3 %endif cvtsi2s%4 xm1, fracd subp%4 xm2, xm0 @@ -608,3 +608,12 @@ RESAMPLE_FNS int16, 2, 1 INIT_XMM sse2 RESAMPLE_FNS double, 8, 3, d, pdbl_1 + +%if HAVE_AVX_EXTERNAL +INIT_YMM avx +RESAMPLE_FNS double, 8, 3, d, pdbl_1 +%endif +%if HAVE_FMA3_EXTERNAL +INIT_YMM fma3 +RESAMPLE_FNS double, 8, 3, d, pdbl_1 +%endif diff --git a/libswresample/x86/resample_init.c b/libswresample/x86/resample_init.c index e515762..c6b2a36 100644 --- a/libswresample/x86/resample_init.c +++ b/libswresample/x86/resample_init.c @@ -42,6 +42,8 @@ RESAMPLE_FUNCS(float, avx); RESAMPLE_FUNCS(float, fma3); RESAMPLE_FUNCS(float, fma4); RESAMPLE_FUNCS(double, sse2); +RESAMPLE_FUNCS(double, avx); +RESAMPLE_FUNCS(double, fma3); av_cold void swri_resample_dsp_x86_init(ResampleContext *c) { @@ -85,6 +87,14 @@ av_cold void swri_resample_dsp_x86_init(ResampleContext *c) c->dsp.resample_linear = ff_resample_linear_double_sse2; c->dsp.resample_common = ff_resample_common_double_sse2; } + if (EXTERNAL_AVX_FAST(mm_flags)) { + c->dsp.resample_linear = ff_resample_linear_double_avx; + c->dsp.resample_common = ff_resample_common_double_avx; + } + if (EXTERNAL_FMA3_FAST(mm_flags)) { + c->dsp.resample_linear = ff_resample_linear_double_fma3; + c->dsp.resample_common = ff_resample_common_double_fma3; + } break; } }