From patchwork Tue May 23 19:01:18 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 3720 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.10.2 with SMTP id 2csp2029899vsk; Tue, 23 May 2017 12:09:30 -0700 (PDT) X-Received: by 10.223.173.212 with SMTP id w78mr15286297wrc.144.1495566570227; Tue, 23 May 2017 12:09:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1495566570; cv=none; d=google.com; s=arc-20160816; b=Q4qVXl5yB7DwJLXaxJHlpcQZaw4XGU5+5K6ueJj9LE6FAAkxtQ4HsAkRG+fAt38Xh6 NvqvQdyDV6ZsdxSFJ5S0Ayse5G0UOgBJyrVa4OoAwhEzXH5WfMbGMIIOc7M90DDB+sSd xHuIpx/2Ob/zqit5n7pibk2P6orCqCfYxG2I5S3qgYq4sSRAQrKdjWdMkgE2TEA0USzd vRZFygAB5QqLtNLavU143bQOVsohUUlWZwJ2z4a3YuweHRtL6EIRyc+2l+A9DoIvSEgf OyCAxuwN9B3TcYW7mwZZTotBu+XDhSfe6YGSw/15eP+v6oMDRbWQ5aF3eH0/AA0p9hA7 Y/Hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to:arc-authentication-results; bh=a2i80TdGlEU2ai9WiIPsHPzTnFQ3rFD+JjSjN+B+nX0=; b=VC541pPiD6M207CoDSU36h0pvOKAr0oc6x27ASrXp/zodpKOapecUPVKGTeo7SlYyi Oyb7GQ1QkZaxQ5oaAj+oPR+hudPnYjQsASjkTh9lKmUsjwVYliPy5QX9jXBvnbgXA21E jkIUxJcP/ooKuwXgHXDjboT/14dItGg185rj9WakEdWGEyn2WcQlpoex3s3Re7Q1Pfj5 o7eMab1f8WV9PVuFibHZEG4um/sj5Yw3bSm2MBSAIea+cAebj404Hh1ocJ02SzPQJ5Sj AB5XvVnD9tk8YOQf3rish+iRu1h12BwbEaQDoW6k0FXsPPXHWuCMOB5WfBXtmVdqSG0b mKAQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id k105si18120387wrc.227.2017.05.23.12.09.29; Tue, 23 May 2017 12:09:30 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D524E689AFC; Tue, 23 May 2017 22:09:24 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qt0-f196.google.com (mail-qt0-f196.google.com [209.85.216.196]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3097C689A37 for ; Tue, 23 May 2017 22:09:18 +0300 (EEST) Received: by mail-qt0-f196.google.com with SMTP id r58so23450240qtb.2 for ; Tue, 23 May 2017 12:09:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references; bh=KFSQUmPEm22fS/Q33RXdca63hGk8sVtlSPkm5gE/MEU=; b=cn8PfAsOHHBWqtTE7WvARyIsGsjjW7sVHeCaW5YAo6gql8DsIHHtRJPM2zoLXxyFD+ BLRZY2jsj/zL/gjtjzgAmKMfLy2uOJUIAnhHRLRASm7M/klR6rNyOD+0IKGY3vdEIy2f RVIwyEYdSQf7Tx/paZtwL4S3YG0L7eZFzVOI+E0YVz9O0E1PtaDzA3QLloJxrkVWzWLD BU7028RX53yDJOdZB5mVSOrEO9A3EN1OMqKnPeMDzAriSoCBcwGLCPDQsqvcC1UXk3sg sQKPMaTe24fXZ/qZT6LfDQWt2n3boOdpfh0bBAXU3kHvk8LCnmDARG3eD1ZTh5DYumGa Y/9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=KFSQUmPEm22fS/Q33RXdca63hGk8sVtlSPkm5gE/MEU=; b=tozKiPePxUEvxcJSDsisgvzGRrOsM/dlOb5HdzZAuI041YYKEV0CupbpelW6l/tRQg rxQwxlO+PoeLmKfakF2js60lQI/vu0QkTuhAcv95TLNnVTR64c1XCWRT86cX8UBsIKjq BlA4oWI60d/liNejgPvW5ZUE3CQAjlzXl04iprETD/na+KrmnbaQ6yDFTON4j9FLUb2E KQuuYRiYW6uYX7UgoHUxDuL1BIymJ71ZpqJFLwb2pHqumoqfQtbgbTGl9jG0PNkfikze JOzKx680IyRFPUeBGVeqFx9ApcmD3ys4HEWdbX2I8Zxv6hlPjNnaJYJsWonsZskkxOr8 jD9w== X-Gm-Message-State: AODbwcADmN6HT0QI87W/us7sr5pC/oO5zdzZm/0QzUhre39NRloVEsaY 9I67fxfgmmLS8sYd X-Received: by 10.200.35.230 with SMTP id r35mr28183771qtr.167.1495566125803; Tue, 23 May 2017 12:02:05 -0700 (PDT) Received: from localhost.localdomain ([181.231.116.134]) by smtp.gmail.com with ESMTPSA id l47sm1025502qtb.14.2017.05.23.12.02.04 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 23 May 2017 12:02:05 -0700 (PDT) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 23 May 2017 16:01:18 -0300 Message-Id: <20170523190118.3524-2-jamrial@gmail.com> X-Mailer: git-send-email 2.12.1 In-Reply-To: <20170523190118.3524-1-jamrial@gmail.com> References: <20170523190118.3524-1-jamrial@gmail.com> Subject: [FFmpeg-devel] [PATCH 2/2] x86/aacps: add ff_ps_stereo_interpolate_ipdopd_sse3() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" About 2x faster than the c version. Signed-off-by: James Almer --- libavcodec/x86/aacpsdsp.asm | 51 ++++++++++++++++++++++++++++++++++++++++++ libavcodec/x86/aacpsdsp_init.c | 4 ++++ 2 files changed, 55 insertions(+) diff --git a/libavcodec/x86/aacpsdsp.asm b/libavcodec/x86/aacpsdsp.asm index e92cbbce08..bb8a7f5df0 100644 --- a/libavcodec/x86/aacpsdsp.asm +++ b/libavcodec/x86/aacpsdsp.asm @@ -117,6 +117,57 @@ align 16 .ret: REP_RET +;*************************************************************************** +;void ps_stereo_interpolate_ipdopd_sse3(float (*l)[2], float (*r)[2], +; float h[2][4], float h_step[2][4], +; int len); +;*************************************************************************** +INIT_XMM sse3 +cglobal ps_stereo_interpolate_ipdopd, 5, 5, 10, l, r, h, h_step, n + cmp nd, 0 + jle .ret + movaps m0, [hq] + movaps m1, [hq+mmsize] +%if ARCH_X86_64 + movaps m8, [h_stepq] + movaps m9, [h_stepq+mmsize] + %define H_STEP0 m8 + %define H_STEP1 m9 +%else + %define H_STEP0 [h_stepq] + %define H_STEP1 [h_stepq+mmsize] +%endif + shl nd, 3 + add lq, nq + add rq, nq + neg nq + +align 16 +.loop: + addps m0, H_STEP0 + addps m1, H_STEP1 + movddup m2, [lq+nq] + movddup m3, [rq+nq] + shufps m4, m2, m2, q2301 + shufps m5, m3, m3, q2301 + unpcklps m6, m0, m0 + unpckhps m7, m0, m0 + mulps m2, m6 + mulps m3, m7 + unpcklps m6, m1, m1 + unpckhps m7, m1, m1 + mulps m4, m6 + mulps m5, m7 + addps m2, m3 + addsubps m4, m5 + addsubps m2, m4 + movsd [lq+nq], m2 + movhps [rq+nq], m2 + add nq, 8 + jl .loop +.ret: + REP_RET + ;******************************************************************* ;void ff_ps_hybrid_analysis_(float (*out)[2], float (*in)[2], ; const float (*filter)[8][2], diff --git a/libavcodec/x86/aacpsdsp_init.c b/libavcodec/x86/aacpsdsp_init.c index f6d6c039c3..767ae6588e 100644 --- a/libavcodec/x86/aacpsdsp_init.c +++ b/libavcodec/x86/aacpsdsp_init.c @@ -37,6 +37,9 @@ void ff_ps_hybrid_analysis_sse3(float (*out)[2], float (*in)[2], void ff_ps_stereo_interpolate_sse3(float (*l)[2], float (*r)[2], float h[2][4], float h_step[2][4], int len); +void ff_ps_stereo_interpolate_ipdopd_sse3(float (*l)[2], float (*r)[2], + float h[2][4], float h_step[2][4], + int len); av_cold void ff_psdsp_init_x86(PSDSPContext *s) { @@ -50,6 +53,7 @@ av_cold void ff_psdsp_init_x86(PSDSPContext *s) if (EXTERNAL_SSE3(cpu_flags)) { s->add_squares = ff_ps_add_squares_sse3; s->stereo_interpolate[0] = ff_ps_stereo_interpolate_sse3; + s->stereo_interpolate[1] = ff_ps_stereo_interpolate_ipdopd_sse3; s->hybrid_analysis = ff_ps_hybrid_analysis_sse3; } }