From patchwork Thu Nov 4 04:18:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wu Jianhua X-Patchwork-Id: 31287 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:a610:0:0:0:0:0 with SMTP id q16csp72646ioi; Wed, 3 Nov 2021 21:19:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw/JomuBcfs+hDXz4hSyizQZ5CLcv6wmb+wfNjMSLfm6lF35YlkUg188eQ2J4RTrcj2UP8l X-Received: by 2002:a17:907:2627:: with SMTP id aq7mr51141735ejc.483.1635999558525; Wed, 03 Nov 2021 21:19:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635999558; cv=none; d=google.com; s=arc-20160816; b=p+CogoeinPw+w+R0xg4Y4JI1aiTGX/OPu5XPvd4YWNloYXqJXn6w8UfFEChbqGzASb +tr76hq7eLz+I0/bOQ6zw4wDHe1glMbWyICvXbW1z7MRnh9Lt9NE3Qt7HQZ8kSylQ+B8 dVhtrMBJgoA2SPjAj8p6eGzWjLePiGB2jnQX0eh5m2pjSsN9mDM32z4OKxYBf7uXIoWt /1trDEhTTd0kyvNlgP3jzdTkGcTC+ZEajjijdBukiKB+piV5vc90lq96QGcVwvjiyLok QHWNB4hxQlsODIIQkzvHa/c6LlfRYKkv6aMKizCw3FlUJOe3WKlx7xbFeqDRk22MkshK c9gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:delivered-to; bh=zaGnCx+krjGtGD6GYKDD2p1O04B5T/SNxYL1qtfz+fk=; b=IxiMK0mxlW56YlhtyTmFpb6Xsg4Vc640Bkk2x1OAzwlPE5G08B4hUIKI4oOzK9rex1 14FzHCplDBIYdwu4U5ZWTYSar05AHniXxzO1rVO88iGCQBUsN6c9brl4Kqwsvy0DfuUr 4Lo/JiiCqaKs7ZwzCEMJERZEW5rQ2gboRU6HOLBvL1LtEzvPucMxtxT2f+CFXJn4V2pe PoflS0PQ5IEr1Dw3sk7E8fx+sFHP7ciQg4ZI+H+EkLjgNCLNWtBQQ3pmORXye3mAL6lx UNDD3dwmUcJRCJMo5WysNNNE5Fke88PgZ5ZvPjCBPZ5vlPj81w/C8YF3TbfpBKHXSe9Q FHqg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ds3si9214562ejc.113.2021.11.03.21.19.18; Wed, 03 Nov 2021 21:19:18 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7F80968AC65; Thu, 4 Nov 2021 06:19:06 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5DC4268AC6A for ; Thu, 4 Nov 2021 06:18:59 +0200 (EET) X-IronPort-AV: E=McAfee;i="6200,9189,10157"; a="231489187" X-IronPort-AV: E=Sophos;i="5.87,207,1631602800"; d="scan'208";a="231489187" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Nov 2021 21:18:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,207,1631602800"; d="scan'208";a="578438739" Received: from otc-skl-e5-server.sh.intel.com ([10.239.43.106]) by FMSMGA003.fm.intel.com with ESMTP; 03 Nov 2021 21:18:50 -0700 From: Wu Jianhua To: ffmpeg-devel@ffmpeg.org Date: Thu, 4 Nov 2021 12:18:40 +0800 Message-Id: <20211104041841.95318-2-jianhua.wu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20211104041841.95318-1-jianhua.wu@intel.com> References: <20211104041841.95318-1-jianhua.wu@intel.com> Subject: [FFmpeg-devel] [PATCH v2 2/3] avfilter/x86/vf_exposure: add ff_exposure_avx2 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Wu Jianhua MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: MAN5OFHt5IYq Performance data(Less is better): exposure_sse: 500491 exposure_avx2: 449122 Signed-off-by: Wu Jianhua --- libavfilter/x86/vf_exposure.asm | 15 +++++++++++++++ libavfilter/x86/vf_exposure_init.c | 6 ++++++ 2 files changed, 21 insertions(+) diff --git a/libavfilter/x86/vf_exposure.asm b/libavfilter/x86/vf_exposure.asm index 3351c6fb3b..f271167805 100644 --- a/libavfilter/x86/vf_exposure.asm +++ b/libavfilter/x86/vf_exposure.asm @@ -36,11 +36,21 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale VBROADCASTSS m1, xmm1 %endif +%if cpuflag(fma3) || cpuflag(fma4) + mulps m0, m0, m1 ; black * scale +%endif + .loop: +%if cpuflag(fma3) || cpuflag(fma4) + mova m2, m0 + vfmsub231ps m2, m1, [ptrq] + movu [ptrq], m2 +%else movu m2, [ptrq] subps m2, m2, m0 mulps m2, m2, m1 movu [ptrq], m2 +%endif add ptrq, mmsize sub lengthq, mmsize/4 @@ -52,4 +62,9 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale %if ARCH_X86_64 INIT_XMM sse EXPOSURE + +%if HAVE_AVX2_EXTERNAL +INIT_YMM avx2 +EXPOSURE +%endif %endif diff --git a/libavfilter/x86/vf_exposure_init.c b/libavfilter/x86/vf_exposure_init.c index de1b360f6c..80dae6164e 100644 --- a/libavfilter/x86/vf_exposure_init.c +++ b/libavfilter/x86/vf_exposure_init.c @@ -24,6 +24,7 @@ #include "libavfilter/exposure.h" void ff_exposure_sse(float *ptr, int length, float black, float scale); +void ff_exposure_avx2(float *ptr, int length, float black, float scale); av_cold void ff_exposure_init_x86(ExposureContext *s) { @@ -32,5 +33,10 @@ av_cold void ff_exposure_init_x86(ExposureContext *s) #if ARCH_X86_64 if (EXTERNAL_SSE(cpu_flags)) s->exposure_func = ff_exposure_sse; + +#if HAVE_AVX2_EXTERNAL + if (EXTERNAL_AVX2_FAST(cpu_flags)) + s->exposure_func = ff_exposure_avx2; +#endif #endif }