From patchwork Mon Nov 22 08:08:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wu Jianhua X-Patchwork-Id: 31524 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:d206:0:0:0:0:0 with SMTP id q6csp6599403iob; Mon, 22 Nov 2021 00:09:17 -0800 (PST) X-Google-Smtp-Source: ABdhPJyK/8/3Rukv+8HF0COWMY549RA2P11ZcwU2npB3G8YJ1zSi4nFvBkvQ0gRyTzyuC61kDATZ X-Received: by 2002:a05:6402:11c9:: with SMTP id j9mr64691290edw.346.1637568557048; Mon, 22 Nov 2021 00:09:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1637568557; cv=none; d=google.com; s=arc-20160816; b=rDAE8KKZYlpNEpBmU+kQG9CrhA4TNSdSla+ARubtseg6GSCXnEaoosPic6uN5vLuZj Rlwk24q19F2gUXFxWuNhZMiWzotaSiv0tmAPWWKdT9SagFXRwgPDruIKyKB9uYYNDc9Y JKlTN9FOX8N8B6a05tYjfr8uqKyiPAV6JWqL6+naqmc15s1jzPTvFzO9mKafxXyb9gI5 1+FQVyCgjq+7Gb0c2OCLSIu05aopKlyNdjJJLBWnDWMfq7XXTrxi110+XZtN8Uu2EOY1 WUY3XdElS9Q0JNNGJsV9pyOmrFMH0r1oIjcpprORcJusMwkCz5kGbmlh6K35Bra9yJvT yxeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:delivered-to; bh=PU8eby6Oh1LFnteecsCs+C6XT1/SOR4+6nOXN/4BM5k=; b=Mq/8WZWHmgAWvVh1oXZ93v9XaJtQXPpBXJ8rGH5BAKeGiyDuo2MM1Za6caUB91WD5K jU9Yu8/liV9gtpJBr8nFirsBBMrcws/FWyZVMVfPUsCswa1aMcLNl5vD6n0Um3mLPaTl j+6h6bTi34fT4sU5lezq3QuM4YwzkXF4L3nLxBjADJeE18bDa+8Q1lWezdau3L2aVpjA GGaooYMfWmGVnyK27C2XdxUvZTiGPjLM2laQbx7DT9qHKgB9etmhRxDHN2wlldj0D0fJ RgFqez00ErTQLdhb4qKgR5CGfwP9hSrSyp+ZVsvjYJ2Kyw7teS2BI85TvafOIs6xx4e8 IqKA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ji9si26906609ejc.16.2021.11.22.00.09.16; Mon, 22 Nov 2021 00:09:17 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B4C5668921B; Mon, 22 Nov 2021 10:09:06 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7B7D068A927 for ; Mon, 22 Nov 2021 10:08:59 +0200 (EET) X-IronPort-AV: E=McAfee;i="6200,9189,10175"; a="221964804" X-IronPort-AV: E=Sophos;i="5.87,254,1631602800"; d="scan'208";a="221964804" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Nov 2021 00:08:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,254,1631602800"; d="scan'208";a="496769586" Received: from otc-skl-e5-server.sh.intel.com ([10.239.43.106]) by orsmga007.jf.intel.com with ESMTP; 22 Nov 2021 00:08:55 -0800 From: Wu Jianhua To: ffmpeg-devel@ffmpeg.org Date: Mon, 22 Nov 2021 16:08:47 +0800 Message-Id: <20211122080848.39566-2-jianhua.wu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20211122080848.39566-1-jianhua.wu@intel.com> References: <20211122080848.39566-1-jianhua.wu@intel.com> Subject: [FFmpeg-devel] [PATCH v3 2/3] avfilter/x86/vf_exposure: add ff_exposure_avx2 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Wu Jianhua MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: oJ8a4/rLENqs Performance data(Less is better): exposure_sse: 500491 exposure_avx2: 449122 Signed-off-by: Wu Jianhua --- libavfilter/x86/vf_exposure.asm | 15 +++++++++++++++ libavfilter/x86/vf_exposure_init.c | 4 ++++ 2 files changed, 19 insertions(+) diff --git a/libavfilter/x86/vf_exposure.asm b/libavfilter/x86/vf_exposure.asm index 3351c6fb3b..4ee9fbcb15 100644 --- a/libavfilter/x86/vf_exposure.asm +++ b/libavfilter/x86/vf_exposure.asm @@ -36,11 +36,21 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale VBROADCASTSS m1, xmm1 %endif +%if cpuflag(fma3) + mulps m0, m0, m1 ; black * scale +%endif + .loop: +%if cpuflag(fma3) + mova m2, m0 + vfmsub231ps m2, m1, [ptrq] + movu [ptrq], m2 +%else movu m2, [ptrq] subps m2, m2, m0 mulps m2, m2, m1 movu [ptrq], m2 +%endif add ptrq, mmsize sub lengthq, mmsize/4 @@ -52,4 +62,9 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale %if ARCH_X86_64 INIT_XMM sse EXPOSURE + +%if HAVE_AVX2_EXTERNAL +INIT_YMM avx2 +EXPOSURE +%endif %endif diff --git a/libavfilter/x86/vf_exposure_init.c b/libavfilter/x86/vf_exposure_init.c index de1b360f6c..edc1452850 100644 --- a/libavfilter/x86/vf_exposure_init.c +++ b/libavfilter/x86/vf_exposure_init.c @@ -24,6 +24,7 @@ #include "libavfilter/exposure.h" void ff_exposure_sse(float *ptr, int length, float black, float scale); +void ff_exposure_avx2(float *ptr, int length, float black, float scale); av_cold void ff_exposure_init_x86(ExposureContext *s) { @@ -32,5 +33,8 @@ av_cold void ff_exposure_init_x86(ExposureContext *s) #if ARCH_X86_64 if (EXTERNAL_SSE(cpu_flags)) s->exposure_func = ff_exposure_sse; + + if (EXTERNAL_AVX2_FAST(cpu_flags)) + s->exposure_func = ff_exposure_avx2; #endif }