From patchwork Mon Aug 2 05:34:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wu Jianhua X-Patchwork-Id: 29182 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:6c0f:0:0:0:0:0 with SMTP id a15csp1282651ioh; Sun, 1 Aug 2021 22:35:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz24LJhZ94FdtPC3NFkNigOky0JUpcQ9DgLDq2nDgVy0B9CAthVC90iP2JVZgcTijfyorYc X-Received: by 2002:a17:906:3983:: with SMTP id h3mr13536137eje.249.1627882529706; Sun, 01 Aug 2021 22:35:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627882529; cv=none; d=google.com; s=arc-20160816; b=ChVR17aP3N0TCI8ZIjIYnDpUZ+5JnJrJDmcTV4/TVyNWpGUBdFMToFJy/xAeuCEAB3 s9fHXBzf46prQmXPDk74DYzvtKQ5VghJxN1Sb7AwbH/AX8caE0R+E3589PZ3OI6in4RH Z0CMMv/9QXW21fssFG+zYDIOn/CKbbD7SnWz5B5Pees9SfzD2k3p+RMpKjlytHqRpJRw toQmXpdEfUXtruX6w4Jerc3iAifWnjA3N7VjwhE1CB1epzkXh58LIeMxlC9jyELuHS+Z 7pAjjaUKSV59k6j2B9sIpu6LaOjmsIjw2vTy6cVHcBhFNfH/DR1b71Qk21WBj1XfkipO N9TQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:message-id:date:to:from:delivered-to; bh=2ZavV0vTEI6RT3E9ZB2b895qmeC6VhmeMt47sGM8/tU=; b=L/mZZY1xKPExOO0MaJzi2+Yjye03dNE9zIb5Qan8Y9RkWNX3h0lzzc+6/KV4c2cCX2 UUOyoeStkC2gG6KCbZSuJ1phI/SLLbvDI1bUJ75eSHkWh6Ar2n00p7/WpkT+yPfGy20F wEQLp+M90k2+RFVXd2pstSpouYSG2QIfKgt9nFB0VqaU7wiwBBazxZBc7jORHC9QbOcu 9rnFTyL6n/poC70zDRrlPjwoKsTBE+eB6p91+5AjTr6zhyRXPA8/Mp3hzlwHwjvC5KWd lQObzbko34Yr/BNZi0hViZ2K66niJPqkV85Yb81S8BNKegB2uBQUoaHixPIPNJV0kOld XDkg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qb29si6584000ejc.254.2021.08.01.22.35.28; Sun, 01 Aug 2021 22:35:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 517A168A57B; Mon, 2 Aug 2021 08:35:24 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BBED568A181 for ; Mon, 2 Aug 2021 08:35:16 +0300 (EEST) X-IronPort-AV: E=McAfee;i="6200,9189,10063"; a="213420916" X-IronPort-AV: E=Sophos;i="5.84,288,1620716400"; d="scan'208";a="213420916" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Aug 2021 22:35:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.84,287,1620716400"; d="scan'208";a="457815390" Received: from skl-e5.sh.intel.com ([10.239.43.106]) by orsmga007.jf.intel.com with ESMTP; 01 Aug 2021 22:35:13 -0700 From: Wu Jianhua To: ffmpeg-devel@ffmpeg.org Date: Mon, 2 Aug 2021 13:34:35 +0800 Message-Id: <20210802053439.42828-1-jianhua.wu@intel.com> X-Mailer: git-send-email 2.17.1 Subject: [FFmpeg-devel] [PATCH 1/5] libavfilter/x86/vf_gblur: add ff_postscale_slice_avx512() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Wu Jianhua , yanfei.cheng@intel.com MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: SZMPpfeNc+0D Co-authored-by: Cheng Yanfei Co-authored-by: Jin Jun Signed-off-by: Wu Jianhua --- libavfilter/x86/vf_gblur.asm | 21 ++++++++++++--------- libavfilter/x86/vf_gblur_init.c | 4 ++++ 2 files changed, 16 insertions(+), 9 deletions(-) diff --git a/libavfilter/x86/vf_gblur.asm b/libavfilter/x86/vf_gblur.asm index 4d84e6d011..276fe347f5 100644 --- a/libavfilter/x86/vf_gblur.asm +++ b/libavfilter/x86/vf_gblur.asm @@ -194,19 +194,17 @@ cglobal postscale_slice, 2, 2, 4, ptr, length, postscale, min, max VBROADCASTSS m1, minm VBROADCASTSS m2, maxm %elif WIN64 - SWAP 0, 2 - SWAP 1, 3 - VBROADCASTSS m0, xm0 - VBROADCASTSS m1, xm1 + VBROADCASTSS m0, xmm2 + VBROADCASTSS m1, xmm3 VBROADCASTSS m2, maxm -%else ; UNIX64 - VBROADCASTSS m0, xm0 - VBROADCASTSS m1, xm1 - VBROADCASTSS m2, xm2 +%else ; UNIX + VBROADCASTSS m0, xmm0 + VBROADCASTSS m1, xmm1 + VBROADCASTSS m2, xmm2 %endif .loop: -%if cpuflag(avx2) +%if cpuflag(avx2) || cpuflag(avx512) mulps m3, m0, [ptrq + lengthq] %else movu m3, [ptrq + lengthq] @@ -229,3 +227,8 @@ POSTSCALE_SLICE INIT_YMM avx2 POSTSCALE_SLICE %endif + +%if HAVE_AVX512_EXTERNAL +INIT_ZMM avx512 +POSTSCALE_SLICE +%endif diff --git a/libavfilter/x86/vf_gblur_init.c b/libavfilter/x86/vf_gblur_init.c index d80fb46fe4..34aba4ca6e 100644 --- a/libavfilter/x86/vf_gblur_init.c +++ b/libavfilter/x86/vf_gblur_init.c @@ -29,6 +29,7 @@ void ff_horiz_slice_avx2(float *ptr, int width, int height, int steps, float nu, void ff_postscale_slice_sse(float *ptr, int length, float postscale, float min, float max); void ff_postscale_slice_avx2(float *ptr, int length, float postscale, float min, float max); +void ff_postscale_slice_avx512(float *ptr, int length, float postscale, float min, float max); av_cold void ff_gblur_init_x86(GBlurContext *s) { @@ -47,5 +48,8 @@ av_cold void ff_gblur_init_x86(GBlurContext *s) if (EXTERNAL_AVX2(cpu_flags)) { s->horiz_slice = ff_horiz_slice_avx2; } + if (EXTERNAL_AVX512(cpu_flags)) { + s->postscale_slice = ff_postscale_slice_avx512; + } #endif }