From patchwork Wed Aug 4 02:06:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wu Jianhua X-Patchwork-Id: 29226 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:6c0f:0:0:0:0:0 with SMTP id a15csp2975954ioh; Tue, 3 Aug 2021 19:06:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzR1XkHqmE+bn1DLgr+9+OSBlKTUNIiv1wKLZBHg695EfsQomd14UPV8hhT8TC41lyB6i25 X-Received: by 2002:a50:f1c7:: with SMTP id y7mr29244152edl.386.1628042796901; Tue, 03 Aug 2021 19:06:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628042796; cv=none; d=google.com; s=arc-20160816; b=Iab3WvVjfCytNA0Hh51vAR62hq/tCTw5N2Dy+5daxE+TrK0DTwWiyV6dJvFQwgOblN ytxqryOBQwyidU8XFAKxGUfG/jltqYr8BRr30WfbnN52cvMMpEQsyt+AHlN40YivoBpP LMsY36DPGGCIIkuHSxRDWwJCpQihDVCC+76xPPcQVAP4B6/yXCpiw5CXqwhAW/yVudsE v8Fvlo0Te5MlC1BqTFH81lkv6y3jebhIm/bKaI4JQ9mA/x9JVQ/etpw5KqUNsgE+PzRc dXNeIz4ODbgUxVrZHJ5QOKHPYmW7RLMxb4CKokJPtRm1TPBSzpO2d+D+DpH1OvVxUL2k y6AA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:message-id:date:to:from:delivered-to; bh=2ZavV0vTEI6RT3E9ZB2b895qmeC6VhmeMt47sGM8/tU=; b=Hz6/EO3+J/MTJQOyIaVBVkKHnoNQ7oDIICVWIHa7i732weUfQYJS7TkFktQSa8j1pJ 9/Qu1J1trRJQ3MQYuqEt7VdGujJS5q71qkiHg/Z8GmnI/uucj0OoFYJ4E5iR01KmUTmQ jXxxDI8MFatA70YcMtA38Q03kAMuoeyUg7VsFD461U1qpAOs614dwPR/GfUcgSdW7jN+ 137RwMQa1uh746gdkFcY14MLQEMWthH/2ctfmpdWiQKt5hMwe8ck//EBm7H7aLJpFM1a eJv17Z2ffSiJi5zl5v1WOj4SmYo5Ep/salRgsXl/+AXctjrNLD1AU9luho6wdxMgVxnJ R6iA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id jy1si626198ejc.140.2021.08.03.19.06.36; Tue, 03 Aug 2021 19:06:36 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B50026891F1; Wed, 4 Aug 2021 05:06:32 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CA1EF6880BA for ; Wed, 4 Aug 2021 05:06:25 +0300 (EEST) X-IronPort-AV: E=McAfee;i="6200,9189,10065"; a="210717012" X-IronPort-AV: E=Sophos;i="5.84,293,1620716400"; d="scan'208";a="210717012" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Aug 2021 19:06:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.84,293,1620716400"; d="scan'208";a="667632118" Received: from skl-e5.sh.intel.com ([10.239.43.106]) by fmsmga006.fm.intel.com with ESMTP; 03 Aug 2021 19:06:22 -0700 From: Wu Jianhua To: ffmpeg-devel@ffmpeg.org Date: Wed, 4 Aug 2021 10:06:12 +0800 Message-Id: <20210804020616.82866-1-jianhua.wu@intel.com> X-Mailer: git-send-email 2.17.1 Subject: [FFmpeg-devel] [PATCH v2 1/5] libavfilter/x86/vf_gblur: add ff_postscale_slice_avx512() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Wu Jianhua MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 9xfvLfzOJt0b Co-authored-by: Cheng Yanfei Co-authored-by: Jin Jun Signed-off-by: Wu Jianhua --- libavfilter/x86/vf_gblur.asm | 21 ++++++++++++--------- libavfilter/x86/vf_gblur_init.c | 4 ++++ 2 files changed, 16 insertions(+), 9 deletions(-) diff --git a/libavfilter/x86/vf_gblur.asm b/libavfilter/x86/vf_gblur.asm index 4d84e6d011..276fe347f5 100644 --- a/libavfilter/x86/vf_gblur.asm +++ b/libavfilter/x86/vf_gblur.asm @@ -194,19 +194,17 @@ cglobal postscale_slice, 2, 2, 4, ptr, length, postscale, min, max VBROADCASTSS m1, minm VBROADCASTSS m2, maxm %elif WIN64 - SWAP 0, 2 - SWAP 1, 3 - VBROADCASTSS m0, xm0 - VBROADCASTSS m1, xm1 + VBROADCASTSS m0, xmm2 + VBROADCASTSS m1, xmm3 VBROADCASTSS m2, maxm -%else ; UNIX64 - VBROADCASTSS m0, xm0 - VBROADCASTSS m1, xm1 - VBROADCASTSS m2, xm2 +%else ; UNIX + VBROADCASTSS m0, xmm0 + VBROADCASTSS m1, xmm1 + VBROADCASTSS m2, xmm2 %endif .loop: -%if cpuflag(avx2) +%if cpuflag(avx2) || cpuflag(avx512) mulps m3, m0, [ptrq + lengthq] %else movu m3, [ptrq + lengthq] @@ -229,3 +227,8 @@ POSTSCALE_SLICE INIT_YMM avx2 POSTSCALE_SLICE %endif + +%if HAVE_AVX512_EXTERNAL +INIT_ZMM avx512 +POSTSCALE_SLICE +%endif diff --git a/libavfilter/x86/vf_gblur_init.c b/libavfilter/x86/vf_gblur_init.c index d80fb46fe4..34aba4ca6e 100644 --- a/libavfilter/x86/vf_gblur_init.c +++ b/libavfilter/x86/vf_gblur_init.c @@ -29,6 +29,7 @@ void ff_horiz_slice_avx2(float *ptr, int width, int height, int steps, float nu, void ff_postscale_slice_sse(float *ptr, int length, float postscale, float min, float max); void ff_postscale_slice_avx2(float *ptr, int length, float postscale, float min, float max); +void ff_postscale_slice_avx512(float *ptr, int length, float postscale, float min, float max); av_cold void ff_gblur_init_x86(GBlurContext *s) { @@ -47,5 +48,8 @@ av_cold void ff_gblur_init_x86(GBlurContext *s) if (EXTERNAL_AVX2(cpu_flags)) { s->horiz_slice = ff_horiz_slice_avx2; } + if (EXTERNAL_AVX512(cpu_flags)) { + s->postscale_slice = ff_postscale_slice_avx512; + } #endif }