From patchwork Tue Sep 6 18:43:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37718 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3443677pzh; Tue, 6 Sep 2022 11:44:11 -0700 (PDT) X-Google-Smtp-Source: AA6agR6QUh3aLuyoF+qmFF0ovvkjTj4Nn6swzTvgi51EqKqh6zmDz4oRWuph7ahY7N7kUuV6Qq2j X-Received: by 2002:a17:907:72d6:b0:742:133b:42be with SMTP id du22-20020a17090772d600b00742133b42bemr25323792ejc.581.1662489851598; Tue, 06 Sep 2022 11:44:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489851; cv=none; d=google.com; s=arc-20160816; b=RkTTRw2qH8aqtCnFXWgOgtJIuUD82twfvtKUAUBrt2jKVP6iFA0rcQWlX8DIzcSm6V Akwfdk4DkybU2vDqgbMKxpg1QLtnJMqepLkKhgl1xGCOi9dQPxFZh91VGP/eBmvQbVHS kHWH9S6chdISoIT9fU+EPxr7+fs4HkUXKqxYw4iWooh2pVDsLeU4mARAFfkyN27l0mEf hzxvWuKbaRG/kwjVfUAx440C7YrhfQrulUu3kfbsi9b2ndfZhZRXPvo7FIS0LLg2V1rf mOuvzfNpQj9d2nmViaFckyO3VaK+Mq7TbuSjJyEoQp7D8ilKFvXjxRbfAEZPXx3+Ge1u qVtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=x4Z7817OBUBQjLadX40/XLCrT1iCj6y3Gii2EFLeS6A=; b=r3PnO+xh2go/eCouVEMI8bVrW9+aKcrVRO5XNBEWRv1fviQR8RiMv3tuenlQVad89w XS/iIrWUsSj0OHOVXchcAiHeFd0c+1GHF7ugkkrvqvFMY9/ghNMX4y3zPMtXNZ1NqJqb gA7CSme68xVXYjMXcIqRaikxA8iPjYHlIT2kV3ZutD4MUMR2FoYLd4nNLHPv5REpPoAM kbo9k/TcVQD4vDb6ppAXs9OanGW2bAq8YLc5doEvY7Ejlk/mA5KBhIidh9yJI2H8uYe7 wZR87OFNGdb1y39i2WJsqnEPxhmesWiYRETSLmCiFd/YHWz021XCNTX84nwaD3k8ZdDu WT1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hv13-20020a17090760cd00b0076f7f824407si1640051ejc.948.2022.09.06.11.44.11; Tue, 06 Sep 2022 11:44:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 26B6268BB18; Tue, 6 Sep 2022 21:44:09 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8675C68BABD for ; Tue, 6 Sep 2022 21:44:02 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 3286BC00AD for ; Tue, 6 Sep 2022 21:44:02 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:43:51 +0300 Message-Id: <20220906184402.119826-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/12] lavu/riscv: add CPU flags for the RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: T++Z0LO14uzQ From: Rémi Denis-Courmont RVV defines a total of 12 different extensions, including: - 5 different instruction subsets: - Zve32x: 8-, 16- and 32-bit integers, - Zve32f: Zve32x plus single precision floats, - Zve64x: Zve32x plus 64-bit integers, - Zve64f: Zve32f plus Zve64x, - Zve64d: Zve64f plus double precision floats. - 6 different vector lengths: - Zvl32b (embedded only), - Zvl64b (embedded only), - Zvl128b, - Zvl256b, - Zvl512b, - Zvl1024b, - and the V extension proper: equivalent to Zve64f and Zvl128b. In total, there are 6 different possible sets of supported instructions (including the empty set), but for convenience we allocate one bit for each type sets: up-to-32-bit ints (ZVE32X), floats (ZV32F), 64-bit ints (ZV64X) and doubles (ZVE64D). Whence the vector size is needed, it can be retrieved by reading the unprivileged read-only vlenb CSR. This should probably be a separate helper macro if needed at a later point. --- libavutil/cpu.c | 15 +++++++++++ libavutil/cpu.h | 6 +++++ libavutil/cpu_internal.h | 1 + libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 57 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 80 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..89d2fb6f56 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -178,6 +180,19 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV +#define AV_CPU_FLAG_ZVE32X_M (AV_CPU_FLAG_ZVE32X) +#define AV_CPU_FLAG_ZVE32F_M (AV_CPU_FLAG_ZVE32X_M | AV_CPU_FLAG_ZVE32F) +#define AV_CPU_FLAG_ZVE64X_M (AV_CPU_FLAG_ZVE32X_M | AV_CPU_FLAG_ZVE64X) +#define AV_CPU_FLAG_ZVE64F_M (AV_CPU_FLAG_ZVE32F_M | AV_CPU_FLAG_ZVE64X) +#define AV_CPU_FLAG_ZVE64D_M (AV_CPU_FLAG_ZVE64F_M | AV_CPU_FLAG_ZVE64D) +#define AV_CPU_FLAG_VECTORS AV_CPU_FLAG_ZVE64D_M + { "vectors", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VECTORS }, .unit = "flags" }, + { "zve32x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE32X }, .unit = "flags" }, + { "zve32f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE32F_M }, .unit = "flags" }, + { "zve64x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64X_M }, .unit = "flags" }, + { "zve64f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64F_M }, .unit = "flags" }, + { "zve64d", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64D_M }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..44836e50d6 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,12 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V Vector extension +#define AV_CPU_FLAG_ZVE32X (1 << 0) /* 8-, 16-, 32-bit integers */ +#define AV_CPU_FLAG_ZVE32F (1 << 1) /* single precision scalars */ +#define AV_CPU_FLAG_ZVE64X (1 << 2) /* 64-bit integers */ +#define AV_CPU_FLAG_ZVE64D (1 << 3) /* double precision scalars */ + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..634f28bac4 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,6 +48,7 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..9e4cce5e8b --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,57 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#endif + +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; + + /* If RV-V is enabled statically at compile-time, check the details. */ +#ifdef __riscv_vectors + ret |= AV_CPU_FLAG_ZVE32X; +#if __riscv_v_elen >= 64 + ret |= AV_CPU_FLAG_ZVE64X; +#endif +#if __riscv_v_elen_fp >= 32 + ret |= AV_CPU_FLAG_ZVE32F; +#if __riscv_v_elen_fp >= 64 + ret |= AV_CPU_FLAG_ZVE64F; +#endif +#endif +#endif + +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + /* The V extension implies all subsets */ + if (hwcap & HWCAP_RV('V')) + ret |= AV_CPU_FLAG_ZVE32X | AV_CPU_FLAG_ZVE64X + | AV_CPU_FLAG_ZVE32F | AV_CPU_FLAG_ZVE64D; +#endif + + return ret; +} From patchwork Tue Sep 6 18:43:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37719 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3443748pzh; Tue, 6 Sep 2022 11:44:20 -0700 (PDT) X-Google-Smtp-Source: AA6agR4FSfWe0eqeqmo6S+6q2MbBW5DmQqzkxKfLfKjVIKdzV/QH7gLTxi45iP146+1KRkwHBUex X-Received: by 2002:a05:6402:1946:b0:44e:a406:5ff5 with SMTP id f6-20020a056402194600b0044ea4065ff5mr25900edz.14.1662489860578; Tue, 06 Sep 2022 11:44:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489860; cv=none; d=google.com; s=arc-20160816; b=pVT16cIqGgmQEnia+5PYawK4s511faprbvfPGHhcCpMJF1WWBfeUQRvIWVm8ipFJIh 2eS9Cbv9UGJ7dS4TVBvv4+t1hqo8OLGtyOpfI+l4HwC1oWAUxoom8K7nr3x3+5bE1v9K /9dsTJ2MT1dv6H5ELjCjfHZnR3qSNz78NWGNXNGpZEGmJWjcyDqctJiMtOAu0nBjkomr PMm02b+4p9NUk0WwzwPKJ2dI5LKamaFj44KvmhlaATkHBJzd2izw8dybWDdIbBdBoPug oBo6S1rwVNnWICv8O2BvmLX7sS+pycke7d864knxHBn2sRO+RzPWalzCsDlWK3EwxG0R p6Kw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=uN7M5zdHFPiX2RopVgTytSwJNztn3reCBDQhsgul/NI=; b=OtPiw5c4IfcMZ6s5q1XVxmds+pzLedSZN8pkQa/FZViSMTwUs5DANPTiXmk+/SLVui PSfYJ+RQIsnITOFy9UjSAqpa/5T7gKscb5VBPSVV+PktZcrfCWaetHI11tWeWk6yPdze vGSuKN3j+4DMLTQLZp7ian+pQ2mbivaFVE9P6vE4VsTBX9O5CthAsE9RJ77/7wkzK/86 rToLOpQzjSo+0JKE3+2tBTYQg8dRLcxo074p3DiLXWMN01oyBUoMbxC8fIK9uQFefYdk vLJuHdg5HLZyO+Yk8HsNmGW0OG4+2bdVUqKC38diyCEwxDOUKd81TZtgVtg/gs/EaRtQ +zrg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e8-20020a170906314800b00732fd5caf3fsi9695354eje.227.2022.09.06.11.44.20; Tue, 06 Sep 2022 11:44:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1752368BB3D; Tue, 6 Sep 2022 21:44:10 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id ABA9268BABD for ; Tue, 6 Sep 2022 21:44:02 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 65A29C00AE for ; Tue, 6 Sep 2022 21:44:02 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:43:52 +0300 Message-Id: <20220906184402.119826-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/12] checkasm: register the RISC-V V subsets X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: eICk7/LrUmOx From: Rémi Denis-Courmont --- tests/checkasm/checkasm.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index e56fd3850e..a5d0503811 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -226,6 +226,11 @@ static const struct { { "ALTIVEC", "altivec", AV_CPU_FLAG_ALTIVEC }, { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, +#elif ARCH_RISCV + { "Zve32x", "zve32x", AV_CPU_FLAG_ZVE32X }, + { "Zve32f", "zve32f", AV_CPU_FLAG_ZVE32F }, + { "Zve64x", "zve64x", AV_CPU_FLAG_ZVE64X }, + { "Zve64d", "zve64d", AV_CPU_FLAG_ZVE64D }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Tue Sep 6 18:43:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37722 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3443988pzh; Tue, 6 Sep 2022 11:44:46 -0700 (PDT) X-Google-Smtp-Source: AA6agR5IaJvoMutfdCWSj6PVRWbY7mA3UFFYQlK1FHNPXhRqXljfgVERy9DBnnGMxcuFvzRzfi/0 X-Received: by 2002:a17:907:1dda:b0:742:9b96:16c7 with SMTP id og26-20020a1709071dda00b007429b9616c7mr25118303ejc.422.1662489885968; Tue, 06 Sep 2022 11:44:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489885; cv=none; d=google.com; s=arc-20160816; b=S9LqeJNrG8AX2wYQU/tr6G5CUozTaK5DifOxNM6StX51+g6zRTiCJUjhB2ZcZ0Zmzk 6q3UfPFyENV42HrvWWngC2Sq4Usx0q9oS3JTThFCnFN9AzPONiZ/sCavbTmmeX6klu0j EgQaJukPFJR6Zd4xp6JDYd4qRoT41Pz6N+r6yVJ2iExY91ooM8loXqdzK3ozbociA/Ry Q941zh77w2Yu4mVuxXXaLeXl8VfEZZeaRTfD+XppIVbL4JZF/jvhdbAM8wOlGsmVrjbl pe2+En2VXF/Dqo33zLQtcFU0AQX+kfxUCII/2oW/Ly+//TBncA+EYSYt7qVuV64E6yAD UkFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=eaEou0KQyOFm/1LpkQQhDLfhyGIV6wIp/FCAO/Z/+CM=; b=l8tmsJweAYmNBd5ySknNqZvXxC7mD6VuKg5TKjw08dwyMpNq8puAAODRvyVxiMMikG D0luco7wnxYLoz3Z2Rze/tdPGfGzaGWezT0RcAWvs0I3191SATpzinTpjcJolLbNVd3c /5bemXv6oENR6ugyqfgS5AiG7Vh4ja2yiAJ3YAgkk1yI0EOpzeAH0L1Oc67cwO6wdMCQ irEDxCnsmqJeUkKbfGfe/wZpfZ+kyr1WtJXV+qOZcpXlwzDeU1njgW24+V6SEOQ5pZ7U K409MyNymGg6dIF668zW4a7MF1x9BfvB0KpaYJHpqPmofWwy+i3ID2O6GakrMuQfYsep lpdw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h4-20020a056402280400b00448b88378f8si13173045ede.359.2022.09.06.11.44.45; Tue, 06 Sep 2022 11:44:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0489C68BB0F; Tue, 6 Sep 2022 21:44:13 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DE9EA68BABD for ; Tue, 6 Sep 2022 21:44:02 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 988C2C00AF for ; Tue, 6 Sep 2022 21:44:02 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:43:53 +0300 Message-Id: <20220906184402.119826-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/12] lavu/riscv: initial common header for assembler macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: zgRK1JWI60t1 From: Rémi Denis-Courmont --- libavutil/riscv/asm.S | 74 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 libavutil/riscv/asm.S diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S new file mode 100644 index 0000000000..7623c161cf --- /dev/null +++ b/libavutil/riscv/asm.S @@ -0,0 +1,74 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#if defined (__riscv_float_abi_soft) +#define NOHWF +#define NOHWD +#define HWF # +#define HWD # +#elif defined (__riscv_float_abi_single) +#define NOHWF # +#define NOHWD +#define HWF +#define HWD # +#else +#define NOHWF # +#define NOHWD # +#define HWF +#define HWD +#endif + + .macro func sym, ext= + .text + .align 2 + + .option push + .ifnb \ext + .option arch, +\ext + .endif + + .global \sym + .hidden \sym + .type \sym, %function + \sym: + + .macro endfunc + .size \sym, . - \sym + .option pop + .previous + .purgem endfunc + .endm + .endm + + .macro const sym, align=3, relocate=0 + .if \relocate + .pushsection .data.rel.ro + .else + .pushsection .rodata + .endif + .align \align + \sym: + + .macro endconst + .size \sym, . - \sym + .popsection + .purgem endconst + .endm + .endm From patchwork Tue Sep 6 18:43:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37727 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3444356pzh; Tue, 6 Sep 2022 11:45:27 -0700 (PDT) X-Google-Smtp-Source: AA6agR51tT3p0ErYVbRhkNO7MAI2sFVYy6kI27fwcw4BzIOlix6zAgzM89arZb9GgcVICvHY91YO X-Received: by 2002:a05:6402:198:b0:442:da5a:6716 with SMTP id r24-20020a056402019800b00442da5a6716mr3713edv.5.1662489927492; Tue, 06 Sep 2022 11:45:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489927; cv=none; d=google.com; s=arc-20160816; b=tlSQH0zmKWB33JjaRkbVOLOt2MCYP0rSinKPHfTNsPV/nJPs0YRBET5yjvj6jGDs15 maYj8uFRYBHidT8QJKdnbFuEOTbtKcUiHKDCcUvvByjDSUFABrk+b5rYoU23mZ4Jtbwf pepgmUqit1/gORrnWI4KOq1AsKPnYM/MKfr8zZ3Djt2M3hbVIgTocIg58bfodzbGk8+e i9zEnIPx8dLSAxflhwgJe6a0K8uhVkvHBKm7/x5S9/n+bVAbf/eT2tVSkIRpVdr2UgjU 0giBptYlFTbxnXH8WKtfD1vGWq1NKrDZ0DOJMAcU80C0yZ92etCMIZ8QT3+N8KPSB9i9 u0AA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=grZNT4gs7dtJlF9G+JtMJfwFQPejOlVAhvn98hE290I=; b=ZOz3QoXzKqj4inSQmDSQUSc5dtKpuyUnmbNtokFtToN5KZRpCIPL2LMDWh4TOjZdkp R19+i9Aqa/ooKWUQw6KFjQ7Peibp//JxfBaAQd+6itYbflptTw+dIYJcj17vKp1ro5z/ 165VMflNAGsFo+XIOpLSEmCJQMovLxLs7Z5+x6Qa9CG09BM3erhNRJiz76Xwx2tVnM7o wiACMto6Z54uBLsbic+zj7xuamamkKLKFAYGK7ZTEM0RVFbuRB6jJ0mIIyrYtorxvj2e fPcBbJw7DmaY0lPDLAqNcZ7swPMW7jybGYdpBpqc89ox5mQYkSwk48DnAFmA2+4YRP8v QwcA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id p27-20020a1709060e9b00b0076fcc543c5esi918466ejf.151.2022.09.06.11.45.27; Tue, 06 Sep 2022 11:45:27 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C143068BB5E; Tue, 6 Sep 2022 21:44:17 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2250A68BB0F for ; Tue, 6 Sep 2022 21:44:03 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id CB851C00B0 for ; Tue, 6 Sep 2022 21:44:02 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:43:54 +0300 Message-Id: <20220906184402.119826-4-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/12] lavu/riscv: float vector-scalar multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: sYLPmC7IBWyn From: Rémi Denis-Courmont This is based on existing code from the VLC git tree with two minor changes to account for the different function prototypes. --- libavutil/float_dsp.c | 2 ++ libavutil/float_dsp.h | 1 + libavutil/riscv/Makefile | 4 ++- libavutil/riscv/float_dsp_init.c | 41 +++++++++++++++++++++++ libavutil/riscv/float_dsp_rvv.S | 56 ++++++++++++++++++++++++++++++++ 5 files changed, 103 insertions(+), 1 deletion(-) create mode 100644 libavutil/riscv/float_dsp_init.c create mode 100644 libavutil/riscv/float_dsp_rvv.S diff --git a/libavutil/float_dsp.c b/libavutil/float_dsp.c index 8676c8b0f8..742dd679d2 100644 --- a/libavutil/float_dsp.c +++ b/libavutil/float_dsp.c @@ -156,6 +156,8 @@ av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact) ff_float_dsp_init_arm(fdsp); #elif ARCH_PPC ff_float_dsp_init_ppc(fdsp, bit_exact); +#elif ARCH_RISCV + ff_float_dsp_init_riscv(fdsp); #elif ARCH_X86 ff_float_dsp_init_x86(fdsp); #elif ARCH_MIPS diff --git a/libavutil/float_dsp.h b/libavutil/float_dsp.h index 9c664592bd..7cad9fc622 100644 --- a/libavutil/float_dsp.h +++ b/libavutil/float_dsp.h @@ -205,6 +205,7 @@ float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len); void ff_float_dsp_init_aarch64(AVFloatDSPContext *fdsp); void ff_float_dsp_init_arm(AVFloatDSPContext *fdsp); void ff_float_dsp_init_ppc(AVFloatDSPContext *fdsp, int strict); +void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp); void ff_float_dsp_init_x86(AVFloatDSPContext *fdsp); void ff_float_dsp_init_mips(AVFloatDSPContext *fdsp); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 1f818043dc..6bf8243e8d 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1 +1,3 @@ -OBJS += riscv/cpu.o +OBJS += riscv/cpu.o \ + riscv/float_dsp_init.o \ + riscv/float_dsp_rvv.o diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c new file mode 100644 index 0000000000..279412c036 --- /dev/null +++ b/libavutil/riscv/float_dsp_init.c @@ -0,0 +1,41 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/float_dsp.h" + +void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, + int len); + +void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, + int len); + +av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_ZVE32F) { + fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + + if (flags & AV_CPU_FLAG_ZVE64D) + fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } +} diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S new file mode 100644 index 0000000000..365e00190c --- /dev/null +++ b/libavutil/riscv/float_dsp_rvv.S @@ -0,0 +1,56 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_dmul_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 + +1: vsetvli t0, a2, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Tue Sep 6 18:43:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37729 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3444507pzh; Tue, 6 Sep 2022 11:45:45 -0700 (PDT) X-Google-Smtp-Source: AA6agR54VTbtBgvOKwSRajpuKKDicFndeNpgu6wPsxCsH/GQ1jp5URMJ/rVE8DAGCXoSupJhKAaK X-Received: by 2002:a17:907:2c4f:b0:741:5b68:e2d9 with SMTP id hf15-20020a1709072c4f00b007415b68e2d9mr31836555ejc.314.1662489945465; Tue, 06 Sep 2022 11:45:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489945; cv=none; d=google.com; s=arc-20160816; b=VFwHng8p97bw56LVyWcIsGJ65AT18c+HBh8pGEKtcYs1gWJK1c/VcMLmIWTwvBD8R1 B2TlXGjk7cxwehUAIUL1wDujqT6LMpfHwxFjbVpsy4a6hrb0acchUC9I+KeA+Gvp+8d8 roA/hsJbwWB6DBO6+hyhbOP/6wuiP+X7CdLtYPUuaal/sFxS2/nZazavjmaZ/8Oc9dF2 w+ort4J5LneSD169/3HuJ49v/bH31GYkAopuzkKEPS443DGrsdTIeiGjCYx9aJ88gRmq RtVOWVuTT1tKOEaAai2XgyYv3MkuMAQH6hfusbMNauoeMrhtjrW4ZvurX+eO5MR9J0uT 4GVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=XrCnkFn0Cm2u9tx9/7/f04v7NAm0AsdHjU09h5EQE4w=; b=xVdzFcU8KLYT1orNtuIEQ2sKYVAK6mM2UFICmNJvvpRdNTN7oUkGv6oN96ibhnMLQo sl+0XtN5vEq8nmYLRK3iCGfi4Da9Z0gJEDTQ9oYKehJNEQAuoGdRiUs0EwNsX2Sn740M YiNdMZrYn6MoP5eIJi9RzHM7OpU+zShVHXBGz9Wfol8BSb+ip8q8Ggo91Q3YbF86Lk9q Yd3ZGGuhshGiESejNoLWlFjW0gLCA9HnxS2zCZD693Es6t/y+kaemh24tySW0lqZDMNw p0D1Qnt9GR7UW0RHVJ32Erkr3QPJWlTLEMeta/e4WHMsCsZw3CN9P9gevuwD9IJhYZGt koTw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i6-20020a1709064fc600b00731868bc6e8si628451ejw.58.2022.09.06.11.45.44; Tue, 06 Sep 2022 11:45:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AB46B68BB6A; Tue, 6 Sep 2022 21:44:19 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4C2C968BB20 for ; Tue, 6 Sep 2022 21:44:03 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 0A33CC00B1 for ; Tue, 6 Sep 2022 21:44:03 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:43:55 +0300 Message-Id: <20220906184402.119826-5-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/12] lavu/riscv: float vector-vector multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Nk/Txhj9wzVm From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 9 ++++++++- libavutil/riscv/float_dsp_rvv.S | 34 ++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 279412c036..4135284c76 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -22,9 +22,13 @@ #include "libavutil/cpu.h" #include "libavutil/float_dsp.h" +void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -33,9 +37,12 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_ZVE32F) { + fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; - if (flags & AV_CPU_FLAG_ZVE64D) + if (flags & AV_CPU_FLAG_ZVE64D) { + fdsp->vector_dmul = ff_vector_dmul_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } } } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 365e00190c..65c3a77b01 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -19,6 +19,23 @@ #include "config.h" #include "asm.S" +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_fmul_rvv, zve32f +1: vsetvli t0, a3, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vle32.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 @@ -37,6 +54,23 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_dmul_rvv, zve64d +1: vsetvli t0, a3, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vle64.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Tue Sep 6 18:43:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37721 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3443895pzh; Tue, 6 Sep 2022 11:44:37 -0700 (PDT) X-Google-Smtp-Source: AA6agR6QLWA4eFIXuwD0XPFZtjYBxWVKUOzCMA4Zq6xR52A12nMUT3gOe8bebxR2k3Nmp8SQJgcj X-Received: by 2002:a05:6402:b29:b0:44e:d429:749d with SMTP id bo9-20020a0564020b2900b0044ed429749dmr3672599edb.423.1662489876890; Tue, 06 Sep 2022 11:44:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489876; cv=none; d=google.com; s=arc-20160816; b=rxO0dO0EiSulRvhF37kE+kl6O6tyibq/vcsW8gmHKz0tB3NoIK/Yq3jZ2aMuRLRxRY POCyq17+6qg8KJlc4JLWfdjopSZ33otp/P1vJmI5XFHMZ2IC4dEwgbrR7iKz3TaJVwIy tPMcEIX3KGv1diUNa3A7SVFAXy7ke5QAcA/Xw5YUCLz3dS8lSiY2IFKs/DHWa+vuy3yv i/7+WFWfjOD8x/MraFc2r6zay5mx6kOfIQx3BQ9Ays5idKTLDBtTnFUHG+utRbZ+0F6c cp3HzpyWEEy5cOIc4/lZ+h5lnxkSSI6AYyCDPxquoBuR6l38AWcTEQLTxFPxz2Ryq+yC Hv4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=hulmNl2ewJRdyvn9n9TGI54EM2AimjP9/757rqolLrY=; b=GGOHtJZrm7K5xIZB6zBoR+qHhevYM+V+7fzfWFJf99qlwiskAdpxayO0wLJqahkLKf 340Jyz1RcPMBQUovqlPUJNwDH+KI4EiJ+DgrjSwk6Y4SoIZbWjW/4kAzXplwJj9Hooxd Zx51mkcdQepsCRE/txDyHCaA5jViobuSf40yd3TWlz0S0dxhv1q7W8Vu9IkzRqVHCMFE yqottTktCyBd+n+DGqJt9/ZQlHP1NmwDr5hm3NKA1KD58Gd3MjDptju4LnfE+mLTBtQJ wkUNkFn66Oomsw8uPiq/WvNDlKhp7KrKNMtoC5jS5aPHrPdaSQIez7o6uHJ+gNo0wNnA yzxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qf17-20020a1709077f1100b0072ed60fb78asi9945401ejc.548.2022.09.06.11.44.36; Tue, 06 Sep 2022 11:44:36 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0141F68BB0D; Tue, 6 Sep 2022 21:44:11 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CFD5468BAFA for ; Tue, 6 Sep 2022 21:44:07 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 343FCC00B2 for ; Tue, 6 Sep 2022 21:44:03 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:43:56 +0300 Message-Id: <20220906184402.119826-6-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/12] lavu/riscv: float vector multiply-accumulate with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NFrNG/YA7aCQ From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 6 +++++ libavutil/riscv/float_dsp_rvv.S | 38 ++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 4135284c76..a1bb112ec7 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -24,11 +24,15 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, int len); +void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); +void ff_vector_dmac_scalar_rvv(double *dst, const double *src, double mul, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -38,10 +42,12 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_ZVE32F) { fdsp->vector_fmul = ff_vector_fmul_rvv; + fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; + fdsp->vector_dmac_scalar = ff_vector_dmac_scalar_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; } } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 65c3a77b01..5a7d92abd6 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -36,6 +36,25 @@ func ff_vector_fmul_rvv, zve32f ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_fmac_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v24, (a1) + add a1, a1, t1 + vle32.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 @@ -71,6 +90,25 @@ func ff_vector_dmul_rvv, zve64d ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_dmac_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 + +1: vsetvli t0, a2, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v24, (a1) + add a1, a1, t1 + vle64.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Tue Sep 6 18:43:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37720 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3443825pzh; Tue, 6 Sep 2022 11:44:29 -0700 (PDT) X-Google-Smtp-Source: AA6agR63j+FtX/wjBvjZ2pEXaaGjXFOddOh7yWcptwHIpsyte9eBYGVM1nue1cyCYROson3U1BDb X-Received: by 2002:a17:906:959:b0:741:6f76:546f with SMTP id j25-20020a170906095900b007416f76546fmr32584619ejd.32.1662489868936; Tue, 06 Sep 2022 11:44:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489868; cv=none; d=google.com; s=arc-20160816; b=Vwc3TUVKXDm1gfBrieY3QzxNA9+oNyQK4Jrs5+ufAZLZGEUDD2barldRViygeDHins is60wymmoQSYPYy5bBSKLFAxXhFSDHvAbqnmD8dGGbdvD/4jrf3xLlT3wrsg5ZDBW6Oy riKZGjcNG4DyNQgZ/UylSh+3fJSaKOx6NrRMFYf1Y9qU55fWmVsuvkoqW0Qw60wJ5bOm wI4i8MyRKJrFF5qa74iwjnOvEj9QBXE8j4h5ygRWyyDlkSIRZjwSeRizPBn7NWu2yYNT m4jdtO0YDp/SOfRVjQqqLhzEbLptetITl+peYUr+U5lLONEY2Rdu/HjXYeiq4Gg8HLAQ H3DQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=2do/jxIj+B+pQJNYluHpDciIrMUND6eG8JfAg0z659M=; b=pRlkvGNFqg2JwIbaGPZxBj935Ej/YrFgHINwjToPkeNNQtZurGA6cj225g8Ty98R5z L+/B1cFWTzERHtgHY2MwqY/QYJofEwGgFGeYa1cQovlFsm/Xf188waoh+2ru0kpU//Ia 8y3K0dOWUrf68lkmsLyHcT2fvA88lg0f7ZC2RULvGu1iaHOS5BIuQgZ4vm2DALPKrmmE UbcpPfWni6GeVNjVGozCd9N0IoR0lL2TnEbmdugFZi9ez/4wuUPkWRlutmAuy9Z7pMKR 2fRM+nFVhcBe+tllaZxvf3JL/8uTGNKKAyBeEPuWsBMY/MHlxl1TFt7Rqntf/mHE5Wc4 CPsg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hr4-20020a1709073f8400b0073d9d374e81si12945071ejc.681.2022.09.06.11.44.28; Tue, 06 Sep 2022 11:44:28 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0FCCE68BB14; Tue, 6 Sep 2022 21:44:11 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D09E468BB0D for ; Tue, 6 Sep 2022 21:44:07 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 5FC6BC00B3 for ; Tue, 6 Sep 2022 21:44:03 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:43:57 +0300 Message-Id: <20220906184402.119826-7-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/12] lavu/riscv: float vector multiplication-addition with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: VsNevDWNMkDA From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index a1bb112ec7..8539fe9ac5 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -28,6 +28,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, + const float *src2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -44,6 +46,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 5a7d92abd6..efbf12179f 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -73,6 +73,25 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) + (a3) [0..a4-1] +func ff_vector_fmul_add_rvv, zve32f +1: vsetvli t0, a4, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v8, (a1) + add a1, a1, t1 + vle32.v v16, (a2) + add a2, a2, t1 + vle32.v v24, (a3) + add a3, a3, t1 + vfmadd.vv v8, v16, v24 + sub a4, a4, t0 + vse32.v v8, (a0) + add a0, a0, t1 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Tue Sep 6 18:43:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37723 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3444068pzh; Tue, 6 Sep 2022 11:44:54 -0700 (PDT) X-Google-Smtp-Source: AA6agR5o0Sh77eDatZ7EvikZFEutGcT677v/tfTAZ00QJq8sTVk7pPlf0qZwoetVj/zIfxdiHAM7 X-Received: by 2002:a17:907:7f21:b0:73d:6b7b:3e0 with SMTP id qf33-20020a1709077f2100b0073d6b7b03e0mr39121893ejc.680.1662489893883; Tue, 06 Sep 2022 11:44:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489893; cv=none; d=google.com; s=arc-20160816; b=iNjmLgwtMzROi3Av2EfL+Z9HBEvxNo3x/pr4gVo9TRdyb0cbTdCEwrFZCSoxd6NpyY v/NZScXbNNLRIE69bI6irKrukpmzLTSLg0JR9kcsS4gFBEQgYkh7xS+9MrzYH3eOTbKC JJXm6OFq3x8Nax1VQCpmLp/v8NH6JXhQaHcLIrVfm3ZGXSjp4SSjBO7NwFBQDPAGDvdo Riuhau5l0qeb+N/W8Wovp+ixXZQpjTlFMasvkK5fYxYo+r13JomnnPMNY8cpbzFrFsoR TA/D4PH8PKEJrG4NId5Mkzi8HvSsubNwTeewxYr/+fWs3W3s/Ly1B7KGt8PDnhsLrwLf GvFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=lZE6BK3YYnouwz91LsWU5uRtbpo0p+xihV1aJXv+BV4=; b=iCxcaG/yt+cRhQfOycD2vLsrLw3QRKb6tc2xLZ/quCFKYRpibB1DFNFwbSO6YsS5WJ i6EjL7EBYdduaNa75E/sRVkVAgTwqNXGXV/Owyjv56X2DTOmvFE9xMocxNkUou63kLMK oP5LujXLdKIPp1HM56Xd58e/SVVmn8hJ7c9pnL/X7Nv1S2wW2r3fBPyM13Q5or6eFx5c SJz20z9LbMChROtMH0Z9L5+Ji/Keha0hOQOQH7Blu+yOTsj7Dz66MUYUSWq9sq7Dby17 fzpy1fllWinkpz8eHpImhymj+gWOpI8zdV+EqDgEB3zcWrCo4KWj0mtndcguEKsPc2wU RDiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id js1-20020a17090797c100b00722e91c126bsi12397546ejc.55.2022.09.06.11.44.53; Tue, 06 Sep 2022 11:44:53 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D908D68BB49; Tue, 6 Sep 2022 21:44:13 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E264168BB0F for ; Tue, 6 Sep 2022 21:44:07 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 89578C00B4 for ; Tue, 6 Sep 2022 21:44:03 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:43:58 +0300 Message-Id: <20220906184402.119826-8-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/12] lavu/riscv: float vector sum-and-difference with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 5BhSdsoAXtlx From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 8539fe9ac5..2165394585 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -30,6 +30,7 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -47,6 +48,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index efbf12179f..1c3b08b94f 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -92,6 +92,24 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_float_rvv, zve32f +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vfadd.vv v0, v16, v24 + vfsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Tue Sep 6 18:43:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37724 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3444150pzh; Tue, 6 Sep 2022 11:45:02 -0700 (PDT) X-Google-Smtp-Source: AA6agR4qeftN54x978488n8k7UKaOeLkHsxcHgu20cI0y9kYG/rILcyBS157i0LVKYKu6uaZD1l0 X-Received: by 2002:a17:907:7d91:b0:731:7ecb:1e5b with SMTP id oz17-20020a1709077d9100b007317ecb1e5bmr40346691ejc.78.1662489902639; Tue, 06 Sep 2022 11:45:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489902; cv=none; d=google.com; s=arc-20160816; b=mni0U9e51xzjpENCRYshW06j7C4T+dxWVNiO+E2MnRKCEIjMytvdx07KA37AIj7aV3 2lQE9ashcmRWcX5QbBdcs4+wBGFeclTkdJ06dmbYMJ/2yZA8BOoq/buJoGa6cIVvU4Cw ACdus7zOUR/SbJzmtvAwW4s6YbW05KWZJL3jnPVdCGGKO+FRkBE5OHT6zNSX/BuPjTMn NkjQ5P0bI3RlqOuYF8u67PyuGk84sWZowMxbUjqkJcRQzYyEnfhQcE3DBKYRgreeHCDF YUIz5pUdbUuJMU/KY69vGnw1N97t86Hb8yCub0ZyFho5fa9ZrdVM0Jxc1UgD0KMfOqI2 6sfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=UdS79jCW3XYrsJ8nQ72+AeLQ9ZHMIKgP89gjFaS/+9Q=; b=qmI8VEm1vqVlaWlp288hGb7Rfl9X8F82Nnb2oadyNDd9rk6jUlKGKdaCCO6pN/5MtY cDB0ZT1DMAlalWmK9B+A9hvDb6odbkjnqxoYUdE2MWyxo5std0c2sF/sxWFFEyE8GdIF WDcvv8sRzd636OxWsUoEgXgto1E1WUL/UB4NNMXVW8WiO2Ur1BCFv3RsNBt7E1JBHToU 3X39yGk0lifJaIKJLH+M2+euvDfAbNm2ZI6SJMlHz7PNQAr65v623HyxQwPDHCgUd6hy fcdaHlVOgdoLZcUPXk4OknByp/N9CKorkh8+rEL6c14YzS7ua0dtUW1YAezKZ0xzPCqv ZWeg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ga13-20020a1709070c0d00b00741362a9695si12590252ejc.13.2022.09.06.11.45.02; Tue, 06 Sep 2022 11:45:02 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B93B568BB50; Tue, 6 Sep 2022 21:44:14 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E91DD68BB10 for ; Tue, 6 Sep 2022 21:44:07 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B2B4EC00B5 for ; Tue, 6 Sep 2022 21:44:03 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:43:59 +0300 Message-Id: <20220906184402.119826-9-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/12] lavu/riscv: float reversed vector multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: B74114tAdzsQ From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 22 ++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 2165394585..1183460181 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -30,6 +30,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, + const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, @@ -48,6 +50,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 1c3b08b94f..b376392294 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -92,6 +92,28 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a1) * reverse(a2) [0..a3-1] +func ff_vector_fmul_reverse_rvv, zve32f + add t3, a3, -1 + li t2, -4 // byte stride + slli t3, t3, 2 + add a2, a2, t3 + +1: vsetvli t0, a3, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vlse32.v v24, (a2), t2 + sub a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] func ff_butterflies_float_rvv, zve32f 1: vsetvli t0, a2, e32, m8, ta, ma From patchwork Tue Sep 6 18:44:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37725 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3444223pzh; Tue, 6 Sep 2022 11:45:11 -0700 (PDT) X-Google-Smtp-Source: AA6agR6xSQ9sazQAxnaIVLmrS+0fR6zaE0bgLj2ADx2TFyH3LwfVHQKj6dg+sKTT3+TUQQ6pQsxm X-Received: by 2002:a17:907:70c:b0:740:33f3:cbab with SMTP id xb12-20020a170907070c00b0074033f3cbabmr36827658ejb.600.1662489910998; Tue, 06 Sep 2022 11:45:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489910; cv=none; d=google.com; s=arc-20160816; b=TB/hl1bwfYPm06Zzuz6RczXJ0NSw+Nau0PWYabC0+bov+mMrKpzy2LLXENRSdFYzLn Fx1+yag1x/ZNulqy2a9TFzIhAPrVYD2Bh4Zruh05pZrvhYf2jsKsEoKEmH1XDkOLYAL5 0lJKnhj/xEAillhZfdD5Rh8WyGk0UIcDTJF54p7IqCUQ99vD1Exo4Ka2maY3v51ptnJX T+JC7lORiUAqGcPpBlbSuDbWO4/R8e4zsdnAY2eM23MOhAwI6kVYPOvvwtlA7Q+70E9V 2WGz5EDWfKfNJtJ4xlcDq3iyuMlEi/nwhBd0tZz5z81B46gV30x3h8+sJ65Xlmg2JgIc vZMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=X/uzAHYPKWHXVfKNQSofHLiD8D0hMuHYkbmMe6bAYbA=; b=Q3rZ0/wK+vXsMfY7WUMwI8zxzQfD2WpPsSwU0CmVZP4eN6XJ2cXNniyb53WzcUShpt 33WWA001dT3T+bVjI+8P2HfF/XEPeID1FW3QjdY3JZMnFp+fGHNnWWLsaRvQucNK8njq DpHcsE/1KWdhYjOaUzYvETqDELnn773c5+HNVlTQhllhqVnywtvDgeJSSfRbuwgf02Hz P4zQba8W8HPKmYaUa2OVDk5k8AC5JlG3c7Wzh5ylpzDTy7k8Q+nfQ9NwevXbJjZnFRjN jswrd78FeV3rDIJqqFBenJNpwNECym8uRoRXPOrE5fJ7QiHP87iNKIWLhsrDsJh2bUH3 GQ0g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h19-20020aa7c953000000b00446d2362b7esi9687759edt.539.2022.09.06.11.45.10; Tue, 06 Sep 2022 11:45:10 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C96CC68BB53; Tue, 6 Sep 2022 21:44:15 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EFE5668BB11 for ; Tue, 6 Sep 2022 21:44:07 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id DC043C00B6 for ; Tue, 6 Sep 2022 21:44:03 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:44:00 +0300 Message-Id: <20220906184402.119826-10-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/12] lavu/riscv: float vector windowed overlap/add with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: GVZU7ECwnMGA From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 35 ++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 1183460181..887706d899 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -28,6 +28,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_window_rvv(float *dst, const float *src0, + const float *src1, const float *win, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, @@ -49,6 +51,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_window = ff_vector_fmul_window_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index b376392294..65daaa2d27 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -73,6 +73,41 @@ NOHWF mv a2, a3 ret endfunc +func ff_vector_fmul_window_rvv, zve32f + // a0: dst, a1: src0, a2: src1, a3: window, a4: length + addi t0, a4, -1 + add t1, t0, a4 + slli t0, t0, 2 + slli t1, t1, 2 + add a2, a2, t0 + add t0, a0, t1 + add t3, a3, t1 + li t1, -4 // byte stride + +1: vsetvli t2, a4, e32, m4, ta, ma + slli t4, t2, 2 + vle32.v v16, (a1) + add a1, a1, t4 + vlse32.v v20, (a2), t1 + sub a2, a2, t4 + vle32.v v24, (a3) + add a3, a3, t4 + vlse32.v v28, (t3), t1 + sub t3, t3, t4 + vfmul.vv v0, v16, v28 + sub a4, a4, t2 + vfmul.vv v8, v16, v24 + vfnmsac.vv v0, v20, v24 + vfmacc.vv v8, v20, v28 + vse32.v v0, (a0) + add a0, a0, t4 + vsse32.v v8, (t0), t1 + sub t0, t0, t4 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) + (a3) [0..a4-1] func ff_vector_fmul_add_rvv, zve32f 1: vsetvli t0, a4, e32, m8, ta, ma From patchwork Tue Sep 6 18:44:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37726 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3444296pzh; Tue, 6 Sep 2022 11:45:19 -0700 (PDT) X-Google-Smtp-Source: AA6agR7FM8HFsLYk9rDeIgvxuF7BqCtYSrwabGg0E1aMp7yuqhxX6ofQvhg58n8VyMQT54uviuXa X-Received: by 2002:aa7:d0d3:0:b0:44d:ef98:2075 with SMTP id u19-20020aa7d0d3000000b0044def982075mr23159edo.122.1662489918974; Tue, 06 Sep 2022 11:45:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489918; cv=none; d=google.com; s=arc-20160816; b=FYdjm+mnsNlpMwVPLqH92SOg8xwa8zfh4jd1gkIX3tatExJ6UUVzd6fYxRxuZsYsjf YjwfBfw2fbFq8kQf5rCcN/Yk7IbEKfWQBCya8PGnOCl26QodLEolwB6W2E5C+in/ZOBH cmLv2d5dGin97EireMbVngzqw6m9JO58s/uY7z3NWgIE+lRIo2vU91NhMGP0Qq6ClKTh M56CBFWrEXpt9Jlb/eFMcdKH284i3iD+GH4T//UZ/qKC6xnmIafvHUmycMGBk1Bq7+yl QH8Y5YGRSlvs1fXgc1w0FZ9zEbZCQesVMGHPM2jP3jPKgwbF3QDDDSRltx9N7moi8Ssv 4QEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=qMsQhYXH0v7ljON8cAFRP989+m9g+7GnTvr+v2Kw1bA=; b=JJpQR2R3RIic6Egh++Yf3WAuk+8tXvGfoNC4OxJyWhwl/LaxEGp/I4sgCCk+iInn7c ZNL4EgsfGI7qNePuzHEeFrQXQQSornR0KBA/A1/REmWeUbjXeCbSjoDSjQCGmNLQNyyc Xxp9pqVhLdjErBXMTh32bYcO3rq98AF0AKXKGqvNux7Usj9T9uiUpX53uy5R4JXZj7cH hlCVs0dcvsnWQHXuQTmOYG6TmY/EHC7DCHA/m7GGCz72yloLBG5uWX+YltJFEMsAS4hE 9/BFgVJqaS+gkRJl2vI0KrX/FnHbeSvLtqdmkR9S6wd1CCKkRfmd7ktEWSVVt5xkxQLj tIkA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id p27-20020a1709060e9b00b0076fcc543c5esi918205ejf.151.2022.09.06.11.45.18; Tue, 06 Sep 2022 11:45:18 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CAE3E68BB59; Tue, 6 Sep 2022 21:44:16 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F345168BB18 for ; Tue, 6 Sep 2022 21:44:07 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 11130C00B7 for ; Tue, 6 Sep 2022 21:44:04 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:44:01 +0300 Message-Id: <20220906184402.119826-11-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 11/12] lavu/riscv: float vector dot product with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: l8DejMH8TFQP From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 887706d899..7c2fc10e99 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -35,6 +35,7 @@ void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); +float ff_scalarproduct_float_rvv(const float *v1, const float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -55,6 +56,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; + fdsp->scalarproduct_float = ff_scalarproduct_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 65daaa2d27..81bd0e510a 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -167,6 +167,27 @@ func ff_butterflies_float_rvv, zve32f ret endfunc +// a0 = (a0).(a1) [0..a2-1] +func ff_scalarproduct_float_rvv, zve32f + vsetvli zero, zero, e32, m8, ta, ma + vmv.s.x v8, zero + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + add a0, a0, t1 + vle32.v v24, (a1) + add a1, a1, t1 + vfmul.vv v16, v16, v24 + sub a2, a2, t0 + vfredusum.vs v8, v16, v8 + bnez a2, 1b + + vfmv.f.s fa0, v8 +NOHWF fmv.x.w a0, fa0 + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Tue Sep 6 18:44:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37728 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3444445pzh; Tue, 6 Sep 2022 11:45:36 -0700 (PDT) X-Google-Smtp-Source: AA6agR4s1v/+voEgZDDMutnAywL3m0UU65le5o6DkHoDvv3WwtoFrtmYH9uBF+qaXw+JxXxhjp46 X-Received: by 2002:a05:6402:3485:b0:448:a1c8:d640 with SMTP id v5-20020a056402348500b00448a1c8d640mr32624753edc.279.1662489936728; Tue, 06 Sep 2022 11:45:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662489936; cv=none; d=google.com; s=arc-20160816; b=fv/HT7Duf2kVD8BMoaRgm4XGeV7OjNgYWMFAUTgnlM1jXbPKRB0jEs/nXJRPBRx1H6 dTIiIaK+UwMeWH73TXk28l8XuNW+ObZsvx9wK3+isx5F77qwZCx+MlGeiPa82gBjpNgk xlW7WQUzLQne1CRRMwDpUGDjxC9UGZvgvYJcOEFPpYXT6lJKM83IXEWHJmNcGevWJuXc 9mwx8y3KZwrRTts8eXjY2S3tlOQfdl/GKImdMCNkzOXHIzo7dY199jKO+HFPwXE0YiJB S9xo1eRzq/gzSoQUFBgZcKFLFSIKvqJHcnamjdbgH+H6L8p1ZYLpR2RF/y1+jwsY9/sl 88zw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ydgkx3pGgrruUm+lh4mc5XmQ4yftpx/o5yzWRthxs4Y=; b=bPu4/FYDDvgeDtclkkFMGz6JSproPUF00O4W6eLUrjUNZilFPxTkP2hBX6brDbaiPc Ys4cO8kxN6mPJ7MkhiifQqX57eyhpqhwIc+YorzrTMGIQpMPimCwJ3R680nlwp/toSJ7 EKI7PLjDNAkhu914PmhVp55CmwZkIRiXJLHS3fLo7eZeWibeXFlknDd7IwjpYcBQ/d31 rkVUWeotJB1P4tKOBkKJ8PS4tPAmK4wUl/nrI6wFIhsCTWL8RrlaxuMvXDKlxUDZfMhp 3Yn2bK+SDdrrSGHNNhrHfGzK8IPnbDZ3U+mC4wcoFatFIHOY6m/hAWLV35Ct8VuTcy3t E/CQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sd18-20020a1709076e1200b0073ce636490esi10765607ejc.272.2022.09.06.11.45.36; Tue, 06 Sep 2022 11:45:36 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BF2A368BB65; Tue, 6 Sep 2022 21:44:18 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2455668BB1C for ; Tue, 6 Sep 2022 21:44:08 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 3AA08C00B8 for ; Tue, 6 Sep 2022 21:44:04 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 21:44:02 +0300 Message-Id: <20220906184402.119826-12-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5753736.MhkbZ0Pkbq@basile.remlab.net> References: <5753736.MhkbZ0Pkbq@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 12/12] lavu/riscv: fixed vector sum-and-difference with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NrvGLHJFzGic From: Rémi Denis-Courmont --- libavutil/fixed_dsp.c | 4 +++- libavutil/fixed_dsp.h | 1 + libavutil/riscv/Makefile | 2 ++ libavutil/riscv/fixed_dsp_init.c | 33 +++++++++++++++++++++++++++ libavutil/riscv/fixed_dsp_rvv.S | 38 ++++++++++++++++++++++++++++++++ 5 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 libavutil/riscv/fixed_dsp_init.c create mode 100644 libavutil/riscv/fixed_dsp_rvv.S diff --git a/libavutil/fixed_dsp.c b/libavutil/fixed_dsp.c index 154f3bc2d3..bc847949dc 100644 --- a/libavutil/fixed_dsp.c +++ b/libavutil/fixed_dsp.c @@ -162,7 +162,9 @@ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int bit_exact) fdsp->butterflies_fixed = butterflies_fixed_c; fdsp->scalarproduct_fixed = scalarproduct_fixed_c; -#if ARCH_X86 +#if ARCH_RISCV + ff_fixed_dsp_init_riscv(fdsp); +#elif ARCH_X86 ff_fixed_dsp_init_x86(fdsp); #endif diff --git a/libavutil/fixed_dsp.h b/libavutil/fixed_dsp.h index fec806ff2d..1217d3a53b 100644 --- a/libavutil/fixed_dsp.h +++ b/libavutil/fixed_dsp.h @@ -161,6 +161,7 @@ typedef struct AVFixedDSPContext { */ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int strict); +void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp); void ff_fixed_dsp_init_x86(AVFixedDSPContext *fdsp); /** diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 6bf8243e8d..0f2fcbd41d 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1,3 +1,5 @@ OBJS += riscv/cpu.o \ + riscv/fixed_dsp_init.o \ + riscv/fixed_dsp_rvv.o \ riscv/float_dsp_init.o \ riscv/float_dsp_rvv.o diff --git a/libavutil/riscv/fixed_dsp_init.c b/libavutil/riscv/fixed_dsp_init.c new file mode 100644 index 0000000000..08d4c4d9a7 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_init.c @@ -0,0 +1,33 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/fixed_dsp.h" + +void ff_butterflies_fixed_rvv(int *v1, int *v2, int len); + +av_cold void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_ZVE32X) + fdsp->butterflies_fixed = ff_butterflies_fixed_rvv; +} diff --git a/libavutil/riscv/fixed_dsp_rvv.S b/libavutil/riscv/fixed_dsp_rvv.S new file mode 100644 index 0000000000..beb1b949f7 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_rvv.S @@ -0,0 +1,38 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_fixed_rvv, zve32x +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vadd.vv v0, v16, v24 + vsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc