From patchwork Sun Sep 4 13:54:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37650 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp2093469pzh; Sun, 4 Sep 2022 06:56:08 -0700 (PDT) X-Google-Smtp-Source: AA6agR7a8RFOx9MyrIO2fzsuUwYuDfUvu2j+Uq3im+vphb3OSdl0mAygng6w1PTptQ7/sjQ/wixC X-Received: by 2002:a17:906:cc5d:b0:741:38a8:a50a with SMTP id mm29-20020a170906cc5d00b0074138a8a50amr27681717ejb.650.1662299768698; Sun, 04 Sep 2022 06:56:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662299768; cv=none; d=google.com; s=arc-20160816; b=A+vbW6CvoxFivAr0Q86/F+jcqauT5ldH3oM7F3MpwC1zIvcdHRFjUWeko15udVUX7A iC9+Ap4oUKp2VkRJWFizNT6OSc8ghwN59gUyytqqTc61lYckDfjFz8TDNfd9h1L5f/Ln ptbN/xAR7EeHnS+CKcZfz0ZW+9Zf0qxY844bKlqeupn2c9hoqkgxbzW7YVCk5zT/3TrH uirxSMndzySTpQtbewzCmAscSXBKaEjG0VT0alBGD5NqARwd/+A+4qxluGNVshSMZ85V 3lziYbAzMipVQ62BlKLqIfQYD3ELxB0HdO23Av9fIdmANfDVomGaWTml1aXKa/bJhDTc Noew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=AQ1xsagXpofZH+cTBsQpuHYKiCFnlUSns+CIMjPxL9g=; b=tGCiHo5H1jWlx222LfwK97OAXsWwH/6xDLWFdyWkRVxx6kquui8Wt7ZhC2dS2FwTfw ER9CGklKa1nBe/bcT/vVHLpH5y8u6Gu62laweC4zBd5i4hUm0KiAElkT2zpOew8zNAmo ChH9UBoFo8B0DAepAZJo67LJ8iwnPTEwvRHbuiL96d3g6fJdCkhYJKN5Oc4UkWC9/RGn RC396LTBiNASAJ2/spepOxC6C2Wz46mXWO9wF95v5lVe34dyK/WVf3sawTHFaWEtSw4e EDvIsPXJ4BP6WqVlRueU1mb7p7JQZ329vEfLxFzn01b00NZtgE/e2OQlwOL0dAUoTB3j 6nLw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dm21-20020a170907949500b0073da13bf4c3si6319976ejc.726.2022.09.04.06.56.08; Sun, 04 Sep 2022 06:56:08 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 605C768BB06; Sun, 4 Sep 2022 16:55:12 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 459C468BACF for ; Sun, 4 Sep 2022 16:55:03 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 5DDEBC006F for ; Sun, 4 Sep 2022 16:55:03 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 4 Sep 2022 16:54:54 +0300 Message-Id: <20220904135503.116704-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <3372981.QJadu78ljV@basile.remlab.net> References: <3372981.QJadu78ljV@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/10] riscv: add CPU flags for the RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: BvIIOWi9YRun From: Rémi Denis-Courmont RVV defines a total of 12 different extensions: V, Zvl32b, Zvl64b, Zvl128b, Zvl256b, Zvl512b, Zvl1024b, Zve32x, Zve32f, Zve64x, Zve64f and Zve64d. At this stage, we don't expose the vector length extensions Zvl*, as the vector length is most commonly determined at run-time depending on the element size and the effective multipler. There are anyways no other run-time mechanisms defiend to determine the actual maximum vector length than to invoke VSETVL. Zve64f is equivalent to Zve32f plus Zve64x, so it is exposed as a convenience flag, but not tracked internally. Likewise V is the equivalent of Zve64d plus Zvl128b. Technically, Zve32f and Zve64x are both implied by Zve64d and both imply Zve32x, leaving only 5 possibilities (including no vector support), but we keep 4 separate bits for easy run-time checks as on other instruction set architectures. --- libavutil/cpu.c | 14 ++++++++++ libavutil/cpu.h | 6 +++++ libavutil/cpu_internal.h | 1 + libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 57 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 79 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..83bf513cf2 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -178,6 +180,18 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV +#define AV_CPU_FLAG_ZVE32F_M (AV_CPU_FLAG_ZVE32F | AV_CPU_FLAG_ZVE32X) +#define AV_CPU_FLAG_ZVE64X_M (AV_CPU_FLAG_ZVE64X | AV_CPU_FLAG_ZVE32X) +#define AV_CPU_FLAG_ZVE64D_M (AV_CPU_FLAG_ZVE64D | AV_CPU_FLAG_ZVE64F_M) +#define AV_CPU_FLAG_ZVE64F_M (AV_CPU_FLAG_ZVE32F_M | AV_CPU_FLAG_ZVE64X_M) +#define AV_CPU_FLAG_VECTORS AV_CPU_FLAG_ZVE64D_M + { "vectors", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VECTORS }, .unit = "flags" }, + { "zve32x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE32X }, .unit = "flags" }, + { "zve32f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE32F_M }, .unit = "flags" }, + { "zve64x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64X_M }, .unit = "flags" }, + { "zve64f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64F_M }, .unit = "flags" }, + { "zve64d", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64D_M }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..44836e50d6 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,12 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V Vector extension +#define AV_CPU_FLAG_ZVE32X (1 << 0) /* 8-, 16-, 32-bit integers */ +#define AV_CPU_FLAG_ZVE32F (1 << 1) /* single precision scalars */ +#define AV_CPU_FLAG_ZVE64X (1 << 2) /* 64-bit integers */ +#define AV_CPU_FLAG_ZVE64D (1 << 3) /* double precision scalars */ + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..634f28bac4 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,6 +48,7 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..9e4cce5e8b --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,57 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#endif + +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; + + /* If RV-V is enabled statically at compile-time, check the details. */ +#ifdef __riscv_vectors + ret |= AV_CPU_FLAG_ZVE32X; +#if __riscv_v_elen >= 64 + ret |= AV_CPU_FLAG_ZVE64X; +#endif +#if __riscv_v_elen_fp >= 32 + ret |= AV_CPU_FLAG_ZVE32F; +#if __riscv_v_elen_fp >= 64 + ret |= AV_CPU_FLAG_ZVE64F; +#endif +#endif +#endif + +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + /* The V extension implies all subsets */ + if (hwcap & HWCAP_RV('V')) + ret |= AV_CPU_FLAG_ZVE32X | AV_CPU_FLAG_ZVE64X + | AV_CPU_FLAG_ZVE32F | AV_CPU_FLAG_ZVE64D; +#endif + + return ret; +} From patchwork Sun Sep 4 13:54:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37651 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp2093540pzh; Sun, 4 Sep 2022 06:56:17 -0700 (PDT) X-Google-Smtp-Source: AA6agR4GGOH3gyiuS9zHCzF4FJxI3r0rX9tVHutnQ1+FesttnbXdY4lqSf4bB3iyY0aRA9+1736F X-Received: by 2002:a17:907:2cd3:b0:741:550e:17ea with SMTP id hg19-20020a1709072cd300b00741550e17eamr25277782ejc.595.1662299776983; Sun, 04 Sep 2022 06:56:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662299776; cv=none; d=google.com; s=arc-20160816; b=pCxp6b7MYcfvpmTXTVp7Xh1n1vFsjIm+icFjfuO3pqxsYo+G7iEzmHqwm3EoJwdU75 XFRASJcmLvNwEwCduGMM4X3hz883KrD1zkCDl/rhmoOy1d++CMQM+CzsZdnAcd833FHX Gxb67Hztd0Giix58ud4fQJNa+SS5F59QRa4peqTcirYScjxGhHOb7AV+OR2S3IBhK89i IVpFrdAusqW8zRJTTdoU3rTGoZR+UYPrVBjTlRXFTT/zFn3x0DJsvTyWqR419sQ+xfeD VMvId+rNsJ470c3PkM8fP0dwEgYFZ83R5LoPkOmw98mlP3j42mEDQsfi5jJWE4Me6vlc M4TA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=15KVPaap7MtMEZqLUc60kQT5ens4XOM7+MOoXRyWGdc=; b=ypne0020ROzc7nrxPSSNzScaU0p/4wai/T/CX8OVO3GFkWE2KDxeF1WmsGkYyUZsgA hTRHhiG/2hS9b/05L7apjAXLS8JachBvdnvOSTvSV8UI+LTDL3cOloIvl6uyrGCmi5Vl AhjY/OROYULIJJZGdBSUZ8z5jN+G715sTov266G90yb1etSrbeYaUXu/oW3FEyEIZKKg tgKUGsb1ev329A96nY6PvCW0LhqvHU+qZ/bGk3VU9LruJew5colVaOoGGMVmJwLe9mOA o8F8eMoKhlu0nE0439B6Yt8hHBujyvqAdYzIRTARWwVZOIqHzRyiY4jRRQPIp81JoQ/7 I7fg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v11-20020a1709067d8b00b00741521e9a3esi5659499ejo.235.2022.09.04.06.56.16; Sun, 04 Sep 2022 06:56:16 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 60F6C68BB0B; Sun, 4 Sep 2022 16:55:13 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4593768BAC8 for ; Sun, 4 Sep 2022 16:55:03 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 8B6EDC0070 for ; Sun, 4 Sep 2022 16:55:03 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 4 Sep 2022 16:54:55 +0300 Message-Id: <20220904135503.116704-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <3372981.QJadu78ljV@basile.remlab.net> References: <3372981.QJadu78ljV@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/10] riscv: initial common header for assembler macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 7khkjO5Yi829 From: Rémi Denis-Courmont --- libavutil/riscv/asm.S | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) create mode 100644 libavutil/riscv/asm.S diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S new file mode 100644 index 0000000000..31001b8bdb --- /dev/null +++ b/libavutil/riscv/asm.S @@ -0,0 +1,33 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + + .macro func sym + .text + .align 2 + + .global \sym + .type \sym, %function + \sym: + + .macro endfunc + .size \sym, . - \sym + .purgem endfunc + .endm + .endm From patchwork Sun Sep 4 13:54:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37652 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp2093589pzh; Sun, 4 Sep 2022 06:56:25 -0700 (PDT) X-Google-Smtp-Source: AA6agR7cELv+jfSfHu7okl3R3kZFGyvP6Mce9SeruwDMCHyA6w4RJimyGsre/4oWO2+dRw5naKT4 X-Received: by 2002:a05:6402:40b:b0:44e:6c5c:441b with SMTP id q11-20020a056402040b00b0044e6c5c441bmr2287000edv.223.1662299785293; Sun, 04 Sep 2022 06:56:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662299785; cv=none; d=google.com; s=arc-20160816; b=DserELjfBD2+cPkfh40PqR1C0MLJ6KR+t+HCakukcgn0vnqjHgyidiicm7OuXVJ+bD osUG2LLn0pbFtoQGMjm6rml6a3ZdSbKFtly/El+98+AxxJLC8XBvBiDFp/EPInSddpXH 9B2LGYnWjz+MfRrilxSE+gJv7ALAkrQdkNZzqQRj5AeXEkhpdtoqINvZTykDmQ25Vhdb Hu4gVHRVcsTNLKHacuAzl6uzq0Tywae4S4kRkKCN/RM2BUfdV1o8mCU79t8OmVN34cjG 7NQ7rSySo4DisVqs38ro9g0oNILHJD76fMcJRNNsGHpM0vEdF3etBOzX8V4ZlEuypT2v wtrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=AnMe5zSv9gdBgR14vgPKck6aCRgr9s1/NDrDSny2SAY=; b=q3Qd6b1Bv9zjMZUDlfVcnpwl0MEjOMf4KhYEbNtFcu2E2cQRNWEZ1m/I5w7hxLW3+u ZtYhcWlQtyomlPL/bV1iNGqqcSUQ4VMEyGhKlIJZYAQrlFsGyFZxHxpKy+txG+iLm2Kz P8QmTeRqevMXNM0E23ZMFZVmMCtnlkuTjlUl9iUQJDrYSLz2Rd67BeHkKdyr8uOlshq4 TPozPXNEoMFBUr3iuCTBAwwnM/WbcH8+oANJPHp40z/IY2lTWXp7F9D+E/HZHYnITtE9 PsJvkW2QuhxfcNyX4epYm7pvmQB5WjpbkD9PzkK8jZdqkI5yN2Yf3H1597YtBSFzJnf3 DPPg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id x19-20020a1709064bd300b0073dced7204bsi4920617ejv.767.2022.09.04.06.56.25; Sun, 04 Sep 2022 06:56:25 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 672BE68BB12; Sun, 4 Sep 2022 16:55:14 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 47CCA68BAD2 for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B8E17C00AC for ; Sun, 4 Sep 2022 16:55:03 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 4 Sep 2022 16:54:56 +0300 Message-Id: <20220904135503.116704-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <3372981.QJadu78ljV@basile.remlab.net> References: <3372981.QJadu78ljV@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/10] riscv: float vector-scalar multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: +FPdyti5BCRC From: Rémi Denis-Courmont This is based on existing code from the VLC git tree with two minor changes to account for the different function prototypes. --- libavutil/float_dsp.c | 2 ++ libavutil/float_dsp.h | 1 + libavutil/riscv/Makefile | 4 ++- libavutil/riscv/float_dsp_init.c | 41 +++++++++++++++++++++ libavutil/riscv/float_dsp_rvv.S | 62 ++++++++++++++++++++++++++++++++ 5 files changed, 109 insertions(+), 1 deletion(-) create mode 100644 libavutil/riscv/float_dsp_init.c create mode 100644 libavutil/riscv/float_dsp_rvv.S diff --git a/libavutil/float_dsp.c b/libavutil/float_dsp.c index 8676c8b0f8..742dd679d2 100644 --- a/libavutil/float_dsp.c +++ b/libavutil/float_dsp.c @@ -156,6 +156,8 @@ av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact) ff_float_dsp_init_arm(fdsp); #elif ARCH_PPC ff_float_dsp_init_ppc(fdsp, bit_exact); +#elif ARCH_RISCV + ff_float_dsp_init_riscv(fdsp); #elif ARCH_X86 ff_float_dsp_init_x86(fdsp); #elif ARCH_MIPS diff --git a/libavutil/float_dsp.h b/libavutil/float_dsp.h index 9c664592bd..7cad9fc622 100644 --- a/libavutil/float_dsp.h +++ b/libavutil/float_dsp.h @@ -205,6 +205,7 @@ float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len); void ff_float_dsp_init_aarch64(AVFloatDSPContext *fdsp); void ff_float_dsp_init_arm(AVFloatDSPContext *fdsp); void ff_float_dsp_init_ppc(AVFloatDSPContext *fdsp, int strict); +void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp); void ff_float_dsp_init_x86(AVFloatDSPContext *fdsp); void ff_float_dsp_init_mips(AVFloatDSPContext *fdsp); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 1f818043dc..6bf8243e8d 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1 +1,3 @@ -OBJS += riscv/cpu.o +OBJS += riscv/cpu.o \ + riscv/float_dsp_init.o \ + riscv/float_dsp_rvv.o diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c new file mode 100644 index 0000000000..279412c036 --- /dev/null +++ b/libavutil/riscv/float_dsp_init.c @@ -0,0 +1,41 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/float_dsp.h" + +void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, + int len); + +void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, + int len); + +av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_ZVE32F) { + fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + + if (flags & AV_CPU_FLAG_ZVE64D) + fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } +} diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S new file mode 100644 index 0000000000..98d06c6d07 --- /dev/null +++ b/libavutil/riscv/float_dsp_rvv.S @@ -0,0 +1,62 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + + .option arch, +v + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_fmul_scalar_rvv +#if defined (__riscv_float_abi_soft) + fmv.w.x fa0, a2 + mv a2, a3 +#endif + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_dmul_scalar_rvv +#if defined (__riscv_float_abi_soft) || defined (__riscv_float_abi_single) + fmv.d.x fa0, a2 + mv a2, a3 +#endif + +1: vsetvli t0, a2, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Sun Sep 4 13:54:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37643 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp2093116pzh; Sun, 4 Sep 2022 06:55:08 -0700 (PDT) X-Google-Smtp-Source: AA6agR7PRzgHNvqsbrneEO0hNXkZqjMhg3schfaYMgvaWi76G9wMqv6pwgmgyfxKthFZjjMLtK+d X-Received: by 2002:a05:6402:d69:b0:448:4c7a:cb6a with SMTP id ec41-20020a0564020d6900b004484c7acb6amr30429714edb.18.1662299708328; Sun, 04 Sep 2022 06:55:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662299708; cv=none; d=google.com; s=arc-20160816; b=AtBrzA9D/SVP9/mwlj9BcCSCYFl2zElovc5XXaXRtm979nTJMGk39WeTEIq1y0PntB BX6WIudRjOERGHeOBgMxAY1ERIz/2gTzeDVvMmFMrH+jG2XjoHP1wfUm2WZccwTp1rqX FiaXR1ctJlov/tMUp+F/y8svY0yBTFa+EYtpOpzgAUkZlaZymS169pOXfudHBKmjTV5D p91pCKlP9v2cUgLbGrJkUPP4h44oLKTZmy/oyvpwrnYakixQpLGn/RXN3PwXeDGVuOY4 00J+7gq2KYUGDMAZf+QCUvOOtjio2JT7NMVt/dhY6V1Eu/Bek7GD5FBVE0HAI9FHHs0z GQ3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=mTH3CsHGYJDNnM3laiVZqf4dJI64VycbSUqHexjxfdo=; b=ZrxoCc2W+kCJaLJpoSlf8Hy8ovigZ72kj5RpMzTQ9mi8RTeyUxP6uc4eLk4DUDUp7+ GMpJ1Vo5NXlq7tsBDQo2sZiKm544NbksrgtIu8b5oHuZcFh4qTMGUe7Eg4jYXO7RBXOU yYIT3N+3zR3ObAFqFCNjXLjHHFaoLYHqOh3XATMc9a+6nnBL0vcCyUctB/0l4oQ9HlTY UtzAhFUuChkVbejL38WQBAoN170tafJtwQbKHpFWMnTKZaYUdJDg5ykceL1jOaWZZJYy hKR5ZT7nfF4xaIRcFoYhcJE2RJ1bZwhZexfWZldJq7+cfy/5gH2GZU6pQSomlapZVcOs WEYQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e21-20020a170906375500b0073d7d0aaa16si4762908ejc.226.2022.09.04.06.55.07; Sun, 04 Sep 2022 06:55:08 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8A69268BAB4; Sun, 4 Sep 2022 16:55:05 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3F2C168BAA3 for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id E6B92C00AD for ; Sun, 4 Sep 2022 16:55:03 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 4 Sep 2022 16:54:57 +0300 Message-Id: <20220904135503.116704-4-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <3372981.QJadu78ljV@basile.remlab.net> References: <3372981.QJadu78ljV@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/10] riscv: float vector-vector multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: SIId+HP4kMMr From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 9 ++++++++- libavutil/riscv/float_dsp_rvv.S | 34 ++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 279412c036..4135284c76 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -22,9 +22,13 @@ #include "libavutil/cpu.h" #include "libavutil/float_dsp.h" +void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -33,9 +37,12 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_ZVE32F) { + fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; - if (flags & AV_CPU_FLAG_ZVE64D) + if (flags & AV_CPU_FLAG_ZVE64D) { + fdsp->vector_dmul = ff_vector_dmul_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } } } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 98d06c6d07..15c875f9d2 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -21,6 +21,23 @@ .option arch, +v +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_fmul_rvv +1: vsetvli t0, a3, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vle32.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv #if defined (__riscv_float_abi_soft) @@ -41,6 +58,23 @@ func ff_vector_fmul_scalar_rvv ret endfunc +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_dmul_rvv +1: vsetvli t0, a3, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vle64.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv #if defined (__riscv_float_abi_soft) || defined (__riscv_float_abi_single) From patchwork Sun Sep 4 13:54:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37644 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp2093189pzh; Sun, 4 Sep 2022 06:55:17 -0700 (PDT) X-Google-Smtp-Source: AA6agR4RUH2w64Hw4wF95OXeOaKUd6RhSzqvCi0qALglBlYGRgNJDcJ3LzpkcISEBmS4C4nminVF X-Received: by 2002:a17:907:968f:b0:742:1244:b02 with SMTP id hd15-20020a170907968f00b0074212440b02mr17898232ejc.496.1662299717518; Sun, 04 Sep 2022 06:55:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662299717; cv=none; d=google.com; s=arc-20160816; b=K/GHAwYycGBATpX/t5fz4OZPoEOlBmcKKELTDZb06wiDv3XUMqTrKCd4yT6GUrFX3O RwICtx+PL5IjWXcHJVad9CbZ1AaB/gtIjOUA00qhp5rze/PxZcqlh4L3FuwxmNpE+dEX hrmWXIWpJ5n+yN/I4y29mh7N5ouun4a4wAeBreStNVyx6IHFY5/hg8yev6PpHXLuaekL 6KhcNKFZwECeTHUUSX6l4wjhBZgbfLzE8niFQdUK8d3oh4J+DedTEPC1gW4uG+xVh+yb bPkUhbsKC3J9EWer6tpj0YtdmcN7Hb4XDa8CsI0939hrKnoKPcLP8JTnzgLFsQpJ+/cs Ozpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=MnqgTZuOheLzdLn7T5fJBEYmdblzgM7FFqZiQOCZvcY=; b=ppUkYC21KSvDpJNjHUVICKRvsMNbbNxD7iR2IM4x8CK+GUc/VFDOk1UC73ATn0O7RQ 0iwf5jt5TFzA78YTxD0jQXPWAAuiKDo2XNcETj+X/kCNSFZCIxxfqOdedATJVg05ya1V aiOb7iuvB0OLj6NaFGiBkzTG6zfgUuLsqhf4IozdGQGjqj+wFjoBGgOdQrZ/9BdhD43I euNUBCnyOQ9T3uLyECQUuaKhHFM4g+jqyDMEScWHkMLsG6Gn3aN0DlmVA2JpZDQXJVDv EQ6vLYie/0Qp4U78i+bMZWHF+chSWwfuDqKyzT759ZZ3dXUhO8+56J7yW/9nmCeaqd6R iX0Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id bj15-20020a170906b04f00b0073da846c2c1si4927461ejb.524.2022.09.04.06.55.17; Sun, 04 Sep 2022 06:55:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6FB1E68BAC8; Sun, 4 Sep 2022 16:55:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 62CE868BAAA for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 1F895C00AE for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 4 Sep 2022 16:54:58 +0300 Message-Id: <20220904135503.116704-5-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <3372981.QJadu78ljV@basile.remlab.net> References: <3372981.QJadu78ljV@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/10] riscv: float vector multiply-accumulate with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3FHPhSwK+UW9 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 6 +++++ libavutil/riscv/float_dsp_rvv.S | 42 ++++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 4135284c76..a1bb112ec7 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -24,11 +24,15 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, int len); +void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); +void ff_vector_dmac_scalar_rvv(double *dst, const double *src, double mul, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -38,10 +42,12 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_ZVE32F) { fdsp->vector_fmul = ff_vector_fmul_rvv; + fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; + fdsp->vector_dmac_scalar = ff_vector_dmac_scalar_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; } } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 15c875f9d2..8adfa6085c 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -38,6 +38,27 @@ func ff_vector_fmul_rvv ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_fmac_scalar_rvv +#if defined (__riscv_float_abi_soft) + fmv.w.x fa0, a2 + mv a2, a3 +#endif + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v24, (a1) + add a1, a1, t1 + vle32.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv #if defined (__riscv_float_abi_soft) @@ -75,6 +96,27 @@ func ff_vector_dmul_rvv ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_dmac_scalar_rvv +#if defined (__riscv_float_abi_soft) || defined (__riscv_float_abi_single) + fmv.d.x fa0, a2 + mv a2, a3 +#endif + +1: vsetvli t0, a2, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v24, (a1) + add a1, a1, t1 + vle64.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv #if defined (__riscv_float_abi_soft) || defined (__riscv_float_abi_single) From patchwork Sun Sep 4 13:54:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37645 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp2093232pzh; Sun, 4 Sep 2022 06:55:26 -0700 (PDT) X-Google-Smtp-Source: AA6agR5hKUEl7iCQa28qRpTwvDgadS8pza0Hm8rlpMb8KZgJ49IE1UxnjWOILCnXNglurADPRx4a X-Received: by 2002:a05:6402:120d:b0:448:e049:c665 with SMTP id c13-20020a056402120d00b00448e049c665mr20440729edw.70.1662299725941; Sun, 04 Sep 2022 06:55:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662299725; cv=none; d=google.com; s=arc-20160816; b=gdPMmAC/t04qnonJ+m6ga5rX6bhywfyzAayv00VqdTctQjCFh37pmm2mp/AjopXzhG 4N4y5FhFYzX5POeLu+UrbjZBODzOYzIKiBYB80ZVm+1owDZzf79qk5krMV8ePeVaa0Dt XqXjLWpN3iT57kiAdP7JHtM/BNwapr0xMoPiNC37im2gYWLqVjqOw59ZOeGXEo7tQHLt s43MYnOARaB/OgKE10rKu8YlICduevkt5nGiJM0biOYmtMr6Wz/YoAWkjxHM2HSoAwtk nrzT5aIewZixkmukrF/wnvGLI2YOQ20hMbwXZsvSaXbs0IKlqWO4afEYvDe4Yrp2y0Vp nLFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=8GnB9SRkDGV0aVUNVAW+8K1Vbt8H9z/+vjivXuXeeEc=; b=pQdFnA9O3bkMnRNrucmkY9r3/5iPwf7A7EoJv9UEki5/xBT3WTIiJjwW3jtKl9ilVz Ae+gVbHgIE3/AW+kC99s/0Ot0G4FSH9Shat5Y6ALCBF+pZmOu4rXvSNm7GBQpvqLw2Hi uXZa4u9dNFb26jXlnmrgmePk7pG+aCUtYEhJL69XHVlPZDbwh6OqgrbNIRGKMKvF24GT 4wbsC+PettVdpgrlF0XED1OCxWYC/Gokj1ntcdcpD4xq3WJPyqEzZzHCyWJbhQyABCb/ br+FPyd8gP5N7dPx2trIbcuGbszO+F1jwKcmgRYMoU4B0VwnmuyMl0JGU/gW09ympC0j UM7A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sc18-20020a1709078a1200b0074133918ba1si6668485ejc.331.2022.09.04.06.55.25; Sun, 04 Sep 2022 06:55:25 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 78B5F68BAB0; Sun, 4 Sep 2022 16:55:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8F98568BAAA for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 4C65AC00AF for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 4 Sep 2022 16:54:59 +0300 Message-Id: <20220904135503.116704-6-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <3372981.QJadu78ljV@basile.remlab.net> References: <3372981.QJadu78ljV@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/10] riscv: float vector multiplication-addition with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: u2FV4DlNnrfL From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index a1bb112ec7..8539fe9ac5 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -28,6 +28,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, + const float *src2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -44,6 +46,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 8adfa6085c..27190c21ff 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -79,6 +79,25 @@ func ff_vector_fmul_scalar_rvv ret endfunc +// (a0) = (a1) * (a2) + (a3) [0..a4-1] +func ff_vector_fmul_add_rvv +1: vsetvli t0, a4, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v8, (a1) + add a1, a1, t1 + vle32.v v16, (a2) + add a2, a2, t1 + vle32.v v24, (a3) + add a3, a3, t1 + vfmadd.vv v8, v16, v24 + sub a4, a4, t0 + vse32.v v8, (a0) + add a0, a0, t1 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Sun Sep 4 13:55:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37646 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp2093265pzh; Sun, 4 Sep 2022 06:55:34 -0700 (PDT) X-Google-Smtp-Source: AA6agR4hbadQzskFDEAcA0ZoYe4PYDJjmer57Aw8RJSXhGOPpohVsPWz9u3HNcXK1VLrHuUjT4Z0 X-Received: by 2002:a17:907:3e07:b0:741:7db9:eb74 with SMTP id hp7-20020a1709073e0700b007417db9eb74mr23662687ejc.83.1662299734056; Sun, 04 Sep 2022 06:55:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662299734; cv=none; d=google.com; s=arc-20160816; b=BhDvXrQh440NTNA9gNjgHekI9POBZbVDGqA3MDJNEAs0mh6whr8GH0E5QGpCFC/hmm RloMkEb9lcbxbH9i24++dR0OMJSyJkZ8qs1tzUvwOvdJNTzXET7z2dcOOC8gCwrQN9m7 rdD6+mipuTUv9QutZCYZUGcPnxUjwnkmDmTH5eta2OeY6ieL2DZTciiRzI0S4ICrfwG3 Vy9Fls7Lg7srgFGslwv2yBkMEU+XHgZfVcltMfVhiJ41kvEdW33Ck5UUSaQqT6g29VPw dp8tOWX42DSSBpBiIg8/vym1es6cuHC2o3OwaroiqirzlbyHG9CPudhGOh2htgwV4oBn LIRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=2y9TGIhqDkFBImbF4f7P6zj0QylSAws/7ZnVI9KOkk0=; b=jYidqL25ItUoANQaQLH5Vz9c6awKM8ZPkJ+PdZ+NdjqEUxCUDjyKF6ZCnRVVA7bJYb ZdJU09oFFP5TplmR1r4GmqxIx1sNbJ9nBkhlO3jPID35zGZWR89pHHnJTmJgFNARRRUz 3jq7hyjQ+iZvlgnvPBI+DEF96c7r4qWVaoKZuWWpd47LG8UBB3COV52zouvXEbA9m4h0 i2NJxru9rA5nqKdJVhsZ4LRUBl/maGVANs2oWRsteXMl1w6mE/dCtIwVPm9gJ42Lw6g3 k9Vo6DsuYWVYeArMyZqVhQWcGaLO+Q1SVOakyGqzvM+hTSv+5CvcRgcuGHM9rGLa3yBO 1pow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id w10-20020a056402268a00b00448da245f3bsi6717400edd.6.2022.09.04.06.55.33; Sun, 04 Sep 2022 06:55:34 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8846B68BAAA; Sun, 4 Sep 2022 16:55:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B994E68BA9F for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 791E6C00AD for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 4 Sep 2022 16:55:00 +0300 Message-Id: <20220904135503.116704-7-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <3372981.QJadu78ljV@basile.remlab.net> References: <3372981.QJadu78ljV@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/10] riscv: float vector sum-and-difference with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: iaUNBXwXJfyT From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 8539fe9ac5..2165394585 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -30,6 +30,7 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -47,6 +48,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 27190c21ff..61beb868b0 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -98,6 +98,24 @@ func ff_vector_fmul_add_rvv ret endfunc +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_float_rvv +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vfadd.vv v0, v16, v24 + vfsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Sun Sep 4 13:55:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37647 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp2093318pzh; Sun, 4 Sep 2022 06:55:43 -0700 (PDT) X-Google-Smtp-Source: AA6agR5aVlr/INz2KQKppkNiJeY4nCjlcs3GTwS0E4U8aLS8x/DM8LrkrlQYuYvAQeNv7NS1fg8R X-Received: by 2002:a17:906:eec9:b0:73d:c369:690f with SMTP id wu9-20020a170906eec900b0073dc369690fmr33474663ejb.767.1662299743050; Sun, 04 Sep 2022 06:55:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662299743; cv=none; d=google.com; s=arc-20160816; b=ORYKO4NkFkZdaBMsiKh9SPUg1W1K+1NQP+J+4bHtvHAv0bfvSry3qTi3Laz/oBcnn1 flV3xD+xATz5f0Kce0zvDQA0axZf48IeD/8ALIjEDf8jAIepPJiHviO7PrsDMJsgHG4y h/cr259XIhBop0IgLLT90F+1ZaJD9WgkJ+pEEqTNhW3neJVczHqgww6WHaXzc4YiOl6O 32BZ4KFSmGZP03h165AF6fc1PGHX24ULKBdKILfiaCIgzJISM5X976tTW7L/c0Ruyorn FbpTM9piYnYANonbr7qXLMSzoi5SG9Oc5A+PZs/sr4nc91V3Fb0zxfcW2ai4nc/Ko7kG gRVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ZjuQcdgZS+VV3c6QHtS6bnO0EgWIx0/XnNi7QE218vY=; b=C6hvhjNj/nZF914rkXwEEDevKXklMg23w2xzj9AIJv4VteXtOsqNL9JcoXdIuFeY1B Rm1jhIADcw6Cps2OalqlGGOYFuWljVoGJr3m6CX2ub2Uo1qspzRoElgCvyneDAS45YCQ jDcvV+x6GXUPME9rNs+6X68Fr77fKSvYi7ZQ81j2GqH/6baVSEA1K0nEijg9LKxXuiuy ZM32TCxhG5e/vXSSriit0ohHbaYXT6GgFwv320ZlBRgTl8bpaV2GfXyAjvz1WFEP6HYw 0JHN4mPUWu2swufb1+ChHD0+Q383WL/NX7ngnd9iYvdbrLYRNPzczy2/MqSNax8QgG/0 FXww== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d4-20020a056402144400b0044e85e75120si423532edx.84.2022.09.04.06.55.42; Sun, 04 Sep 2022 06:55:43 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8A8D468BACA; Sun, 4 Sep 2022 16:55:09 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D517E68BAAE for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id A626AC00AE for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 4 Sep 2022 16:55:01 +0300 Message-Id: <20220904135503.116704-8-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <3372981.QJadu78ljV@basile.remlab.net> References: <3372981.QJadu78ljV@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/10] riscv: float reversed vector multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: AjPzfA+f9QpC From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 22 ++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 2165394585..1183460181 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -30,6 +30,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, + const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, @@ -48,6 +50,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 61beb868b0..e3738ef7c5 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -98,6 +98,28 @@ func ff_vector_fmul_add_rvv ret endfunc +// (a0) = (a1) * reverse(a2) [0..a3-1] +func ff_vector_fmul_reverse_rvv + add t3, a3, -1 + li t2, -4 // byte stride + slli t3, t3, 2 + add a2, a2, t3 + +1: vsetvli t0, a3, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vlse32.v v24, (a2), t2 + sub a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] func ff_butterflies_float_rvv 1: vsetvli t0, a2, e32, m8, ta, ma From patchwork Sun Sep 4 13:55:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37648 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp2093359pzh; Sun, 4 Sep 2022 06:55:51 -0700 (PDT) X-Google-Smtp-Source: AA6agR7i4xvb/LydZg/AYP+oRD9BC4hZZFcIiEeVLtA4wyl1+VkDv4jxz3MqEgVJYszUaKsugWUZ X-Received: by 2002:a05:6402:3484:b0:448:cc83:2bff with SMTP id v4-20020a056402348400b00448cc832bffmr22436621edc.65.1662299751484; Sun, 04 Sep 2022 06:55:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662299751; cv=none; d=google.com; s=arc-20160816; b=ibo6aCEldOGemGlDsA/xIs+XAV4cjvHtaci2bhtUBCTTgGrVdCz7iDnvL1JDL4GjDh 9tvLSLpbzohEXNZquRhp+n+wXvvyAgGucLfA4vmqBcZ0lt7/jlQ4Sz6gMo2Vs0CtK1P+ TMr6ubJocJp2rPpmPj1cPz/mEH4B+HAveSl7XV754+FL3VlGBsXSYLPFwO2Pvj4JlM0S jrkwhfbUdqzHH9CyDRnHZGO9TqeR2dmuA9PpGI1swX6RAB1cbhDzd3IiOC8CaX1RCLhy nn90gGgACNZ5T+1OrhrSyQ0eDdFEaZoJ+c/Z0LLl8S6jprwuQa70LoKZURTT+5XEj+3h IjLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=6Sj7W7mvmmu4A84bq1Cox9mUGWaHSOoxWgDubCg6dgc=; b=STscrnHMWgbSUkkPXaT/1fgz0YmMpmnkZt1liHiSD14hSREAH8R+eNnpiKkbZV+lnx Lh+MrjJsLFB1EKzCvT0OkeP5VO0XUfkN+QnPNDWVT2NWLxS4ZCuqy+qQBKLb461zPHtM xZ32r5ltyJfAxi44ANcZFly9smzgWM5qs1U/R513cejuM3uoMqEdT6YV/lXLopm7DqCx D4J+0nTZP12ks1TM0K6mLGso8+w3LoStMTnDoeIF0YBMfh0MaIOgqVG6AIJWANIoGnYV id7aVSmOj7OfJBMWNRH7jjn1pL7uhCZ/4jXaw5fA9v5ZNJTqKDFK7POlAqUX+WDtK2K8 FqrA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d4-20020a056402144400b0044e85e75120si423692edx.84.2022.09.04.06.55.51; Sun, 04 Sep 2022 06:55:51 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7DC9868BAFC; Sun, 4 Sep 2022 16:55:10 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 25E3F68BAAE for ; Sun, 4 Sep 2022 16:55:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id D44E9C00AD for ; Sun, 4 Sep 2022 16:55:04 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 4 Sep 2022 16:55:02 +0300 Message-Id: <20220904135503.116704-9-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <3372981.QJadu78ljV@basile.remlab.net> References: <3372981.QJadu78ljV@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/10] riscv: float vector windowed overlap/add with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: e3RBVXuXqpt0 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 35 ++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 1183460181..887706d899 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -28,6 +28,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_window_rvv(float *dst, const float *src0, + const float *src1, const float *win, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, @@ -49,6 +51,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_window = ff_vector_fmul_window_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index e3738ef7c5..7e7d48374f 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -79,6 +79,41 @@ func ff_vector_fmul_scalar_rvv ret endfunc +func ff_vector_fmul_window_rvv + // a0: dst, a1: src0, a2: src1, a3: window, a4: length + addi t0, a4, -1 + add t1, t0, a4 + slli t0, t0, 2 + slli t1, t1, 2 + add a2, a2, t0 + add t0, a0, t1 + add t3, a3, t1 + li t1, -4 // byte stride + +1: vsetvli t2, a4, e32, m4, ta, ma + slli t4, t2, 2 + vle32.v v16, (a1) + add a1, a1, t4 + vlse32.v v20, (a2), t1 + sub a2, a2, t4 + vle32.v v24, (a3) + add a3, a3, t4 + vlse32.v v28, (t3), t1 + sub t3, t3, t4 + vfmul.vv v0, v16, v28 + sub a4, a4, t2 + vfmul.vv v8, v16, v24 + vfnmsac.vv v0, v20, v24 + vfmacc.vv v8, v20, v28 + vse32.v v0, (a0) + add a0, a0, t4 + vsse32.v v8, (t0), t1 + sub t0, t0, t4 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) + (a3) [0..a4-1] func ff_vector_fmul_add_rvv 1: vsetvli t0, a4, e32, m8, ta, ma From patchwork Sun Sep 4 13:55:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37649 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp2093402pzh; Sun, 4 Sep 2022 06:56:00 -0700 (PDT) X-Google-Smtp-Source: AA6agR4BOx1xe8NFQa6udLPLpc/Tz6aOCCF+9OEHSNSRAx/b4VLwqTViu8yvsBHDr18HuUh5gwSd X-Received: by 2002:a17:907:1b1f:b0:72f:56db:cce9 with SMTP id mp31-20020a1709071b1f00b0072f56dbcce9mr32230881ejc.605.1662299759988; Sun, 04 Sep 2022 06:55:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662299759; cv=none; d=google.com; s=arc-20160816; b=kUj84d7HauSkVe/hkWtwceRIC02Cm/C+FrtYny+YWaIW4qqYgeVN9WSZZLjs6yoVXn gB4/AfApjMDFLYrckfQmsYiYGmD6YE/47d4UUVA4WNQ6wxAa22VGcxRQ8AxQ83Hf0cO8 8xde2pGFSPdZdfOYNbI3g7oQsye/zNHDMUCBS8xPKFd62XpPqgfAC+bjevbeHpdJw9K8 nqhTdSUl+jrCuMDZMBk8hlRt33msvx3BU8zg3/vYi8FB1GKdBqlP0UAeKIRMFVk0dU4R lTfHs+BufKWQQZ6ahKEH17CV2jFsi/L8RQap9HBjMG2s6E9faoV264QKBLuN66IDj1hp aIZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=2T7dXJdzwOe2oaJf+eRi4COzkFKH0cOb9oEZYtlVojw=; b=GY2b2VykJmTK+r6WRXxtn62KOf1h7JOWVDUjTWZ05qiqr2OpnvbtAqdSUA+mNLs7vd EqKhRcpmYtXdZlnS7vvm38XIqV2/tmqaLM9VV13Z36e7pIue0AVM2J5aCqxnODo6OJJA ok6GJJbM/RWzUzTWwzNoJKoGopQPIKxSOe/vxpOZRJGGWajfNXgDUw5a3a95/EWoyPnh XlTUuroQNrV49WwHyL16+BvlzD3Ghn6dz0ROv0udnMjy6rxyC1ECKB1BikVRnNCvQlFh 0GwejVQiGWB1K0OIBqiyfA0tIKxj7d/YDJ28c4WqbBTerJ+YTOlMLSXJtbWIRodA0+Wx JqGw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id m30-20020a50d7de000000b00447c97a309csi3843997edj.169.2022.09.04.06.55.59; Sun, 04 Sep 2022 06:55:59 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6DB8368BB02; Sun, 4 Sep 2022 16:55:11 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 41F1968BABE for ; Sun, 4 Sep 2022 16:55:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 0FC9CC00AE for ; Sun, 4 Sep 2022 16:55:05 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 4 Sep 2022 16:55:03 +0300 Message-Id: <20220904135503.116704-10-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <3372981.QJadu78ljV@basile.remlab.net> References: <3372981.QJadu78ljV@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/10] riscv: float vector dot product with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: I8Zmjm9sOHoY From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 23 +++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 887706d899..7c2fc10e99 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -35,6 +35,7 @@ void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); +float ff_scalarproduct_float_rvv(const float *v1, const float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -55,6 +56,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; + fdsp->scalarproduct_float = ff_scalarproduct_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 7e7d48374f..7616abb9f6 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -173,6 +173,29 @@ func ff_butterflies_float_rvv ret endfunc +// a0 = (a0).(a1) [0..a2-1] +func ff_scalarproduct_float_rvv + vsetvli zero, zero, e32, m8, ta, ma + vmv.s.x v8, zero + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + add a0, a0, t1 + vle32.v v24, (a1) + add a1, a1, t1 + vfmul.vv v16, v16, v24 + sub a2, a2, t0 + vfredusum.vs v8, v16, v8 + bnez a2, 1b + + vfmv.f.s fa0, v8 +#if defined (__riscv_float_abi_soft) + fmv.x.w a0, fa0 +#endif + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv 1: vsetvli t0, a3, e64, m8, ta, ma