From patchwork Mon Sep 26 14:52:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38338 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2303841pzh; Mon, 26 Sep 2022 07:53:01 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5qVXRwhMaLULbQoNghwgjWpH5mAs1SpCw2WmsUrGUE8XtJMmkO5dUAGjwTeybbsOEojgis X-Received: by 2002:a17:907:31c7:b0:740:e3e5:c025 with SMTP id xf7-20020a17090731c700b00740e3e5c025mr18222333ejb.341.1664203981347; Mon, 26 Sep 2022 07:53:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664203981; cv=none; d=google.com; s=arc-20160816; b=b5TlVyV5s4+OAR6jaC7XP4iROdDvNbAMtEvBQf/i24XfylfGpHd51rEOplkjJjE9RF JlHcvPqPu5jx1bam4czx4M3JKV3zmlgWvX2+3HuDsexG8xtmIUa8M8MMH0YAMAKdFci2 1r7vmuLH5ArXmOjFeQvPxbikyTPbzLmMXkk3GNW4u7w4BeP19E09KK7E0LxnJkmRUxA7 aIbWv3+eF26fY1Ls7zFwmlI4FyoW6CaQhiyCuy6VtCO5GPU8aenOn2HmWA4QU92o0JGB SKKiYJ18k9ShSCPT5+xMNCABZEqN0MqKP2K9P8TTwKkwBMm+9X8zOxgDdBNZ5iqd3es1 92IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=lztWidayenm4oOe8hLQNFwZSSGPlXFX8NeYcet1Hkdk=; b=TnpF0LsQliV8ZwMIH/rMG/JWT+FgJHLsJeZQhtlPk6mUBkjTn7gTVVFCk6h26gwyt0 /BoUVjmk97I7pNVFnT845PaKY46iXn8/zsOx+EpBbYAAtcBb67HnuZ1Gl+biWaCgt1nT MDicnNrkQ3wqHn+VuZ0wkBFgfsoaWeee/lbaKBEofHQgV0hOrrQ76e0gnnz7Z9r3wTMj +WA5rOHxaDIw/NV0Oe/nwtuJye973YPZuBgt38jRXdebAcuff13I0j2YpkY0k0yR++rs am+p6yK9tC7hXBSeKeIjr2suY/gJPTMfhwRq9uEQqyAXb5bMsHhWJcuCU9HLh2Sm4vAA DpUw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id u24-20020a1709067d1800b00774d0f10566si26570ejo.821.2022.09.26.07.53.00; Mon, 26 Sep 2022 07:53:01 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 612D568BAFE; Mon, 26 Sep 2022 17:52:58 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7E53868B32D for ; Mon, 26 Sep 2022 17:52:51 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 2D69AC0014 for ; Mon, 26 Sep 2022 17:52:51 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:21 +0300 Message-Id: <20220926145251.56351-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/31] lavu/cpu: detect RISC-V base extensions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: KTDQQmawfC5x From: Rémi Denis-Courmont This introduces compile-time and run-time CPU detection on RISC-V. In practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of I, F and D extensions, and if it does, it probably won't have run-time detection. So the flags are essentially always set. But as things stand, checkasm wants them that way. Compare the ARMV8 flag on AArch64. We are nowhere near running short on CPU flag bits. --- libavutil/cpu.c | 6 +++++ libavutil/cpu.h | 5 ++++ libavutil/cpu_internal.h | 1 + libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 56 +++++++++++++++++++++++++++++++++++++++ tests/checkasm/checkasm.c | 4 +++ 6 files changed, 73 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..8b6eef9873 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -178,6 +180,10 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV + { "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" }, + { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" }, + { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..9aae2ccc7a 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,11 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V extensions +#define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank) +#define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP) +#define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP) + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..634f28bac4 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,6 +48,7 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..6803f035e5 --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,56 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "libavutil/log.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) +#endif + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + if (hwcap & HWCAP_RV('I')) + ret |= AV_CPU_FLAG_RVI; + if (hwcap & HWCAP_RV('F')) + ret |= AV_CPU_FLAG_RVF; + if (hwcap & HWCAP_RV('D')) + ret |= AV_CPU_FLAG_RVD; +#endif + +#ifdef __riscv_i + ret |= AV_CPU_FLAG_RVI; +#endif +#if defined (__riscv_flen) && (__riscv_flen >= 32) + ret |= AV_CPU_FLAG_RVF; +#if (__riscv_flen >= 64) + ret |= AV_CPU_FLAG_RVD; +#endif +#endif + + return ret; +} diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 8fd9bba0b0..e1135a84ac 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -232,6 +232,10 @@ static const struct { { "ALTIVEC", "altivec", AV_CPU_FLAG_ALTIVEC }, { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, +#elif ARCH_RISCV + { "RVI", "rvi", AV_CPU_FLAG_RVI }, + { "RVF", "rvf", AV_CPU_FLAG_RVF }, + { "RVD", "rvd", AV_CPU_FLAG_RVD }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Mon Sep 26 14:52:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38340 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304182pzh; Mon, 26 Sep 2022 07:53:32 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4K0DEoz5gPR9wxjLuqCLCvYmlCF/mNqberzdJCbxBmw/dnKdHbt5hhAzMnlm8yfSGErcHE X-Received: by 2002:a05:6402:3211:b0:453:ba03:9dee with SMTP id g17-20020a056402321100b00453ba039deemr23268105eda.351.1664204012141; Mon, 26 Sep 2022 07:53:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204012; cv=none; d=google.com; s=arc-20160816; b=bb4XvH402S438W4o3S2miPAkdRtDcJeerZZc0ivd2Lw9Unk7h/Tsi7crOjGn+hx/qJ CEOec9tpOLbh1FWe1VDsvhAmew9RBVa36EjZZ0iQMYso0J2WoO4w4i02MoXfTpAnb4x5 Q+K1Euc1qmZUHvGHcd+zFRVq4YTKGQ7NS/cqez6InYakaUenkJYLEffFMr9/b1MLS2aT TKazKeEWBmxwqJeFCOtbIzFmpZ5Mhg/1+PX6JE6TUw37sWFqEKttN7Ugmoj0eae7VM5u A+If8jn0RMM5gQegRaLTLXSMHm0Zyl+a4m4l0bRMZ5rofPthpv/ZPOjN4DNommuLaABO NQPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=O+HgSwCNmXwf3ppSUZbEowfy1qCjm51aUTfBxBrqgAo=; b=fcCAzC/5wma7fLU5rUwGLa89GKFkx6cvFxp7DPWBwQrRQbIAmvjN7jJN3kNjS9vsYc macY6Wi3+O4vUQ+Xub5IYD+kzE3m12bpE2XxttrDhGiApZFvjLYq1vFQU73GlUzptn7C R5W/2C33mhF/q3n51FLGija9bcwyQiJlRnlJPaJLDMSuc+aiqH0XX3DuKv4iXCXhF3kD x+GfX1xq3DFnjoAVQD6vqcLtQSS1EMzOYQGlBY309kZ7rmseX9iMZTKWyZC9tYjzarUE TJ7FQZomEfKxfnhZUEWDOKLDLzJOyNW5XE7u5u9uVegs4EZjj3pzB5L30aD6XfY6MzxA Qd1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id t6-20020a056402524600b00447b2f52d55si17374726edd.627.2022.09.26.07.53.10; Mon, 26 Sep 2022 07:53:32 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 56C4868BB17; Mon, 26 Sep 2022 17:52:59 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AAD2768B32D for ; Mon, 26 Sep 2022 17:52:51 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6079BC001B for ; Mon, 26 Sep 2022 17:52:51 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:22 +0300 Message-Id: <20220926145251.56351-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/31] lavu/riscv: initial common header for assembler macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: UpyYh2UkmdYC From: Rémi Denis-Courmont --- libavutil/riscv/asm.S | 77 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) create mode 100644 libavutil/riscv/asm.S diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S new file mode 100644 index 0000000000..dbd97f40a4 --- /dev/null +++ b/libavutil/riscv/asm.S @@ -0,0 +1,77 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * Loosely based on earlier work copyrighted by Måns Rullgård, 2008. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#if defined (__riscv_float_abi_soft) +#define NOHWF +#define NOHWD +#define HWF # +#define HWD # +#elif defined (__riscv_float_abi_single) +#define NOHWF # +#define NOHWD +#define HWF +#define HWD # +#else +#define NOHWF # +#define NOHWD # +#define HWF +#define HWD +#endif + + .macro func sym, ext= + .text + .align 2 + + .option push + .ifnb \ext + .option arch, +\ext + .endif + + .global \sym + .hidden \sym + .type \sym, %function + \sym: + + .macro endfunc + .size \sym, . - \sym + .option pop + .previous + .purgem endfunc + .endm + .endm + + .macro const sym, align=3, relocate=0 + .if \relocate + .pushsection .data.rel.ro + .else + .pushsection .rodata + .endif + .align \align + \sym: + + .macro endconst + .size \sym, . - \sym + .popsection + .purgem endconst + .endm + .endm From patchwork Mon Sep 26 14:52:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38339 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304030pzh; Mon, 26 Sep 2022 07:53:22 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4W9RAFehsHlsG8TNPgXAqLNHRil7gHA6m2xCkJsufQXrYdzdbIojWjEqqIoltejqjo/FWO X-Received: by 2002:a17:906:8a46:b0:781:71fc:d23f with SMTP id gx6-20020a1709068a4600b0078171fcd23fmr18841901ejc.500.1664204001834; Mon, 26 Sep 2022 07:53:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204001; cv=none; d=google.com; s=arc-20160816; b=Y3o2yKocgbZ3XDuzo8/Llswlbpu3gRcA8ikOIK4yWEIoCdsAKU6N+dFr8UQvm3jAGW adjz3TvkoNtKdFfzUFnxTCj7+qPcD+9N54c3uIn9MM7lfb0nGjm3nsnbrYKk2gQgudEM gZbWjfsVnEKp9yIc69lbbiXxQo/nAOXXL50stt0xovkvYR14hmSbCFGARG9C2MZhuImx Etp264Cf9/KxC4dGCu2jQep2mzs0cMWW80pIO6zgx1ejEX/XfGhCAfBx5Gmc25xgHNT7 tPgPqLXzvIlr6dt9YxAnZnChlw20KJAdd6Kc7Fp2yblKQBnw1wX3Ad2iO4wscIs5vHxr 2nrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Hd++iMz3VwjtbZCtNQUPsT5V+P6VHH6YU5sCfBL1v18=; b=jXn9foOVWKLpYvRgu5e/l37bNg6bQsAQyVsgdpFHxVBd/Ig5MqSGGj4e8SjxXjtwSG jLHVqe4YyIQvgvG+jYc/9bc3muWCMTtilwje2tCVaQkB01ggzho9Cl9cA9bg4KrCunb7 xdwNzoycrqb6wlvEnHPvhLc9qsnvYiPPVChZbgNcJMqzDXBLmysAJ7VQEHP8k0ypLNww Nrq0jRIeR0fkHPTUUzIT3yYbDiCC2XcYIXJWCWZaNqNd+NQfHFSPyPjcQyx1+/1HLqLu hcRlXqtjO6Y5Gruy85rckxi2IC7coBJKUgrbNNzFGoWPaEKpY4eVl5Mg5XL62zi10ssS nb7w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v8-20020a50a448000000b0044f2bdfc098si13515259edb.532.2022.09.26.07.53.21; Mon, 26 Sep 2022 07:53:21 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3AECC68BB39; Mon, 26 Sep 2022 17:53:00 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DE82268B32D for ; Mon, 26 Sep 2022 17:52:51 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 93417C00AF for ; Mon, 26 Sep 2022 17:52:51 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:23 +0300 Message-Id: <20220926145251.56351-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/31] lavc/audiodsp: RISC-V F vector_clipf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: qtuXLgSJwqgN From: Rémi Denis-Courmont RV64G supports MIN & MAX instructions natively only on floating point registers, not general purpose ones. The later would require the Zbb extension. Due to that, it is actually faster to perform the clipping "properly" in FPU. Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech): audiodsp.vector_clipf_c: 29551.5 audiodsp.vector_clipf_rvf: 17871.0 Also tried unrolling with 2 or 8 elements but it gets worse either way. --- libavcodec/audiodsp.c | 2 ++ libavcodec/audiodsp.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/audiodsp_init.c | 33 +++++++++++++++++++++ libavcodec/riscv/audiodsp_rvf.S | 49 ++++++++++++++++++++++++++++++++ 5 files changed, 87 insertions(+) create mode 100644 libavcodec/riscv/Makefile create mode 100644 libavcodec/riscv/audiodsp_init.c create mode 100644 libavcodec/riscv/audiodsp_rvf.S diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c index ff43e87dce..eba6e809fd 100644 --- a/libavcodec/audiodsp.c +++ b/libavcodec/audiodsp.c @@ -113,6 +113,8 @@ av_cold void ff_audiodsp_init(AudioDSPContext *c) ff_audiodsp_init_arm(c); #elif ARCH_PPC ff_audiodsp_init_ppc(c); +#elif ARCH_RISCV + ff_audiodsp_init_riscv(c); #elif ARCH_X86 ff_audiodsp_init_x86(c); #endif diff --git a/libavcodec/audiodsp.h b/libavcodec/audiodsp.h index aa6fa7898b..485b512839 100644 --- a/libavcodec/audiodsp.h +++ b/libavcodec/audiodsp.h @@ -55,6 +55,7 @@ typedef struct AudioDSPContext { void ff_audiodsp_init(AudioDSPContext *c); void ff_audiodsp_init_arm(AudioDSPContext *c); void ff_audiodsp_init_ppc(AudioDSPContext *c); +void ff_audiodsp_init_riscv(AudioDSPContext *c); void ff_audiodsp_init_x86(AudioDSPContext *c); #endif /* AVCODEC_AUDIODSP_H */ diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile new file mode 100644 index 0000000000..414a9e9bd8 --- /dev/null +++ b/libavcodec/riscv/Makefile @@ -0,0 +1,2 @@ +OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ + riscv/audiodsp_rvf.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c new file mode 100644 index 0000000000..c5842815d6 --- /dev/null +++ b/libavcodec/riscv/audiodsp_init.c @@ -0,0 +1,33 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/audiodsp.h" + +void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); + +av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVF) + c->vector_clipf = ff_vector_clipf_rvf; +} diff --git a/libavcodec/riscv/audiodsp_rvf.S b/libavcodec/riscv/audiodsp_rvf.S new file mode 100644 index 0000000000..2ec8a11691 --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvf.S @@ -0,0 +1,49 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clipf_rvf, f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.x fa1, a4 +1: + flw ft0, (a1) + flw ft1, 4(a1) + fmax.s ft0, ft0, fa0 + flw ft2, 8(a1) + fmax.s ft1, ft1, fa0 + flw ft3, 12(a1) + fmax.s ft2, ft2, fa0 + addi a2, a2, -4 + fmax.s ft3, ft3, fa0 + addi a1, a1, 16 + fmin.s ft0, ft0, fa1 + fmin.s ft1, ft1, fa1 + fsw ft0, (a0) + fmin.s ft2, ft2, fa1 + fsw ft1, 4(a0) + fmin.s ft3, ft3, fa1 + fsw ft2, 8(a0) + fsw ft3, 12(a0) + addi a0, a0, 16 + bnez a2, 1b + + ret +endfunc From patchwork Mon Sep 26 14:52:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38342 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304275pzh; Mon, 26 Sep 2022 07:53:44 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6X9IIYkMGAqM7lrqWsMiiQvO5d0skHS50e73sn2lfAZ63I2LH5oDSDt7h0pyjl15KPedOB X-Received: by 2002:a17:907:7f0e:b0:783:93a3:791f with SMTP id qf14-20020a1709077f0e00b0078393a3791fmr4178787ejc.59.1664204023916; Mon, 26 Sep 2022 07:53:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204023; cv=none; d=google.com; s=arc-20160816; b=cWSZKuSrxzNqp8Ua0hIU3Ctn44nx5iANuyMgOmUJkEwSsdAyXdmO08ndGPddpjJRj6 IiP6NGrOtVD93gPgmGGOxq5w4finiccMcrcEEYsrn/zIE5O84Vr5RJpTQN7yqh+xXuue Oy2IM8sKYNO9+CO7H9W8vGHAgNeSyBw00FXtPX2GDutICzSOHX5jwPk3xWSyD+cErOaI OslYT9mRAGceClAJxtRXaTexvrRz1RL/hvRVumIyrx+wQSaVilaAEO1rwjBL90wMBLwc m4bewnJKlD6ovfa2M+98Vju2tZU/ktxUqLbV9VUmJBLVdvIzaQDo/QpcTL1Dp763Cjmx 2bGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=/Ga+4PJJevRhqo8rey9k1goQZJwqX8UDtM5memtoZcY=; b=jObdZyt47AyG0bT50vdSQc4xkb1gSzNoAH/S5JRNjIgQyvyQZuT5XPiwJcjp0i5Fz5 GwBlWEm0i2ZIqMlMNtGrogAafjBhnROqdSjyOMgVOi5TvoDS2aTWDL+0EQT9hXIbAiV5 YwYSyhMOWeEhAP+1yTM4hrrzNiOs/lOScdegUyGeVxBgTSZMSxVPv4UiVQHex2Jjyfzw UK5MKJ9tYmyK/ddLaW+aTn9iY89ggka2sMjl+lOWHY/fg8Hqn0GS2FSOXQmMExbrxw0D 8YRSc3p0J3zORsFU1SbPvBujN9/HE34Bzfd/OzjOR7dcBLPJWAs4IOYyc2znGF1W876v 9pTA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d3-20020a170906c20300b007330c08fe49si111861ejz.206.2022.09.26.07.53.43; Mon, 26 Sep 2022 07:53:43 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 655F568BB50; Mon, 26 Sep 2022 17:53:02 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1C78668B32D for ; Mon, 26 Sep 2022 17:52:52 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id C5818C00B0 for ; Mon, 26 Sep 2022 17:52:51 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:24 +0300 Message-Id: <20220926145251.56351-4-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/31] lavc/pixblockdsp: RISC-V I get_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: GAuCoQKzJqta From: Rémi Denis-Courmont Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech): get_pixels_c: 180.0 get_pixels_rvi: 136.7 --- libavcodec/pixblockdsp.c | 2 + libavcodec/pixblockdsp.h | 2 + libavcodec/riscv/Makefile | 2 + libavcodec/riscv/pixblockdsp_init.c | 45 ++++++++++++++++++++++ libavcodec/riscv/pixblockdsp_rvi.S | 59 +++++++++++++++++++++++++++++ 5 files changed, 110 insertions(+) create mode 100644 libavcodec/riscv/pixblockdsp_init.c create mode 100644 libavcodec/riscv/pixblockdsp_rvi.S diff --git a/libavcodec/pixblockdsp.c b/libavcodec/pixblockdsp.c index 17c487da1e..4294075cee 100644 --- a/libavcodec/pixblockdsp.c +++ b/libavcodec/pixblockdsp.c @@ -109,6 +109,8 @@ av_cold void ff_pixblockdsp_init(PixblockDSPContext *c, AVCodecContext *avctx) ff_pixblockdsp_init_arm(c, avctx, high_bit_depth); #elif ARCH_PPC ff_pixblockdsp_init_ppc(c, avctx, high_bit_depth); +#elif ARCH_RISCV + ff_pixblockdsp_init_riscv(c, avctx, high_bit_depth); #elif ARCH_X86 ff_pixblockdsp_init_x86(c, avctx, high_bit_depth); #elif ARCH_MIPS diff --git a/libavcodec/pixblockdsp.h b/libavcodec/pixblockdsp.h index 07c2ec4f40..9b002aa3d6 100644 --- a/libavcodec/pixblockdsp.h +++ b/libavcodec/pixblockdsp.h @@ -52,6 +52,8 @@ void ff_pixblockdsp_init_arm(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); void ff_pixblockdsp_init_ppc(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); +void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, AVCodecContext *avctx, + unsigned high_bit_depth); void ff_pixblockdsp_init_x86(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); void ff_pixblockdsp_init_mips(PixblockDSPContext *c, AVCodecContext *avctx, diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 414a9e9bd8..da07f1fe96 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,2 +1,4 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o +OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ + riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c new file mode 100644 index 0000000000..04bf52649f --- /dev/null +++ b/libavcodec/riscv/pixblockdsp_init.c @@ -0,0 +1,45 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/avcodec.h" +#include "libavcodec/pixblockdsp.h" + +void ff_get_pixels_8_rvi(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); +void ff_get_pixels_16_rvi(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); + +av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, + AVCodecContext *avctx, + unsigned high_bit_depth) +{ + int cpu_flags = av_get_cpu_flags(); + + if (cpu_flags & AV_CPU_FLAG_RVI) { + if (high_bit_depth) + c->get_pixels = ff_get_pixels_16_rvi; + else + c->get_pixels = ff_get_pixels_8_rvi; + } +} diff --git a/libavcodec/riscv/pixblockdsp_rvi.S b/libavcodec/riscv/pixblockdsp_rvi.S new file mode 100644 index 0000000000..93ece4405e --- /dev/null +++ b/libavcodec/riscv/pixblockdsp_rvi.S @@ -0,0 +1,59 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_get_pixels_8_rvi +.irp row, 0, 1, 2, 3, 4, 5, 6, 7 + ld t0, (a1) + add a1, a1, a2 + sd zero, ((\row * 16) + 0)(a0) + addi t6, t6, -1 + sd zero, ((\row * 16) + 8)(a0) + srli t1, t0, 8 + sb t0, ((\row * 16) + 0)(a0) + srli t2, t0, 16 + sb t1, ((\row * 16) + 2)(a0) + srli t3, t0, 24 + sb t2, ((\row * 16) + 4)(a0) + srli t4, t0, 32 + sb t3, ((\row * 16) + 6)(a0) + srli t1, t0, 40 + sb t4, ((\row * 16) + 8)(a0) + srli t2, t0, 48 + sb t1, ((\row * 16) + 10)(a0) + srli t3, t0, 56 + sb t2, ((\row * 16) + 12)(a0) + sb t3, ((\row * 16) + 14)(a0) +.endr + ret +endfunc + +func ff_get_pixels_16_rvi +.irp row, 0, 1, 2, 3, 4, 5, 6, 7 + ld t0, 0(a1) + ld t1, 8(a1) + add a1, a1, a2 + sd t0, ((\row * 16) + 0)(a0) + sd t1, ((\row * 16) + 8)(a0) +.endr + ret +endfunc From patchwork Mon Sep 26 14:52:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38349 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304865pzh; Mon, 26 Sep 2022 07:54:51 -0700 (PDT) X-Google-Smtp-Source: AMsMyM49oqQHh7gMHiSNVesMHLqQUS1EDyc+b3IgwUl+rp/tNdO3ep26BI57/na/tEIhmGyw08fW X-Received: by 2002:a17:907:843:b0:73a:5b0e:8352 with SMTP id ww3-20020a170907084300b0073a5b0e8352mr18903320ejb.438.1664204091340; Mon, 26 Sep 2022 07:54:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204091; cv=none; d=google.com; s=arc-20160816; b=YQ6deE6khvezI8Ze0nK6b3cSdiNfPyjmsmAOt0V9pYBtdzZoa4jYnYTT3qpYu+SPaH q643Gc59J6wRcXGh92xpHnHeFmgQGoawe6Fxkn6nH8cbYVHFhsbK7/vL72agUyLaurx5 L0xcEKwtGWsTx7BeBT3jZlIyYJOzOYaGoDI1Ux42cYrtHpYrofjvqHGmDoS+nD1twEO5 Go2ysS3CvUuQl6ip4OaMJ2asv6938mRnV50l3VQrDiD1k0B57c9lenJ3P2HshhmUkTle R0WO5hrzGCJnWWcDCmPzREgGNsrzNvBiGIC8cgjE5q6ja0xzZ1+Kq8tKkUM7TGKPe8zo rodg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=d3goI1BLy1jOHpETSNrxuoOw2NDyWRwkfTkVelgkCPc=; b=ClhetCome7ux3rx8zmqXfFg3n4npmj2uBXmU9I3sBiyZBGqrlmvMWZ6Rdb9x1RVzsw BP+06yndPvPpOi6G2a2PlWQiHcclShTIiNwdR18h4V4SMhJ2VSUnGCHDD04fIaIoIVzj VRwTvHwnRsZBqE9v3W7q7lll4TpWs/uHvZtkF1QdmA/kUHbj+ASI+SOI98WJXPK3MXQk m/gQoT3YT4L20vy7ecnswzS5fdS6aPvWuCbAiS6iCHkITFm86FSrHvEJGAyEi+bPfzzK iLx8BGk4CUawU5zgYECAbFXUCwB3NFhfWrAuPpBPheY01AYhtkuQb4hzW4yPxwJ8j8CI 0+1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qn5-20020a170907210500b00781d695b597si91481ejb.473.2022.09.26.07.54.51; Mon, 26 Sep 2022 07:54:51 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E472E68BB0A; Mon, 26 Sep 2022 17:53:09 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 459C368BA9C for ; Mon, 26 Sep 2022 17:52:52 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 04175C00B1 for ; Mon, 26 Sep 2022 17:52:51 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:25 +0300 Message-Id: <20220926145251.56351-5-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/31] lavu/cpu: CPU flags for the RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Ijx9LG+j0eAG From: Rémi Denis-Courmont RVV defines a total of 12 different extensions, including: - 5 different instruction subsets: - Zve32x: 8-, 16- and 32-bit integers, - Zve32f: Zve32x plus single precision floats, - Zve64x: Zve32x plus 64-bit integers, - Zve64f: Zve32f plus Zve64x, - Zve64d: Zve64f plus double precision floats. - 6 different vector lengths: - Zvl32b (embedded only), - Zvl64b (embedded only), - Zvl128b, - Zvl256b, - Zvl512b, - Zvl1024b, - and the V extension proper: equivalent to Zve64f and Zvl128b. In total, there are 6 different possible sets of supported instructions (including the empty set), but for convenience we allocate one bit for each type sets: up-to-32-bit ints (RVV_I32), floats (RVV_F32), 64-bit ints (RVV_I64) and doubles (RVV_F64). Whence the vector size is needed, it can be retrieved by reading the unprivileged read-only vlenb CSR. This should probably be a separate helper macro if needed at a later point. --- libavutil/cpu.c | 4 ++++ libavutil/cpu.h | 4 ++++ libavutil/riscv/cpu.c | 19 +++++++++++++++++++ tests/checkasm/checkasm.c | 4 ++++ 4 files changed, 31 insertions(+) diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 8b6eef9873..5818fd9c1c 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -184,6 +184,10 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" }, { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" }, { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" }, + { "rvv-i32", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_I32 }, .unit = "flags" }, + { "rvv-f32", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_F32 }, .unit = "flags" }, + { "rvv-i64", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_I64 }, .unit = "flags" }, + { "rvv", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVV_F64 }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9aae2ccc7a..18f42af015 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -82,6 +82,10 @@ #define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank) #define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP) #define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP) +#define AV_CPU_FLAG_RVV_I32 (1 << 3) ///< Vectors of 8/16/32-bit int's */ +#define AV_CPU_FLAG_RVV_F32 (1 << 4) ///< Vectors of float's */ +#define AV_CPU_FLAG_RVV_I64 (1 << 5) ///< Vectors of 64-bit int's */ +#define AV_CPU_FLAG_RVV_F64 (1 << 6) ///< Vectors of double's /** * Return the flags which specify extensions supported by the CPU. diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c index 6803f035e5..e234201395 100644 --- a/libavutil/riscv/cpu.c +++ b/libavutil/riscv/cpu.c @@ -40,6 +40,11 @@ int ff_get_cpu_flags_riscv(void) ret |= AV_CPU_FLAG_RVF; if (hwcap & HWCAP_RV('D')) ret |= AV_CPU_FLAG_RVD; + + /* The V extension implies all Zve* functional subsets */ + if (hwcap & HWCAP_RV('V')) + ret |= AV_CPU_FLAG_RVV_I32 | AV_CPU_FLAG_RVV_I64 + | AV_CPU_FLAG_RVV_F32 | AV_CPU_FLAG_RVV_F64; #endif #ifdef __riscv_i @@ -50,6 +55,20 @@ int ff_get_cpu_flags_riscv(void) #if (__riscv_flen >= 64) ret |= AV_CPU_FLAG_RVD; #endif +#endif + + /* If RV-V is enabled statically at compile-time, check the details. */ +#ifdef __riscv_vectors + ret |= AV_CPU_FLAG_RVV_I32; +#if __riscv_v_elen >= 64 + ret |= AV_CPU_FLAG_RVV_I64; +#endif +#if __riscv_v_elen_fp >= 32 + ret |= AV_CPU_FLAG_RVV_F32; +#if __riscv_v_elen_fp >= 64 + ret |= AV_CPU_FLAG_RVV_F64; +#endif +#endif #endif return ret; diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index e1135a84ac..90dd7e4634 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -236,6 +236,10 @@ static const struct { { "RVI", "rvi", AV_CPU_FLAG_RVI }, { "RVF", "rvf", AV_CPU_FLAG_RVF }, { "RVD", "rvd", AV_CPU_FLAG_RVD }, + { "RVVi32", "rvv_i32", AV_CPU_FLAG_RVV_I32 }, + { "RVVf32", "rvv_f32", AV_CPU_FLAG_RVV_F32 }, + { "RVVi64", "rvv_i64", AV_CPU_FLAG_RVV_I64 }, + { "RVVf64", "rvv_f64", AV_CPU_FLAG_RVV_F64 }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Mon Sep 26 14:52:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38367 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2306813pzh; Mon, 26 Sep 2022 07:57:58 -0700 (PDT) X-Google-Smtp-Source: AMsMyM44/J7rBZbpJB7hYZngCDmrLeZUCV459WGSvLbfM5506gH0NNHZXHn+7G+mpglJm74HtOtp X-Received: by 2002:a05:6402:450c:b0:443:6279:774f with SMTP id ez12-20020a056402450c00b004436279774fmr23020084edb.11.1664204278311; Mon, 26 Sep 2022 07:57:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204278; cv=none; d=google.com; s=arc-20160816; b=Qki0pIYsp16Lo+LHXBUTufmSOmzxcGE2sabwwQBjMXzsIqG512iIZK78E7EFITdIQo v3j/a0moO7zpBU4yPApAam18kf43mskQtXV35m8OHpGZzhTH+7nKnCnddqsNVx4IBibI uh5Nrm/xOrOla3A58CyB+z0cjf2MuhOWOzysN3JOOMML8DlwKeHrre3sqshwI4TeC/L9 Gj0Gs6E0nAiw66MpFrIjIGrSuGS0IdeyXuACGTycwDKfiOX9hE0vKBdhon/0KIjpWyMw WWTjWvXUvEbPWXKMe8lYMEpGq86hNm1qHxvGKI5SkMV1G4PdWAzYQt6S88ZqAIQQ/YID sYhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=15r1qRl87t5VsghkiR3cNI/Gki9TZSdCFr2zgAwi5BQ=; b=r16ubnouPUD1JNOqWAFHMu/DMeWk6Bg34E21XK8q0e4GmThBv71G0M11A5SXibn4O+ YRzWFC/hr2tfvpgCk4sp+xMQ9kUw8e8H8HGvzpjlhLBVvmknebijdPh4YcahYDViCx0l Hc0IzzXNhmH7GGBBoVyT3pSqgMEkFX36GYm2NdX/t1vIfgZTRmZK/MPrnuaok96+MMl2 b3AHgwDr+m6Zvl3SMNMMODucrILl5GCt6DiX6A5muZKTQGG3FSGCbL2PDIMvtxboCpwG 5tpFV/limGmMpCPvUJqGkFtqwtkz3HlsfKe53OvK9lTSppRAavkZ4jjKN7oNRwRVsUQR TfpQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id w11-20020a05640234cb00b0044ea4c5a78dsi20377141edc.158.2022.09.26.07.57.58; Mon, 26 Sep 2022 07:57:58 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3E6AF68BC0E; Mon, 26 Sep 2022 17:53:30 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A076068BB35 for ; Mon, 26 Sep 2022 17:52:56 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 2DD9CC00B2 for ; Mon, 26 Sep 2022 17:52:52 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:26 +0300 Message-Id: <20220926145251.56351-6-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/31] configure: probe RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: TvadFoNX+aD8 From: Rémi Denis-Courmont --- Makefile | 2 +- configure | 15 +++++++++++++++ ffbuild/arch.mak | 2 ++ 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 61f79e27ae..1fb742f390 100644 --- a/Makefile +++ b/Makefile @@ -91,7 +91,7 @@ ffbuild/.config: $(CONFIGURABLE_COMPONENTS) SUBDIR_VARS := CLEANFILES FFLIBS HOSTPROGS TESTPROGS TOOLS \ HEADERS ARCH_HEADERS BUILT_HEADERS SKIPHEADERS \ ARMV5TE-OBJS ARMV6-OBJS ARMV8-OBJS VFP-OBJS NEON-OBJS \ - ALTIVEC-OBJS VSX-OBJS MMX-OBJS X86ASM-OBJS \ + ALTIVEC-OBJS VSX-OBJS RVV-OBJS MMX-OBJS X86ASM-OBJS \ MIPSFPU-OBJS MIPSDSPR2-OBJS MIPSDSP-OBJS MSA-OBJS \ MMI-OBJS LSX-OBJS LASX-OBJS OBJS SLIBOBJS SHLIBOBJS \ STLIBOBJS HOSTOBJS TESTOBJS diff --git a/configure b/configure index c157338b1f..a41ebda6d4 100755 --- a/configure +++ b/configure @@ -462,6 +462,7 @@ Optimization options (experts only): --disable-mmi disable Loongson MMI optimizations --disable-lsx disable Loongson LSX optimizations --disable-lasx disable Loongson LASX optimizations + --disable-rvv disable RISC-V Vector optimizations --disable-fast-unaligned consider unaligned accesses slow Developer options (useful when working on FFmpeg itself): @@ -2126,6 +2127,10 @@ ARCH_EXT_LIST_PPC=" vsx " +ARCH_EXT_LIST_RISCV=" + rvv +" + ARCH_EXT_LIST_X86=" $ARCH_EXT_LIST_X86_SIMD cpunop @@ -2135,6 +2140,7 @@ ARCH_EXT_LIST_X86=" ARCH_EXT_LIST=" $ARCH_EXT_LIST_ARM $ARCH_EXT_LIST_PPC + $ARCH_EXT_LIST_RISCV $ARCH_EXT_LIST_X86 $ARCH_EXT_LIST_MIPS $ARCH_EXT_LIST_LOONGSON @@ -2642,6 +2648,8 @@ ppc4xx_deps="ppc" vsx_deps="altivec" power8_deps="vsx" +rvv_deps="riscv" + loongson2_deps="mips" loongson3_deps="mips" mmi_deps_any="loongson2 loongson3" @@ -6110,6 +6118,10 @@ elif enabled ppc; then check_cpp_condition power8 "altivec.h" "defined(_ARCH_PWR8)" fi +elif enabled riscv; then + + enabled rvv && check_inline_asm rvv '".option arch, +v\nvsetivli zero, 0, e8, m1, ta, ma"' + elif enabled x86; then check_builtin rdtsc intrin.h "__rdtsc()" @@ -7596,6 +7608,9 @@ if enabled loongarch; then echo "LSX enabled ${lsx-no}" echo "LASX enabled ${lasx-no}" fi +if enabled riscv; then + echo "RISC-V Vector enabled ${rvv-no}" +fi echo "debug symbols ${debug-no}" echo "strip symbols ${stripping-no}" echo "optimize for size ${small-no}" diff --git a/ffbuild/arch.mak b/ffbuild/arch.mak index 997e31e85e..39d76ee152 100644 --- a/ffbuild/arch.mak +++ b/ffbuild/arch.mak @@ -15,5 +15,7 @@ OBJS-$(HAVE_LASX) += $(LASX-OBJS) $(LASX-OBJS-yes) OBJS-$(HAVE_ALTIVEC) += $(ALTIVEC-OBJS) $(ALTIVEC-OBJS-yes) OBJS-$(HAVE_VSX) += $(VSX-OBJS) $(VSX-OBJS-yes) +OBJS-$(HAVE_RVV) += $(RVV-OBJS) $(RVV-OBJS-yes) + OBJS-$(HAVE_MMX) += $(MMX-OBJS) $(MMX-OBJS-yes) OBJS-$(HAVE_X86ASM) += $(X86ASM-OBJS) $(X86ASM-OBJS-yes) From patchwork Mon Sep 26 14:52:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38366 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2306785pzh; Mon, 26 Sep 2022 07:57:55 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4qXVoLOWOEhq7cHlDGKhHxwuq3HtLEJOLSrMEX8IRx4sqDw2Ye/MHdBr8k1x6H+1CcaozP X-Received: by 2002:a50:ee99:0:b0:456:eefd:29a with SMTP id f25-20020a50ee99000000b00456eefd029amr13556517edr.423.1664204274914; Mon, 26 Sep 2022 07:57:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204274; cv=none; d=google.com; s=arc-20160816; b=aQA43v1eytu9Tv3uTz3H8Fm0H7glOQlx3tjaYGK82+/MC/m0ssZZG5mFa78TFS5xxc Jp8h6d7mciJU9RkU9zWzlWeIoEyY5xlnqbK8uH99YcWDv89r4TDQZf8tfMQrCGhzukN3 r+/YmngBOWd6kdMBqpVKdRma7jnkfGNqFYB5Q2UKzddTvpAb8GmxlPsJWLk5sxIa9f4V YYqMyIdLpwfM1XNhaDGqhXenEDsuqBylpAZe5zbjq2cAkXxMQUvaH8CaxUg+RHjnMQm7 rTxSFJ4yMVpYV7A2KAxyoLhxJS8WA/26OyspkTFqzCMXv1kb/PFl8HAdynkqY5Nv+imD 2EuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=hfS16BYjn1Fmv0pRVNKnO+R9Q7WUDnTn73ppaKrHqlM=; b=krfYmpaj99xUbrqjPEwFFSzi10VvSk3qjHu5eAX1CmFO9e6szri9vKMvq4fecNzs99 b0fngMUinhvyVY9t6VHO0M6Z+5p0FhqrpQZj5sVJ7mAhUuan9etHKpNYSVbgYF1aurO/ 5WHPV5R9k5YdcjWCEelMeljhPWk4PEMFL34NqPPdTI2/LU10auxaqtZzEoWlFD5drl4V QCV8u9VxvJFE8kh/0uJ5s7vmw4QuZsTsEZ7+47gXvClr/jvNAQniAY/EaP/YQ4pjC+Xf APafxFV63JJCPcH2CW5GKymkyfSp9P3HKpBJito21P7gA1x1/1UmCOtrEixmeTMTiiBE uz8g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f21-20020a170906c09500b00782372a6e08si127603ejz.198.2022.09.26.07.57.48; Mon, 26 Sep 2022 07:57:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5A8F268BC09; Mon, 26 Sep 2022 17:53:29 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A09C968BB36 for ; Mon, 26 Sep 2022 17:52:56 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 5780FC00B3 for ; Mon, 26 Sep 2022 17:52:52 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:27 +0300 Message-Id: <20220926145251.56351-7-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/31] lavu/riscv: fallback macros for SH{1, 2, 3}ADD X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: RUW7Ja1txy8g From: Rémi Denis-Courmont Those mnemonics require the very latest binutils release at the time of writing. These macros provide seamless backward compatibility. --- libavutil/riscv/asm.S | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S index dbd97f40a4..de5e1ad0a6 100644 --- a/libavutil/riscv/asm.S +++ b/libavutil/riscv/asm.S @@ -75,3 +75,22 @@ .purgem endconst .endm .endm + +#if !defined (__riscv_zba) + /* SH{1,2,3}ADD definitions for pre-Zba assemblers */ + .macro shnadd n, rd, rs1, rs2 + .insn r OP, 2 * \n, 16, \rd, \rs1, \rs2 + .endm + + .macro sh1add rd, rs1, rs2 + shnadd 1, \rd, \rs1, \rs2 + .endm + + .macro sh2add rd, rs1, rs2 + shnadd 2, \rd, \rs1, \rs2 + .endm + + .macro sh3add rd, rs1, rs2 + shnadd 3, \rd, \rs1, \rs2 + .endm +#endif From patchwork Mon Sep 26 14:52:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38365 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2306662pzh; Mon, 26 Sep 2022 07:57:39 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6oIVrGieYnjyIh9CMBloRD9V3tf2L9FN3ZXeaeIsISVLt0doyu2U0wKKztcd9uhEACXAW8 X-Received: by 2002:a05:6402:b07:b0:456:e2f4:9e65 with SMTP id bm7-20020a0564020b0700b00456e2f49e65mr14859775edb.209.1664204259215; Mon, 26 Sep 2022 07:57:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204259; cv=none; d=google.com; s=arc-20160816; b=KpcxUzpnT6nnCvEg9S8Q0t746C6jO8WvR/spm9WfVPX/drJwuO8Qe0l305gpvQ1pJO h/g/kDRkREIYGiSj2ebH79kY3UQAywiYa//NKmpBPZ+fvBNEOX8LvX7JUMfVNidPXMJL fz7M+yPHyqtNKkdShTExTwftTslay6B33MyV1r7BMsCWq0govSZxDozFOwlUeg6i2IWW EmJ1dFQ6u1fsReLRbJZHUUvK7Qmv86goO1tKJWRpnfh1zkfU3UcvEGZJlWgWDhP8zlMW TDnqLdueKPZG4LbvXQ3PgBEri1bljCrIIBBAPvvlDZIAqG6OVZjp6/9zcIfU5gPxdcF9 9Y+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=N8dEw4trf/BIgbBXBbDVxDgXmV0WZ+L4Tzl6cfzUtSE=; b=Bmmb5weNjZZI3SexKaYzlCMEjGdTyxiU8VfOCjMHIAaHn86NJvcsiqEFrQ2RxCQwsm srl7DRMsqr+DlhcyFEIyhF1hJ64UFXO3xu8Yx0/Rmuqtsn7twC0Mp8skxFlEA18XJ8WA a/7+X4kdfXoFI83tzbQkiFIIrABAkvb+nvAVI8v1WmjzqmCKKVnth+wW3sL90DrpuFCs Zm/76p9AvKxKh7BuetibMpa7PT5xjSimYypoBQNArto6Y3TiVfwJCHOB1DEqDSJFkBzk my1I8hbgRa7B8rETtJrdUkM2P2/bIDjlmPEIx3QPcvb0zBpPQs1sDJ3Hl92QdxhAKP7w 8Fow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hu9-20020a170907a08900b00780afde0bb3si155241ejc.70.2022.09.26.07.57.38; Mon, 26 Sep 2022 07:57:39 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0DA4A68BC05; Mon, 26 Sep 2022 17:53:28 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A041D68BB2E for ; Mon, 26 Sep 2022 17:52:56 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 80D0EC00B4 for ; Mon, 26 Sep 2022 17:52:52 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:28 +0300 Message-Id: <20220926145251.56351-8-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/31] lavu/floatdsp: RISC-V V vector_fmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: mD39p0IsF18c From: Rémi Denis-Courmont This is based on existing code from the VLC git tree with two minor changes to account for the different function prototypes. --- libavutil/float_dsp.c | 2 ++ libavutil/float_dsp.h | 1 + libavutil/riscv/Makefile | 4 +++- libavutil/riscv/float_dsp_init.c | 39 ++++++++++++++++++++++++++++++++ libavutil/riscv/float_dsp_rvv.S | 39 ++++++++++++++++++++++++++++++++ 5 files changed, 84 insertions(+), 1 deletion(-) create mode 100644 libavutil/riscv/float_dsp_init.c create mode 100644 libavutil/riscv/float_dsp_rvv.S diff --git a/libavutil/float_dsp.c b/libavutil/float_dsp.c index 8676c8b0f8..742dd679d2 100644 --- a/libavutil/float_dsp.c +++ b/libavutil/float_dsp.c @@ -156,6 +156,8 @@ av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact) ff_float_dsp_init_arm(fdsp); #elif ARCH_PPC ff_float_dsp_init_ppc(fdsp, bit_exact); +#elif ARCH_RISCV + ff_float_dsp_init_riscv(fdsp); #elif ARCH_X86 ff_float_dsp_init_x86(fdsp); #elif ARCH_MIPS diff --git a/libavutil/float_dsp.h b/libavutil/float_dsp.h index 9c664592bd..7cad9fc622 100644 --- a/libavutil/float_dsp.h +++ b/libavutil/float_dsp.h @@ -205,6 +205,7 @@ float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len); void ff_float_dsp_init_aarch64(AVFloatDSPContext *fdsp); void ff_float_dsp_init_arm(AVFloatDSPContext *fdsp); void ff_float_dsp_init_ppc(AVFloatDSPContext *fdsp, int strict); +void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp); void ff_float_dsp_init_x86(AVFloatDSPContext *fdsp); void ff_float_dsp_init_mips(AVFloatDSPContext *fdsp); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 1f818043dc..89a8d0d990 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1 +1,3 @@ -OBJS += riscv/cpu.o +OBJS += riscv/float_dsp_init.o \ + riscv/cpu.o +RVV-OBJS += riscv/float_dsp_rvv.o diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c new file mode 100644 index 0000000000..f4299049b0 --- /dev/null +++ b/libavutil/riscv/float_dsp_init.c @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/float_dsp.h" + +void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, + int len); + +av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_F32) + fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; +#endif +} diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S new file mode 100644 index 0000000000..50cb1fa90f --- /dev/null +++ b/libavutil/riscv/float_dsp_rvv.S @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a1) + sub a2, a2, t0 + vfmul.vf v16, v16, fa0 + sh2add a1, t0, a1 + vse32.v v16, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc From patchwork Mon Sep 26 14:52:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38341 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304196pzh; Mon, 26 Sep 2022 07:53:34 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6rlpP1QQINvBq/C2cNfyg0E+ocpEZuo1A268aNW6k6vCFg3OL2KiAOZoek1hfjBcdiECJ8 X-Received: by 2002:a05:6402:2709:b0:451:d665:e787 with SMTP id y9-20020a056402270900b00451d665e787mr22648838edd.317.1664204013840; Mon, 26 Sep 2022 07:53:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204013; cv=none; d=google.com; s=arc-20160816; b=RHl9sIsmLW63KCREG9oQf2Wmb2sqAHOw3b1+B+KdWKJK04taQWsBEWQ2v/15z76gbD 7lbxBMy1YgeMJKfzPDZd64Mt9kU9zgOP5N5kaV0HqOc8HDiiz8r30cfOyiZBh+w0iAiO RS3kAsrxUdHnser+n/QWHYCP/t1wXGifiiooHb+caU+ucRdd3kI5w4Z7MOO2XGQ50Gb+ qauDAT39WDTHnYzBFt3jly7DOAX1v6D+zq+w0aAUT8x1JACsPdLY8r4i4eJvvdTqYjFC 1WzOy85FRe4DHAa6vpoZSefzhs5RriuViDMJuKP3h1kI1gg9Mr/07UdGoD2PlYloG73O 9ooA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=XAE3QUFAbkqn7NVVimAtPA/PCiVHCRuZ7TWaVKpubx0=; b=uEya7RgvuWaldb476JKAVjL28UnpP6SWSUfCLBGFxDcqpuHVklSn6fM6XCByM8oVHf q/IKUPLQ6t46f/I8YBW+Z0ysh3gA5H7l3njxRfybS4TEnPYuk+jy51FxVAA7pdhZygc4 PF7+kCjOzmxBP4sWJtU9AFfaoYUKfCAK8sIz3Ucnw0XHhkQ0D6ZNXTMxgFg/NL5Bm3Pt S8Qe9zCe8zvVA7ZkBmpGdXGbxgEGGYyZP6LuCf3xOT9Wo/V19UMifAHFVpD7LpG4cKkk k3ba/IOU0RfpcIu8MW9DYPmtWLJG/Cq762PrsGcOOhzoaFKLzMr8gemB1OXbXXBBOYlw Wykg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e5-20020a170906044500b00779c6c57dbfsi51701eja.556.2022.09.26.07.53.32; Mon, 26 Sep 2022 07:53:33 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 617B668BB45; Mon, 26 Sep 2022 17:53:01 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E2A1568B940 for ; Mon, 26 Sep 2022 17:52:56 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id AB133C00B5 for ; Mon, 26 Sep 2022 17:52:52 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:29 +0300 Message-Id: <20220926145251.56351-9-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/31] lavu/floatdsp: RISC-V V vector_dmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ONpNWAJJQqC4 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 6 ++++++ libavutil/riscv/float_dsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index f4299049b0..3386139d49 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -28,6 +28,9 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, + int len); + av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) { #if HAVE_RVV @@ -35,5 +38,8 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_RVV_F32) fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + + if (flags & AV_CPU_FLAG_RVV_F64) + fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; #endif } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 50cb1fa90f..17dda471b4 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -37,3 +37,20 @@ NOHWF mv a2, a3 ret endfunc + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_dmul_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 +1: + vsetvli t0, a2, e64, m1, ta, ma + vle64.v v16, (a1) + sub a2, a2, t0 + vfmul.vf v16, v16, fa0 + sh3add a1, t0, a1 + vse64.v v16, (a0) + sh3add a0, t0, a0 + bnez a2, 1b + + ret +endfunc From patchwork Mon Sep 26 14:52:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38343 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304350pzh; Mon, 26 Sep 2022 07:53:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7IW13JwMZIdsT3cxlRKiYLeK/vvJDi/IVFFK20aKRftr1a10I4X4YmDlb8ZqJuImsTHkQu X-Received: by 2002:a05:6402:27d0:b0:451:b381:e0a1 with SMTP id c16-20020a05640227d000b00451b381e0a1mr22486368ede.4.1664204033875; Mon, 26 Sep 2022 07:53:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204033; cv=none; d=google.com; s=arc-20160816; b=tUg65qysStzthOilVEGdkLTcX28y+dmgkJTZBUvrQj9MDx8DbJRINhy+dW26qfoWeu 2P1SLn45nXEtGJhe3rcfvngEwyv1ECVTJCbv9PhEduTTBakVI7qnp8Lufjn/LHlSW2ZC 7VmcsKUL3nSD+pTRN4yWGwhEENW0qGJESJAO2C5dlZ78WyQ+YHhB1OPDNkBxlUIEVcYJ 7+lk7i/2VQZL7qRa/gK8D04WmJdDGl6XLc43TXKavBH/O1XIc+2xKstkClMaGDeXzpf7 FNj/FQioRs++2DzwWOSFd/oFJhLR24jLiAbLyHvMZrWfafipJ3xN7UPknks00C5qbklV mejg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=xkN4nAgahQjtgtrPSKjREC7MS7Y+oYwQun41VFknF4A=; b=RxFYm0njeAHklOG2UT7Ogss2QXNzyycoukrpH2RNbgipNokLf0PrdU5UMesHA4OExq VXRHtSXUKvPrJguxo69ZXYxR9QSPUnimuPSVhsEyE5RdJufY/FLbnraHwNo0JYVYtrbs hIoNUj+4b1zEZfQMdx89SVFSfKXtrEY5jEX535SHNkWh5WW5uh08RmC+0YYYPoPN/Jic a3Eyx47ykaPbDmmkyTDcaBj8qMhtV+XxO+oa7rmXwbdBhS0clIoHndJFgjZeBB/M/wNH pxI89bg5J/RVkXR/TSuLcn9WrVS/MAOsJiJk788xCY+kuFU99AJhCYJFf0KW/r3hWvgc /56g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nd30-20020a170907629e00b0076daf135b26si30650ejc.791.2022.09.26.07.53.53; Mon, 26 Sep 2022 07:53:53 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8605968BB0B; Mon, 26 Sep 2022 17:53:03 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2276D68B940 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id D40B5C00B6 for ; Mon, 26 Sep 2022 17:52:52 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:30 +0300 Message-Id: <20220926145251.56351-10-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/31] lavu/floatdsp: RISC-V V vector_fmul X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: AHBJ4c1PSsZB From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 6 +++++- libavutil/riscv/float_dsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 3386139d49..2482094ab4 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -25,6 +25,8 @@ #include "libavutil/cpu.h" #include "libavutil/float_dsp.h" +void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); @@ -36,8 +38,10 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) #if HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RVV_F32) + if (flags & AV_CPU_FLAG_RVV_F32) { + fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + } if (flags & AV_CPU_FLAG_RVV_F64) fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 17dda471b4..00fb7354bb 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -21,6 +21,23 @@ #include "config.h" #include "asm.S" +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_fmul_rvv, zve32f +1: + vsetvli t0, a3, e32, m1, ta, ma + vle32.v v16, (a1) + sub a3, a3, t0 + vle32.v v24, (a2) + sh2add a1, t0, a1 + vfmul.vv v16, v16, v24 + sh2add a2, t0, a2 + vse32.v v16, (a0) + sh2add a0, t0, a0 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 From patchwork Mon Sep 26 14:52:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38344 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304452pzh; Mon, 26 Sep 2022 07:54:03 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4/k6Attsf6UbRnx0AdHfH/vV5EIEnQzZFzRgO/oxzyHsqPFcePx9q7TrYM8ZGklF1h+e0J X-Received: by 2002:a17:906:6086:b0:731:3970:48d0 with SMTP id t6-20020a170906608600b00731397048d0mr18715465ejj.16.1664204043495; Mon, 26 Sep 2022 07:54:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204043; cv=none; d=google.com; s=arc-20160816; b=a6H92sIAdzxg+4JETuxWDkt5O4NX0BEeC9OC8nHb2Lhh+XC63sTd52dGOAV/tM4Dk9 3t76GshWuntkK8EKKQngS9QYfcl2fOUjt623gOlQvdoLxPoriY7s3lBom8REB7jXXCxF 00rvGZKtVkdlT9syafHR/yYY8XwXO+ZsydwdoOwTAInLCpVqz6/BMnXj/iYybgyQy/uA S620yPGCX4/UQ9E34Ka56e/0e+VvRumj6qX58hLl/IKbkb9IqLOWBNOxoHv4ZdtTZfhm pK0QPKytZ3Gf1zhZemiwi75NXfs4h0PSbgUFgVd21cdnuZmP9Af2G1b1t1fw4wEoBx9x USfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=rX9sAHaxNVnK4F4jClfWnkAchEJ7UCwhjW3SdFWIihM=; b=QJC5EyL5ITvDVUO6VSmQqyhAIObISdaIrzSspi/PBanyx0pw14qftHPneGWSUST/Hq Q7R/bOsXmFZpxgMU96nsSmBlpUmoavNTYDlLv9Te77aK2S925rQBitbYhYK3ds8s+Skd QCnJjVXesCV9euclArhOGN9noi0Ul8bO01OQMU0lLpO4A8SjuDn/mZsishTZXaL4xy1+ cn+NTHttLvCDRtiMLzHccCCGk+hLVrjQ0hGrzZBBMWfn3E1A2ju1GtSu8tagZWo7OW/o S/o4RHgjd2VCxXj+g4bTWqaOPzaft9IfoZrZRfYIb1xepGfvP6oHbjdFdMhv4kobzFEN g/4Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sh41-20020a1709076ea900b0073123a3bea6si41547ejc.769.2022.09.26.07.54.02; Mon, 26 Sep 2022 07:54:03 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F243568BB70; Mon, 26 Sep 2022 17:53:04 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2558A68BA76 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 123D6C00B7 for ; Mon, 26 Sep 2022 17:52:53 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:31 +0300 Message-Id: <20220926145251.56351-11-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 11/31] lavu/floatdsp: RISC-V V vector_dmul X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: g2YpL+A7o6X2 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 6 +++++- libavutil/riscv/float_dsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 2482094ab4..29114dfb82 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -30,6 +30,8 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -43,7 +45,9 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; } - if (flags & AV_CPU_FLAG_RVV_F64) + if (flags & AV_CPU_FLAG_RVV_F64) { + fdsp->vector_dmul = ff_vector_dmul_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } #endif } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 00fb7354bb..710e122444 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -55,6 +55,23 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_dmul_rvv, zve64d +1: + vsetvli t0, a3, e64, m1, ta, ma + vle64.v v16, (a1) + sub a3, a3, t0 + vle64.v v24, (a2) + sh3add a1, t0, a1 + vfmul.vv v16, v16, v24 + sh3add a2, t0, a2 + vse64.v v16, (a0) + sh3add a0, t0, a0 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Mon Sep 26 14:52:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38345 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304529pzh; Mon, 26 Sep 2022 07:54:12 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4SwFqmloOyhhjWnJb68Go8/ZXHD4K58oVcYO8dg+0Oofdfaqe9uJJWYOR23LggVXpT1Bcw X-Received: by 2002:a17:907:3dab:b0:783:4b01:1ffe with SMTP id he43-20020a1709073dab00b007834b011ffemr6461504ejc.107.1664204052525; Mon, 26 Sep 2022 07:54:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204052; cv=none; d=google.com; s=arc-20160816; b=p2J2bRBDl25n3KI+CyAZkZ6mai4hDhvZMuk7MmMDgpPli4CrVGOamtSrHzbhjb+lfo lNYe0WmuLBZyGrpa7/tm9lknJZoixBtrfBYAwce0KT/X/bFbxJwl7OmSI/XZJhKMvJvp GPQhzyt6VieeUmLj0q/N1GGP34eOFudVfWa+E1HOweCOq57F0S4n2EQY8StUuj8SLvs4 4oon2InBEKROzLSHyVk28Bv1vVQMQbbvbazWjZZBivwdULKpErX/u8Nz4K5IZTTbDD9T YAZ8QdgA/sR7QkdLy61uVqVxoSreEpVQv7iuHxfvpJ4la0C+2XT6iiKMvd/VdmgctGfy xLQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=R62xTchZAsvXJJMnWl9b+/DdFZV20tzb/PtuGE7Szww=; b=Vwd+2g8Zrjdu08puO3/D61Cv2HKKwBlSxNmTFS7Paiz9WwDR9MdeHJUASdDuCvetVy gx8QGiIOp6H006yjOr6fechZpBOFR77MWZ94nBSfOiS5tVN+FK01v4WMOZG7R/UwnMww 6jeVti4TDLHBelFj9iWS3QNWZA56YCQskpVYfkadt87SsqG1Dh0VnYzSGfy1/GtHTE6i wZlsSSzxH/J3Pkc6HA+aKp5IaM5CwPPYNuw86gSovMYmSindkcQ0FWiekYO7uoeKT4Jq oMQ4ZdN8jBJ23jq3Zc/K4dEvbN3W/OhzLR0B2LnRKHCMeiQ5mdwoay5EVAsOls/69dqc gMEQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qn24-20020a170907211800b00715867834e3si71619ejb.506.2022.09.26.07.54.12; Mon, 26 Sep 2022 07:54:12 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0D0C568B32D; Mon, 26 Sep 2022 17:53:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 26EB168BA85 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 3BE45C00B8 for ; Mon, 26 Sep 2022 17:52:53 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:32 +0300 Message-Id: <20220926145251.56351-12-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 12/31] lavu/floatdsp: RISC-V V vector_fmac_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: pYZMNamRoYfE From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 29114dfb82..9e19413d5d 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -27,6 +27,8 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, int len); +void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); @@ -42,6 +44,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_RVV_F32) { fdsp->vector_fmul = ff_vector_fmul_rvv; + fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 710e122444..4c325db9fd 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -38,6 +38,25 @@ func ff_vector_fmul_rvv, zve32f ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_fmac_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v24, (a1) + sub a2, a2, t0 + vle32.v v16, (a0) + sh2add a1, t0, a1 + vfmacc.vf v16, fa0, v24 + vse32.v v16, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 From patchwork Mon Sep 26 14:52:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38346 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304626pzh; Mon, 26 Sep 2022 07:54:22 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7QxeQ7ZNHU2AiLKWeupAuTRDKEEB5cD3HCNxkmxPxbCvFsZyE1yoxDbQ+KhsROQ4hG45oA X-Received: by 2002:a17:906:5d07:b0:781:c281:f6e4 with SMTP id g7-20020a1709065d0700b00781c281f6e4mr18820112ejt.744.1664204062715; Mon, 26 Sep 2022 07:54:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204062; cv=none; d=google.com; s=arc-20160816; b=YdpWWt8oEuW/eViSU2HEU2H5f9wQOfNQRhMiD1a2D9X/YYgQUbWjPcGVMJhaJio2Zk 13SMBBEwUutQYGvA9eXJxgFJRa6JMpQTTjM1FkW/EwCbTcKRoCTDEqqOuEjqxqqWYkh4 GZJze0L9cQ/8hg9aK/TFEnWtzXjR9FK/AidmCuJYApJHVp86jnIe+K6VBTUEPpIkaSL5 efjhDH09BNIWxBASHa1GRl7jiOjnvElpFXXZ8wCLwoRV02gau+4dGKxsbx4r4Nysdh33 r3cH2yqfJhOZapCuCShvBy3B2B6jmDwyxjUCdjJ8+H/iKZS6I8ipN2toIjTQp33jbM+e iRjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=B76wmCG0uy2P3HfvRW4btat5W66dzwgFyw6QDh1l0Yo=; b=KtGfVZLjk1EmpJs/Ytpq++KwYdJv7HoBFa6TcmC5TzdXco1eFX2GQDCFmzUyFn9y4O IjdQCr7DccKM7x71TQh/92EP8BB4X90vT9GC+YiOQr0S6gSSOs2XQrZD7bW6Q5ON67JW SqNugvAgtFBbgH46XUL9ATbyD12V2RT1wKXuis2q/p2sGZyncbOmYAyQuZPpPpaWGZxE Xs5ShNaaH+9C9L70/VqSfDuW+0dT7NNruQmiuE3vD6wrqmMPrZqNawTCkE6LJJpRxOE4 WedJxdVC6DwSZF0KNlEiX9yNX6V1cGeh6N+DEYwaotBNP6bLCXhiOZTzOEH2MznPHOti LH1A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 6-20020a170906308600b0072fc714c92fsi15024ejv.902.2022.09.26.07.54.22; Mon, 26 Sep 2022 07:54:22 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0F0DD68BBA4; Mon, 26 Sep 2022 17:53:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2966868BA89 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6575CC00B9 for ; Mon, 26 Sep 2022 17:52:53 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:33 +0300 Message-Id: <20220926145251.56351-13-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 13/31] lavu/floatdsp: RISC-V V vector_dmac_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: cBooQ+Of38mw From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 9e19413d5d..a559bbb32b 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -34,6 +34,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); +void ff_vector_dmac_scalar_rvv(double *dst, const double *src, double mul, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -50,6 +52,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_RVV_F64) { fdsp->vector_dmul = ff_vector_dmul_rvv; + fdsp->vector_dmac_scalar = ff_vector_dmac_scalar_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; } #endif diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 4c325db9fd..048ec0bc40 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -91,6 +91,24 @@ func ff_vector_dmul_rvv, zve64d ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_dmac_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 +1: + vsetvli t0, a2, e64, m1, ta, ma + vle64.v v24, (a1) + sub a2, a2, t0 + vle64.v v16, (a0) + sh3add a1, t0, a1 + vfmacc.vf v16, fa0, v24 + vse64.v v16, (a0) + sh3add a0, t0, a0 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Mon Sep 26 14:52:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38352 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2305210pzh; Mon, 26 Sep 2022 07:55:22 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6GlNaXta7h3G9oWlb1O3BomTvrhWDw6S1QZ4PELCLqQhRipgK+B5MBXD3sOUM/0WvQ/R4a X-Received: by 2002:a17:906:5d11:b0:780:bf2d:6d14 with SMTP id g17-20020a1709065d1100b00780bf2d6d14mr18878547ejt.543.1664204121885; Mon, 26 Sep 2022 07:55:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204121; cv=none; d=google.com; s=arc-20160816; b=b1D0VwRi5qflIu/A3MJVD3iDfFziZD0whC23WkZIEa3wJdU5GyWoTdly7i6MZj2tEV AiCfSoHiXb3ycZQFGF0HfgbVGmEo5XSYA8BRn6jbWWuZzBQmUPmIvezrAhc/z3nLi4xk wk/sZm15XW1XQNTojzdYYCXzJs3oRJrHk3Oo8xdNu4HqZaIwKWvbpK0iY7lPfrlPjGO9 sl8e8g+/Ddlw0xlJ8tUKyypOgyBFUQI5K+EvV775aq/Wyk0S8YgfgLoaRzNX4qec196W 6SLDTFMwXybBHljiPO5SeAv6FSwHtPg/MFWZKtv655UpA6pUxjaxG8jFmTlyWSnKU1UL UNeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=uxpI2RR/nWA9R64qsAQaAUftR3idi+8WhWeDV+LuPvg=; b=hCoE+Z8bAbIlo9tm12jRR5HIatJyWsk4FaopjALhnjTlP9cH0eYeocJQw6ogOI8b3x Nlu2rGBiRaUwypBCUIFMczmwrYY+7LDGl8wcIia8WFpyAAJSe62Qsm0ZXILQW/DYk4MD 3xr0I6pHys8UApuOuymaIuVQnC6km7m25YpNra3cRIGY+7Mc1+ozQtIbdYHuRKaWGd+Q PZeIiKQ0TpjvFYvB2NhCEDv7tm8sIQoD9wm3OSiWI8h3pAS46kaPAvPYsbI0RIubu5AC MlRE+TCX2bN6iB0ytcSRh10vJxyXboi8UuDSHIuTe1z8lYKbIl/iSc42C22RlcLSI+Y1 Nn8w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y9-20020a056402440900b0043dc00e0740si19526644eda.373.2022.09.26.07.55.21; Mon, 26 Sep 2022 07:55:21 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3172668BA62; Mon, 26 Sep 2022 17:53:13 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 50A3268BA9F for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 8F569C00BA for ; Mon, 26 Sep 2022 17:52:53 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:34 +0300 Message-Id: <20220926145251.56351-14-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 14/31] lavu/floatdsp: RISC-V V vector_fmul_add X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: qMtqOLysCPXN From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index a559bbb32b..8982436647 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -31,6 +31,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, + const float *src2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -48,6 +50,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; } if (flags & AV_CPU_FLAG_RVV_F64) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 048ec0bc40..db62402878 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -74,6 +74,25 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) + (a3) [0..a4-1] +func ff_vector_fmul_add_rvv, zve32f +1: + vsetvli t0, a4, e32, m1, ta, ma + vle32.v v8, (a1) + sub a4, a4, t0 + vle32.v v16, (a2) + sh2add a1, t0, a1 + vle32.v v24, (a3) + sh2add a2, t0, a2 + vfmadd.vv v8, v16, v24 + sh2add a3, t0, a3 + vse32.v v8, (a0) + sh2add a0, t0, a0 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Mon Sep 26 14:52:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38353 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2305325pzh; Mon, 26 Sep 2022 07:55:32 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7Uwbfaao+FPkT4kNcoobCUdfelpx4dbjkDSrj7S49aQ5B1oNgjbZhjQOVIoznm/6hPSTwh X-Received: by 2002:a05:6402:35d2:b0:450:be1b:d7cf with SMTP id z18-20020a05640235d200b00450be1bd7cfmr23087759edc.51.1664204132180; Mon, 26 Sep 2022 07:55:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204132; cv=none; d=google.com; s=arc-20160816; b=sn77eIu5OaWCvJo5cuP9q+3Nqwek/gcAvklXJXmMqjcKDlCz8InmHGbXUCUhi4zOqs kuvcxcBvO1Yc+xhW0/Bk1HFaOLvgjxqmjcgQwpOg3dFn4faro9cfyJyF7Lc4lY8JwSak 5IkEYV5a+XN97n3vv0EZpEaAJu4ujDO8B10tOBp7UO4mCjDGvPnyZfvcpL3lE6ylMP7x nO++q/hnmVRZwlQg2wePtehwMETpOjoxtNXWZV/9yYSFcOqPzVuVVipynvfXVFFz0Gid 5izgc9G6eveM6GVKIuHcfdK3E2VvjeCMarbiu9TYLeeRUUDH6zO6Y+U+/BBvi9jvjHN6 jkEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=j52lKxfe+Nj+jLe2XVFKWjnJ1fP4EfXiHrxef+gi4HM=; b=zQoMHZN/VDu++J0kof2NQb/b3D3u7BowPSu2jUXU0uYGCHdrnIgKNdeLjk4x7hLDWp 951wMexB9fokvjN2IuAwXe6MkALXPx8+fvnYAopXIiwlRKkilYy9408wiEVaxzDo6TI0 e1Y+BqSjPiP46Loez447p012u5+ox17k1lce1iN6R5lij/58tZypNk1B9WZa3F8k+vZf ZJURcnew2V9nxwLaBiwy5V7sj4DqtI2To/xIPQTH5oZzxUtrM6XvE38j+NakREa3tGTp odZTXprTJcwyRa2V+Vm8IbZ6ssnrRhWIfEsYKb6RbRp/cZToQt6M2scMKWPAJe4UuXiv dnNw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j27-20020a170906279b00b007820aa60da9si31540ejc.701.2022.09.26.07.55.31; Mon, 26 Sep 2022 07:55:32 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 49DAD68BB8A; Mon, 26 Sep 2022 17:53:14 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5822F68BAA3 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B89F6C00BB for ; Mon, 26 Sep 2022 17:52:53 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:35 +0300 Message-Id: <20220926145251.56351-15-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 15/31] lavu/floatdsp: RISC-V V butterflies_float X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: dNg2aruGjS3G From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 8982436647..a1cd180cdc 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -33,6 +33,7 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -51,6 +52,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->butterflies_float = ff_butterflies_float_rvv; } if (flags & AV_CPU_FLAG_RVV_F64) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index db62402878..a721c44667 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -93,6 +93,24 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_float_rvv, zve32f +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a0) + sub a2, a2, t0 + vle32.v v24, (a1) + vfadd.vv v0, v16, v24 + vfsub.vv v8, v16, v24 + vse32.v v0, (a0) + sh2add a0, t0, a0 + vse32.v v8, (a1) + sh2add a1, t0, a1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Mon Sep 26 14:52:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38347 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304695pzh; Mon, 26 Sep 2022 07:54:32 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5G1llXWVKpfj/09B4+v+91jfgSc2mmn+cCYIz8b4DOxg5F4dUd5/R3XT4ckd30ky+caDHN X-Received: by 2002:a05:6402:3550:b0:451:473a:5ca3 with SMTP id f16-20020a056402355000b00451473a5ca3mr22814649edd.48.1664204072000; Mon, 26 Sep 2022 07:54:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204071; cv=none; d=google.com; s=arc-20160816; b=z1bbuWmxzC3VnwbbKqhCfrl5gQiMvm7c5tDqaQX+uUbMljFF87cwet6tUWw+btsSCB yhUg/3r1rzr9PE3WzXOp+KdDjNjLPOezFvHLicRjxHosQ+PW5QXixv7i1Hp5NBZwxY2s b9oDIjzLhNhKyPARFRdgtvatOgViOXAa87R4Nqjl9sBCtxOqKA0rn+B6Ng2pQItIt4hF 8MAbfSxrMBfQancRRWmcVLET2LJbD1r4hg0P2NDKQ6QlFSd4BH/aPL0QLN0JfvDaW0/g tWJiioBemtAPo4lNgcMYgseST3mNV/USxOdKBDOg6nStlQmi5O1FrEwbYTuRD2TjR78/ V4aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=qaHjoRCzyCrkJHR8OXXlJV7nsD3igpg9H4IrqJiPEyc=; b=yqhY9UvJONvZ+ki51VlCKb4Mum9fMs1Hm6r9GdJO/D+SSCCJIbiuvb6WBCqtEBCzvz LpG7IrdA2lWP0Eu76YLXz3WlueofpQY1jQq8e8HV5H3oRTYUbDhJqDwoqzUfm0FcEHTn 90FMJjW0HoyayEAIX9S6IgCJcJeJxOwF+cFDIWEbvmbdm+v17TaxSOjhdD3Hx5pMhhfg YbL4ycdvdcXoxJ8c9C4aJtsUCE17DcDj53K+Ao61ZW/paDGaKDWorOee4u3WDxVdpLFt mxJwU2g33SMV+iDUP45mczJY4X6QVR78FAOiarwm5S+r+eO786DG8RYL9NfLt6WfunS9 kaXQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dn12-20020a17090794cc00b00781c08ee48bsi155642ejc.69.2022.09.26.07.54.31; Mon, 26 Sep 2022 07:54:31 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 01D2B68BA9F; Mon, 26 Sep 2022 17:53:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4283168B32D for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id E1C17C00BC for ; Mon, 26 Sep 2022 17:52:53 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:36 +0300 Message-Id: <20220926145251.56351-16-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 16/31] lavu/floatdsp: RISC-V V vector_fmul_reverse X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ppPMaoIRpd5E From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 21 +++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index a1cd180cdc..b99e3080c9 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -33,6 +33,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, + const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, @@ -52,6 +54,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index a721c44667..fbd2777463 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -93,6 +93,27 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a1) * reverse(a2) [0..a3-1] +func ff_vector_fmul_reverse_rvv, zve32f + sh2add a2, a3, a2 + li t2, -4 // byte stride + addi a2, a2, -4 +1: + vsetvli t0, a3, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + sub a3, a3, t0 + vlse32.v v24, (a2), t2 + add a1, a1, t1 + vfmul.vv v16, v16, v24 + sub a2, a2, t1 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] func ff_butterflies_float_rvv, zve32f 1: From patchwork Mon Sep 26 14:52:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38348 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304790pzh; Mon, 26 Sep 2022 07:54:41 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5dVgwKVaCysgCdMxEMjSVVoedt0357GH3RJhdLyxmxBP6pBvsWKMpCvoRYrIjmif1QGDhr X-Received: by 2002:a17:907:2c78:b0:779:7327:c897 with SMTP id ib24-20020a1709072c7800b007797327c897mr18122155ejc.657.1664204081760; Mon, 26 Sep 2022 07:54:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204081; cv=none; d=google.com; s=arc-20160816; b=Yj0vzXXBe7+ZqCaahWwsAA8Yb3FFr6Bh3Ggx0RurPWLh1UUdkQx0wF9spthsS1ZlH/ 7Ey6nSk4SI4QZmBUAaGlaV08aGHjogExLtjc4IH3H1YDaf+6i6kpw5F8JesozPn6HjFA lOZ91fZtjLcfLr7uqG+WPlVXw/QIi8jy82E5jJP5tszKQmLNVrEoUsE5WZVQU9rR9rbq J/3VbydMRd9i6OdRgs7SQaA8kepgURFvUOcUI1f8+ofbTHRdrMdhNLdpj7kmGHf0kTlz L7IMLXpkGVV5NvaQiZtJAqjE0e8GgUzhB7XayHxSPrdi73dAJX0wbWnZ/4YjcWiIC12J nX9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=XtacblnZA3R1Crk7OO3OrllNk96yGh6WccckHk5uaiA=; b=G0OcfeayNN0n5ttduS02LGjsMe4N/IuH4WpyxGNRRhQu8DdHsfWggJVLxGNKC8idQZ ZYflvQBWH2PYCjbmmBzsiMLxFAjMGW6VhwPdfq+0vUhBrdhqc+xBS2J6QregGAcYaewW 1zhVEd9HLNrhWqnfyk/Gj7qzNuQbKqWmOJvpUOYBStKyhHxqDy8EQnrFON+4qkUjmsYo /WUaGvB8nKUlMEBy+VutL9JlwTeW7fvYwWCdrFj2bSL0rINXRhcKy7IrGs649AUh1Idj rgEnaPqvdzJaXhc8dFTdeV2kOFMxsoZAAuyIAhXTUdhsaqhUJNPBmp/a47q2o4ecUtjz 462w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id k16-20020a05640212d000b00457596e4babsi1874361edx.310.2022.09.26.07.54.41; Mon, 26 Sep 2022 07:54:41 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1000968BAA0; Mon, 26 Sep 2022 17:53:09 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 44AE968BA9B for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 16CE6C00BD for ; Mon, 26 Sep 2022 17:52:54 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:37 +0300 Message-Id: <20220926145251.56351-17-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 17/31] lavu/floatdsp: RISC-V V vector_fmul_window X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ouElYms620tx From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 33 ++++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index b99e3080c9..44a505308d 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -31,6 +31,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_window_rvv(float *dst, const float *src0, + const float *src1, const float *win, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, @@ -53,6 +55,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_window = ff_vector_fmul_window_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index fbd2777463..ce530f6108 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -74,6 +74,39 @@ NOHWF mv a2, a3 ret endfunc +func ff_vector_fmul_window_rvv, zve32f + // a0: dst, a1: src0, a2: src1, a3: window, a4: length + addi t0, a4, -1 + add t1, t0, a4 + sh2add a2, t0, a2 + sh2add t0, t1, a0 + sh2add t3, t1, a3 + li t1, -4 // byte stride +1: + vsetvli t2, a4, e32, m1, ta, ma + vle32.v v16, (a1) + slli t4, t2, 2 + vlse32.v v20, (a2), t1 + sub a4, a4, t2 + vle32.v v24, (a3) + add a1, a1, t4 + vlse32.v v28, (t3), t1 + sub a2, a2, t4 + vfmul.vv v0, v16, v28 + add a3, a3, t4 + vfmul.vv v8, v16, v24 + sub t3, t3, t4 + vfnmsac.vv v0, v20, v24 + vfmacc.vv v8, v20, v28 + vse32.v v0, (a0) + add a0, a0, t4 + vsse32.v v8, (t0), t1 + sub t0, t0, t4 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) + (a3) [0..a4-1] func ff_vector_fmul_add_rvv, zve32f 1: From patchwork Mon Sep 26 14:52:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38350 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2305069pzh; Mon, 26 Sep 2022 07:55:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6YC86U/eOEBXkOtkRsUOsYxLKojoBPPFJIldHB46zgClirjZMHnUP+xjiwzH6HH3Bt8hO7 X-Received: by 2002:a17:907:a051:b0:77a:e136:6ad2 with SMTP id gz17-20020a170907a05100b0077ae1366ad2mr18521282ejc.764.1664204101455; Mon, 26 Sep 2022 07:55:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204101; cv=none; d=google.com; s=arc-20160816; b=p4u9TGwJTZffybwLdvQ7SZyU+dkR1KR8TSW+fkxVFi6fraFx4xxGXIFM5ImMA8Qp5R 6W74m/Ok2iAfpp7TLBINek3Gi7EDsCWxXAGn1pY+EuAB8C2ybjqzYUR56Zf9bsdQaFL9 wjPZBLz0/DbZZIdUc2qtoxGp9qg+H94xmEEex8cxCSIIInXWNHRb0TL1gngRjXJVxOtG y0NCBXXGM488bytFNLaRJCezwa6k1kjNxy+d/vUOhbVydJdwmSfUh60pek7uXUxIhIfR 7nXaQxZ110cdCaBazQzQHXJnmFdChtlv+MlRJXsxQLkxZaixi2mjkcWsvqSfxaw3Io2a 6KZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=TosyE5P8bFir9swlQPSuFzgge6D0CkYLp0PnPC2H4BI=; b=hX+UB51O5dlA8JhPDAjI9rWY3SZtnpZNEpplGLXuQqcbjM52fQiQMEbgSdoM6nlx2r /vu3zeCBDVKYyF/1dRIFyD88wVYKEmQwbjFVtgrRomboBqyvRZ7+fkze1O1FE5iyE2ad 9K3/piy3esIAy0ZAPNq6EV6JRJpa+b/BIEVq0KxtzfBRcRGvBT/xdugnpKZ6y4pzJLIw phaJCclU4pBHzj211bpw/DwIXb3wjL0NQH1LdzJC7MyTIn+jEkHJ5OhSp6sfldVbyx27 BuMpQXRc1yKjmoDuGECgM9ckcn1jaOWxrWqwheWgweZEFwk23PiGEpADTb3SeElvVlwG wsLA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gb23-20020a170907961700b00782da4ff18dsi61073ejc.668.2022.09.26.07.55.01; Mon, 26 Sep 2022 07:55:01 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0B88068BBB5; Mon, 26 Sep 2022 17:53:11 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 468EA68BA9D for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 4093EC00BE for ; Mon, 26 Sep 2022 17:52:54 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:38 +0300 Message-Id: <20220926145251.56351-18-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 18/31] lavu/floatdsp: RISC-V V scalarproduct_float X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: iFGSBG6WiNZB From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 44a505308d..e61f887862 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -38,6 +38,7 @@ void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); +float ff_scalarproduct_float_rvv(const float *v1, const float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -59,6 +60,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; + fdsp->scalarproduct_float = ff_scalarproduct_float_rvv; } if (flags & AV_CPU_FLAG_RVV_F64) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index ce530f6108..ab2e0c42d7 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -165,6 +165,26 @@ func ff_butterflies_float_rvv, zve32f ret endfunc +// a0 = (a0).(a1) [0..a2-1] +func ff_scalarproduct_float_rvv, zve32f + vsetvli zero, zero, e32, m1, ta, ma + vmv.s.x v8, zero +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a0) + sub a2, a2, t0 + vle32.v v24, (a1) + sh2add a0, t0, a0 + vfmul.vv v16, v16, v24 + sh2add a1, t0, a1 + vfredusum.vs v8, v16, v8 + bnez a2, 1b + + vfmv.f.s fa0, v8 +NOHWF fmv.x.w a0, fa0 + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Mon Sep 26 14:52:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38354 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2305444pzh; Mon, 26 Sep 2022 07:55:42 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5sVTMit9SNba/4kwJhlnjzN3w1r+PIXF1C4ehFFI6PDH7jR1vVFicWul7QLWXPxqo7WmJf X-Received: by 2002:a17:906:8457:b0:781:648c:3495 with SMTP id e23-20020a170906845700b00781648c3495mr17445900ejy.541.1664204141969; Mon, 26 Sep 2022 07:55:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204141; cv=none; d=google.com; s=arc-20160816; b=pDQmSC1Yug0wnFCUuhHEN9absD45VIQc3tnolABlsFViDoL/GLCqdG7JYjUsBoyvz3 l8MQCYGel0Q6Uhbshk35+Vx/guoabXcsTaT642DcgMRES4PuI/j6MMUlRx8CdRpEnyU9 xE06fQCgKsEY10FxVeeuo5ZJ+F7ltGeLtBSes93U7Z0sbI0tojVsK4RyFncLKBaGSfV8 FiIC/fRdkNUhndQz2Kww3raY338wEqhpHYmDMTVr+HVRf29CVy3rjDlgp/IS2Uw4AcVm liaCUj/mCZh9Rm6EPeL3R6+rIf2jqqEbdMX7oVJREhkWqe4PMKPrQEaJs9ZO1QRZFOzi U//Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=+Ta9FaRg8F+VBhPHpT3ynrDnGfItPnqqWQpKVRGSdsA=; b=OrhCEUUFd/51lbojQjYQ6LUl8428rlZTNX/+esbmVZpvGEs2VGOpCDanRNac7m2N7j dPgpDzuLqOo8VBxagnULObRuvCs81T1eUhp+ufSaYQH/+BgvpK9dQM6oOS5g2KqZ8hdV CuFGhlwO1gmPnvBdh1nRHAke4/7F9+2tHoU/JjPfbZztKCA6a+Gx+JUgnw8K31lEra0F feLzJ5hHfKZSt5VrLKYaDpSrO9JwwRdoupPJGeEWQmsN0Fj4UtXLpkXAA4bFt4X67SwT IWr3frSmdS2UKfoS/lesZpVvctcg4j/QqNzhsvVndVokUfm8LUFEv5GBfAXpI1X1AicH 1maw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f2-20020a50ee82000000b0045723aa55d2si4780835edr.453.2022.09.26.07.55.41; Mon, 26 Sep 2022 07:55:41 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3B7D568BA89; Mon, 26 Sep 2022 17:53:15 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5F5F468BA76 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6A2B4C00BF for ; Mon, 26 Sep 2022 17:52:54 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:39 +0300 Message-Id: <20220926145251.56351-19-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 19/31] lavu/fixeddsp: RISC-V V butterflies_fixed X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: tkVmHV5jbiG8 From: Rémi Denis-Courmont --- libavutil/fixed_dsp.c | 4 +++- libavutil/fixed_dsp.h | 1 + libavutil/riscv/Makefile | 4 +++- libavutil/riscv/fixed_dsp_init.c | 38 ++++++++++++++++++++++++++++++ libavutil/riscv/fixed_dsp_rvv.S | 40 ++++++++++++++++++++++++++++++++ 5 files changed, 85 insertions(+), 2 deletions(-) create mode 100644 libavutil/riscv/fixed_dsp_init.c create mode 100644 libavutil/riscv/fixed_dsp_rvv.S diff --git a/libavutil/fixed_dsp.c b/libavutil/fixed_dsp.c index 154f3bc2d3..bc847949dc 100644 --- a/libavutil/fixed_dsp.c +++ b/libavutil/fixed_dsp.c @@ -162,7 +162,9 @@ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int bit_exact) fdsp->butterflies_fixed = butterflies_fixed_c; fdsp->scalarproduct_fixed = scalarproduct_fixed_c; -#if ARCH_X86 +#if ARCH_RISCV + ff_fixed_dsp_init_riscv(fdsp); +#elif ARCH_X86 ff_fixed_dsp_init_x86(fdsp); #endif diff --git a/libavutil/fixed_dsp.h b/libavutil/fixed_dsp.h index fec806ff2d..1217d3a53b 100644 --- a/libavutil/fixed_dsp.h +++ b/libavutil/fixed_dsp.h @@ -161,6 +161,7 @@ typedef struct AVFixedDSPContext { */ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int strict); +void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp); void ff_fixed_dsp_init_x86(AVFixedDSPContext *fdsp); /** diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 89a8d0d990..1597154ba5 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1,3 +1,5 @@ OBJS += riscv/float_dsp_init.o \ + riscv/fixed_dsp_init.o \ riscv/cpu.o -RVV-OBJS += riscv/float_dsp_rvv.o +RVV-OBJS += riscv/float_dsp_rvv.o \ + riscv/fixed_dsp_rvv.o diff --git a/libavutil/riscv/fixed_dsp_init.c b/libavutil/riscv/fixed_dsp_init.c new file mode 100644 index 0000000000..e2915f1fcd --- /dev/null +++ b/libavutil/riscv/fixed_dsp_init.c @@ -0,0 +1,38 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/fixed_dsp.h" + +void ff_butterflies_fixed_rvv(int *v1, int *v2, int len); + +av_cold void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_I32) + fdsp->butterflies_fixed = ff_butterflies_fixed_rvv; +#endif +} diff --git a/libavutil/riscv/fixed_dsp_rvv.S b/libavutil/riscv/fixed_dsp_rvv.S new file mode 100644 index 0000000000..0e78734b4c --- /dev/null +++ b/libavutil/riscv/fixed_dsp_rvv.S @@ -0,0 +1,40 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_fixed_rvv, zve32x +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a0) + sub a2, a2, t0 + vle32.v v24, (a1) + vadd.vv v0, v16, v24 + vsub.vv v8, v16, v24 + vse32.v v0, (a0) + sh2add a0, t0, a0 + vse32.v v8, (a1) + sh2add a1, t0, a1 + bnez a2, 1b + + ret +endfunc From patchwork Mon Sep 26 14:52:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38351 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2305072pzh; Mon, 26 Sep 2022 07:55:12 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6Vb+odgwayYSo7diHOwhq569RDTmAAQ9/Za4aFiFBk6QkqG2oJHior8q8jkDBNtIT6Nu/H X-Received: by 2002:a17:907:b1b:b0:781:320f:b76c with SMTP id h27-20020a1709070b1b00b00781320fb76cmr18559254ejl.671.1664204112053; Mon, 26 Sep 2022 07:55:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204112; cv=none; d=google.com; s=arc-20160816; b=baDwE8iI4rNGGTV3/vz4UL6NRdgczQfkdgEQ73y6bq61HdQXT8uX/h2D58qOlIe64L SEcaiy24x6mx6HvkgIwym2mpkVhk117g3O7GE54UmG6GIOYtenlH05WJ00Nvih6gMcq/ aUY5kIIS3f3Nm8A2mEuHCjvWCTH7Yu9ACC+54KEll9KJZNVAQ0Zh5qO9OuDGXeaEnlhs W7gJBUW5+ldiyB8NxQJzCkPXPfT08YEPY2ayHlbsTO30LS02sgzlBIDajpkcjtWHkYbY Zu/SIq5sRo6d1jifN5NQzQ0+fjDDJzsZCx0qX/01v0aeBvo48DV5TbqyUMuiOJw/nKvQ XdUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=FCojXJxFs9GqZ/Yj54Av+YyqrqqQp0L+wLnK/47l+U4=; b=IFGF15NCCJdswm1C6RjetBrnfCtIUJS5AB5Mk0I+n780bYcMciREm5yXHjBtL57447 pz1Neef/pPsqUcQv2luVxmja+WXhfd2JbuGL2am1WwlIPOlVUmToZ7NC7iSKdTpMXhwU m81QbAA7VnHroJViuBz7yomuVDl/2d2Yu4x2gA2RG6qGk3cZyJnLMUdAsoEAjDOKkAVG g5OKrxCboeEOUsqa4jdxv8NAan+kOBktrXWcbgs9zJ4gatboiOqrIGzy1Dit7a2Zqn0E kKVZMMU5tEcxRxhUhTXNs9LWdRCzdVofry1cqL+XboaXUhL987gFPaYz7izfdfEeWTKp BfiQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hr31-20020a1709073f9f00b00780076c3322si107016ejc.432.2022.09.26.07.55.11; Mon, 26 Sep 2022 07:55:12 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2107668BB52; Mon, 26 Sep 2022 17:53:12 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4AAFB68B940 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 93A8EC00C0 for ; Mon, 26 Sep 2022 17:52:54 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:40 +0300 Message-Id: <20220926145251.56351-20-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 20/31] lavc/audiodsp: RISC-V V vector_clip_int32 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: XfJxmFcd3lpY From: Rémi Denis-Courmont --- libavcodec/riscv/Makefile | 1 + libavcodec/riscv/audiodsp_init.c | 9 ++++++++ libavcodec/riscv/audiodsp_rvv.S | 36 ++++++++++++++++++++++++++++++++ 3 files changed, 46 insertions(+) create mode 100644 libavcodec/riscv/audiodsp_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index da07f1fe96..99541b075e 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,4 +1,5 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o +RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index c5842815d6..ac06848a82 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -18,16 +18,25 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "config.h" + #include "libavutil/attributes.h" #include "libavutil/cpu.h" #include "libavcodec/audiodsp.h" void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); +void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, + int32_t max, unsigned int len); + av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) { int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_RVF) c->vector_clipf = ff_vector_clipf_rvf; +#if HAVE_RVV + if (flags & AV_CPU_FLAG_RVV_I32) + c->vector_clip_int32 = ff_vector_clip_int32_rvv; +#endif } diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S new file mode 100644 index 0000000000..49546ee3c4 --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -0,0 +1,36 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clip_int32_rvv, zve32x +1: + vsetvli t0, a4, e32, m1, ta, ma + vle32.v v8, (a1) + sub a4, a4, t0 + vmax.vx v8, v8, a2 + sh2add a1, t0, a1 + vmin.vx v8, v8, a3 + vse32.v v8, (a0) + sh2add a0, t0, a0 + bnez a4, 1b + + ret +endfunc From patchwork Mon Sep 26 14:52:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38358 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2305872pzh; Mon, 26 Sep 2022 07:56:19 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4C1MIXzL4hzdgH/7R9u8YrXj1duHPS6VMXTPIe5PdfM8cgk3UehxbX2wzEb2Oon/so0e/4 X-Received: by 2002:a17:907:1b03:b0:6ff:78d4:c140 with SMTP id mp3-20020a1709071b0300b006ff78d4c140mr18859235ejc.554.1664204179152; Mon, 26 Sep 2022 07:56:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204179; cv=none; d=google.com; s=arc-20160816; b=lQGy+jEnj6I9N0X4qYWY3giMusZnEUIVRpLh93StPlrVc16deNiVI0Y5HHopaUgLoD +NXwnIlETIwai6+V5dvAkvqqTWVxtf9Fz1hz0+PEGLxbvBU8nmRNlF3gbNv0NzFJyD6/ axlJZ3PdNSY69iZbDyGGe5NWZoa3MZgPOAP9XphlAg74GSU+iXP/uNS/y9ix20GDf9g7 tpSOqoAzUcev3XAStn6AgoGOmOLpBudGvAsKl1POrJYBvxtSQO7+Q3H3+U8QIwY2bSdT 2BI/t4Z1v/Uq2Cck6ZWl6hbE503CMIqiRyMj5wL7+v5MnIbyxh5a9kBd65bDqJadco5E HKpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Xo3erlMj1zN2DitMj9b489ugpA3+t86OgF1HxHCC6PM=; b=sFAdBCryO4uyay4Cp/LNsUnsQiYx254uVdJMCfE359ZXv7QKBiIB5339OawdzGt5H+ P8I/MVkUD5uSSPST0bwDJQX8CH3GYM8L4bTLNgftt1JCJ2CyAnYfhDm+7rcbMT9cpYR3 LWDcdAEgOqmIGrAdrlV05Wusw7tgeApopup/LUKqmaMf1ayk61s9qh5VEhpfH0BnA1v7 aUc3+pkkmB6tqQrsch8xDIOdTaF11//3uLRPEbYYTL7oRIt75/VqXPphioGmBdPE5f2N JACGKxzBeNSY0ONt5l4HZbc/uqHS0xHAApPaBWtNiq3BL+iaGmdMUtA8BER/qVbardvN xR1Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dn12-20020a17090794cc00b0077b2e822b8fsi173050ejc.76.2022.09.26.07.56.18; Mon, 26 Sep 2022 07:56:19 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5CE7268BBD9; Mon, 26 Sep 2022 17:53:19 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 65AAE68BAB6 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id BD9B4C00C1 for ; Mon, 26 Sep 2022 17:52:54 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:41 +0300 Message-Id: <20220926145251.56351-21-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 21/31] lavc/audiodsp: RISC-V V vector_clipf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: v7o54GV+u8GB From: Rémi Denis-Courmont --- libavcodec/riscv/audiodsp_init.c | 3 +++ libavcodec/riscv/audiodsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index ac06848a82..9c9265531d 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -28,6 +28,7 @@ void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, int32_t max, unsigned int len); +void ff_vector_clipf_rvv(float *dst, const float *src, int len, float min, float max); av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) { @@ -38,5 +39,7 @@ av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) #if HAVE_RVV if (flags & AV_CPU_FLAG_RVV_I32) c->vector_clip_int32 = ff_vector_clip_int32_rvv; + if (flags & AV_CPU_FLAG_RVV_F32) + c->vector_clipf = ff_vector_clipf_rvv; #endif } diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S index 49546ee3c4..427b424cb9 100644 --- a/libavcodec/riscv/audiodsp_rvv.S +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -34,3 +34,20 @@ func ff_vector_clip_int32_rvv, zve32x ret endfunc + +func ff_vector_clipf_rvv, zve32f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.x fa1, a4 +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v8, (a1) + sub a2, a2, t0 + vfmax.vf v8, v8, fa0 + sh2add a1, t0, a1 + vfmin.vf v8, v8, fa1 + vse32.v v8, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc From patchwork Mon Sep 26 14:52:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38356 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2305665pzh; Mon, 26 Sep 2022 07:56:00 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6kXtMpYUAGumgwnHccA5pl1sBTBwcKftNd7VgBNRSiShRLNmM8thddPMzZeWxInhROYhTc X-Received: by 2002:a05:6402:5507:b0:452:183f:16d1 with SMTP id fi7-20020a056402550700b00452183f16d1mr23203455edb.96.1664204160834; Mon, 26 Sep 2022 07:56:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204160; cv=none; d=google.com; s=arc-20160816; b=nimvLIH883kA8nD/SPwyels4GqjGUMnUvrD9sTWFM51sU+HvFaRsaRZoZ1CkmxXJfg kRzKK0pl6TtAcYKgQ60TZrL9Dv2256ptY9eO0NdIaKVWzWo//kBPOuUzeZIF1u6Q2ztE lUMiG/9TpsZ/oTHlZu1P7vgfnWvO5qSmxR458TBDB147WWt8bjaMrGTVp63QbQMe7h3f iUH3Dr3SNCwjWcyZy02KGVdHTx2+IflH4NhD72m8ORCuvdJOK3FLgxC2caJGnBNP2W+Q NlxxpaM0ierRxWEM11wIe2tadWUS8KDfwkgNxBiK4Q9h98bQ9WnCA1g7Iy8F6QR/3kID DbEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=uOOrARZ57KKS0DIeFykShX9HUUeEFsBz/9NIeV/loXw=; b=QT44ZomfCO0+6NEiGO2lt/iG020VJtM1c6hHZuRNcvKJve8OEA7CuQ41Raziyfdlpk Qd/kjR6+SHxkMxOkMhQ/1DdAGbSt3NBx7Au2mjCS+U+s/9s5aRgKh1L8iRwKj0l9F1vX te+Ibaeo4YdMhkpxj3Zjc0yGVgAP7yFOn2tDJ5rKVr2aPvuP2iBidR/M4cC4VY5KKIaX h9p92fx6b5Ug7aUhYHogN1SVf80Pk2dgIoKZrAxgGZJ5eQjBmDHsYlvlVU3bDvjZe+ze VyJpXls1/T6QjRVFFpDVc8UVwwQgQYJyydsJNQQ5v3eKMLHlmky2cBwZaGfXCFfrx2i9 Goww== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id fx17-20020a170906b75100b0076c593c2d58si72935ejb.479.2022.09.26.07.56.00; Mon, 26 Sep 2022 07:56:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 384DD68BBCC; Mon, 26 Sep 2022 17:53:17 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6421768BA85 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id E6B02C00C2 for ; Mon, 26 Sep 2022 17:52:54 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:42 +0300 Message-Id: <20220926145251.56351-22-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 22/31] lavc/audiodsp: RISC-V V scalarproduct_int16 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NCVrfCk46Z09 From: Rémi Denis-Courmont --- libavcodec/riscv/audiodsp_init.c | 5 ++++- libavcodec/riscv/audiodsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index 9c9265531d..32c3c6794d 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -26,6 +26,7 @@ void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); +int32_t ff_scalarproduct_int16_rvv(const int16_t *v1, const int16_t *v2, int len); void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, int32_t max, unsigned int len); void ff_vector_clipf_rvv(float *dst, const float *src, int len, float min, float max); @@ -37,8 +38,10 @@ av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) if (flags & AV_CPU_FLAG_RVF) c->vector_clipf = ff_vector_clipf_rvf; #if HAVE_RVV - if (flags & AV_CPU_FLAG_RVV_I32) + if (flags & AV_CPU_FLAG_RVV_I32) { + c->scalarproduct_int16 = ff_scalarproduct_int16_rvv; c->vector_clip_int32 = ff_vector_clip_int32_rvv; + } if (flags & AV_CPU_FLAG_RVV_F32) c->vector_clipf = ff_vector_clipf_rvv; #endif diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S index 427b424cb9..f4308f27c5 100644 --- a/libavcodec/riscv/audiodsp_rvv.S +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -20,6 +20,25 @@ #include "libavutil/riscv/asm.S" +func ff_scalarproduct_int16_rvv, zve32x + vsetvli zero, zero, e16, m1, ta, ma + vmv.s.x v8, zero +1: + vsetvli t0, a2, e16, m1, ta, ma + vle16.v v16, (a0) + sub a2, a2, t0 + vle16.v v24, (a1) + sh1add a0, t0, a0 + vwmul.vv v0, v16, v24 + sh1add a1, t0, a1 + vsetvli zero, t0, e32, m2, ta, ma + vredsum.vs v8, v0, v8 + bnez a2, 1b + + vmv.x.s a0, v8 + ret +endfunc + func ff_vector_clip_int32_rvv, zve32x 1: vsetvli t0, a4, e32, m1, ta, ma From patchwork Mon Sep 26 14:52:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38360 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2306091pzh; Mon, 26 Sep 2022 07:56:39 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7cLNch8h5cDGsjkK1RUt6POuO0OBfEmwDf2v6vpSRp91MIofjle9vrpiURfV+TSGZCv0HV X-Received: by 2002:a05:6402:2709:b0:451:d665:e787 with SMTP id y9-20020a056402270900b00451d665e787mr22663784edd.317.1664204198972; Mon, 26 Sep 2022 07:56:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204198; cv=none; d=google.com; s=arc-20160816; b=H38PZCOTecaSFq3cPpimK7K5nFkeGRAjNMpq562m7thkSjhdg/jRM/3fSmBWVVrSUb I/FJjMnodFCYukjnHN49CI73cbICiF1MUWMKwlQXY75vyXsHrYw7dWIBKET1np2ymr/d njBTv1ZmyizEhsK75YXzyAV9JbngGmY43d7XxsUKOuOIYN2i67O6paI4otVa7bghmPT1 4vVb37bXymoyb65zWxNi6yzZKPqCHfMnARqw+w1tTUuav92GswZIGTuVYoTMiwFjqBET hcPnVoKohpl9/HAOzTlo8lNTpg41ZTYCOI/CN0OolG9TvlAaVW1MQO34d2c/SOursIhl UGag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=JNszohZd+F9Y6CPC2Lq2w20pnvb50+DJwAFDxLwSMVM=; b=n/40G+SK1Y3bqjiLfYR7057fqHQdRGtxf4p/18oaWreDaIpiEvPlu+KIWxL8fmIMoF xLQiAF5452aVAVyZX0VB8GGI0TvEF5d5FTc1adIdChw2Abmi8Har8wkpCwX53PhexgWe wBdH4t+iVj+YDDVjmCmYIIHfM9nQOsgrOKad8mC+y3v53/XesxypBebwwiHu4eMJeBHl B9zDXVRLjzWokOyqD0Pz+gKzHszuoIeMyestkhhX/H8ad8dEIcTToEQWp14l2FLVRhxd I0vWDZBso/q3FBUdlbPnkzDuN4vrBFyT/eb3j4h7xOO1ALKHqbGgS9sGO6EWBaxIMSLq 7Qaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e15-20020a17090658cf00b00782ece08669si185191ejs.13.2022.09.26.07.56.38; Mon, 26 Sep 2022 07:56:38 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 54D8868BBE2; Mon, 26 Sep 2022 17:53:21 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 69B6568BAC6 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 1C1FCC00C3 for ; Mon, 26 Sep 2022 17:52:55 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:43 +0300 Message-Id: <20220926145251.56351-23-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 23/31] lavc/fmtconvert: RISC-V V int32_to_float_fmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: xGOcle5aWLp9 From: Rémi Denis-Courmont --- libavcodec/fmtconvert.c | 2 ++ libavcodec/fmtconvert.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/fmtconvert_init.c | 39 ++++++++++++++++++++++++++++++ libavcodec/riscv/fmtconvert_rvv.S | 39 ++++++++++++++++++++++++++++++ 5 files changed, 83 insertions(+) create mode 100644 libavcodec/riscv/fmtconvert_init.c create mode 100644 libavcodec/riscv/fmtconvert_rvv.S diff --git a/libavcodec/fmtconvert.c b/libavcodec/fmtconvert.c index cedfd61138..d889e61aca 100644 --- a/libavcodec/fmtconvert.c +++ b/libavcodec/fmtconvert.c @@ -52,6 +52,8 @@ av_cold void ff_fmt_convert_init(FmtConvertContext *c) ff_fmt_convert_init_arm(c); #elif ARCH_PPC ff_fmt_convert_init_ppc(c); +#elif ARCH_RISCV + ff_fmt_convert_init_riscv(c); #elif ARCH_X86 ff_fmt_convert_init_x86(c); #endif diff --git a/libavcodec/fmtconvert.h b/libavcodec/fmtconvert.h index da244e05a5..1cb4628a64 100644 --- a/libavcodec/fmtconvert.h +++ b/libavcodec/fmtconvert.h @@ -61,6 +61,7 @@ void ff_fmt_convert_init(FmtConvertContext *c); void ff_fmt_convert_init_aarch64(FmtConvertContext *c); void ff_fmt_convert_init_arm(FmtConvertContext *c); void ff_fmt_convert_init_ppc(FmtConvertContext *c); +void ff_fmt_convert_init_riscv(FmtConvertContext *c); void ff_fmt_convert_init_x86(FmtConvertContext *c); void ff_fmt_convert_init_mips(FmtConvertContext *c); diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 99541b075e..682174e875 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,5 +1,7 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o +OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o +RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/fmtconvert_init.c b/libavcodec/riscv/fmtconvert_init.c new file mode 100644 index 0000000000..b2c240c1ce --- /dev/null +++ b/libavcodec/riscv/fmtconvert_init.c @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/fmtconvert.h" + +void ff_int32_to_float_fmul_scalar_rvv(float *dst, const int32_t *src, + float mul, int len); + +av_cold void ff_fmt_convert_init_riscv(FmtConvertContext *c) +{ +#ifdef HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_F32) + c->int32_to_float_fmul_scalar = ff_int32_to_float_fmul_scalar_rvv; +#endif +} diff --git a/libavcodec/riscv/fmtconvert_rvv.S b/libavcodec/riscv/fmtconvert_rvv.S new file mode 100644 index 0000000000..b7c78831a0 --- /dev/null +++ b/libavcodec/riscv/fmtconvert_rvv.S @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_int32_to_float_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v24, (a1) + sub a2, a2, t0 + vfcvt.f.x.v v24, v24 + sh2add a1, t0, a1 + vfmul.vf v24, v24, fa0 + vse32.v v24, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc From patchwork Mon Sep 26 14:52:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38355 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2305551pzh; Mon, 26 Sep 2022 07:55:51 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7aNzISIZM/R/i9RuOR6AxGqedf8DgUjgJkimT60pTFEaxTB4jKo+yUTZx4myC3dgQBBrYJ X-Received: by 2002:a50:fc13:0:b0:457:1075:42de with SMTP id i19-20020a50fc13000000b00457107542demr10435955edr.310.1664204151177; Mon, 26 Sep 2022 07:55:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204151; cv=none; d=google.com; s=arc-20160816; b=m4yQkc+i5gAq7pDdvYYTdWIxj05kR+Ix3kXyHT2zzFYvEfM9rZz2pWmQtIqlc63gzh cgL9csDlMgJQrbNCD7pxwHp+sxEL8Yz7AOvxSMacBxCyyI0N6BlkNU+4hHGLCmo0MdV2 TUQW1bqTp5OSzEVlWGShwQb9MSsFLBMscBQJMf3Hs8+aA2z71Eql9AbqEsWd4MvnBwST OqFO2Iqn9reNQEXJhuCveSM53a0OOWPrbUVL2GubdMGI3NR7s3XxALo1xtXFh3lSvZjV QAr7BMMk8N+JZmh/DsnIemdVRcIfFNmCKDe+Y86jLM02qPfCHex03fFtwx4T57Pc6u/9 FLnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=9935FaDriaNQg5cEJdFat3cxD7HibT+IxnvEJeRTBhw=; b=FU8S9xru726Wu/o/UKQmgMxXHveXr6mqu/EhwlRA1iOnGQlDKGpmoEOXH60NdjbU+c hxYZf3squekZ7lUn3FRQAZbAXM97QcyDo2k4QJOUOsPhGLo3Op8D2seCkxTJbNUZPJoD Nymi7y6ytAkqBJqUjInb7tkLQJ6zq/OAgFNiKjakh/P3xg5BL9tI596FRojXH+8Jzluh ltAoF5lypqGEHQ9Un47M2GhZgjvNoHqzHnPheBqaWsuwbfzxGdhGFXD8UdwYoeTb7j7k PBu3SntQyDNyxjaJpX8DLd3ASjQ/kJAtI69jsg629Vx+KfpieFNu1byFuNr9+wH8GUhb DSag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id mp37-20020a1709071b2500b007313314bb73si30333ejc.806.2022.09.26.07.55.50; Mon, 26 Sep 2022 07:55:51 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3CEDA68BAB1; Mon, 26 Sep 2022 17:53:16 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6328A68BAA9 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 45C7DC00C4 for ; Mon, 26 Sep 2022 17:52:55 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:44 +0300 Message-Id: <20220926145251.56351-24-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 24/31] lavc/fmtconvert: RISC-V V int32_to_float_fmul_array8 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: fapGC0F0G3LR From: Rémi Denis-Courmont --- libavcodec/riscv/fmtconvert_init.c | 7 ++++++- libavcodec/riscv/fmtconvert_rvv.S | 28 ++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/fmtconvert_init.c b/libavcodec/riscv/fmtconvert_init.c index b2c240c1ce..fd1a8e0ca1 100644 --- a/libavcodec/riscv/fmtconvert_init.c +++ b/libavcodec/riscv/fmtconvert_init.c @@ -27,13 +27,18 @@ void ff_int32_to_float_fmul_scalar_rvv(float *dst, const int32_t *src, float mul, int len); +void ff_int32_to_float_fmul_array8_rvv(FmtConvertContext *c, float *dst, + const int32_t *src, const float *mul, + int len); av_cold void ff_fmt_convert_init_riscv(FmtConvertContext *c) { #ifdef HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RVV_F32) + if (flags & AV_CPU_FLAG_RVV_F32) { c->int32_to_float_fmul_scalar = ff_int32_to_float_fmul_scalar_rvv; + c->int32_to_float_fmul_array8 = ff_int32_to_float_fmul_array8_rvv; + } #endif } diff --git a/libavcodec/riscv/fmtconvert_rvv.S b/libavcodec/riscv/fmtconvert_rvv.S index b7c78831a0..c79f80cc47 100644 --- a/libavcodec/riscv/fmtconvert_rvv.S +++ b/libavcodec/riscv/fmtconvert_rvv.S @@ -37,3 +37,31 @@ NOHWF mv a2, a3 ret endfunc + +func ff_int32_to_float_fmul_array8_rvv, zve32f + srai a4, a4, 3 + +1: vsetvli t0, a4, e32, m1, ta, ma + vle32.v v24, (a3) + slli t2, t0, 2 + 3 + vlseg8e32.v v16, (a2) + vsetvli t3, zero, e32, m8, ta, ma + vfcvt.f.x.v v16, v16 + vsetvli zero, a4, e32, m1, ta, ma + vfmul.vv v16, v16, v24 + sub a4, a4, t0 + vfmul.vv v17, v17, v24 + sh2add a3, t0, a3 + vfmul.vv v18, v18, v24 + add a2, a2, t2 + vfmul.vv v19, v19, v24 + vfmul.vv v20, v20, v24 + vfmul.vv v21, v21, v24 + vfmul.vv v22, v22, v24 + vfmul.vv v23, v23, v24 + vsseg8e32.v v16, (a1) + add a1, a1, t2 + bnez a4, 1b + + ret +endfunc From patchwork Mon Sep 26 14:52:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38357 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2305781pzh; Mon, 26 Sep 2022 07:56:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM50q4C+o/QOiSPHzPfna6BAy6j4vEwpt17BtkKF9aq3L/YVMjbdUdoOYFBRU0phryUqAIbB X-Received: by 2002:a17:906:ef90:b0:77f:8f0d:e925 with SMTP id ze16-20020a170906ef9000b0077f8f0de925mr18650122ejb.622.1664204171283; Mon, 26 Sep 2022 07:56:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204171; cv=none; d=google.com; s=arc-20160816; b=GJglne1vPPmiv2uLdJr/APb+qKsT0LgasUJtxFckexFHLLtxSLlRh6STlcKo40Iglm GYyJ0qpUN8r/HBH8Hhuvk6f1AddNcINwhPLtscna8MmlInyJ5s3U9Mdm0OMmlwk9Er3B PIFlwqSEI4B5LUVqMFcOAm7C2IRcTS0KMVKHXgv2l9KJeQ9Xu0tvC2OCxasrLfb7nYAv h4kJEqDaQvhlpRLhHVyWDcDm3MbPJqWo2AqMR9Reot6MGQvuUNmoU+k66hi58VdXXPk/ RG1NlePHKCjldhlViTb9Ci9aHNcVRnkgCFC0JFMeOUhb+eEZRIxIltMc5TkE65zAb4wf tOqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=E+RxQkENRPavJh7yvhwHQrWv/7E1Qbg1hfHJTrHFVkA=; b=U0EOWO3vIbdAoVuSuesMJHMs6ghwZkoVYdfXkss8i7rnButySPhnXNT46RTDJZC7tD n4WgFKBFrtL7sgEMKIU1PwQjg3XBToaH90aIl6bPJEu/LgGO/+z0tMqgbea5SVznlBvg bPytexGV1gi6GLOqW3IOFQafZFkmIHdyShlWAFd+bBWkfRu2X13cj/xFqsI+/06UlKow a5S6mBV1ZjiPvVdnpPuLlWSU1siQEe2CRJtG8gOuPKfseIVPq+SV+NpN2DjgOsS2nIy7 nXp5Shp8MWmk07eWaGNTAIUn4pE+bXCpZdO75uIZ52nkf1qDCGhG88PgMF5Ys6i2r1iA 3Lwg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id w3-20020a17090652c300b00781a47397b1si71895ejn.502.2022.09.26.07.56.09; Mon, 26 Sep 2022 07:56:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5C1E268BBD2; Mon, 26 Sep 2022 17:53:18 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 652FA68BAB1 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6FBF0C00C5 for ; Mon, 26 Sep 2022 17:52:55 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:45 +0300 Message-Id: <20220926145251.56351-25-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 25/31] lavc/vorbisdsp: RISC-V V inverse_coupling X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: XdmHZCi6LbCq From: Rémi Denis-Courmont This uses the following vectorisation: for (i = 0; i < blocksize; i++) { ang[i] = mag[i] - copysignf(fmaxf(ang[i], 0.f), mag[i]); mag[i] = mag[i] - copysignf(fminf(ang[i], 0.f), mag[i]); } --- libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/vorbisdsp_init.c | 37 ++++++++++++++++++++++++++ libavcodec/riscv/vorbisdsp_rvv.S | 44 +++++++++++++++++++++++++++++++ libavcodec/vorbisdsp.c | 2 ++ libavcodec/vorbisdsp.h | 1 + 5 files changed, 86 insertions(+) create mode 100644 libavcodec/riscv/vorbisdsp_init.c create mode 100644 libavcodec/riscv/vorbisdsp_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 682174e875..03a95301d7 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -5,3 +5,5 @@ OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o +OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o +RVV-OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_rvv.o diff --git a/libavcodec/riscv/vorbisdsp_init.c b/libavcodec/riscv/vorbisdsp_init.c new file mode 100644 index 0000000000..0c56ffcb9b --- /dev/null +++ b/libavcodec/riscv/vorbisdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/vorbisdsp.h" + +void ff_vorbis_inverse_coupling_rvv(float *mag, float *ang, + ptrdiff_t blocksize); + +av_cold void ff_vorbisdsp_init_riscv(VorbisDSPContext *c) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_I32) + c->vorbis_inverse_coupling = ff_vorbis_inverse_coupling_rvv; +#endif +} diff --git a/libavcodec/riscv/vorbisdsp_rvv.S b/libavcodec/riscv/vorbisdsp_rvv.S new file mode 100644 index 0000000000..e8953fb548 --- /dev/null +++ b/libavcodec/riscv/vorbisdsp_rvv.S @@ -0,0 +1,44 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_vorbis_inverse_coupling_rvv, zve32f + fmv.w.x ft0, zero +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a1) + sub a2, a2, t0 + vle32.v v24, (a0) + vfmax.vf v8, v16, ft0 + vfmin.vf v16, v16, ft0 + vfsgnj.vv v8, v8, v24 + vfsgnj.vv v16, v16, v24 + vfsub.vv v8, v24, v8 + vfsub.vv v24, v24, v16 + vse32.v v8, (a1) + sh2add a1, t0, a1 + vse32.v v24, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc diff --git a/libavcodec/vorbisdsp.c b/libavcodec/vorbisdsp.c index 693c44dfcb..70022bd262 100644 --- a/libavcodec/vorbisdsp.c +++ b/libavcodec/vorbisdsp.c @@ -53,6 +53,8 @@ av_cold void ff_vorbisdsp_init(VorbisDSPContext *dsp) ff_vorbisdsp_init_arm(dsp); #elif ARCH_PPC ff_vorbisdsp_init_ppc(dsp); +#elif ARCH_RISCV + ff_vorbisdsp_init_riscv(dsp); #elif ARCH_X86 ff_vorbisdsp_init_x86(dsp); #endif diff --git a/libavcodec/vorbisdsp.h b/libavcodec/vorbisdsp.h index 1775a92cf2..5c369ecf22 100644 --- a/libavcodec/vorbisdsp.h +++ b/libavcodec/vorbisdsp.h @@ -34,5 +34,6 @@ void ff_vorbisdsp_init_aarch64(VorbisDSPContext *dsp); void ff_vorbisdsp_init_x86(VorbisDSPContext *dsp); void ff_vorbisdsp_init_arm(VorbisDSPContext *dsp); void ff_vorbisdsp_init_ppc(VorbisDSPContext *dsp); +void ff_vorbisdsp_init_riscv(VorbisDSPContext *dsp); #endif /* AVCODEC_VORBISDSP_H */ From patchwork Mon Sep 26 14:52:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38359 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2305980pzh; Mon, 26 Sep 2022 07:56:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5W5PbxHL+U0yZgoohRQB/dcDWeas1WeY1MtygNFDf4mJK+LLZI46v5ChTUpsZLpv9d7VGW X-Received: by 2002:a17:906:9749:b0:782:287f:d217 with SMTP id o9-20020a170906974900b00782287fd217mr18342161ejy.259.1664204189247; Mon, 26 Sep 2022 07:56:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204189; cv=none; d=google.com; s=arc-20160816; b=neZf3j/t/TCXEFYRTtlip24aIGaXAwSiA/ynLfDh3EV5iLZuhOY0jKcs9ffnCbPaiL 03GvJ1R+lzDgSg4K3lfMx6cKj7xR/SjbjEVkCk7CQnBzOl3t7Zuo5UIAtTcc70oUuATf NpGTX7hThkof2EByzdpko7GnrIKVh5WfLZvoR6r1+jQUsKSIjb//SJmiZdBkWhp7ZYRF cdiZZjQvDQQJcZ+P19E/hnzDhQncEO7h7M85FCFHQtp50XesncbJuFxgGpcVN3LDkKw2 wJEeKJEKrOqZZyDmWeDmAlZnC3IAdnuI1YAjF2DggixGodfn4aGjtb6Zn5PnalDVS0PF N8lA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=d36caztwqEEj4UCYSQs++vfMFwEKg29wriP2PIK80Mk=; b=N+TClEC6QbDThMWG6+1qPi1rfxJCiwTytm91XwDFIzWRId0+XC11H8v6Hqw9XhDrfM 7/FKg864i0NXmh3a3mCxvRpta7rqEGgse8csg/swJ+V3OlDIWVvcR7Tk6enKvU8zzhz2 aIYQy3U8AAt9DR9vd0Lyg9EFDB8up/nlFKXPp0knYhvvg2OQKa/F6HGQMmDTjcU4gQY0 17pi0tSRcbMY/5YIQ2RfBoNGNpjorqVlNQc9R93oBkjxUR/rRjoB44v0vACTj/VnL5Eo pj7ZGnlgEopZ26mvXZknWk6MM/2MQgN/yo56jH3iUeJOHzitlah8E7A/GxIb/a+0ymUt 9QRQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id n11-20020a05640205cb00b0043a9bb390d3si19844346edx.278.2022.09.26.07.56.28; Mon, 26 Sep 2022 07:56:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 78EFF68BBCE; Mon, 26 Sep 2022 17:53:20 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 68A5B68BAC0 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 99A18C00C6 for ; Mon, 26 Sep 2022 17:52:55 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:46 +0300 Message-Id: <20220926145251.56351-26-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 26/31] lavc/aacpsdsp: RISC-V V add_squares X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: THdeCzL7RMda From: Rémi Denis-Courmont --- libavcodec/aacpsdsp.h | 1 + libavcodec/aacpsdsp_template.c | 2 ++ libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/aacpsdsp_init.c | 37 ++++++++++++++++++++++++++++++++ libavcodec/riscv/aacpsdsp_rvv.S | 37 ++++++++++++++++++++++++++++++++ 5 files changed, 79 insertions(+) create mode 100644 libavcodec/riscv/aacpsdsp_init.c create mode 100644 libavcodec/riscv/aacpsdsp_rvv.S diff --git a/libavcodec/aacpsdsp.h b/libavcodec/aacpsdsp.h index 917ac5303f..8b32761bdb 100644 --- a/libavcodec/aacpsdsp.h +++ b/libavcodec/aacpsdsp.h @@ -55,6 +55,7 @@ void AAC_RENAME(ff_psdsp_init)(PSDSPContext *s); void ff_psdsp_init_arm(PSDSPContext *s); void ff_psdsp_init_aarch64(PSDSPContext *s); void ff_psdsp_init_mips(PSDSPContext *s); +void ff_psdsp_init_riscv(PSDSPContext *s); void ff_psdsp_init_x86(PSDSPContext *s); #endif /* AVCODEC_AACPSDSP_H */ diff --git a/libavcodec/aacpsdsp_template.c b/libavcodec/aacpsdsp_template.c index e3cbf3feec..c063788b89 100644 --- a/libavcodec/aacpsdsp_template.c +++ b/libavcodec/aacpsdsp_template.c @@ -230,6 +230,8 @@ av_cold void AAC_RENAME(ff_psdsp_init)(PSDSPContext *s) ff_psdsp_init_aarch64(s); #elif ARCH_MIPS ff_psdsp_init_mips(s); +#elif ARCH_RISCV + ff_psdsp_init_riscv(s); #elif ARCH_X86 ff_psdsp_init_x86(s); #endif diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 03a95301d7..829a1823d2 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,3 +1,5 @@ +OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o +RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c new file mode 100644 index 0000000000..83f6d9b16b --- /dev/null +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/aacpsdsp.h" + +void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); + +av_cold void ff_psdsp_init_riscv(PSDSPContext *c) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_F32) + c->add_squares = ff_ps_add_squares_rvv; +#endif +} diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S new file mode 100644 index 0000000000..b516063ea7 --- /dev/null +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_ps_add_squares_rvv, zve32f +1: + vsetvli t0, a2, e32, m1, ta, ma + vlseg2e32.v v24, (a1) + sub a2, a2, t0 + vle32.v v16, (a0) + sh3add a1, t0, a1 + vfmacc.vv v16, v24, v24 + vfmacc.vv v16, v25, v25 + vse32.v v16, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc From patchwork Mon Sep 26 14:52:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38361 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2306293pzh; Mon, 26 Sep 2022 07:56:59 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6F2hSFfOs7k2sgp6DupBmw0x0W8JujMZF7GRMd/z+tmxB5jZMlwruIzbj2C7AV2aowgWCR X-Received: by 2002:a17:906:fe46:b0:730:ca2b:cb7b with SMTP id wz6-20020a170906fe4600b00730ca2bcb7bmr19276112ejb.703.1664204218961; Mon, 26 Sep 2022 07:56:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204218; cv=none; d=google.com; s=arc-20160816; b=gBXNFKCln5jxdg8oYR0qxxrTGfYMURm13nk1GJf9S21bI5K/X7qmY3oNFPwUPSvENF UBD9DG5HA4/KtzmN49BizPAAj0mxHujazPceSMfL2TNzqoUlJjY5nrEVsfUPNF8a8as6 xCpfqZMfRsmOPsm7xak01oIPbwBGWkt/K3vshnYl8mqJs/XslxKokHNOnPtA0maZlqFg 37sZguyFY1YAjUUd90lzm+H5TeBLRc8KSMiNvAN5dGturCo3cBlu9ipR8nCPIaRa//zf YeZYDAZnvZBWJ15n0NRVhWzu9hQG0JFcvHNhM3a+FqkHRmBnk2cTKUdVZWU3qth7iBEF lJNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=IKr7JNZvpn70VKlD16Z383uHW/+7BBNU7QPz6yhuzdc=; b=rddwi/B5eZHIgN/fizX0GRYG6LqOo+biwznMe2ROwc+XOUQvRmTUMwTIrELBG94LJG OK1fUmohjHEhWCpkirkp4V4pZSaUMxFXwMTZJPdcOlMgpcjzmM+CZ7ZdSFXZfIoVTgWU XkV+R9MpE59UMdXNdZvSJ8Z4qx2tNb6UkoM+qn0+Ro3wtKwIBddj6vtjyJlU7rgDrm+X ofzu/qiu3pyDgRNY7wsnLBBbOWsz4D8N5S8xNrOsRxC3y9p0nVkbzl3O0MVe/+KdrT8z 9tar4xSueZWI1pjP2tKSiBYR/3Z3X1J7O3OeTdI2AEmWB/JnS31N6+U0cjRDvAR3F8ZI fcng== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id js1-20020a17090797c100b00782b2a97827si134749ejc.242.2022.09.26.07.56.58; Mon, 26 Sep 2022 07:56:58 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2C1DD68BBF1; Mon, 26 Sep 2022 17:53:23 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7F60B68B32D for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id C4121C00C7 for ; Mon, 26 Sep 2022 17:52:55 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:47 +0300 Message-Id: <20220926145251.56351-27-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 27/31] lavc/aacpsdsp: RISC-V V mul_pair_single X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: x2BwJtwSHAYW From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 6 +++++- libavcodec/riscv/aacpsdsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 83f6d9b16b..21fd5b8470 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -25,13 +25,17 @@ #include "libavcodec/aacpsdsp.h" void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); +void ff_ps_mul_pair_single_rvv(float (*dst)[2], float (*src0)[2], float *src1, + int n); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RVV_F32) + if (flags & AV_CPU_FLAG_RVV_F32) { c->add_squares = ff_ps_add_squares_rvv; + c->mul_pair_single = ff_ps_mul_pair_single_rvv; + } #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index b516063ea7..70b7b72218 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -35,3 +35,20 @@ func ff_ps_add_squares_rvv, zve32f ret endfunc + +func ff_ps_mul_pair_single_rvv, zve32f +1: + vsetvli t0, a3, e32, m1, ta, ma + vlseg2e32.v v24, (a1) + sub a3, a3, t0 + vle32.v v16, (a2) + sh3add a1, t0, a1 + vfmul.vv v24, v24, v16 + sh2add a2, t0, a2 + vfmul.vv v25, v25, v16 + vsseg2e32.v v24, (a0) + sh3add a0, t0, a0 + bnez a3, 1b + + ret +endfunc From patchwork Mon Sep 26 14:52:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38362 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2306374pzh; Mon, 26 Sep 2022 07:57:09 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6UbxjwowouWCfo/Dlhg+MpZxc85xoRHOKXzQWp3105UYymd3fHpeuKy1Gh12h77cXZXU8o X-Received: by 2002:a17:907:2c54:b0:77d:971f:be12 with SMTP id hf20-20020a1709072c5400b0077d971fbe12mr17786931ejc.560.1664204229059; Mon, 26 Sep 2022 07:57:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204229; cv=none; d=google.com; s=arc-20160816; b=0mCtXr+pIy1/7FyAXh6swFYSuSL70orv4c3vDgNuJ2SlyRxRUcSjJFN53eV6sZ5RMb VvDunWIrf+MVwD4b2Mpd+P96yJu6GcZ9gxV0wZrjTe8eOmJQJDW3KDN/Gj7WIbjxYRk5 3o4r3VnbvIEUk8HJthlrC/VMp5NQyU2H1sxvYa1Tus156gL6KlcsbVus5jAOJKm3IS2u SQmsO6VUggqOeuw0gAalM11HnhHVnfYaN2PFlFGeFlerrzCn2kZj5nc9lr0MVL38BOqE GPvOpfF/r69tUeXsSVHRpAKtap3euvpnD8uG8HjTVCaW+/M/XW+oqEt1M4R3uzNMEBWK /5Ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=PPAvPNDAN3cYa+/I+luZXIs5yBTZVBlH4NYS8gksQSg=; b=BzDs+gv8EgLPdXhQaee/qee5nTwiG6ywiTqm0G/TCjfOC/WcGS9zrhx5pJ6hr/O/RC Sxow18AG+DVt9mh9NqrCJdaZeskRjjP7gq4EtFwqyKZ/lsBM6h5H8OzZZdKoJZIPOzWF y+pCXrCq5F1VScShQgOlAQ/797Sa+n62qEljKnIloFpctMrnIyhcalzQ4DzMUbO3yRB5 41GDnU85XqIW8wPEIZFjelAlLp87IBDmTDSyuFgj29jUD7TlsgJGKHTmgTj1FFY+lg79 gQkCuf4cFnBfn54yp/z+WuoEYxTK4KYtdm5hLz/nR/YzOmDIWxhtm+PCl3+6kTl4YRz9 b/sQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dr12-20020a170907720c00b0076ed46e4445si38845ejc.810.2022.09.26.07.57.08; Mon, 26 Sep 2022 07:57:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3867A68BBF6; Mon, 26 Sep 2022 17:53:24 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 838FE68BAFE for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id EDA1BC00C8 for ; Mon, 26 Sep 2022 17:52:55 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:48 +0300 Message-Id: <20220926145251.56351-28-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 28/31] lavc/aacpsdsp: RISC-V V hybrid_analysis X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: nktRAtMesPBz From: Rémi Denis-Courmont This starts with one-time initialisation of the 26 constant factors like 08edacc248bce3f8946d75e97188d189c74a6de6. That is done with the scalar instruction set. While the formula can readily be vectored, the gains would (probably) be more than lost in transfering the results back to FP registers (or suitably reshuffling them into vector registers). Note that the main loop could likely be scheduled sligthly better by expanding the filter macro and interleaving loads with arithmetic. It is not clear yet if that would be relevant for vector processing (as opposed to traditional SIMD). We could also use fewer vectors, but there is not much point in sparing them (they are *all* callee-clobbered). --- libavcodec/riscv/aacpsdsp_init.c | 3 + libavcodec/riscv/aacpsdsp_rvv.S | 97 ++++++++++++++++++++++++++++++++ 2 files changed, 100 insertions(+) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 21fd5b8470..09f16f1041 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -27,6 +27,8 @@ void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); void ff_ps_mul_pair_single_rvv(float (*dst)[2], float (*src0)[2], float *src1, int n); +void ff_ps_hybrid_analysis_rvv(float (*out)[2], float (*in)[2], + const float (*filter)[8][2], ptrdiff_t, int n); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { @@ -36,6 +38,7 @@ av_cold void ff_psdsp_init_riscv(PSDSPContext *c) if (flags & AV_CPU_FLAG_RVV_F32) { c->add_squares = ff_ps_add_squares_rvv; c->mul_pair_single = ff_ps_mul_pair_single_rvv; + c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; } #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index 70b7b72218..65e5e0be4f 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -52,3 +52,100 @@ func ff_ps_mul_pair_single_rvv, zve32f ret endfunc + +func ff_ps_hybrid_analysis_rvv, zve32f + /* We need 26 FP registers, for 20 scratch ones. Spill fs0-fs5. */ + addi sp, sp, -32 + .irp n, 0, 1, 2, 3, 4, 5 + fsw fs\n, (4 * \n)(sp) + .endr + + .macro input, j, fd0, fd1, fd2, fd3 + flw \fd0, (4 * ((\j * 2) + 0))(a1) + flw fs4, (4 * (((12 - \j) * 2) + 0))(a1) + flw \fd1, (4 * ((\j * 2) + 1))(a1) + fsub.s \fd3, \fd0, fs4 + flw fs5, (4 * (((12 - \j) * 2) + 1))(a1) + fadd.s \fd2, \fd1, fs5 + fadd.s \fd0, \fd0, fs4 + fsub.s \fd1, \fd1, fs5 + .endm + + // re0, re1, im0, im1 + input 0, ft0, ft1, ft2, ft3 + input 1, ft4, ft5, ft6, ft7 + input 2, ft8, ft9, ft10, ft11 + input 3, fa0, fa1, fa2, fa3 + input 4, fa4, fa5, fa6, fa7 + input 5, fs0, fs1, fs2, fs3 + flw fs4, (4 * ((6 * 2) + 0))(a1) + flw fs5, (4 * ((6 * 2) + 1))(a1) + + add a2, a2, 6 * 2 * 4 // point to filter[i][6][0] + li t4, 8 * 2 * 4 // filter byte stride + slli a3, a3, 3 // output byte stride +1: + .macro filter, vs0, vs1, fo0, fo1, fo2, fo3 + vfmacc.vf v8, \fo0, \vs0 + vfmacc.vf v9, \fo2, \vs0 + vfnmsac.vf v8, \fo1, \vs1 + vfmacc.vf v9, \fo3, \vs1 + .endm + + vsetvli t0, a4, e32, m1, ta, ma + /* + * The filter (a2) has 16 segments, of which 13 need to be extracted. + * R-V V supports only up to 8 segments, so unrolling is unavoidable. + */ + addi t1, a2, -48 + vlse32.v v22, (a2), t4 + addi t2, a2, -44 + vlse32.v v16, (t1), t4 + addi t1, a2, -40 + vfmul.vf v8, v22, fs4 + vlse32.v v24, (t2), t4 + addi t2, a2, -36 + vfmul.vf v9, v22, fs5 + vlse32.v v17, (t1), t4 + addi t1, a2, -32 + vlse32.v v25, (t2), t4 + addi t2, a2, -28 + filter v16, v24, ft0, ft1, ft2, ft3 + vlse32.v v18, (t1), t4 + addi t1, a2, -24 + vlse32.v v26, (t2), t4 + addi t2, a2, -20 + filter v17, v25, ft4, ft5, ft6, ft7 + vlse32.v v19, (t1), t4 + addi t1, a2, -16 + vlse32.v v27, (t2), t4 + addi t2, a2, -12 + filter v18, v26, ft8, ft9, ft10, ft11 + vlse32.v v20, (t1), t4 + addi t1, a2, -8 + vlse32.v v28, (t2), t4 + addi t2, a2, -4 + filter v19, v27, fa0, fa1, fa2, fa3 + vlse32.v v21, (t1), t4 + sub a4, a4, t0 + vlse32.v v29, (t2), t4 + slli t1, t0, 3 + 1 + 2 // ctz(8 * 2 * 4) + add a2, a2, t1 + filter v20, v28, fa4, fa5, fa6, fa7 + filter v21, v29, fs0, fs1, fs2, fs3 + + add t2, a0, 4 + vsse32.v v8, (a0), a3 + mul t0, t0, a3 + vsse32.v v9, (t2), a3 + add a0, a0, t0 + bnez a4, 1b + + .irp n, 5, 4, 3, 2, 1, 0 + flw fs\n, (4 * \n)(sp) + .endr + addi sp, sp, 32 + ret + .purgem input + .purgem filter +endfunc From patchwork Mon Sep 26 14:52:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38363 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2306487pzh; Mon, 26 Sep 2022 07:57:20 -0700 (PDT) X-Google-Smtp-Source: AMsMyM414HRk5JV5b7TQyHOuB4DkgTFQ9/kqqYqL27boCw+2aOk01fc7A0Puwpcm3tc+fAyepKkC X-Received: by 2002:a17:907:a088:b0:780:da07:9df3 with SMTP id hu8-20020a170907a08800b00780da079df3mr18626287ejc.252.1664204239889; Mon, 26 Sep 2022 07:57:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204239; cv=none; d=google.com; s=arc-20160816; b=x+nblCp/Wm6d7DYu2X7/BHJcEc05w1iZskkoj2QRO9XQROs7P1uZ7PN8PplpZ5h+hP HGLpoqwJTeef3dbo58stDyfj56bIBT6c39zlQXFAGyc6T83+y1Buqr3IGJjn5GroY1r+ J5bX5/UkvBGsvB087j1LH+YbmUv9gax3R3WTngE93Ni0O0dLh6iXSbP0kn+b+RBN8Xhv o1g5c0B28z3sYn1Ya047XtNjuI7Ne0uuOpatB2iEghEt7EwEJlTxs8QUhmVD37v6ayxt QyG/MNkpskNsDK5oECQMF/H+7gWx9t4hyXq1lFOG+IEgDMBBS5iDHcifGgVKaooj1Yxg tk8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=RYQ0IKRr5qQqw10yQ6BNEEB0mTazIO8Og9dU9X3irFg=; b=vsx9qzYnognRoke5ueutMfx7S74PpaG9zRZr6Y9fDKKruBPF3ZnpaOg0gPRpE7okii 8wATajVBzZ7LyPcTH+foEIwyG+ectvyIVFZkznfSSA8tCuDE+Abjf/xU4qZXNktxWZws xRhq5UMOhW3aZm5P5E808tyJShyjSFizVIiRRb2poIzOvIMg7LlVkR0RmG+vdKG/CMrB FjWq5AScYVlvQuocwkHhlxXH/hZ6hzaOG8ayDyF9IXGCe0ycFrpeiDDIIP/IB0nC/IaV 8i6SpCAh6Nsst5cAJ/Oy55Uv3RiovKV/sd36Pr9JesTEaDZPGcY/bMUuY0RuJpf1Amcd 09Dw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id q10-20020a1709060e4a00b007813594dc31si68282eji.523.2022.09.26.07.57.18; Mon, 26 Sep 2022 07:57:19 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6DB9568BBFD; Mon, 26 Sep 2022 17:53:25 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 86FF368BB05 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 232BFC00C9 for ; Mon, 26 Sep 2022 17:52:56 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:49 +0300 Message-Id: <20220926145251.56351-29-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 29/31] lavc/aacpsdsp: RISC-V V hybrid_analysis_ileave X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: o8Gj1kBlnSux From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 5 +++++ libavcodec/riscv/aacpsdsp_rvv.S | 35 ++++++++++++++++++++++++++++++++ 2 files changed, 40 insertions(+) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 09f16f1041..1d36f89f6e 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -29,6 +29,8 @@ void ff_ps_mul_pair_single_rvv(float (*dst)[2], float (*src0)[2], float *src1, int n); void ff_ps_hybrid_analysis_rvv(float (*out)[2], float (*in)[2], const float (*filter)[8][2], ptrdiff_t, int n); +void ff_ps_hybrid_analysis_ileave_rvv(float (*out)[32][2], float L[2][38][64], + int i, int len); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { @@ -40,5 +42,8 @@ av_cold void ff_psdsp_init_riscv(PSDSPContext *c) c->mul_pair_single = ff_ps_mul_pair_single_rvv; c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; } + + if (flags & AV_CPU_FLAG_RVV_I32) + c->hybrid_analysis_ileave = ff_ps_hybrid_analysis_ileave_rvv; #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index 65e5e0be4f..c9cc15e73d 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -149,3 +149,38 @@ func ff_ps_hybrid_analysis_rvv, zve32f .purgem input .purgem filter endfunc + +func ff_ps_hybrid_analysis_ileave_rvv, zve32x /* no needs for zve32f here */ + slli t0, a2, 5 + 1 + 2 // ctz(32 * 2 * 4) + sh2add a1, a2, a1 + add a0, a0, t0 + addi a2, a2, -64 + li t1, 38 * 64 * 4 + li t6, 64 * 4 // (uint8_t *)L[x][j+1][i] - L[x][j][i] + add a4, a1, t1 // &L[1] + beqz a2, 3f +1: + mv t0, a0 + mv t1, a1 + mv t3, a3 + mv t4, a4 + addi a2, a2, 1 +2: + vsetvli t5, t3, e32, m1, ta, ma + vlse32.v v16, (t1), t6 + sub t3, t3, t5 + vlse32.v v17, (t4), t6 + mul t2, t5, t6 + vsseg2e32.v v16, (t0) + sh3add t0, t5, t0 + add t1, t1, t2 + add t4, t4, t2 + bnez t3, 2b + + add a0, a0, 32 * 2 * 4 + add a1, a1, 4 + add a4, a4, 4 + bnez a2, 1b +3: + ret +endfunc From patchwork Mon Sep 26 14:52:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38337 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2306209pzh; Mon, 26 Sep 2022 07:56:49 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7rSpWfM/lNLYcD/AjrQ7ZDEJk5HXbG3uHNAV/DUe/4FD7D6i3KfZF7aRZK0YsSGVFnWfB+ X-Received: by 2002:aa7:db12:0:b0:457:2973:7e24 with SMTP id t18-20020aa7db12000000b0045729737e24mr8337983eds.264.1664204209384; Mon, 26 Sep 2022 07:56:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204209; cv=none; d=google.com; s=arc-20160816; b=bQuuUD9IkgxA340WWMVYLSpxiEml33vFFNcbj5RY+VYreGxzvyHRTNyBSxbQpkv6DR IpsPfLb/rcAglxt4YkynioY7ojVYdcxvp75T+B95jIG563VA+giU7xHCik/1S1vedqDz 04lapO6+oH6uaoISvpJzdooLzC+N9k0rDwDAnelQeFWn/J9JQxUAoLz1UOSggRyPhBLx sVSNx9rNvkSvUmbcKbeeldqfby4EbSfE+g41qvH6Vo2QZrl6TM+Npp3/aIKNwIoSObd5 OvXoCE+PpLEN0Sip/wAxaJp9JAmsGnxA4MpSgodhkrlxCuGFQmKprYiT3sBg5CGdQPWs XYXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=rJxJlDHJl7M81gUHprJ1XBjUHOwnIXqiu07P9Fc1Y1E=; b=vkXSYsETPR/P7Xx6BqougdVljliiU0Qf+K0jb0I5Es7Cr10iOL27wGXMfIVKXfwe8b 7sUOvpoC7Bn0KA2MufCSlCbNmXaNoYMnDzAaA6zXzDUE0ReVDJnjAypMQckuf6PKjBdB ToTByKl/HnEJs1BQAZSySw1cClAhLsx4nAIYSsnBge/wG6m/sFASsWtHALTfBhmrNwrp jureFj7ggnTEfaPCU/AmHz40Wsy63E3Vf9lFX3J4vTfcYdD+JRDhX8NEIneNBYD11HX3 0qlcMbCHzZ/8ThAQkRb6MbA1xECx7ihhb9QASF7uRCXCRIIcT0noKjbZ0fGpriPBBBiq Z6zg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ey4-20020a0564022a0400b004513abe8f74si15897461edb.249.2022.09.26.07.56.48; Mon, 26 Sep 2022 07:56:49 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 293AF68BBE6; Mon, 26 Sep 2022 17:53:22 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6B66468BAD2 for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 4C869C00CA for ; Mon, 26 Sep 2022 17:52:56 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:50 +0300 Message-Id: <20220926145251.56351-30-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 30/31] lavc/aacpsdsp: RISC-V V hybrid_synthesis_deint X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: KxrW5a/25Rxb From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 6 +++++- libavcodec/riscv/aacpsdsp_rvv.S | 35 ++++++++++++++++++++++++++++++++ 2 files changed, 40 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 1d36f89f6e..c2201ffb6a 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -31,6 +31,8 @@ void ff_ps_hybrid_analysis_rvv(float (*out)[2], float (*in)[2], const float (*filter)[8][2], ptrdiff_t, int n); void ff_ps_hybrid_analysis_ileave_rvv(float (*out)[32][2], float L[2][38][64], int i, int len); +void ff_ps_hybrid_synthesis_deint_rvv(float out[2][38][64], float (*in)[32][2], + int i, int len); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { @@ -43,7 +45,9 @@ av_cold void ff_psdsp_init_riscv(PSDSPContext *c) c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; } - if (flags & AV_CPU_FLAG_RVV_I32) + if (flags & AV_CPU_FLAG_RVV_I32) { c->hybrid_analysis_ileave = ff_ps_hybrid_analysis_ileave_rvv; + c->hybrid_synthesis_deint = ff_ps_hybrid_synthesis_deint_rvv; + } #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index c9cc15e73d..0cbe4c1d3c 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -184,3 +184,38 @@ func ff_ps_hybrid_analysis_ileave_rvv, zve32x /* no needs for zve32f here */ 3: ret endfunc + +func ff_ps_hybrid_synthesis_deint_rvv, zve32x + slli t1, a2, 5 + 1 + 2 + sh2add a0, a2, a0 + add a1, a1, t1 + addi a2, a2, -64 + li t1, 38 * 64 * 4 + li t6, 64 * 4 + add a4, a0, t1 + beqz a2, 3f +1: + mv t0, a0 + mv t1, a1 + mv t3, a3 + mv t4, a4 + addi a2, a2, 1 +2: + vsetvli t5, t3, e32, m1, ta, ma + vlseg2e32.v v16, (t1) + sub t3, t3, t5 + vsse32.v v16, (t0), t6 + mul t2, t5, t6 + vsse32.v v17, (t4), t6 + sh3add t1, t5, t1 + add t0, t0, t2 + add t4, t4, t2 + bnez t3, 2b + + add a0, a0, 4 + add a1, a1, 32 * 2 * 4 + add a4, a4, 4 + bnez a2, 1b +3: + ret +endfunc From patchwork Mon Sep 26 14:52:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38364 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2306561pzh; Mon, 26 Sep 2022 07:57:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM48JXpwp4NOQ/7idhJ97TjrWCDXsm2QOxmcz/5QvitSXOMh5pYrYufcSLB1lW7QizZQiJ0n X-Received: by 2002:a17:906:6a25:b0:782:1ce3:7e94 with SMTP id qw37-20020a1709066a2500b007821ce37e94mr18347780ejc.507.1664204249109; Mon, 26 Sep 2022 07:57:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204249; cv=none; d=google.com; s=arc-20160816; b=Cyhw0taUIC1CS7gTBfXMBj4Ll++eXUZaOGe9vh93xpoT5khhKdR08bsFyrI8b62Mxr qqOzkFHWs2h0IP+K97PWLbS9AbcncuXgawXfbAB1R21sPb39N/tB1aYszMgLegnBp9ht BWIwLFIcM/8FraK5bxF19IVU12vvy4T6xqLBwBqdC8AXwE9GqItXV1YJf1nkqGlz3Oql oxbIklciU7C3JnUX3XbEitTirScA7NBi/GbArhJ6cagc16GQcoXm8EbBIX8cBgqx+dzD w2C+yNdj4PCLD3NTO2FgLMT8VTsT5cHIRsz5jWAhbpXkGVBcejPcpX5h/0/18Oi1/ofe bL6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=lfAWaa4gsTnFa7YWly9SXaogWEB+z4giIN/n+ceC1u0=; b=kdmiYjT0BgQUWshPpqHCfKXLdlpLc3sp+aZvPVBcDy2AEiJ2fek28dEULy0mCZ9kHZ Kidyy+cY2ozTVhlgWaNbrV0kRpUaXEHCbjpBigx7t0cUwhgdVQQtCiH9I5FqHl25Iv5z 2L7BaTX4ym/XAzfLNXUxYk81peMXAgbIBrsRJHWFvU6FeyjOIbwr2ByIj8LC0C6RAGpq EjNqnvQwe2y+AS7BIECIxZztwiAUq8hwa0aglMqUcZG624jqW9mnZRXNY+qs9yLjLWe6 C8ckChCioX+DeSXv08NmAU3SGGgh7LG0nf47QjFoHLjKZBx8zKwkSjFeqD0mcgK6QWcR O5qw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id du3-20020a17090772c300b0077bb3c728c5si176011ejc.20.2022.09.26.07.57.28; Mon, 26 Sep 2022 07:57:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BE76D68BC01; Mon, 26 Sep 2022 17:53:26 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 925B968BA9B for ; Mon, 26 Sep 2022 17:52:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 765B2C00CB for ; Mon, 26 Sep 2022 17:52:56 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:51 +0300 Message-Id: <20220926145251.56351-31-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 31/31] lavc/aacpsdsp: RISC-V V stereo_interpolate[0] X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: UowJCGPjxL4q From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 4 +++ libavcodec/riscv/aacpsdsp_rvv.S | 56 ++++++++++++++++++++++++++++++++ 2 files changed, 60 insertions(+) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index c2201ffb6a..f42baf4251 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -34,6 +34,9 @@ void ff_ps_hybrid_analysis_ileave_rvv(float (*out)[32][2], float L[2][38][64], void ff_ps_hybrid_synthesis_deint_rvv(float out[2][38][64], float (*in)[32][2], int i, int len); +void ff_ps_stereo_interpolate_rvv(float (*l)[2], float (*r)[2], + float h[2][4], float h_step[2][4], int len); + av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { #if HAVE_RVV @@ -43,6 +46,7 @@ av_cold void ff_psdsp_init_riscv(PSDSPContext *c) c->add_squares = ff_ps_add_squares_rvv; c->mul_pair_single = ff_ps_mul_pair_single_rvv; c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; + c->stereo_interpolate[0] = ff_ps_stereo_interpolate_rvv; } if (flags & AV_CPU_FLAG_RVV_I32) { diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index 0cbe4c1d3c..1d6e73fd2d 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -219,3 +219,59 @@ func ff_ps_hybrid_synthesis_deint_rvv, zve32x 3: ret endfunc + +func ff_ps_stereo_interpolate_rvv, zve32f + vsetvli t0, zero, e32, m1, ta, ma + vid.v v24 + flw ft0, (a2) + vadd.vi v24, v24, 1 // v24[i] = i + 1 + flw ft1, 4(a2) + vfcvt.f.xu.v v24, v24 + flw ft2, 8(a2) + vfmv.v.f v16, ft0 + flw ft3, 12(a2) + vfmv.v.f v17, ft1 + flw ft0, (a3) + vfmv.v.f v18, ft2 + flw ft1, 4(a3) + vfmv.v.f v19, ft3 + flw ft2, 8(a3) + vfmv.v.f v20, ft0 + flw ft3, 12(a3) + vfmv.v.f v21, ft1 + fcvt.s.wu ft4, t0 // (float)(vlenb / sizeof (float)) + vfmv.v.f v22, ft2 + fmul.s ft0, ft0, ft4 + vfmv.v.f v23, ft3 + fmul.s ft1, ft1, ft4 + vfmacc.vv v16, v24, v20 // h0 += (i + 1) * h0_step + fmul.s ft2, ft2, ft4 + vfmacc.vv v17, v24, v21 + fmul.s ft3, ft3, ft4 + vfmacc.vv v18, v24, v22 + vfmacc.vv v19, v24, v23 +1: + vsetvli t0, a4, e32, m1, ta, ma + vlseg2e32.v v8, (a0) // v8:l_re, v9:l_im + sub a4, a4, t0 + vlseg2e32.v v10, (a1) // v10:r_re, v11:r_im + vfmul.vv v12, v8, v16 + vfmul.vv v13, v9, v16 + vfmul.vv v14, v8, v17 + vfmul.vv v15, v9, v17 + vfmacc.vv v12, v10, v18 + vfmacc.vv v13, v11, v18 + vfmacc.vv v14, v10, v19 + vfmacc.vv v15, v11, v19 + vsseg2e32.v v12, (a0) + sh3add a0, t0, a0 + vsseg2e32.v v14, (a1) + sh3add a1, t0, a1 + vfadd.vf v16, v16, ft0 // h0 += (vlenb / sizeof (float)) * h0_step + vfadd.vf v17, v17, ft1 + vfadd.vf v18, v18, ft2 + vfadd.vf v19, v19, ft3 + bnez a4, 1b + + ret +endfunc