From patchwork Sun Sep 25 14:25:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38261 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1705861pzh; Sun, 25 Sep 2022 07:26:30 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6VZFhcBj2H7PyUVuXKW8kL20pbEH+n+G58LJBDBV6a1MD32JtY6AB1c7w6SBm1+w5WJwtM X-Received: by 2002:a05:6402:2217:b0:457:1eb6:ca3a with SMTP id cq23-20020a056402221700b004571eb6ca3amr4293129edb.364.1664115990501; Sun, 25 Sep 2022 07:26:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664115990; cv=none; d=google.com; s=arc-20160816; b=OcolTPDcARZJYXQnYij2Vp7uVlNdYEeskYoKM7RnkbWR+stqCdgYFoq9ShoZ2TR+5f mzyEx7YC6Bq+2Mxd1oQIFNL9cBMD6SNx90OHFjj+3aL48g5pE/2xCfUlqivU7kE8C5NA mDLILp1SeK1Wp1Q52KFJdM9r6jJT8PExtv96Cd4lZO5ffGnG6lIJRtIWQM+ZlGP2EI6+ H0XQ9hKj03zU1/p/EXIuRMscaypIoW2cWi4gI6/XxC631uXEa7EUqnN6Sb+2+Luh7KBG fDxRSz7iDQNniNvp2n6/qpD5pRvAAAp9H8rsvN/bAqZEgotv6voNaSWhufEf6/IZq65Q /w0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=QPkPwEg5LVYUltOuzxNAHVL4r5yF8kObHCjvJOi8v9w=; b=f8/JXg19DrYguzQl4Ku6rvlzz5jKO78BBBmslKyxkGbdct9nxU1BOAiUD1pzyozEsA auEodWJwaymetWfZydJRnqlSsUioBxF10OIVpDLUYo+dw8Rce2kxfAwoCBwvIUSIJnJg hLUorgSbqQmIpNyecBMnsiKpn0aCujoyKsgRIMZ2QqiQ5yu4RYP+ZxETNHdMFvIeQGle New8ltjGt7KebcM9jtuHx/AoLm3P2v1PXAjrqCi4J9tm4PxM1fSqsc/jd+Cg0SSvV7Az X102Qnl5lCJ+w9VT/ae59u9vLAFzkwIxYg4OutT7ug/AdBLO85LJGWt81hb1sSmQfhM1 fJvA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i7-20020a05640242c700b00451729caaaesi12655722edc.8.2022.09.25.07.26.29; Sun, 25 Sep 2022 07:26:30 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0B09A68B940; Sun, 25 Sep 2022 17:26:27 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 49BDD68B468 for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id AA71DC001A for ; Sun, 25 Sep 2022 17:26:19 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:49 +0300 Message-Id: <20220925142619.67917-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/31] lavu/cpu: detect RISC-V base extensions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: nrPzZyQmZ4Vn From: Rémi Denis-Courmont This introduces compile-time and run-time CPU detection on RISC-V. In practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of I, F and D extensions, and if it does, it probably won't have run-time detection. So the flags are essentially always set. But as things stand, checkasm wants them that way. Compare the ARMV8 flag on AArch64. We are nowhere near running short on CPU flag bits. --- libavutil/cpu.c | 9 ++++++ libavutil/cpu.h | 5 +++ libavutil/cpu_internal.h | 3 ++ libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 66 +++++++++++++++++++++++++++++++++++++++ tests/checkasm/checkasm.c | 4 +++ 6 files changed, 88 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..78e92a1bf6 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -95,6 +97,9 @@ void av_force_cpu_flags(int arg){ arg |= AV_CPU_FLAG_MMX; } +#if ARCH_RISCV + arg = ff_force_cpu_flags_riscv(arg); +#endif atomic_store_explicit(&cpu_flags, arg, memory_order_relaxed); } @@ -178,6 +183,10 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV + { "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" }, + { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" }, + { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..9aae2ccc7a 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,11 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V extensions +#define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank) +#define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP) +#define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP) + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..9ddf11488b 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,9 +48,12 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); +int ff_force_cpu_flags_riscv(int flags); + size_t ff_get_cpu_max_align_mips(void); size_t ff_get_cpu_max_align_aarch64(void); size_t ff_get_cpu_max_align_arm(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..fec1f7822a --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,66 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "libavutil/log.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) +#endif + +int ff_force_cpu_flags_riscv(int flags) +{ + if ((flags & AV_CPU_FLAG_RVD) && !(flags & AV_CPU_FLAG_RVF)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "F"); + flags |= AV_CPU_FLAG_RVF; + } + + return flags; +} + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + if (hwcap & HWCAP_RV('I')) + ret |= AV_CPU_FLAG_RVI; + if (hwcap & HWCAP_RV('F')) + ret |= AV_CPU_FLAG_RVF; + if (hwcap & HWCAP_RV('D')) + ret |= AV_CPU_FLAG_RVD; +#endif + +#ifdef __riscv_i + ret |= AV_CPU_FLAG_RVI; +#endif +#if defined (__riscv_flen) && (__riscv_flen >= 32) + ret |= AV_CPU_FLAG_RVF; +#if (__riscv_flen >= 64) + ret |= AV_CPU_FLAG_RVD; +#endif +#endif + + return ret; +} diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 8fd9bba0b0..e1135a84ac 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -232,6 +232,10 @@ static const struct { { "ALTIVEC", "altivec", AV_CPU_FLAG_ALTIVEC }, { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, +#elif ARCH_RISCV + { "RVI", "rvi", AV_CPU_FLAG_RVI }, + { "RVF", "rvf", AV_CPU_FLAG_RVF }, + { "RVD", "rvd", AV_CPU_FLAG_RVD }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Sun Sep 25 14:25:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38263 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706039pzh; Sun, 25 Sep 2022 07:26:49 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5SUWYKFh+IEsBbfSwxKh0OcNBxHCflK1PwqatOl55ManP6nkgp35obWdLYEoS7tlH99Qzh X-Received: by 2002:aa7:d911:0:b0:457:3d7:9004 with SMTP id a17-20020aa7d911000000b0045703d79004mr6434488edr.49.1664116009689; Sun, 25 Sep 2022 07:26:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116009; cv=none; d=google.com; s=arc-20160816; b=P5+9V73EUMquMuG5DCiZa2Q3CwYGzg1t1rW5rn4Ff7sfmzon3uwVWKsBK9ANH7FdIR N5ox47W3umw1NRHugJXedUkX2ZkSiPT8Hu5NMyL0WxdVb3nowSbR1M+B1oLcuFSbo0p0 gWviZ4f4ldMKZYyZyaI7FM9drCicO4UvqMdOcE3DSDVE3dmQg4sgc7HgKm7g2YoZLdEe y0CmQAPBv9VejwL7wmAbzQKfGRg0oEwae6E+nKn6LF//+0JVVfJCXr+bGu6bXEQRTq4R EexeGwM+dubHCMIcTLvrMUkpbTpx2O8d6cNpCihIHs4F29XQzbrrQLBfwrRiC6f96iHC 7OXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=O+HgSwCNmXwf3ppSUZbEowfy1qCjm51aUTfBxBrqgAo=; b=bwK8GFORxBC8JbRSSMK4yK+jCgBNH48aUYjc5F2zgjwSYrM8f28A53qOFeM1IYcllP khWEqVc+v23pMVBzESLjjt8ZB0hWdlo4l0O6ISfLbYsHcxcUpRzGcpaVgC3IaF8CSoqx GcKGZfWoRRYJoam0Xw4BSYwRqvpTz+sK5AFgm6lZn6Gh/+aG81rE1DbvEC4dBZdBSNKe O2e3dP5RPBu9+lxyQOFAvsEsJhF6gstz3a3yZwPAN1rguSr9Ei742ucg8PJwqpU6By66 ePlPu+1mvsVmkTYv22JOIqMbJiEpi+itEdjpOTLvkz0mp4x8qIuaRSM+5PwED8IH3Izs klLw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id w1-20020a05640234c100b0045483f0426csi14831559edc.10.2022.09.25.07.26.49; Sun, 25 Sep 2022 07:26:49 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id ED9CE68BB47; Sun, 25 Sep 2022 17:26:28 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4D90A68B940 for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id DD81FC0087 for ; Sun, 25 Sep 2022 17:26:19 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:50 +0300 Message-Id: <20220925142619.67917-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/31] lavu/riscv: initial common header for assembler macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: b0uiTPnNI82K From: Rémi Denis-Courmont --- libavutil/riscv/asm.S | 77 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) create mode 100644 libavutil/riscv/asm.S diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S new file mode 100644 index 0000000000..dbd97f40a4 --- /dev/null +++ b/libavutil/riscv/asm.S @@ -0,0 +1,77 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * Loosely based on earlier work copyrighted by Måns Rullgård, 2008. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#if defined (__riscv_float_abi_soft) +#define NOHWF +#define NOHWD +#define HWF # +#define HWD # +#elif defined (__riscv_float_abi_single) +#define NOHWF # +#define NOHWD +#define HWF +#define HWD # +#else +#define NOHWF # +#define NOHWD # +#define HWF +#define HWD +#endif + + .macro func sym, ext= + .text + .align 2 + + .option push + .ifnb \ext + .option arch, +\ext + .endif + + .global \sym + .hidden \sym + .type \sym, %function + \sym: + + .macro endfunc + .size \sym, . - \sym + .option pop + .previous + .purgem endfunc + .endm + .endm + + .macro const sym, align=3, relocate=0 + .if \relocate + .pushsection .data.rel.ro + .else + .pushsection .rodata + .endif + .align \align + \sym: + + .macro endconst + .size \sym, . - \sym + .popsection + .purgem endconst + .endm + .endm From patchwork Sun Sep 25 14:25:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38262 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1705950pzh; Sun, 25 Sep 2022 07:26:39 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7irwRhWkXhKumY/JVEv+5HqnMQpmz1WOae0EehV+gm78LTjokmEQTWFcwCo5wbGF4F3qG9 X-Received: by 2002:a17:907:3e02:b0:782:1267:f2c8 with SMTP id hp2-20020a1709073e0200b007821267f2c8mr14394920ejc.585.1664115999331; Sun, 25 Sep 2022 07:26:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664115999; cv=none; d=google.com; s=arc-20160816; b=ClAe9mPejkpwxPF0JhyezcLwNws6ucZzcvgfUHRSQhDQOm8e6EJZDn7MBzpN6XkjsO QzQ+nohP1J0UH43ODf5QJ3WKkDw6TXsVkqBG7SzdD4oqqa7ogMhgjBAdvcjbR1GXLrfo Es7eg+bJVaheypzIcUPY+EQJZEYRNat5Pm30fdasQryE3U4wRar5CWgjUamw5Xi+GnRI ZEbu3UJHiNUmtA1UCSLoZdLRRrab3J9ekONHtLcE0MjvSq/eq3SVKPriWA+Y2A0b0OQS fgW35rECNRIPw7LAluAA1Ct7nJlRQKfEFWfC3bsjorFD++8u7B/Zs9y/JB0CI5ygdBXZ ldHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Hd++iMz3VwjtbZCtNQUPsT5V+P6VHH6YU5sCfBL1v18=; b=aMQC9vjR2WeXmdAdG1AQYwYvE7AO2eiHl8X0nqDWbIA3lcJUjGdLFAVFieakargcp/ pfb0CU5ogntIlvkFjBXpFwCJ6NGpaVJ6CZyp/UBJfQIi/cS2r+Wi8ZSzi61dKC2CFSZW 70zPSuWgnLvnyTK2JyOhhfTScRSSI8O5oj+GZ6fvEGxlpJZlQ1mELOpY8BUH8S0BhZvp TmXEe6nJjCXsWZ33UZb8IhqYiVV8A4KvCSqmaDLFTsNz31oShwQxdwzhoebF4jTbtBzE OHMb41CiQzqucWYhdO44Cr2su7i/wP+nwfZq9742G+lCqa597ZjHXsB07mvX6RuYXEQe bKiQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id u6-20020a056402064600b00457065dc0desi3577097edx.330.2022.09.25.07.26.38; Sun, 25 Sep 2022 07:26:39 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0B06868BBAF; Sun, 25 Sep 2022 17:26:28 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 69EB368BAFA for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 1C478C00AF for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:51 +0300 Message-Id: <20220925142619.67917-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/31] lavc/audiodsp: RISC-V F vector_clipf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: TAtuTyJfVOcB From: Rémi Denis-Courmont RV64G supports MIN & MAX instructions natively only on floating point registers, not general purpose ones. The later would require the Zbb extension. Due to that, it is actually faster to perform the clipping "properly" in FPU. Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech): audiodsp.vector_clipf_c: 29551.5 audiodsp.vector_clipf_rvf: 17871.0 Also tried unrolling with 2 or 8 elements but it gets worse either way. --- libavcodec/audiodsp.c | 2 ++ libavcodec/audiodsp.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/audiodsp_init.c | 33 +++++++++++++++++++++ libavcodec/riscv/audiodsp_rvf.S | 49 ++++++++++++++++++++++++++++++++ 5 files changed, 87 insertions(+) create mode 100644 libavcodec/riscv/Makefile create mode 100644 libavcodec/riscv/audiodsp_init.c create mode 100644 libavcodec/riscv/audiodsp_rvf.S diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c index ff43e87dce..eba6e809fd 100644 --- a/libavcodec/audiodsp.c +++ b/libavcodec/audiodsp.c @@ -113,6 +113,8 @@ av_cold void ff_audiodsp_init(AudioDSPContext *c) ff_audiodsp_init_arm(c); #elif ARCH_PPC ff_audiodsp_init_ppc(c); +#elif ARCH_RISCV + ff_audiodsp_init_riscv(c); #elif ARCH_X86 ff_audiodsp_init_x86(c); #endif diff --git a/libavcodec/audiodsp.h b/libavcodec/audiodsp.h index aa6fa7898b..485b512839 100644 --- a/libavcodec/audiodsp.h +++ b/libavcodec/audiodsp.h @@ -55,6 +55,7 @@ typedef struct AudioDSPContext { void ff_audiodsp_init(AudioDSPContext *c); void ff_audiodsp_init_arm(AudioDSPContext *c); void ff_audiodsp_init_ppc(AudioDSPContext *c); +void ff_audiodsp_init_riscv(AudioDSPContext *c); void ff_audiodsp_init_x86(AudioDSPContext *c); #endif /* AVCODEC_AUDIODSP_H */ diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile new file mode 100644 index 0000000000..414a9e9bd8 --- /dev/null +++ b/libavcodec/riscv/Makefile @@ -0,0 +1,2 @@ +OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ + riscv/audiodsp_rvf.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c new file mode 100644 index 0000000000..c5842815d6 --- /dev/null +++ b/libavcodec/riscv/audiodsp_init.c @@ -0,0 +1,33 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/audiodsp.h" + +void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); + +av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVF) + c->vector_clipf = ff_vector_clipf_rvf; +} diff --git a/libavcodec/riscv/audiodsp_rvf.S b/libavcodec/riscv/audiodsp_rvf.S new file mode 100644 index 0000000000..2ec8a11691 --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvf.S @@ -0,0 +1,49 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clipf_rvf, f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.x fa1, a4 +1: + flw ft0, (a1) + flw ft1, 4(a1) + fmax.s ft0, ft0, fa0 + flw ft2, 8(a1) + fmax.s ft1, ft1, fa0 + flw ft3, 12(a1) + fmax.s ft2, ft2, fa0 + addi a2, a2, -4 + fmax.s ft3, ft3, fa0 + addi a1, a1, 16 + fmin.s ft0, ft0, fa1 + fmin.s ft1, ft1, fa1 + fsw ft0, (a0) + fmin.s ft2, ft2, fa1 + fsw ft1, 4(a0) + fmin.s ft3, ft3, fa1 + fsw ft2, 8(a0) + fsw ft3, 12(a0) + addi a0, a0, 16 + bnez a2, 1b + + ret +endfunc From patchwork Sun Sep 25 14:25:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38264 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706129pzh; Sun, 25 Sep 2022 07:26:59 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6Sa3KimXKOszh4IVBOlyvQl++bIvTAQw2ppRkcwiW3YA1bQUyf1I9WnGTBSi33AJOfFzIQ X-Received: by 2002:a17:906:9753:b0:780:7a0a:10f4 with SMTP id o19-20020a170906975300b007807a0a10f4mr14453514ejy.621.1664116019122; Sun, 25 Sep 2022 07:26:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116019; cv=none; d=google.com; s=arc-20160816; b=FYyzHuuVH8DRaa3Icuc/Ih7eIhsYWU8edFQ2/esJP/+L1ixUfLaajAfooAdp2U40GT J1JZRKPwkg+EPk5kWPlKTzxPfRUSGifwBDIsiGZ1UAdAO+Qynr3ZBHuk9bTMTGHqAD0s H6AncuizpS+cVqlvx5YVVesEBJBIBgFBBsrQYg7JPbc3x3I4cp2QSxbkUrWiYyz6JtBB Fgabuk2OdI3BANXmJ7hByT9pCCpjp/tQXkkLXiI7BlE23pUBnWrV61/L6ZjWctxVGbNE +/H96/TTn3Lh+hY1OOxK74EVqU6p6D1p6y3PwsloGn3zKfXab96n9VnhfdhzTG4v1vEA nM+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=/Ga+4PJJevRhqo8rey9k1goQZJwqX8UDtM5memtoZcY=; b=RR3MeRdz+hMi7UMASV7F837mDzdniVf4SKOhS6KfD0MZnBKdrF7LFv5wplHJhXn1+g LTkwluFL8ikm5dz31O8HuKmeyaWRgTLfIjwrjSHs2mc/4Oz70YThFrz5CnwTYn1uGVHb jHi+RpVtgRPO4LqoAFFbz4Qba1QrQq1VdTtTMx4mr8kIGhALDT5GvjtgEv6jNpcoiEr6 Ox+L2A4LiWKv5C60X+nsJ/9x04l1KMwxa762p/ajlXT+KXofYWnnz+85mMvdrLOygNGb RKqjWF003Cl6OXMPo+8bxHNvYAiY+LoDHhm+ZcIfevc9dVrx1XOzxGvrZ0B5uL5SGu70 wQow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f5-20020a056402194500b00448db2ab374si2084500edz.596.2022.09.25.07.26.58; Sun, 25 Sep 2022 07:26:59 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D8DEB68BBBC; Sun, 25 Sep 2022 17:26:29 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 93FB968BAFE for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 4F5FFC00B0 for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:52 +0300 Message-Id: <20220925142619.67917-4-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/31] lavc/pixblockdsp: RISC-V I get_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: W3fTL0gR8DK6 From: Rémi Denis-Courmont Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech): get_pixels_c: 180.0 get_pixels_rvi: 136.7 --- libavcodec/pixblockdsp.c | 2 + libavcodec/pixblockdsp.h | 2 + libavcodec/riscv/Makefile | 2 + libavcodec/riscv/pixblockdsp_init.c | 45 ++++++++++++++++++++++ libavcodec/riscv/pixblockdsp_rvi.S | 59 +++++++++++++++++++++++++++++ 5 files changed, 110 insertions(+) create mode 100644 libavcodec/riscv/pixblockdsp_init.c create mode 100644 libavcodec/riscv/pixblockdsp_rvi.S diff --git a/libavcodec/pixblockdsp.c b/libavcodec/pixblockdsp.c index 17c487da1e..4294075cee 100644 --- a/libavcodec/pixblockdsp.c +++ b/libavcodec/pixblockdsp.c @@ -109,6 +109,8 @@ av_cold void ff_pixblockdsp_init(PixblockDSPContext *c, AVCodecContext *avctx) ff_pixblockdsp_init_arm(c, avctx, high_bit_depth); #elif ARCH_PPC ff_pixblockdsp_init_ppc(c, avctx, high_bit_depth); +#elif ARCH_RISCV + ff_pixblockdsp_init_riscv(c, avctx, high_bit_depth); #elif ARCH_X86 ff_pixblockdsp_init_x86(c, avctx, high_bit_depth); #elif ARCH_MIPS diff --git a/libavcodec/pixblockdsp.h b/libavcodec/pixblockdsp.h index 07c2ec4f40..9b002aa3d6 100644 --- a/libavcodec/pixblockdsp.h +++ b/libavcodec/pixblockdsp.h @@ -52,6 +52,8 @@ void ff_pixblockdsp_init_arm(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); void ff_pixblockdsp_init_ppc(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); +void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, AVCodecContext *avctx, + unsigned high_bit_depth); void ff_pixblockdsp_init_x86(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); void ff_pixblockdsp_init_mips(PixblockDSPContext *c, AVCodecContext *avctx, diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 414a9e9bd8..da07f1fe96 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,2 +1,4 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o +OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ + riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c new file mode 100644 index 0000000000..04bf52649f --- /dev/null +++ b/libavcodec/riscv/pixblockdsp_init.c @@ -0,0 +1,45 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/avcodec.h" +#include "libavcodec/pixblockdsp.h" + +void ff_get_pixels_8_rvi(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); +void ff_get_pixels_16_rvi(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); + +av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, + AVCodecContext *avctx, + unsigned high_bit_depth) +{ + int cpu_flags = av_get_cpu_flags(); + + if (cpu_flags & AV_CPU_FLAG_RVI) { + if (high_bit_depth) + c->get_pixels = ff_get_pixels_16_rvi; + else + c->get_pixels = ff_get_pixels_8_rvi; + } +} diff --git a/libavcodec/riscv/pixblockdsp_rvi.S b/libavcodec/riscv/pixblockdsp_rvi.S new file mode 100644 index 0000000000..93ece4405e --- /dev/null +++ b/libavcodec/riscv/pixblockdsp_rvi.S @@ -0,0 +1,59 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_get_pixels_8_rvi +.irp row, 0, 1, 2, 3, 4, 5, 6, 7 + ld t0, (a1) + add a1, a1, a2 + sd zero, ((\row * 16) + 0)(a0) + addi t6, t6, -1 + sd zero, ((\row * 16) + 8)(a0) + srli t1, t0, 8 + sb t0, ((\row * 16) + 0)(a0) + srli t2, t0, 16 + sb t1, ((\row * 16) + 2)(a0) + srli t3, t0, 24 + sb t2, ((\row * 16) + 4)(a0) + srli t4, t0, 32 + sb t3, ((\row * 16) + 6)(a0) + srli t1, t0, 40 + sb t4, ((\row * 16) + 8)(a0) + srli t2, t0, 48 + sb t1, ((\row * 16) + 10)(a0) + srli t3, t0, 56 + sb t2, ((\row * 16) + 12)(a0) + sb t3, ((\row * 16) + 14)(a0) +.endr + ret +endfunc + +func ff_get_pixels_16_rvi +.irp row, 0, 1, 2, 3, 4, 5, 6, 7 + ld t0, 0(a1) + ld t1, 8(a1) + add a1, a1, a2 + sd t0, ((\row * 16) + 0)(a0) + sd t1, ((\row * 16) + 8)(a0) +.endr + ret +endfunc From patchwork Sun Sep 25 14:25:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38265 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706217pzh; Sun, 25 Sep 2022 07:27:09 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4EZyCfcYK1DLsUSjjXiq5nWs+tFVgzS5H6+kLHyS/XK86AMwchkqpegDRswkzBYF8sa73C X-Received: by 2002:aa7:c585:0:b0:453:e1c6:7dc6 with SMTP id g5-20020aa7c585000000b00453e1c67dc6mr18137239edq.245.1664116029470; Sun, 25 Sep 2022 07:27:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116029; cv=none; d=google.com; s=arc-20160816; b=vM31glMKH+UPytLR/q4cSi+Kif8UL92FQZ2Mfn+wVTM8jA0ESQZiFX9b/0eJhPQ+ez dBQHYb1yu4AkYE6VO2Rdu/V6gIU6Tx6qOCqz0DhnjnGhNCSiqH1DuyPnyKvxYdNGuGqh 7XXYGYQnTC7NhmGj/ZGD5d5rhs5QP30TndzgTwq0U8e6fGvV9stY2MMg08SsHdIK3Yu4 E96EFy+H2Cyg6uTP2K+V5815KPhzAs6w28ZkSdTa6SOBuyLUL145pb0YUOWseMsHmYJs TWpsf0UgEUZ/hOUe6Zc4GIN7QqNBk/CVVDzgLDrOiIW1up7YBedgu9TqFVfJqvQsQVKm 42Rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Fsx2+NODpi3eETDrDwdD+38okKfrybhG2e87KFZEfow=; b=uFzJ9n9zLpX3VkqlSQcSHnhaMaqYPknOaBa2m5GFXgplQotljxSK7xv+KcBKY6HO3x SoW+Co4/K76xzfGaMyn7Hma3Z4Z6ay3jORvz2LvvjdYRXmf1C4erdR6Vp144INaxyBMD Zb7VSvIiYkprSct/TJSq8Zjt1HaXYhByWMHS1g3+Lk8Tf5BECNE89pqDGB6a5xvU1sDw GRz8hqop0rVZ4cxH9uRZeTwdOE046pbbWMipbM2jHOcvk0dk/B0qF6ba8571mT7YVgSC NBBkUDpge9YKzuKz7qwuCbtafoS88Skj7hYNVYRhJXBGPyzI3f6T8e+0E9+kiPcuatf8 wcDA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ds19-20020a170907725300b00780427ba4e0si5615046ejc.233.2022.09.25.07.27.09; Sun, 25 Sep 2022 07:27:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D440168BB50; Sun, 25 Sep 2022 17:26:30 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B984168BB06 for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 79127C00B1 for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:53 +0300 Message-Id: <20220925142619.67917-5-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/31] lavu/cpu: CPU flags for the RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: nv68ap/5h2i/ From: Rémi Denis-Courmont RVV defines a total of 12 different extensions, including: - 5 different instruction subsets: - Zve32x: 8-, 16- and 32-bit integers, - Zve32f: Zve32x plus single precision floats, - Zve64x: Zve32x plus 64-bit integers, - Zve64f: Zve32f plus Zve64x, - Zve64d: Zve64f plus double precision floats. - 6 different vector lengths: - Zvl32b (embedded only), - Zvl64b (embedded only), - Zvl128b, - Zvl256b, - Zvl512b, - Zvl1024b, - and the V extension proper: equivalent to Zve64f and Zvl128b. In total, there are 6 different possible sets of supported instructions (including the empty set), but for convenience we allocate one bit for each type sets: up-to-32-bit ints (ZVE32X), floats (ZV32F), 64-bit ints (ZV64X) and doubles (ZVE64D). Whence the vector size is needed, it can be retrieved by reading the unprivileged read-only vlenb CSR. This should probably be a separate helper macro if needed at a later point. --- libavutil/cpu.c | 4 ++++ libavutil/cpu.h | 4 ++++ libavutil/riscv/cpu.c | 46 ++++++++++++++++++++++++++++++++++++++- tests/checkasm/checkasm.c | 10 ++++++--- 4 files changed, 60 insertions(+), 4 deletions(-) diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 78e92a1bf6..58ae4858b4 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -187,6 +187,10 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" }, { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" }, { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" }, + { "rvve32", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE32X}, .unit = "flags" }, + { "rvvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE32F}, .unit = "flags" }, + { "rvve64", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE64X}, .unit = "flags" }, + { "rvv", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE64D}, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9aae2ccc7a..00698e30ef 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -82,6 +82,10 @@ #define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank) #define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP) #define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP) +#define AV_CPU_FLAG_RV_ZVE32X (1 << 3) ///< Vectors of 8/16/32-bit int's */ +#define AV_CPU_FLAG_RV_ZVE32F (1 << 4) ///< Vectors of float's */ +#define AV_CPU_FLAG_RV_ZVE64X (1 << 5) ///< Vectors of 64-bit int's */ +#define AV_CPU_FLAG_RV_ZVE64D (1 << 6) ///< Vectors of double's /** * Return the flags which specify extensions supported by the CPU. diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c index fec1f7822a..6f862635b3 100644 --- a/libavutil/riscv/cpu.c +++ b/libavutil/riscv/cpu.c @@ -30,7 +30,32 @@ int ff_force_cpu_flags_riscv(int flags) { - if ((flags & AV_CPU_FLAG_RVD) && !(flags & AV_CPU_FLAG_RVF)) { + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RV_ZVE64X)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE64X"); + flags |= AV_CPU_FLAG_RV_ZVE64X; + } + + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RV_ZVE32F)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE32F"); + flags |= AV_CPU_FLAG_RV_ZVE32F; + } + + if ((flags & (AV_CPU_FLAG_RV_ZVE64X | AV_CPU_FLAG_RV_ZVE32F)) + && !(flags & AV_CPU_FLAG_RV_ZVE32X)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE32X"); + flags |= AV_CPU_FLAG_RV_ZVE32X; + } + + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RVD)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "D"); + flags |= AV_CPU_FLAG_RVD; + } + + if ((flags & (AV_CPU_FLAG_RVD | AV_CPU_FLAG_RV_ZVE32F)) + && !(flags & AV_CPU_FLAG_RVF)) { av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "F"); flags |= AV_CPU_FLAG_RVF; } @@ -50,6 +75,11 @@ int ff_get_cpu_flags_riscv(void) ret |= AV_CPU_FLAG_RVF; if (hwcap & HWCAP_RV('D')) ret |= AV_CPU_FLAG_RVD; + + /* The V extension implies all Zve* functional subsets */ + if (hwcap & HWCAP_RV('V')) + ret |= AV_CPU_FLAG_RV_ZVE32X | AV_CPU_FLAG_RV_ZVE64X + | AV_CPU_FLAG_RV_ZVE32F | AV_CPU_FLAG_RV_ZVE64D; #endif #ifdef __riscv_i @@ -60,6 +90,20 @@ int ff_get_cpu_flags_riscv(void) #if (__riscv_flen >= 64) ret |= AV_CPU_FLAG_RVD; #endif +#endif + + /* If RV-V is enabled statically at compile-time, check the details. */ +#ifdef __riscv_vectors + ret |= AV_CPU_FLAG_RV_ZVE32X; +#if __riscv_v_elen >= 64 + ret |= AV_CPU_FLAG_RV_ZVE64X; +#endif +#if __riscv_v_elen_fp >= 32 + ret |= AV_CPU_FLAG_RV_ZVE32F; +#if __riscv_v_elen_fp >= 64 + ret |= AV_CPU_FLAG_RV_ZVE64F; +#endif +#endif #endif return ret; diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index e1135a84ac..f7d108e8ea 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -233,9 +233,13 @@ static const struct { { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, #elif ARCH_RISCV - { "RVI", "rvi", AV_CPU_FLAG_RVI }, - { "RVF", "rvf", AV_CPU_FLAG_RVF }, - { "RVD", "rvd", AV_CPU_FLAG_RVD }, + { "RVI", "rvi", AV_CPU_FLAG_RVI }, + { "RVF", "rvf", AV_CPU_FLAG_RVF }, + { "RVD", "rvd", AV_CPU_FLAG_RVD }, + { "RV_Zve32x", "rv_zve32x", AV_CPU_FLAG_RV_ZVE32X }, + { "RV_Zve32f", "rv_zve32f", AV_CPU_FLAG_RV_ZVE32F }, + { "RV_Zve64x", "rv_zve64x", AV_CPU_FLAG_RV_ZVE64X }, + { "RV_Zve64d", "rv_zve64d", AV_CPU_FLAG_RV_ZVE64D }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Sun Sep 25 14:25:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38273 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706757pzh; Sun, 25 Sep 2022 07:28:22 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4KBcm8y1B04PiuM9lKw+aiFR6oG0AP0taSlD9f5frGpI9xQc2TVUWXJbJBZilYrN38FY3c X-Received: by 2002:a05:6402:54b:b0:457:3b62:306a with SMTP id i11-20020a056402054b00b004573b62306amr1870474edx.6.1664116102160; Sun, 25 Sep 2022 07:28:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116102; cv=none; d=google.com; s=arc-20160816; b=QMXrCy3TERWmyw3a4JVRtpCosoWJIZAd/qO8vNgpjpJe+KGtnrJ+XtF+UWQhQ7ahFB ckAsjCjFGHfSnb7vG5n6fQwQ6ZLWbGrNn6u1t8fR/pliHvCr0tu44uoGrTkL5LtURAx2 4d3toklUa4IIrulPiSdd1lxC14IEmdbX0ty61TR4RJhU6BAJXIKJN9W603qVUSHz+CmK VuEGRFKzmcTeVWSsJZnLbo6qWAsgqyGx/djrmogrkizlW6PrEbJu8Q4aQ4HTv6Fv7ty4 jSU0D0aiH9rO90pSw5pbDc5v9oohCU7vEc4S3KFxQEnj7WIjUqEPX8goXyR9Hp9H5nYg /i8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=15r1qRl87t5VsghkiR3cNI/Gki9TZSdCFr2zgAwi5BQ=; b=ff2a2sjsOS0pWjLSk8KZTztTyD6q3gEtLv0LL7c1N4ywd7aex2QxFSC4i+aO9kFzSd weYziC5C/oQszW8VcMhOOp6um+lVr4KHYJNpr/uJ4AerpD5VD6t719k0YZY85GDvP7dU McMtUgdFmJa7knO9x2EgP45jxpJ5MGPRa777z2XGvLkLAkyzGYzn0P/EhklVq8jjpKYl sHfsTPQwH9doUXfIH8+0in3gBmEJkJycnFJlhIB/7OyhFit4xjHoF9zfqA1fPPChP+AL h5O9kkv+uOXPZyqXz3vcqr3SDJZ+LNiLrmI8Elqak0vxxesn1eT4iUYK79f/G941Qm73 +TrA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h17-20020a05640250d100b00443e3fe60a0si12958592edb.622.2022.09.25.07.28.21; Sun, 25 Sep 2022 07:28:22 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0738868BBE6; Sun, 25 Sep 2022 17:26:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 070DF68BB18 for ; Sun, 25 Sep 2022 17:26:25 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id A30F6C00B2 for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:54 +0300 Message-Id: <20220925142619.67917-6-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/31] configure: probe RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NoKhdy4zrqOo From: Rémi Denis-Courmont --- Makefile | 2 +- configure | 15 +++++++++++++++ ffbuild/arch.mak | 2 ++ 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 61f79e27ae..1fb742f390 100644 --- a/Makefile +++ b/Makefile @@ -91,7 +91,7 @@ ffbuild/.config: $(CONFIGURABLE_COMPONENTS) SUBDIR_VARS := CLEANFILES FFLIBS HOSTPROGS TESTPROGS TOOLS \ HEADERS ARCH_HEADERS BUILT_HEADERS SKIPHEADERS \ ARMV5TE-OBJS ARMV6-OBJS ARMV8-OBJS VFP-OBJS NEON-OBJS \ - ALTIVEC-OBJS VSX-OBJS MMX-OBJS X86ASM-OBJS \ + ALTIVEC-OBJS VSX-OBJS RVV-OBJS MMX-OBJS X86ASM-OBJS \ MIPSFPU-OBJS MIPSDSPR2-OBJS MIPSDSP-OBJS MSA-OBJS \ MMI-OBJS LSX-OBJS LASX-OBJS OBJS SLIBOBJS SHLIBOBJS \ STLIBOBJS HOSTOBJS TESTOBJS diff --git a/configure b/configure index c157338b1f..a41ebda6d4 100755 --- a/configure +++ b/configure @@ -462,6 +462,7 @@ Optimization options (experts only): --disable-mmi disable Loongson MMI optimizations --disable-lsx disable Loongson LSX optimizations --disable-lasx disable Loongson LASX optimizations + --disable-rvv disable RISC-V Vector optimizations --disable-fast-unaligned consider unaligned accesses slow Developer options (useful when working on FFmpeg itself): @@ -2126,6 +2127,10 @@ ARCH_EXT_LIST_PPC=" vsx " +ARCH_EXT_LIST_RISCV=" + rvv +" + ARCH_EXT_LIST_X86=" $ARCH_EXT_LIST_X86_SIMD cpunop @@ -2135,6 +2140,7 @@ ARCH_EXT_LIST_X86=" ARCH_EXT_LIST=" $ARCH_EXT_LIST_ARM $ARCH_EXT_LIST_PPC + $ARCH_EXT_LIST_RISCV $ARCH_EXT_LIST_X86 $ARCH_EXT_LIST_MIPS $ARCH_EXT_LIST_LOONGSON @@ -2642,6 +2648,8 @@ ppc4xx_deps="ppc" vsx_deps="altivec" power8_deps="vsx" +rvv_deps="riscv" + loongson2_deps="mips" loongson3_deps="mips" mmi_deps_any="loongson2 loongson3" @@ -6110,6 +6118,10 @@ elif enabled ppc; then check_cpp_condition power8 "altivec.h" "defined(_ARCH_PWR8)" fi +elif enabled riscv; then + + enabled rvv && check_inline_asm rvv '".option arch, +v\nvsetivli zero, 0, e8, m1, ta, ma"' + elif enabled x86; then check_builtin rdtsc intrin.h "__rdtsc()" @@ -7596,6 +7608,9 @@ if enabled loongarch; then echo "LSX enabled ${lsx-no}" echo "LASX enabled ${lasx-no}" fi +if enabled riscv; then + echo "RISC-V Vector enabled ${rvv-no}" +fi echo "debug symbols ${debug-no}" echo "strip symbols ${stripping-no}" echo "optimize for size ${small-no}" diff --git a/ffbuild/arch.mak b/ffbuild/arch.mak index 997e31e85e..39d76ee152 100644 --- a/ffbuild/arch.mak +++ b/ffbuild/arch.mak @@ -15,5 +15,7 @@ OBJS-$(HAVE_LASX) += $(LASX-OBJS) $(LASX-OBJS-yes) OBJS-$(HAVE_ALTIVEC) += $(ALTIVEC-OBJS) $(ALTIVEC-OBJS-yes) OBJS-$(HAVE_VSX) += $(VSX-OBJS) $(VSX-OBJS-yes) +OBJS-$(HAVE_RVV) += $(RVV-OBJS) $(RVV-OBJS-yes) + OBJS-$(HAVE_MMX) += $(MMX-OBJS) $(MMX-OBJS-yes) OBJS-$(HAVE_X86ASM) += $(X86ASM-OBJS) $(X86ASM-OBJS-yes) From patchwork Sun Sep 25 14:25:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38268 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706422pzh; Sun, 25 Sep 2022 07:27:37 -0700 (PDT) X-Google-Smtp-Source: AMsMyM646KWqmHjpt92n96TLAEk+KepRU1RjAxH4lo1dyFTIwhAQGKZ0u8wCNvs2JjiEXIQSBuSm X-Received: by 2002:a17:907:7da3:b0:776:a0ae:5147 with SMTP id oz35-20020a1709077da300b00776a0ae5147mr14302774ejc.662.1664116057640; Sun, 25 Sep 2022 07:27:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116057; cv=none; d=google.com; s=arc-20160816; b=OdKPRomQ0dWKOD6aeJttR2a9QzLb2suJWfFZl15qG3mjrL8mSmbC8hwcAOGhy5PK7s QsYUgSiKnVJxsZFuHPLA5cNepL/oz2VldYW5g9nwDPfqeUu04S6HfcmqNzzONd2QY4UO XJu66M8VxRuFIb/MF8eQP+IVR3+PcDDi0E77VwY+nJDBNjRdC3XgpYHdxGfttbxeFxQX 9kySlpmh9/7AqdNhDglVra8XIzxjKSIwhzjrpXHnzmf6a0mPga6Mi/OzCKtw6tei1dj4 9rKCREr6XLOvbGrS6W3ACufT8uHAhSX6BJMB2xMWVc6qRpPjB0ef8bMIbn3M9TekE7uX TVdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=hfS16BYjn1Fmv0pRVNKnO+R9Q7WUDnTn73ppaKrHqlM=; b=fsHcejQAsBJN6HUI36TFca52/9Uyx+5N5c5b5vVwYHXkb6FqnA21GjglhOx9rqB9Gw v0ceKBo5Zqh11KxrbzzK32BLyHib2fEFODfShZH/P4qRWxqQ7gdAAeFzx0pYtSGzi4dV mp97sPvK3Uy9aioL4uQ6KcgLKPj05XW1NAhOKfIY5fcBGZB5tStm5xHiSQBJHZqdEP2j /wK6pD0nTgvDW6U1XF6wdFPG8nboqYoCXgKisvDItv2PXDyUnsAppdEEWsmMLf9JyJ7o fWBDKdYX+6tVCpUr32+ajf+k0EEfgaM9CxcDAykP21zSFBYniWRN3JWGLd7BmIT1++Y8 SSHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hz3-20020a1709072ce300b0077b19730d08si9405496ejc.380.2022.09.25.07.27.36; Sun, 25 Sep 2022 07:27:37 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 12DB768BBC9; Sun, 25 Sep 2022 17:26:34 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F098268BB0B for ; Sun, 25 Sep 2022 17:26:25 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id CC073C00B3 for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:55 +0300 Message-Id: <20220925142619.67917-7-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/31] lavu/riscv: fallback macros for SH{1, 2, 3}ADD X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: strLqfeN+Uww From: Rémi Denis-Courmont Those mnemonics require the very latest binutils release at the time of writing. These macros provide seamless backward compatibility. --- libavutil/riscv/asm.S | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S index dbd97f40a4..de5e1ad0a6 100644 --- a/libavutil/riscv/asm.S +++ b/libavutil/riscv/asm.S @@ -75,3 +75,22 @@ .purgem endconst .endm .endm + +#if !defined (__riscv_zba) + /* SH{1,2,3}ADD definitions for pre-Zba assemblers */ + .macro shnadd n, rd, rs1, rs2 + .insn r OP, 2 * \n, 16, \rd, \rs1, \rs2 + .endm + + .macro sh1add rd, rs1, rs2 + shnadd 1, \rd, \rs1, \rs2 + .endm + + .macro sh2add rd, rs1, rs2 + shnadd 2, \rd, \rs1, \rs2 + .endm + + .macro sh3add rd, rs1, rs2 + shnadd 3, \rd, \rs1, \rs2 + .endm +#endif From patchwork Sun Sep 25 14:25:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38271 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706631pzh; Sun, 25 Sep 2022 07:28:04 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6k57r57Fi9L0ZpvUvpBfmYe1Rlj7x0jAQisIrHYCJ1WbvUdHj65Qq+UQiEZL3GGw6iOD3K X-Received: by 2002:a17:906:4bd3:b0:731:3bdf:b95c with SMTP id x19-20020a1709064bd300b007313bdfb95cmr14711108ejv.677.1664116084052; Sun, 25 Sep 2022 07:28:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116084; cv=none; d=google.com; s=arc-20160816; b=Gd/y2f4MKzCsdbG7EYIvxXEAWXC9WfLtsOt3PC6RXOJI9/ymFVMo/p7OFSUP8qI/NL JV4sShJ7D6qJsqiQnx5t6d+E8p1WUHo321wr4inZE3a9gtK2+MDSjOyoLD6sHyv8D7Jk RXW/gO3hQ9JSnuOhfkMMP4XEAB4LVEbJJyt9lUZF5Y/zxR5M08N8eHhWMglVsBa7Bwae qLWDre6e/hn/2O0CF2JUGNcRMwRAmxTmz86Ggo6UPzEqtUOY5Yw51MX2yyNY+DUkP5bG 1DDU2zgd6EzVfUXF5kFBXl69R5KZjZeon1B33httuh/tk/+m3qkiYJ6eAn8WWdOd+7v5 Rf3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=vFFzFCkt/WN+DQNNbAoM34iw30CI/kRLTaKFQZj1R8o=; b=W/Hxfad6aWyuEdVuOZWC+sCCXHel1Sa6v+nbm6e6d/4SWsziIs96o+gtSNatCWxq9j vGZ+TmKA9NpMNp+5QP6Ei+ea5KG9+99fVNBfZXpTiLmMRlHVaCbuob30tylsqvvJ9OOW jC3REqQfZMI9kR33BfcpTLWL8aZ9ufP5/sOqYzGH/tZecjLowayizgea9OPA/VeBnpMs FaZYNCrw2bywut1A1dQxn/un0wBSYCKQ4vZSivC7OaIIOLbVzb7V1V6lCOcobmxrgGEQ hTlJR2nrs7vk78rfDSX57o9/9/+9kHutKKj2t9JpEHdCPwN2AvCO792aFCGrF0rorU19 PDoQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id k26-20020aa7d2da000000b0045152c6304dsi10784827edr.225.2022.09.25.07.28.03; Sun, 25 Sep 2022 07:28:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 87A2068BBD9; Sun, 25 Sep 2022 17:26:36 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0252668BB16 for ; Sun, 25 Sep 2022 17:26:25 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 016F7C00B4 for ; Sun, 25 Sep 2022 17:26:20 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:56 +0300 Message-Id: <20220925142619.67917-8-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/31] lavu/floatdsp: RISC-V V vector_fmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: hcKYntG1Vev4 From: Rémi Denis-Courmont This is based on existing code from the VLC git tree with two minor changes to account for the different function prototypes. --- libavutil/float_dsp.c | 2 ++ libavutil/float_dsp.h | 1 + libavutil/riscv/Makefile | 4 +++- libavutil/riscv/float_dsp_init.c | 39 ++++++++++++++++++++++++++++++++ libavutil/riscv/float_dsp_rvv.S | 39 ++++++++++++++++++++++++++++++++ 5 files changed, 84 insertions(+), 1 deletion(-) create mode 100644 libavutil/riscv/float_dsp_init.c create mode 100644 libavutil/riscv/float_dsp_rvv.S diff --git a/libavutil/float_dsp.c b/libavutil/float_dsp.c index 8676c8b0f8..742dd679d2 100644 --- a/libavutil/float_dsp.c +++ b/libavutil/float_dsp.c @@ -156,6 +156,8 @@ av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact) ff_float_dsp_init_arm(fdsp); #elif ARCH_PPC ff_float_dsp_init_ppc(fdsp, bit_exact); +#elif ARCH_RISCV + ff_float_dsp_init_riscv(fdsp); #elif ARCH_X86 ff_float_dsp_init_x86(fdsp); #elif ARCH_MIPS diff --git a/libavutil/float_dsp.h b/libavutil/float_dsp.h index 9c664592bd..7cad9fc622 100644 --- a/libavutil/float_dsp.h +++ b/libavutil/float_dsp.h @@ -205,6 +205,7 @@ float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len); void ff_float_dsp_init_aarch64(AVFloatDSPContext *fdsp); void ff_float_dsp_init_arm(AVFloatDSPContext *fdsp); void ff_float_dsp_init_ppc(AVFloatDSPContext *fdsp, int strict); +void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp); void ff_float_dsp_init_x86(AVFloatDSPContext *fdsp); void ff_float_dsp_init_mips(AVFloatDSPContext *fdsp); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 1f818043dc..89a8d0d990 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1 +1,3 @@ -OBJS += riscv/cpu.o +OBJS += riscv/float_dsp_init.o \ + riscv/cpu.o +RVV-OBJS += riscv/float_dsp_rvv.o diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c new file mode 100644 index 0000000000..de567c50d2 --- /dev/null +++ b/libavutil/riscv/float_dsp_init.c @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/float_dsp.h" + +void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, + int len); + +av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; +#endif +} diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S new file mode 100644 index 0000000000..50cb1fa90f --- /dev/null +++ b/libavutil/riscv/float_dsp_rvv.S @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a1) + sub a2, a2, t0 + vfmul.vf v16, v16, fa0 + sh2add a1, t0, a1 + vse32.v v16, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc From patchwork Sun Sep 25 14:25:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38267 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706347pzh; Sun, 25 Sep 2022 07:27:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM69cAINX7FHChJ3cx7tDWxDJcCjNhmabCslFisKwBn8UwRTmKaEOTTkWN7IF4n9R+wT4FLe X-Received: by 2002:a17:907:1b24:b0:76d:7b9d:2f8b with SMTP id mp36-20020a1709071b2400b0076d7b9d2f8bmr14222638ejc.414.1664116048931; Sun, 25 Sep 2022 07:27:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116048; cv=none; d=google.com; s=arc-20160816; b=obtIlAwg4rvo2BaWAUrmV/PwCkEGu4rUPe7tDdBqw11UzKIvSkodFH33J0/1Sar65K xK5SQFu5KZVQR2J4Qgf1O144gaSQhJEc11Ht6Qw9WjWxQ/IpoAAO2BR2AmlBSwln81xm 0OAzWait+UKG3yHrsHmXBQp/rbdM4bO7MSqQ4h5CFBf70fv+iX5Fs3IdOnJe3HjmyITN ICOnmoMoQ7SjybcdD+G77KyjQxcGbvOpGBXcHVY+FY7ln3cBhZyJKMwEc7c9nPON+iZP LeF0ThmqO4Ua5C50vBq4Djt53SQyYVGDqXx+P/NwkntARDODRqGbF19HZmmPWrcl2CN3 hcBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=3TOJzYqY/2xLgYr/2+Nvr/1OIaXaw83RMX3AAJsrl0M=; b=WBQPREYNfyAPeWQrPQZFgWrxg6d8HhuF6vZO2ttLriiUhD97Z+iF4MauXUh8jhiQSA qjZ5CO1kM1uY8dv9mfjSfu7KJhCsrydn/U/55LrC3mkZJPiU/Sw4cGdd4OkCady5w267 kiWw7FPqnGtBezy/P8Q9zkemNHeT2yv1gB2nD6JoTZvx6zb0E+f0RWtlHekZbplLukzw Dbym7T4iocAUEXkrfIL+LBO8YnfSDWpxz1HBkMvvet4dV0phlUItRoSe+AHBTxh46TrQ UkggAapbrvCmIc11WGvd+QnY+4GKzOHkfr5ORQd/7twTAMeCIs1jkAZadiVz9kM26bTk wFWg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id p26-20020a17090628da00b0073d8659db5csi10490230ejd.966.2022.09.25.07.27.27; Sun, 25 Sep 2022 07:27:28 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 085C068BBC7; Sun, 25 Sep 2022 17:26:33 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F17E568BB14 for ; Sun, 25 Sep 2022 17:26:25 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 2B360C00B5 for ; Sun, 25 Sep 2022 17:26:21 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:57 +0300 Message-Id: <20220925142619.67917-9-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/31] lavu/floatdsp: RISC-V V vector_dmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Qupltu+Dk1im From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 9 ++++++++- libavutil/riscv/float_dsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index de567c50d2..b829c0f736 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -28,12 +28,19 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, + int len); + av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) + if (flags & AV_CPU_FLAG_RV_ZVE32F) { fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + + if (flags & AV_CPU_FLAG_RV_ZVE64D) + fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } #endif } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 50cb1fa90f..17dda471b4 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -37,3 +37,20 @@ NOHWF mv a2, a3 ret endfunc + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_dmul_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 +1: + vsetvli t0, a2, e64, m1, ta, ma + vle64.v v16, (a1) + sub a2, a2, t0 + vfmul.vf v16, v16, fa0 + sh3add a1, t0, a1 + vse64.v v16, (a0) + sh3add a0, t0, a0 + bnez a2, 1b + + ret +endfunc From patchwork Sun Sep 25 14:25:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38272 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706689pzh; Sun, 25 Sep 2022 07:28:13 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7GmhUxRRLLo5FE2gKrRQijV16amsaaVJXXf8Hzt0Q7StL7w8T2LQw2G6zLUXa9slC7ebsH X-Received: by 2002:a17:907:2d89:b0:781:eff0:9999 with SMTP id gt9-20020a1709072d8900b00781eff09999mr14672538ejc.194.1664116093341; Sun, 25 Sep 2022 07:28:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116093; cv=none; d=google.com; s=arc-20160816; b=ZvvTajFr0JzTCUWf4ABLdtRgOV0rdv6RAECNz73UiSNie3wa9TLf9MzhObf9SQdu2C 7KHWzfJrZdEilrLHHiAOrTUH9I0Irk5qqvZhrHvMTumUVM7DR/JWxpaMMtWM7HfynKNQ jyviizTXUgo+LXKf4d8DbILtshfKc3Ahlj0lD111C6pdKxPWMRy0uRlfiJ57DNASjx26 E3vPiljDbEN/faMPsqlh4nAxcdZuIDp6gPpzLpryx0Ocaa9pjM0TGWHyq2bWbD685Bzo Zif8P8Sv/ppGD7AHR9bCkMwfU05VCz/B+Us4r9Q5aKZkzUrQblYhLb8vc99WL9yBdmD3 ZwUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=KKr964wm+d6buTg7enxxCbp1wlh2VWZS6AwyBd644No=; b=BDuLIAb5ltVaAYNMwn3PSozW33z0AF+pG96Q7jlZz/ooKUmwMVV13qj/df2HBYpWJ9 nsWYGH1F4vVj7PiqFN9x8ZGu7P4gAJcQXKTLXz1eMVd54sFcAfc7ASU1bW/jPaADgXoi 5y+IejhSJ91F72OVyUZRKH1TKGfBsO7rERPIwj8Z1J+CWX9WrQm3PkhbS0hEZsg68LF5 RE9jHQuM/mtsmmgET0MnLsY25aVAkdqylvRcP1VkQcnW/3YpZ91M3egLqNdlZTUV+kxF yNBgsXteYokl3YX1ceL2lSISiBKVyradswgCBE/h+tOD8QlxSoUcip0PoqE1cEYhVI/j LV0A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id x4-20020a1709065ac400b00781d984d289si11105588ejs.495.2022.09.25.07.28.12; Sun, 25 Sep 2022 07:28:13 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 436E968BB6F; Sun, 25 Sep 2022 17:26:37 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 04FF168BB17 for ; Sun, 25 Sep 2022 17:26:25 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 54AA2C00B6 for ; Sun, 25 Sep 2022 17:26:21 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:58 +0300 Message-Id: <20220925142619.67917-10-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/31] lavu/floatdsp: RISC-V V vector_fmul X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: v/+wE5t6HkQQ From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index b829c0f736..60b79bd59e 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -25,6 +25,8 @@ #include "libavutil/cpu.h" #include "libavutil/float_dsp.h" +void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); @@ -37,6 +39,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_RV_ZVE32F) { + fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 17dda471b4..00fb7354bb 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -21,6 +21,23 @@ #include "config.h" #include "asm.S" +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_fmul_rvv, zve32f +1: + vsetvli t0, a3, e32, m1, ta, ma + vle32.v v16, (a1) + sub a3, a3, t0 + vle32.v v24, (a2) + sh2add a1, t0, a1 + vfmul.vv v16, v16, v24 + sh2add a2, t0, a2 + vse32.v v16, (a0) + sh2add a0, t0, a0 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 From patchwork Sun Sep 25 14:25:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38269 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706496pzh; Sun, 25 Sep 2022 07:27:46 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6nh3OAGOshDshYrWMO3VOR8wJso06VVPTUjt2VBz1dZiV1S9fLMc6o+VBawA5iYVwVimGs X-Received: by 2002:aa7:db12:0:b0:457:2973:7e24 with SMTP id t18-20020aa7db12000000b0045729737e24mr3312584eds.264.1664116065925; Sun, 25 Sep 2022 07:27:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116065; cv=none; d=google.com; s=arc-20160816; b=0rq9MYUtvv/sm/QcCkGJRTM8ppZBm/WJsgNRzN8VxQu77bWRRv6Dno/K90+rPw4vSA qZkAyWtLC8nEvHxfoGqPACWoYWPwp7mQkNVXJxvSZ9eUpuxgOuoVs5ZWNNLWTm2iGQaT XoflA10MRIINaWPDVPvhHIpBStEERV6WaYkHQIm5r5YdMZG57dFVWc3dQl+ACZKexvyP 0BTbDjg4Lr+4HYeucBEHtAYtgzvERRtY94+jXLVyPPzEdGThDmzrAhiNRaolu/svr9wN f7MLxcurwjn8UCDZGVM+1LM0+gHB2YLMoQkNUeiNnuwlxhshgdw40zWIuC1x9NlKlKmc w9GA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=venbTIas6+ijQUaqaNouAjBeT+ebml92+IyOAaUxoNk=; b=cLYKYh/0ynsYbDbKIT4+eB/7X5Vql59NJk51fhHZCCmOmBq/akkLIs6YruvDT+NMG6 sGQicxUR6Vdrq0Owp82k8Jw7e+xfiwovWsFYkZ4ebBrsEgZbmIH2fVzQCJ+ch3eB43DW kRUazqzeDlmkLZqu1/j4RYsnmdnt755Her8PEm8rne84KUN1xfE7k8Q0e+SXX8FlU0Bq 6Miuwj9coFWkZbCREPdMoEviQGVHLAYLxSzkoL5KWnY+dFoOjD6ff/sfpbN7sTRkqdlH Icb1LmYOw9aU6CjmoHHuJ/32ap8gId1hMyOcJOpjSQ47MvQTZsDZ/bqgT9W9cr/TxWi0 gxXw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b17-20020a056402279100b0045181429f39si15519193ede.55.2022.09.25.07.27.45; Sun, 25 Sep 2022 07:27:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F2C5868BBCD; Sun, 25 Sep 2022 17:26:34 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F158768BB10 for ; Sun, 25 Sep 2022 17:26:25 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 7E211C00B7 for ; Sun, 25 Sep 2022 17:26:21 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:25:59 +0300 Message-Id: <20220925142619.67917-11-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 11/31] lavu/floatdsp: RISC-V V vector_dmul X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 43kAQ4rB0M9H From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 6 +++++- libavutil/riscv/float_dsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 60b79bd59e..6027a67b46 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -30,6 +30,8 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -42,8 +44,10 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; - if (flags & AV_CPU_FLAG_RV_ZVE64D) + if (flags & AV_CPU_FLAG_RV_ZVE64D) { + fdsp->vector_dmul = ff_vector_dmul_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } } #endif } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 00fb7354bb..710e122444 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -55,6 +55,23 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_dmul_rvv, zve64d +1: + vsetvli t0, a3, e64, m1, ta, ma + vle64.v v16, (a1) + sub a3, a3, t0 + vle64.v v24, (a2) + sh3add a1, t0, a1 + vfmul.vv v16, v16, v24 + sh3add a2, t0, a2 + vse64.v v16, (a0) + sh3add a0, t0, a0 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Sun Sep 25 14:26:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38266 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706284pzh; Sun, 25 Sep 2022 07:27:19 -0700 (PDT) X-Google-Smtp-Source: AMsMyM775jFmBDtJRTUF6oLyaWcX3tskpUW3/J2BQ6r2QjWY9t4Y+AzTcrj4j0zWFN0i6DBYCOtb X-Received: by 2002:a17:907:3f0b:b0:781:e783:2773 with SMTP id hq11-20020a1709073f0b00b00781e7832773mr14129390ejc.610.1664116039157; Sun, 25 Sep 2022 07:27:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116039; cv=none; d=google.com; s=arc-20160816; b=0lVn6uYOohE5lRESYMwlgnB0CvaLtAugD4W6uGCin/oUANqQfA5NIdUkT0QxPTAfg1 QCIeE8Lf36Hhu+IGxZKE4gJsUhQQj0Da7zb7fCK/go6P/MyJgbTXeOuOhBHNlglGWS98 xt4f4SqRQ0CFMQMyuT3gjV1VKgncfnOr+Eij8NFOLQw0lV6kNy+frGPwXplTBPycO9by bEaLh1zjj9rdkiDNvm6V2k5nd8yj9b5xdWIWwrU/Fp7nxrTkvFQvMpMBEOspc1Pm9XZE pZXtAdqh3LvHsgXP5IShJLf52C8Q+15JJs3Oz3RGQLIhym4VMFFRDvSW4UasRJrsLer4 jriA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Jg6RIVjwnq8tBmCa8grdnVF4nut4HXo07yMC8w9zvcc=; b=R+XbY5pQu8YslXHHXSFsXgDuix5kfQcZqouhX1FaeXc+BaSwIAceBEC0OYY9N39QOO +KMl00QMegGzuOUBSrzs1yxcyYYtiy9dKeyLQB0PWwuaAWcjzQKuLcv8b8kA+Ak636kW VanF+eQB89RVRK4ZNXgIq4GSrT1bLJcv0J27AbVzjYsP7e0hA9mt+rfD+vG2Y2aTijLX E9C0z8t0DFalybcOF2AELJXePig+XTAaWVRjp4gzq2R6pAImxxvr/MKmLMXJPfUr3/Zv hd/Jpo+muzro1E+Cjkyi/vbG9tqHJP4ziFNiH5OTo/tGfVeOZ4JGcHYqC7sfUjQ7DGZg Pp6Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id q5-20020a056402032500b0045745ecd01dsi457405edw.108.2022.09.25.07.27.18; Sun, 25 Sep 2022 07:27:19 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EEE0868BB74; Sun, 25 Sep 2022 17:26:31 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F06D368BAFE for ; Sun, 25 Sep 2022 17:26:25 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id A7694C00B8 for ; Sun, 25 Sep 2022 17:26:21 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:00 +0300 Message-Id: <20220925142619.67917-12-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 12/31] lavu/floatdsp: RISC-V V vector_fmac_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: eQ0kMQzYAbNW From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 6027a67b46..c2d93e0cd7 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -27,6 +27,8 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, int len); +void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); @@ -42,6 +44,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_RV_ZVE32F) { fdsp->vector_fmul = ff_vector_fmul_rvv; + fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 710e122444..4c325db9fd 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -38,6 +38,25 @@ func ff_vector_fmul_rvv, zve32f ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_fmac_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v24, (a1) + sub a2, a2, t0 + vle32.v v16, (a0) + sh2add a1, t0, a1 + vfmacc.vf v16, fa0, v24 + vse32.v v16, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 From patchwork Sun Sep 25 14:26:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38270 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706573pzh; Sun, 25 Sep 2022 07:27:55 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5wD21zy33nbgJrHtJKSt/Vw14oV05qvM4iiL/NPkOo1Nu5B00Z9RtW6RY8Hu0nCattduqJ X-Received: by 2002:a05:6402:f0f:b0:451:1ecd:a61f with SMTP id i15-20020a0564020f0f00b004511ecda61fmr17999144eda.125.1664116074987; Sun, 25 Sep 2022 07:27:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116074; cv=none; d=google.com; s=arc-20160816; b=zNIFp95xDATEuc0vcMch3DuL9zunVCfASNDCOrjE4r3jvMEalFciUkktesboz02aEd xsyIFAPhe09zVwVbPkOsmSc+OQ6w4dF0dyPWPf5DyE2sUOW1Oj0myrF5wXbd+0W7y+oZ ES89JB/Nz3f1bRXsu0z/LO3mIEpzE8ogMQ0dz2XD+ZkfY8ZZjcVqnquyA/4yBbH2K2sj pRcCIjSPK7bXEn8JvOBUE3a2d8ghcys1yIHilm8LKX/KCa/NV8U086Hc/HxiDqbJU+0h IZdCO95FHulNIugNv3HThagCSa7DlJL6dlKKb8hyNnEehOn/iwhmuviVvlEar0OvcKjm nASQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=/wQXU58pLF2gszBIMhd0UTLjInp+/J5br53Q5FKNE8A=; b=r2Q2Sk2RD6TSqRz0cEtYhAW11VtILLz6WB3zUuW2bt/s6/m2xJsRfkK4mmfCDyzWw2 e3NTrVo2eEEk2J7Cspk06tGxylbgAfAaP0EDhdzJqnMRbUh4R1/jajzOS9e28nB+39it WbX9vT7EbhDBvGO7sUDpjvWftygsREW5y0DQL5Xd2j9muVMKhldv7PkhJomPFxke5vGx usgu8f+gVPl5J3N5eDifQIYNdYCrE8F8huglZxNO6Rd097kLDdl2AeVJdps5VIm+J+E9 EU9eTCK7Z9+VhDCWx4v2vPbN0Y074Wa6z1fETyw24APf+thaRI5sP0OwcYP1o3mSDpRN MHMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y20-20020a056402441400b004532dbfc916si15373673eda.615.2022.09.25.07.27.54; Sun, 25 Sep 2022 07:27:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CA95468BBCB; Sun, 25 Sep 2022 17:26:35 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F40FC68BB15 for ; Sun, 25 Sep 2022 17:26:25 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id D0EAEC00B9 for ; Sun, 25 Sep 2022 17:26:21 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:01 +0300 Message-Id: <20220925142619.67917-13-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 13/31] lavu/floatdsp: RISC-V V vector_dmac_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: wciuUi6XaImy From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index c2d93e0cd7..d17d0f66c5 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -34,6 +34,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); +void ff_vector_dmac_scalar_rvv(double *dst, const double *src, double mul, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -49,6 +51,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; + fdsp->vector_dmac_scalar = ff_vector_dmac_scalar_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; } } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 4c325db9fd..048ec0bc40 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -91,6 +91,24 @@ func ff_vector_dmul_rvv, zve64d ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_dmac_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 +1: + vsetvli t0, a2, e64, m1, ta, ma + vle64.v v24, (a1) + sub a2, a2, t0 + vle64.v v16, (a0) + sh3add a1, t0, a1 + vfmacc.vf v16, fa0, v24 + vse64.v v16, (a0) + sh3add a0, t0, a0 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Sun Sep 25 14:26:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38275 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706904pzh; Sun, 25 Sep 2022 07:28:38 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6z/hsjJu5lqBmFRbNzuusasIkfloMJEB1zeK28zd+LEJpCHmrJkjVqGZc2mtdyrpqchM/H X-Received: by 2002:a17:907:2c77:b0:77b:4445:a852 with SMTP id ib23-20020a1709072c7700b0077b4445a852mr14511304ejc.582.1664116118328; Sun, 25 Sep 2022 07:28:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116118; cv=none; d=google.com; s=arc-20160816; b=BGwoEO+k+FzJFCeWYBcA9ZmgFQ3GEWm5C7L1nfb+P1/EIPSNVwiVHY+QNBKB5UvJAx bRkkBZ4O5waYQ30aEik5+tlQ2IpDNzVQkcZaev9Wqa9BF5u0tdI5k+nm8H8Z+KUwVW31 XWkI83LD08sUstaYSdcbulCrhrDpg/RIcV989QyNTgivEVwNtW3Y6xb8YZb0ODQoyig+ bW9pHa67Tn+OP/2rvjoHslvJ+2JeTcyr6U2MwsYPnP3fqhp93SGbxEZxRT8yOWjyojd5 x+o6+qJ98SZshH3WuAvbPB4p9WZbenm8c+3HTsfHoKiOJobgXTVHpYdi+UwiGA449x1i EGJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=gKfsDW3oSjngAyGWef5XyzNPF5+GwH/nj3c+M9WtUC8=; b=rbPzQym/s9lHqg3162s/ASoPqFH4kAfLD3CtQ7rbA8HHOrghaH5L1xkpdrC7oQx++F MJTQ54wwl2Vgv4pkqlHLF/1uE/ob4zHs5l4ntE8y4lYizH4RERfe5ZyIn9yLehNkrzcC 4TNzpZQu51YRvie2J92E+HDbDH5y7x/xFo44MEvvgimq97TrmnGDI5heqboeJ9gYEoxV mIpj3dUgr+Egrxh07f3w36kCLWDIE6594MIRoqoUxRGNL3tnMxGWJ9+126R6/LBCEciN 6vbjG1wuYqK0wWMjQF+U7GT/vgwxUWxQrVKs2Vyplryf+DdRv6poEYKiTJ5cpBViPai+ 6oRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i26-20020a1709063c5a00b0073d9b010074si9768713ejg.824.2022.09.25.07.28.38; Sun, 25 Sep 2022 07:28:38 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B1F2B68BBF1; Sun, 25 Sep 2022 17:26:39 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 17AB568BB35 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 05E5BC00BA for ; Sun, 25 Sep 2022 17:26:21 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:02 +0300 Message-Id: <20220925142619.67917-14-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 14/31] lavu/floatdsp: RISC-V V vector_fmul_add X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3gyW5wB6b3pT From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index d17d0f66c5..2ddd2050f7 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -31,6 +31,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, + const float *src2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -48,6 +50,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 048ec0bc40..db62402878 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -74,6 +74,25 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) + (a3) [0..a4-1] +func ff_vector_fmul_add_rvv, zve32f +1: + vsetvli t0, a4, e32, m1, ta, ma + vle32.v v8, (a1) + sub a4, a4, t0 + vle32.v v16, (a2) + sh2add a1, t0, a1 + vle32.v v24, (a3) + sh2add a2, t0, a2 + vfmadd.vv v8, v16, v24 + sh2add a3, t0, a3 + vse32.v v8, (a0) + sh2add a0, t0, a0 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Sun Sep 25 14:26:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38274 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1706837pzh; Sun, 25 Sep 2022 07:28:31 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4+TJ0yK54GYe9VGsxhnnY84h7pelUsyvCh8mVA/bHQkeBifNuOB42BxdlpKFVNz5FEjrKg X-Received: by 2002:a17:907:31ca:b0:780:2170:e08c with SMTP id xf10-20020a17090731ca00b007802170e08cmr14724169ejb.145.1664116111042; Sun, 25 Sep 2022 07:28:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116111; cv=none; d=google.com; s=arc-20160816; b=upG3nkvnyugviQNONceSSEJK6o2xWDMJQ1UPqWKVMVB2d185cbHNg++MnyFbBksNW+ X8Votz6JOA7AVVE/MiuXXtBuB3is7eEOt7xo1PzhLgMMXUQFMoW9gJCs4u5/G6h1agwM XuWjOXzGRhIzXVGgkznTYfwUYNnFWLy9Ms5rmT7YbbCxZzwboyM7iJTNXJjKdNl9zZvw Z9rbPDkiAa/yDJIQpc1vl/5Ze8peVwt6icP+4rIqQrKi+5xVgwy+2A5Xkoxgt8wEMxfu UWK5u44TtAtOf95xFxGIl+U12nkv89WRMwOEDJNa0g6x5ZQnVIfVN0rMf+4zIcA2EPLz vB3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=933/vOG2UDwcYW44SbhDfw+0lsxO6bDK3FrRT/dOcVQ=; b=PnwJFLTMG0oeC50zisVd6xnPXG3j1oTBApHomcSycmLGJmzoLwc5bY1Hs2V+GS7i88 h7BzdM5sWVvMGyN7MsnFKv7/t8QY05oBj0bKGYBsqJNo+yYAPu/1lF1QzlU+AXcFC3kU 3pkNpKfTieVBe0HTRbRE0VE2OLGilHZzKLz2ZR7XS0n2AB22nkaF1OJbVXi1MEXwmbgi CBLzOmsnlsecqjaZduqfEndRxAV+i7PG7USLiZpM8IohM2eVNQpEcKHPNyxQ0OuWanfq iKZyev1jEie7GLpXc42o3FDSi/WRrouyuGTujlgAhU/6xh3J5tY6vD8eR47xKKSitkqs /jag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g15-20020a056402320f00b0045515cd7e54si9866855eda.350.2022.09.25.07.28.29; Sun, 25 Sep 2022 07:28:31 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CB4A968BB3E; Sun, 25 Sep 2022 17:26:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 174B868BB2C for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 2F593C00BB for ; Sun, 25 Sep 2022 17:26:22 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:03 +0300 Message-Id: <20220925142619.67917-15-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 15/31] lavu/floatdsp: RISC-V V butterflies_float X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: chYwu+BSCo+h From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 2ddd2050f7..f164b1308f 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -33,6 +33,7 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -51,6 +52,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index db62402878..a721c44667 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -93,6 +93,24 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_float_rvv, zve32f +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a0) + sub a2, a2, t0 + vle32.v v24, (a1) + vfadd.vv v0, v16, v24 + vfsub.vv v8, v16, v24 + vse32.v v0, (a0) + sh2add a0, t0, a0 + vse32.v v8, (a1) + sh2add a1, t0, a1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Sun Sep 25 14:26:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38306 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a05:6a10:9905:b0:2f4:3559:b653 with SMTP id j5csp1810977pxh; Sun, 25 Sep 2022 07:28:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7jXr+6lUz1m/hxVRZCRYEu0iAP2xdta6w3T9CRktM6JIL94pJOHnDy21RneEbfbJvszvRX X-Received: by 2002:a05:6402:428f:b0:454:c988:4bb1 with SMTP id g15-20020a056402428f00b00454c9884bb1mr17875905edc.196.1664116134825; Sun, 25 Sep 2022 07:28:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116134; cv=none; d=google.com; s=arc-20160816; b=XTAhBF/jf10UR0L/bdDR1uYWUqKsBGZAXvADnve6tbWuhyamP7zQDMsXiwfHx6ahkx /+90pmYbd9nSgdnEse6fKfiQMBD/HcDMSHq4afUXyszlMrk+QTm2k2G3GeC2pJRSqYH1 lbiaoR5CTSuuMx+OYwy+pm8+xIZVNdvQw5w+aiojJMUy+gBV1KvylMzwMgXFvO6AVT6n QHw3j3607Bkf4aI5zXNXPhIaODvGCW0OmeqPigonBgwQmWgLVN7TirfZjkwN4mK80FTr z0uaMU9vu5+XMoBDU5lzx4Ka83DEi2wJofKdrDqkPxfe4JACdZzJoNB1gB5OthofF0kE yrMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=hdB3kU62eK02Kn0FoV2ZqUSwTalGT4wv32DasRr1IPg=; b=s1tZ5pJ9S8P0H4Dw3+fSS4Js5y4D7jg8gPNVKmyNnWYLn+zhCVqlrXCHk0l2O1DIHz 1/HhDuEtuyofraziho2TlHdN9AmvIllYw9Z1ZlLqHs9a0ckCz63GYR33oaVHaMS/ke0/ +ejqCbXOHD+0xRau6FWye06tu3QYOL3cj0M2W0uFrO5a6Bagd7RtXOtAT0MQo8dKrS9T JXz0hBL4ZDWvdo+vDSsT8YMABEyteIn8rbuFjcMXl2LeqErZwPlPIrtMN/hxCDq++xzX oJizM4w8FReShULxtiMIVWf7NHiUlJe+XuorYcUOfOjkjr1QZkG2Y5LklVgbPwlEtec0 M2wg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nb29-20020a1709071c9d00b00781599eb7e1si14234814ejc.519.2022.09.25.07.28.54; Sun, 25 Sep 2022 07:28:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 83DCA68BC04; Sun, 25 Sep 2022 17:26:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1DBF768BA50 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 58647C00BC for ; Sun, 25 Sep 2022 17:26:22 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:04 +0300 Message-Id: <20220925142619.67917-16-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 16/31] lavu/floatdsp: RISC-V V vector_fmul_reverse X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: VOGdIvWdGquu Content-Length: 3421 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 21 +++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index f164b1308f..9b8fd9942b 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -33,6 +33,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, + const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, @@ -52,6 +54,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index a721c44667..fbd2777463 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -93,6 +93,27 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a1) * reverse(a2) [0..a3-1] +func ff_vector_fmul_reverse_rvv, zve32f + sh2add a2, a3, a2 + li t2, -4 // byte stride + addi a2, a2, -4 +1: + vsetvli t0, a3, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + sub a3, a3, t0 + vlse32.v v24, (a2), t2 + add a1, a1, t1 + vfmul.vv v16, v16, v24 + sub a2, a2, t1 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] func ff_butterflies_float_rvv, zve32f 1: From patchwork Sun Sep 25 14:26:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38310 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a05:6a10:9905:b0:2f4:3559:b653 with SMTP id j5csp1811084pxh; Sun, 25 Sep 2022 07:29:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6yQUYO3BPS2pfnLhf0kZ20hVJrTdzcGmsxvP02lwi7fQPqSyS5A2KXLjcLLebF2EKc3kSS X-Received: by 2002:a17:907:b10:b0:76f:e74f:4f4c with SMTP id h16-20020a1709070b1000b0076fe74f4f4cmr14686922ejl.132.1664116151672; Sun, 25 Sep 2022 07:29:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116151; cv=none; d=google.com; s=arc-20160816; b=U3d5kYKi7KPIRzJMQmd0BkyiZIwl1ZpGBGEyc6OUsZeU6OSjkD9yFc216qLGcpC85m 1VycpDm2DMuoOjsw9tCGGMiEg+miJ1Cc8RTuS7UKOT3rC5Gu/6idFgAL7+N45pJEpFFx 3y66l9WP+xhKJNQyKSZ4R2qumHsfrTb7Pw7M/f32/Y+K9R0Q1Pa383wN8uc0sS1NUbdX jTyJaeoGfk2Uc1wyGKlIij1OxD7h17FxlKXX5O1+d00ZtpwC0NqIWw5a//pUTTPL+/Ov F6wxRz2aMSEZ/ALWs4oYaGC0LGoH/ILrfONiQWI7laiTPNy2gXJYIP3Qa+NApzB3E3kp n6WA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=L1lu7WhPvl4gaMwL80MTF/an46r5XIo7pNkxp2serPo=; b=vxTkNtBCF/hncz13560l4NmjSs+/yM4/T60LzRoidIDVOImnAkNwtbiDI6NljQ0kGJ ZKv5iqAlvfo4oblMoziXXh0sBZlpLKjT8Mjhb6SKVQEdoig9003eeUm2r0NQ4Fji75zL 79tHEztREJXOvPV45zOtAqCVgAg911eHVSt+spO6/76+FCXjpvgSBQjlEQ8hqvgVj6k/ sq3UT2ym2eeqWaxMWyUHe9ZWU5Dhi8gLd9L6sgJ2e3F3EfpsxkNXAQYu8q9d8LDDsJAt Xb3mCeAoNPQXE/TOrujp9ahaE4pF7JGn2HIajPReHfeF2/DPBW6L7R9DiPwyWEiQLclr tTaA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ne17-20020a1709077b9100b007829f6fed9dsi8205013ejc.232.2022.09.25.07.29.11; Sun, 25 Sep 2022 07:29:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6130968BC06; Sun, 25 Sep 2022 17:26:43 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1ED8B68BB3E for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 81BD7C00BD for ; Sun, 25 Sep 2022 17:26:22 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:05 +0300 Message-Id: <20220925142619.67917-17-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 17/31] lavu/floatdsp: RISC-V V vector_fmul_window X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: GWcB9NeDVwAn Content-Length: 4142 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 33 ++++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 9b8fd9942b..dacd81c08b 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -31,6 +31,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_window_rvv(float *dst, const float *src0, + const float *src1, const float *win, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, @@ -53,6 +55,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_window = ff_vector_fmul_window_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index fbd2777463..ce530f6108 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -74,6 +74,39 @@ NOHWF mv a2, a3 ret endfunc +func ff_vector_fmul_window_rvv, zve32f + // a0: dst, a1: src0, a2: src1, a3: window, a4: length + addi t0, a4, -1 + add t1, t0, a4 + sh2add a2, t0, a2 + sh2add t0, t1, a0 + sh2add t3, t1, a3 + li t1, -4 // byte stride +1: + vsetvli t2, a4, e32, m1, ta, ma + vle32.v v16, (a1) + slli t4, t2, 2 + vlse32.v v20, (a2), t1 + sub a4, a4, t2 + vle32.v v24, (a3) + add a1, a1, t4 + vlse32.v v28, (t3), t1 + sub a2, a2, t4 + vfmul.vv v0, v16, v28 + add a3, a3, t4 + vfmul.vv v8, v16, v24 + sub t3, t3, t4 + vfnmsac.vv v0, v20, v24 + vfmacc.vv v8, v20, v28 + vse32.v v0, (a0) + add a0, a0, t4 + vsse32.v v8, (t0), t1 + sub t0, t0, t4 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) + (a3) [0..a4-1] func ff_vector_fmul_add_rvv, zve32f 1: From patchwork Sun Sep 25 14:26:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38292 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a05:6a10:9905:b0:2f4:3559:b653 with SMTP id j5csp1810905pxh; Sun, 25 Sep 2022 07:28:44 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4IcPOFWVPr01gRMSqBFbvcV7hw1QiPVg0cXqW89HKkFM+MImB5FplhMU+VaPmLOnC/Fz9+ X-Received: by 2002:a05:6402:1941:b0:457:138:1e88 with SMTP id f1-20020a056402194100b0045701381e88mr6912734edz.394.1664116124017; Sun, 25 Sep 2022 07:28:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116124; cv=none; d=google.com; s=arc-20160816; b=P5jTr3uX4cU6OONcy5l4tlhqA/g+OYNro5dE2nidBIHvWPElUzcA3FjMQOPxbfSE+2 rnLmtTlgpA/rPGTnVnZZ/LUzr2ZKU8CLdXvvBU8lwsMCSosGI30TOdH3bn0lVdp4+0fo g2hcN8wI+KgIO1ogWHRzYSCC7g6vllAOv+VVQvwJaelxo1MCttoT8RsCL0YU6h2WXl5j By2VTCsWnLZGV1o93Tq4B43z1M6Lc3/2p3raPJL3sOutIR9VZQRbAwNWeCvcvwv4nK2M 4JiwPQEoWiGQqbk3GCXcn0Jno7kgkmE/OToIsO15JnZ+Wd8HVNHO3Cyf1zEctsfJEmAI N77w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=JJwTC9oV2UndM4RW/B2+z8zwFJ8PkAJzQRfh4jBwYIk=; b=GhUNkUvQ3Or+RXCeugaNLMv89dgJAP/OEjJdopiT0jIKJm4hu+MscH/IyqaziJCV94 Rj4eoLbnoy6LCLiIjqsDZ+TZR2x7FTlqvZQ1bS+pH9WbqwdRVPpPXVub+hp0x+M0SdCP dvSyXgnDVOQsFP9qvoW2gkFO/vxVc5F/nu8QCPRksh5UteotQXj83ci5BbGJWlc4mkSy YDI0FiZJXsm709Q45+6dQJL6CA6RsVFC36FV0vq3UBYiOocOZYDvHZxallXgANNIJq1/ iGbHRgLs2LNijgSTT0YtkKM8EKycA4wJ9Z6JQsqWkVHFGyJfSJe3fjpp2CekZhnV7OGn dE9A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i22-20020a1709064fd600b0072b7fac8a7asi13350558ejw.926.2022.09.25.07.28.43; Sun, 25 Sep 2022 07:28:44 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 519B568BBF7; Sun, 25 Sep 2022 17:26:40 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1E3FB68BB37 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id AAED8C00BE for ; Sun, 25 Sep 2022 17:26:22 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:06 +0300 Message-Id: <20220925142619.67917-18-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 18/31] lavu/floatdsp: RISC-V V scalarproduct_float X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: MKQcmJCt6Zev Content-Length: 3299 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index dacd81c08b..cc9b7e83dc 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -38,6 +38,7 @@ void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); +float ff_scalarproduct_float_rvv(const float *v1, const float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -59,6 +60,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; + fdsp->scalarproduct_float = ff_scalarproduct_float_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index ce530f6108..ab2e0c42d7 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -165,6 +165,26 @@ func ff_butterflies_float_rvv, zve32f ret endfunc +// a0 = (a0).(a1) [0..a2-1] +func ff_scalarproduct_float_rvv, zve32f + vsetvli zero, zero, e32, m1, ta, ma + vmv.s.x v8, zero +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a0) + sub a2, a2, t0 + vle32.v v24, (a1) + sh2add a0, t0, a0 + vfmul.vv v16, v16, v24 + sh2add a1, t0, a1 + vfredusum.vs v8, v16, v8 + bnez a2, 1b + + vfmv.f.s fa0, v8 +NOHWF fmv.x.w a0, fa0 + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Sun Sep 25 14:26:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38302 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a05:6a10:9905:b0:2f4:3559:b653 with SMTP id j5csp1811011pxh; Sun, 25 Sep 2022 07:29:00 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4acn/nEBgyNjQXBicu2Ji9NDvuMyYGJzjYOlJOgoYtkUaz6leMXejarBFIv0M+N0RxyUPb X-Received: by 2002:a17:907:6d16:b0:781:be64:d192 with SMTP id sa22-20020a1709076d1600b00781be64d192mr14633587ejc.615.1664116140527; Sun, 25 Sep 2022 07:29:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116140; cv=none; d=google.com; s=arc-20160816; b=Ux3DA7bdLU6p3scaXkzMrbFld9Mo0hjcZe89kA++5P3lkO5fI7PUECMiSCOfyowsyl zx2f3TxT4otHq4aOmZ8saKBr/lzWCLL6nM9WUC4bUG3wx5ZRNL/CMYZn5RHSlqWmGrEI i7Q7dcTDyhcb73Duv1vUiL5ZMTJHNO5XQOx7FbW6g4S9wTAYPlQgkamhFmeYDv5zgxsD +M7mzYDCqFtxUpuQffzRnM+rFRvjqIrEB4fI2GjgEXAoI72MNA5xcvx/sKXS04MxXQoH hrwSIEwFwvcMOUlZUSt/lXg1ckdbPAps6bjg62emvLtTOVX58WPsy9HJMFXnl3HX4SN5 vsPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=5qLrbLWi/m1Uv0lxl79+KMZSHfVxFheKKGlBPiZDwAA=; b=m8u84pTv/X4D0I37pjZd+qgMAZVDja2UzVd6uySCbHk2Q26KqqVxUqvdXIgxXxfuF0 x9r9Qlz70i/2mSwVJrucfFp4ZyOkvKxvT0ju4pUWcB5hQC58kYlIg0HpQK9OwXtw+O87 nUseWCzwNubj+FXhQ8Idk8v4EnXdKIjLDTDafNOLVZehtU9qzKNAIgDZbIMEZTiJ+vpw 7U0nktbudEIgl5RHIQKrCpfIc9cpg8DHrZViAfr751NAujfMywS780YBzG8kLWwo6/h8 JLnINTVsdqqfihLndbdKMEBl5ND7rT5egxcZNmECp0jy7gjAn5NAL513kcNhE9UwabGN JCQA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gs30-20020a1709072d1e00b0077fd6028710si561702ejc.670.2022.09.25.07.29.00; Sun, 25 Sep 2022 07:29:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E906468BC05; Sun, 25 Sep 2022 17:26:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1F42768BB42 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id D447AC00BF for ; Sun, 25 Sep 2022 17:26:22 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:07 +0300 Message-Id: <20220925142619.67917-19-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 19/31] lavu/fixeddsp: RISC-V V butterflies_fixed X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: wo0Fq6ANllHU Content-Length: 6687 From: Rémi Denis-Courmont --- libavutil/fixed_dsp.c | 4 +++- libavutil/fixed_dsp.h | 1 + libavutil/riscv/Makefile | 4 +++- libavutil/riscv/fixed_dsp_init.c | 38 ++++++++++++++++++++++++++++++ libavutil/riscv/fixed_dsp_rvv.S | 40 ++++++++++++++++++++++++++++++++ 5 files changed, 85 insertions(+), 2 deletions(-) create mode 100644 libavutil/riscv/fixed_dsp_init.c create mode 100644 libavutil/riscv/fixed_dsp_rvv.S diff --git a/libavutil/fixed_dsp.c b/libavutil/fixed_dsp.c index 154f3bc2d3..bc847949dc 100644 --- a/libavutil/fixed_dsp.c +++ b/libavutil/fixed_dsp.c @@ -162,7 +162,9 @@ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int bit_exact) fdsp->butterflies_fixed = butterflies_fixed_c; fdsp->scalarproduct_fixed = scalarproduct_fixed_c; -#if ARCH_X86 +#if ARCH_RISCV + ff_fixed_dsp_init_riscv(fdsp); +#elif ARCH_X86 ff_fixed_dsp_init_x86(fdsp); #endif diff --git a/libavutil/fixed_dsp.h b/libavutil/fixed_dsp.h index fec806ff2d..1217d3a53b 100644 --- a/libavutil/fixed_dsp.h +++ b/libavutil/fixed_dsp.h @@ -161,6 +161,7 @@ typedef struct AVFixedDSPContext { */ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int strict); +void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp); void ff_fixed_dsp_init_x86(AVFixedDSPContext *fdsp); /** diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 89a8d0d990..1597154ba5 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1,3 +1,5 @@ OBJS += riscv/float_dsp_init.o \ + riscv/fixed_dsp_init.o \ riscv/cpu.o -RVV-OBJS += riscv/float_dsp_rvv.o +RVV-OBJS += riscv/float_dsp_rvv.o \ + riscv/fixed_dsp_rvv.o diff --git a/libavutil/riscv/fixed_dsp_init.c b/libavutil/riscv/fixed_dsp_init.c new file mode 100644 index 0000000000..4075e521f2 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_init.c @@ -0,0 +1,38 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/fixed_dsp.h" + +void ff_butterflies_fixed_rvv(int *v1, int *v2, int len); + +av_cold void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32X) + fdsp->butterflies_fixed = ff_butterflies_fixed_rvv; +#endif +} diff --git a/libavutil/riscv/fixed_dsp_rvv.S b/libavutil/riscv/fixed_dsp_rvv.S new file mode 100644 index 0000000000..0e78734b4c --- /dev/null +++ b/libavutil/riscv/fixed_dsp_rvv.S @@ -0,0 +1,40 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_fixed_rvv, zve32x +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a0) + sub a2, a2, t0 + vle32.v v24, (a1) + vadd.vv v0, v16, v24 + vsub.vv v8, v16, v24 + vse32.v v0, (a0) + sh2add a0, t0, a0 + vse32.v v8, (a1) + sh2add a1, t0, a1 + bnez a2, 1b + + ret +endfunc From patchwork Sun Sep 25 14:26:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38299 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a05:6a10:9905:b0:2f4:3559:b653 with SMTP id j5csp1811122pxh; Sun, 25 Sep 2022 07:29:17 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5gi0nmcHpckVXMuxh1Fzo+99keoFxwvKpaVoixpllFNc6V5HZbSdD05BT9j0qRGxuWBQjM X-Received: by 2002:a17:907:2d89:b0:781:eff0:9999 with SMTP id gt9-20020a1709072d8900b00781eff09999mr14675189ejc.194.1664116157546; Sun, 25 Sep 2022 07:29:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116157; cv=none; d=google.com; s=arc-20160816; b=xnwc0i/otu90wyXOi37tLZKrSIjvBXrdDRSeT/oFSxM8lOO7UX1b6LBuYdWlGSveN1 ySy5N0BmwhRUfsqSfziYwZrrhPXGhY9171zJTHhhm+Iv4Jwxoc9o6mAuymkCKf5QO9xX k7+KyFV2QZr4WFpT+f19ZWgRU2ykAaWLuJiK9bbgPfMCZ7PR6ke7ZnrsoWViA40tUa6/ z0AmFsulU0+53uWVlyZ/ahhn/oSqs7TI0Xs/ZghOv5CcDknT87hL2dpTVCuxi0ZUrBAX 4iO9iPAkD9WNYhQp5B9uSLzVUB21FC9+aPkPZUaehns5nP0t/j5bZTcqdRWXP4xUVvtm bxUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=LmuxczhjW/9VF6jH/2qMcFDG/XCQ1l1aswDe0vSskEU=; b=KQXPxXNP+H+soWcNASC69J6g2qrJj9cynG53oRLyrrEzCWI4cI+RfG1IjqkRRpJVAc iHe+TQIG+I6R8fSYGfa453dBHWS8Gkxiw89e393mhKUU0ivgrUDdwCXLXSMEK4ZuTOSG A9Yp1N5UBta8WAB8jkX57Lu9CmhKSUQsuNyV2CsIbgiomR4g3iKxJLEZGxxkq76N+omL CzHWSU3IYnatEYSfEK67VqTzgKM6OG+2i92HIkl1r7Ncr2Cd8MQtTBY7hhXDmAE0HMb3 5JpkqbhcYiT0fuufjpddzoHCv6CKqVDGYSvDiF5WCS33L/0jUoqLRsMqvYZvECpGi/rQ LaCw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qf23-20020a1709077f1700b0078363fceadbsi1102583ejc.528.2022.09.25.07.29.17; Sun, 25 Sep 2022 07:29:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 042FA68BC10; Sun, 25 Sep 2022 17:26:44 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1F76D68BB45 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 09B7BC00C0 for ; Sun, 25 Sep 2022 17:26:22 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:08 +0300 Message-Id: <20220925142619.67917-20-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 20/31] lavc/audiodsp: RISC-V V vector_clip_int32 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Yusah4SUP8zC Content-Length: 4543 From: Rémi Denis-Courmont --- libavcodec/riscv/Makefile | 1 + libavcodec/riscv/audiodsp_init.c | 9 ++++++++ libavcodec/riscv/audiodsp_rvv.S | 36 ++++++++++++++++++++++++++++++++ 3 files changed, 46 insertions(+) create mode 100644 libavcodec/riscv/audiodsp_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index da07f1fe96..99541b075e 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,4 +1,5 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o +RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index c5842815d6..ce8b60ee52 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -18,16 +18,25 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "config.h" + #include "libavutil/attributes.h" #include "libavutil/cpu.h" #include "libavcodec/audiodsp.h" void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); +void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, + int32_t max, unsigned int len); + av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) { int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_RVF) c->vector_clipf = ff_vector_clipf_rvf; +#if HAVE_RVV + if (flags & AV_CPU_FLAG_RV_ZVE32X) + c->vector_clip_int32 = ff_vector_clip_int32_rvv; +#endif } diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S new file mode 100644 index 0000000000..49546ee3c4 --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -0,0 +1,36 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clip_int32_rvv, zve32x +1: + vsetvli t0, a4, e32, m1, ta, ma + vle32.v v8, (a1) + sub a4, a4, t0 + vmax.vx v8, v8, a2 + sh2add a1, t0, a1 + vmin.vx v8, v8, a3 + vse32.v v8, (a0) + sh2add a0, t0, a0 + bnez a4, 1b + + ret +endfunc From patchwork Sun Sep 25 14:26:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38277 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1707422pzh; Sun, 25 Sep 2022 07:29:52 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5aIL3bfLiI6AsUfztoL2YQxgAtBUuQeZlUImhDtGd2FyePjO/h6D6IzOTgXFE1e/twSCXs X-Received: by 2002:aa7:cad5:0:b0:454:88dc:2c22 with SMTP id l21-20020aa7cad5000000b0045488dc2c22mr18155704edt.352.1664116192751; Sun, 25 Sep 2022 07:29:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116192; cv=none; d=google.com; s=arc-20160816; b=B/z2dFiOqXKAP9x87fAZwYGup8iZgithMbwYL5oYOzGQcQPXODjK1RRa5kV7D8WHIR UyRArMepAdqCFAtrFdu9aURO+P69uYLkakzGRwYPMRCUbUYUS8t71TnjNzjl06xacbyt mC+M44+wCiWSFmovAOda6t0J8unNGeh7Hd/W4gwNc2nTYdus3y6QH+7xTzs/aFP7UsWR KFUn7wuDVDtFkaqBEDJ4nitZ+i7ouzcBi16GNGJv1gTglkpw5AeOexXKCm6g6mKEZ3io dNUBRc7hvJluWz2Y6Kjoa1ippSZ4EI+tVlSPWVj1UFb6DtYV8q2JvKlDPnJBJ47dzu0S deRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Ef/0DNeiNBDUgU3bxOWVM9itWVrHRXEWIbk8ldylurw=; b=zbFXz7yD7SGKe9l6a2W5ld4pnnCHC4pvvw7DWH+1E6qe3mPET5gsSBEkaYabYi4YPB XVhfy+oDtquYhqAV/HJ0HAZuHEMZ+k1kWOwnaLYldhkI8LGkKz99EYhtBD7aXQyS81ft 2BN9PwiiWrrJgVP0OZsq89i0UcAd4sidVY20s58s8U9dKes48HW1E7GT+j4PMPUC7ygq 9x0d3Nz09aeemUyRtxr/QF8+4nOJWq/UhySYif42Y41mioh99Oqd1R5eqctIxUmMPXVN Akl4Xv+Wb/YRHf20VHJuqLxUliOncu18lPN1anh2RfIxhIxr4t368PL0fo44W0/rvGR/ CIww== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id op25-20020a170906bcf900b006ff49b183e9si11969611ejb.971.2022.09.25.07.29.52; Sun, 25 Sep 2022 07:29:52 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0FE0C68BC1E; Sun, 25 Sep 2022 17:26:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3771F68B940 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 32C27C00C1 for ; Sun, 25 Sep 2022 17:26:23 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:09 +0300 Message-Id: <20220925142619.67917-21-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 21/31] lavc/audiodsp: RISC-V V vector_clipf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: vo3yULEPfc1Z From: Rémi Denis-Courmont --- libavcodec/riscv/audiodsp_init.c | 7 ++++++- libavcodec/riscv/audiodsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index ce8b60ee52..ddd561484f 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -26,6 +26,7 @@ void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); +void ff_vector_clipf_rvv(float *dst, const float *src, int len, float min, float max); void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, int32_t max, unsigned int len); @@ -36,7 +37,11 @@ av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) if (flags & AV_CPU_FLAG_RVF) c->vector_clipf = ff_vector_clipf_rvf; #if HAVE_RVV - if (flags & AV_CPU_FLAG_RV_ZVE32X) + if (flags & AV_CPU_FLAG_RV_ZVE32X) { c->vector_clip_int32 = ff_vector_clip_int32_rvv; + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->vector_clipf = ff_vector_clipf_rvv; + } #endif } diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S index 49546ee3c4..427b424cb9 100644 --- a/libavcodec/riscv/audiodsp_rvv.S +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -34,3 +34,20 @@ func ff_vector_clip_int32_rvv, zve32x ret endfunc + +func ff_vector_clipf_rvv, zve32f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.x fa1, a4 +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v8, (a1) + sub a2, a2, t0 + vfmax.vf v8, v8, fa0 + sh2add a1, t0, a1 + vfmin.vf v8, v8, fa1 + vse32.v v8, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc From patchwork Sun Sep 25 14:26:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38298 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a05:6a10:9905:b0:2f4:3559:b653 with SMTP id j5csp1811179pxh; Sun, 25 Sep 2022 07:29:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6PwiqA2SK18raaJfUVrPOAqCts12FQyEKB2g7kF3mSbip2SH+tPD/uPQlCWJYB2KeUd4Tn X-Received: by 2002:a17:906:dacd:b0:780:a90c:e144 with SMTP id xi13-20020a170906dacd00b00780a90ce144mr14705539ejb.153.1664116169158; Sun, 25 Sep 2022 07:29:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116169; cv=none; d=google.com; s=arc-20160816; b=Uzv2aWykWaWx0A612vfQ9c4yWO/btfZ7479KM+nc2bn/v/UDY6buQ9ehrs+l5l0O2p jgB7008kKfnU+wjMM2ty+jCxtEKqMCqAGs0MrwXwE8yE88K/0+S+FBAEfUGs2pljkzzb uEfJaE94RX9d3e3NNov+RY29m1mIAx5WZhqF4ndMohCA7JX66oFycU6ljoVvplKLYurd iZfr7gTUZols08y4ZPSCIS2UUm9Dkx1dt7QILmtCpTn1oiBJbhm+xr369QnFjj38JRaR 7s553gopHjPolExfbexyWZ+uPnrOmvz9kRxaMuTehZYHVfP8fVwspLfPLmf/huwXWJ5E xSNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=d1NvtEE26BvPQJxIP/ix/3C0CBxzL/HjJ9SZfnoe0aQ=; b=m2BDAKvzgOY7sad8cHoZIBiU2C7ICiPddNINKV5IWH/cPaOZqtlAv+EJu01AEA0WCq 0cyQ5AsFlZgSD0tMKbPgCIf7si7P47Tb60KZpW0T5KsgB1touYnyZ64p4JCpXkCqlxjk svMQ7NJ9FhcI+mgtCZtZDNxkGpJKUdAaPTXcLXxaBPM+AeI4f8rY683p2P/3RoxRDmxZ g0LgFuI/WWznR9O29zz8rEAGyVHALQrtLSfIZGbTSBVGdgHwMS7lYcN6j7cjFhI8pBOU CIB8u3x/LhuxGUKDv4bafzoyqNtFHII1bElldsB4909Z7gSH4xAakZBAAyLal5/3bhKd 6+aA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i9-20020a1709064fc900b0077f61588231si14445070ejw.539.2022.09.25.07.29.28; Sun, 25 Sep 2022 07:29:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8DF3B68BB67; Sun, 25 Sep 2022 17:26:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2277D68BB53 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 5BDB9C00C2 for ; Sun, 25 Sep 2022 17:26:23 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:10 +0300 Message-Id: <20220925142619.67917-22-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 22/31] lavc/audiodsp: RISC-V V scalarproduct_int16 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: tDKWcNq59CZF Content-Length: 3133 From: Rémi Denis-Courmont --- libavcodec/riscv/audiodsp_init.c | 2 ++ libavcodec/riscv/audiodsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index ddd561484f..6f38b7bc83 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -29,6 +29,7 @@ void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float void ff_vector_clipf_rvv(float *dst, const float *src, int len, float min, float max); void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, int32_t max, unsigned int len); +int32_t ff_scalarproduct_int16_rvv(const int16_t *v1, const int16_t *v2, int len); av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) { @@ -38,6 +39,7 @@ av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) c->vector_clipf = ff_vector_clipf_rvf; #if HAVE_RVV if (flags & AV_CPU_FLAG_RV_ZVE32X) { + c->scalarproduct_int16 = ff_scalarproduct_int16_rvv; c->vector_clip_int32 = ff_vector_clip_int32_rvv; if (flags & AV_CPU_FLAG_RV_ZVE32F) diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S index 427b424cb9..f4308f27c5 100644 --- a/libavcodec/riscv/audiodsp_rvv.S +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -20,6 +20,25 @@ #include "libavutil/riscv/asm.S" +func ff_scalarproduct_int16_rvv, zve32x + vsetvli zero, zero, e16, m1, ta, ma + vmv.s.x v8, zero +1: + vsetvli t0, a2, e16, m1, ta, ma + vle16.v v16, (a0) + sub a2, a2, t0 + vle16.v v24, (a1) + sh1add a0, t0, a0 + vwmul.vv v0, v16, v24 + sh1add a1, t0, a1 + vsetvli zero, t0, e32, m2, ta, ma + vredsum.vs v8, v0, v8 + bnez a2, 1b + + vmv.x.s a0, v8 + ret +endfunc + func ff_vector_clip_int32_rvv, zve32x 1: vsetvli t0, a4, e32, m1, ta, ma From patchwork Sun Sep 25 14:26:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38278 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1707490pzh; Sun, 25 Sep 2022 07:30:00 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7Y4YbM7aE5bBcUl+OgsdizxTDEEqQje/6GsqqfcMQscD4cqFWP6T0/6j2hRfEwBULYeJEj X-Received: by 2002:a05:6402:538f:b0:444:c17b:1665 with SMTP id ew15-20020a056402538f00b00444c17b1665mr18034563edb.98.1664116200656; Sun, 25 Sep 2022 07:30:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116200; cv=none; d=google.com; s=arc-20160816; b=mKZRUJOnzdqDmlaWaNdQ7WEK3DPUjwfxowpMprSXS/k5AJJ/dEYPylDPs/fYQnTkyA QFE+j4605gOy8TQ+uEN6X0u/PFhEgtkpGv2kKGL3eEJ3fNpYEeamAZZqo2KJ5Oobz6oO 729uJui2QodKD04MN8wKR+uWiQhXZS6TIWWLW4DWX5fyYWTqVRm7nBFkQ4tR/W/3RUIe iY+1lQfDbkag4n3ba1N+6Uat4d3fGIISWuX81YAYtdHIHeIdOrrrzrMTvL9TCNYP81Ay tGUxWm103U01BZJ7UcLqJ2rsq1QBcoQlxxuJAr4Xr9Rh9NDgNJQDMlxR8oxWYehQjNtJ 5FFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ePHmr/P7rfpaV72S7ofj9xbBtyei9GokvzOo7e18WC0=; b=x2q0gbjxHiLLt6Xlja0FmAW9JE5XktDWLYTr/n2VvsgYmnemcQLgrlVDdb/mjUhjAW wewJGzx/3D6ZPqsC8OzcIEp9MPnWZp5K2UOekACiXLNwqukWghLTcItE3urq+UJIr3iV sLc02mSEXTx6Yj/MbOpPkmRUVZDh7vAcx9hYzbQ6cqT8mxC6prQGSI7Dicrh5sS6uyCC SXJ5Ap63J7LvlR7FF+fWblB77JWiPRXcTlgkvuYzVF81w2ZF9Op4T1WZrrwHRbwIA6E6 OS65k41aGApAA1FkMAEV5qzGmYPxG4HgHwjs9nk0YsVXe55kJm4tsxQNeuVeJYZbh7pU /PEA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id bq13-20020a170906d0cd00b00780440a9b90si10625977ejb.149.2022.09.25.07.30.00; Sun, 25 Sep 2022 07:30:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DF09568BB18; Sun, 25 Sep 2022 17:26:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3EBBF68BB67 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 85227C00C3 for ; Sun, 25 Sep 2022 17:26:23 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:11 +0300 Message-Id: <20220925142619.67917-23-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 23/31] lavc/fmtconvert: RISC-V V int32_to_float_fmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 6oUyN4ym2QxK From: Rémi Denis-Courmont --- libavcodec/fmtconvert.c | 2 ++ libavcodec/fmtconvert.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/fmtconvert_init.c | 39 ++++++++++++++++++++++++++++++ libavcodec/riscv/fmtconvert_rvv.S | 39 ++++++++++++++++++++++++++++++ 5 files changed, 83 insertions(+) create mode 100644 libavcodec/riscv/fmtconvert_init.c create mode 100644 libavcodec/riscv/fmtconvert_rvv.S diff --git a/libavcodec/fmtconvert.c b/libavcodec/fmtconvert.c index cedfd61138..d889e61aca 100644 --- a/libavcodec/fmtconvert.c +++ b/libavcodec/fmtconvert.c @@ -52,6 +52,8 @@ av_cold void ff_fmt_convert_init(FmtConvertContext *c) ff_fmt_convert_init_arm(c); #elif ARCH_PPC ff_fmt_convert_init_ppc(c); +#elif ARCH_RISCV + ff_fmt_convert_init_riscv(c); #elif ARCH_X86 ff_fmt_convert_init_x86(c); #endif diff --git a/libavcodec/fmtconvert.h b/libavcodec/fmtconvert.h index da244e05a5..1cb4628a64 100644 --- a/libavcodec/fmtconvert.h +++ b/libavcodec/fmtconvert.h @@ -61,6 +61,7 @@ void ff_fmt_convert_init(FmtConvertContext *c); void ff_fmt_convert_init_aarch64(FmtConvertContext *c); void ff_fmt_convert_init_arm(FmtConvertContext *c); void ff_fmt_convert_init_ppc(FmtConvertContext *c); +void ff_fmt_convert_init_riscv(FmtConvertContext *c); void ff_fmt_convert_init_x86(FmtConvertContext *c); void ff_fmt_convert_init_mips(FmtConvertContext *c); diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 99541b075e..682174e875 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,5 +1,7 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o +OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o +RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/fmtconvert_init.c b/libavcodec/riscv/fmtconvert_init.c new file mode 100644 index 0000000000..fd2f58d060 --- /dev/null +++ b/libavcodec/riscv/fmtconvert_init.c @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/fmtconvert.h" + +void ff_int32_to_float_fmul_scalar_rvv(float *dst, const int32_t *src, + float mul, int len); + +av_cold void ff_fmt_convert_init_riscv(FmtConvertContext *c) +{ +#ifdef HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->int32_to_float_fmul_scalar = ff_int32_to_float_fmul_scalar_rvv; +#endif +} diff --git a/libavcodec/riscv/fmtconvert_rvv.S b/libavcodec/riscv/fmtconvert_rvv.S new file mode 100644 index 0000000000..b7c78831a0 --- /dev/null +++ b/libavcodec/riscv/fmtconvert_rvv.S @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_int32_to_float_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v24, (a1) + sub a2, a2, t0 + vfcvt.f.x.v v24, v24 + sh2add a1, t0, a1 + vfmul.vf v24, v24, fa0 + vse32.v v24, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc From patchwork Sun Sep 25 14:26:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38291 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a05:6a10:9905:b0:2f4:3559:b653 with SMTP id j5csp1811220pxh; Sun, 25 Sep 2022 07:29:35 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4i5W2pkhxbbq2clt7SxxgCYjsu5hdDNsQCzyVFBZh+yMvUa3bYVm/YoEYMYMAM9WyKXDRZ X-Received: by 2002:a17:907:2cc8:b0:77d:6f62:7661 with SMTP id hg8-20020a1709072cc800b0077d6f627661mr14586254ejc.233.1664116174836; Sun, 25 Sep 2022 07:29:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116174; cv=none; d=google.com; s=arc-20160816; b=HpCzXcq1oyvEpym1yV3vMxLtRWCkAD0w+mIAsRuwa4Pifvj+fGGfICyMJ7E/PidAC2 y1LgxqV/vIr5SZwFKpZn/NQ2AOxps4R3Up0aLoguj4SmM+K0bJkS9f06hoKIJAkNso3I sXoljXTYsemNw2TVGbhfekIkfylP1HV0qP9tAYPPu1y2K01q9ksZ1DUUWVMdSjN/xqMt 44vObkUPJC+G9VCtXXICILq/IPCdnWxe1c1/tYxSepbUBefioFSYSVkDTzJI2iFll7GL 2V4KVdgjAFag43Dz3Lm6bik00JRJ/mWh6PRYLnie4Y0QdOshhZupOm0Tqx2v9ruB7/fs BEAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=uuwsOy5WFqcXJRYpJCgXfMqtqL26EKMyZz0xFmZFtBI=; b=YELT/CvbngkTiLgmVRjNa8Q57eeyZZbI9l7fxnN5dHPdA7BiAHl6Q5iu31c8N5GgqI lAW27YdY0Q3ypsqdDXKGD/NKlDZ5xbbLzJdT2XUksa81HUDvH0qqeGxG8sLwjPeiAbYS h58zHboR9u9MNke20R9ruyY90fm2J5MYYgduis06oIQoVDscHTL1EnymEj5SFDfA5KWU Bvv4l6R/b+hxdGMzfbE5DMT3MJYtPzguwJlbm3IhzwTYhpo7H9M1tYiRorxofeedefI4 gKiEdWsJw6DVQ2HssMNkNLi1GOliivXoxSSuETcVzmGu5o9dsZb2Y09loODSK5NyySyw t9Aw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 23-20020a508e17000000b004517955b673si12410505edw.124.2022.09.25.07.29.34; Sun, 25 Sep 2022 07:29:34 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1A39A68BC12; Sun, 25 Sep 2022 17:26:46 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2414E68B468 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id AF013C00C4 for ; Sun, 25 Sep 2022 17:26:23 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:12 +0300 Message-Id: <20220925142619.67917-24-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 24/31] lavc/fmtconvert: RISC-V V int32_to_float_fmul_array8 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ZI/pLdB1BcTL Content-Length: 3417 From: Rémi Denis-Courmont --- libavcodec/riscv/fmtconvert_init.c | 7 ++++++- libavcodec/riscv/fmtconvert_rvv.S | 28 ++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/fmtconvert_init.c b/libavcodec/riscv/fmtconvert_init.c index fd2f58d060..1796717a1c 100644 --- a/libavcodec/riscv/fmtconvert_init.c +++ b/libavcodec/riscv/fmtconvert_init.c @@ -27,13 +27,18 @@ void ff_int32_to_float_fmul_scalar_rvv(float *dst, const int32_t *src, float mul, int len); +void ff_int32_to_float_fmul_array8_rvv(FmtConvertContext *c, float *dst, + const int32_t *src, const float *mul, + int len); av_cold void ff_fmt_convert_init_riscv(FmtConvertContext *c) { #ifdef HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) + if (flags & AV_CPU_FLAG_RV_ZVE32F) { c->int32_to_float_fmul_scalar = ff_int32_to_float_fmul_scalar_rvv; + c->int32_to_float_fmul_array8 = ff_int32_to_float_fmul_array8_rvv; + } #endif } diff --git a/libavcodec/riscv/fmtconvert_rvv.S b/libavcodec/riscv/fmtconvert_rvv.S index b7c78831a0..c79f80cc47 100644 --- a/libavcodec/riscv/fmtconvert_rvv.S +++ b/libavcodec/riscv/fmtconvert_rvv.S @@ -37,3 +37,31 @@ NOHWF mv a2, a3 ret endfunc + +func ff_int32_to_float_fmul_array8_rvv, zve32f + srai a4, a4, 3 + +1: vsetvli t0, a4, e32, m1, ta, ma + vle32.v v24, (a3) + slli t2, t0, 2 + 3 + vlseg8e32.v v16, (a2) + vsetvli t3, zero, e32, m8, ta, ma + vfcvt.f.x.v v16, v16 + vsetvli zero, a4, e32, m1, ta, ma + vfmul.vv v16, v16, v24 + sub a4, a4, t0 + vfmul.vv v17, v17, v24 + sh2add a3, t0, a3 + vfmul.vv v18, v18, v24 + add a2, a2, t2 + vfmul.vv v19, v19, v24 + vfmul.vv v20, v20, v24 + vfmul.vv v21, v21, v24 + vfmul.vv v22, v22, v24 + vfmul.vv v23, v23, v24 + vsseg8e32.v v16, (a1) + add a1, a1, t2 + bnez a4, 1b + + ret +endfunc From patchwork Sun Sep 25 14:26:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38280 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1707615pzh; Sun, 25 Sep 2022 07:30:17 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6Ks/zj08o/vKM5IllQhYPX6/r+xYOVfTEwyCvw0OcslzwVwinKfhKfEUI0s4RHZwAh7keA X-Received: by 2002:a17:907:7287:b0:783:5e47:33a0 with SMTP id dt7-20020a170907728700b007835e4733a0mr1759782ejc.449.1664116217605; Sun, 25 Sep 2022 07:30:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116217; cv=none; d=google.com; s=arc-20160816; b=xDyJZ89yGVfLMRl02YRH0hMvsBbIBfj+a7xtE/1Ph3hNHx5I7+zI9DZ2rcHwXC3N2W mtAngsLWZ3J1uDeY/K4hP+Ni5K+hqbNfvLu4ZnWzIXZAgT+pnFs2QVGaH19nRLcRTx9g tsAvwPQ2TsjZRZwwAq+Bi+WUEfyPo9QjmwHyRToJlQT0ljckrUp11ENnV8IJ7t1DCj6v 74T3K8xJL+FHPRKWxiuF341WCAn3LVjhnkRV5dL8oEeGSAMsH/kAh3z3KR/SoOMYg5w6 4ivy29yvi281oizShcGyQRjUuehKcmxgSKYlEbTWec4VbUpQSsYXe5gF+Csq+EH5lYXK gtXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=LzVWChJeyWho2o9o/d4rvEtb2+zWRXthaahEeocIqmg=; b=CBn+eSR6Q7k6BTtFXOf0A3hf+fze6icmjJOIOoDYT0kkqSXNptpEQDwcPLmY9A8t3N 23J2U3SbymVSxzz4mk37EZehfvsr6Mrwyqkmz4q5DDwXxhHf6XHx0FGoL8xmg5LoKKbm WnOPYHvL9Irv2C1m3zvQbXnl9es/YTWiHys8YdekjN4pIyRuzkCvOrJESb3F/in21vor bGbFc6f0KwQ2FPoc4QrCvUoifGf4y6vkmGJ7+othb5op3lfkKy/SMMm3vjlQ7uDRyPPv +SkrtOpDwlywIPQbzBbcQmhuYL4q+62si6ioJffJMmvg457w47vX2ltOS09m1Q5rpq19 jnwQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e13-20020a1709062c0d00b00780dd2b1fb3si10915588ejh.867.2022.09.25.07.30.17; Sun, 25 Sep 2022 07:30:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 195C268BC31; Sun, 25 Sep 2022 17:26:51 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4506768BB5A for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id D8563C00C5 for ; Sun, 25 Sep 2022 17:26:23 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:13 +0300 Message-Id: <20220925142619.67917-25-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 25/31] lavc/vorbisdsp: RISC-V V inverse_coupling X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: YkNkUTDE5RYz From: Rémi Denis-Courmont This uses the following vectorisation: for (i = 0; i < blocksize; i++) { ang[i] = mag[i] - copysignf(fmaxf(ang[i], 0.f), mag[i]); mag[i] = mag[i] - copysignf(fminf(ang[i], 0.f), mag[i]); } --- libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/vorbisdsp_init.c | 37 ++++++++++++++++++++++++++ libavcodec/riscv/vorbisdsp_rvv.S | 44 +++++++++++++++++++++++++++++++ libavcodec/vorbisdsp.c | 2 ++ libavcodec/vorbisdsp.h | 1 + 5 files changed, 86 insertions(+) create mode 100644 libavcodec/riscv/vorbisdsp_init.c create mode 100644 libavcodec/riscv/vorbisdsp_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 682174e875..03a95301d7 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -5,3 +5,5 @@ OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o +OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o +RVV-OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_rvv.o diff --git a/libavcodec/riscv/vorbisdsp_init.c b/libavcodec/riscv/vorbisdsp_init.c new file mode 100644 index 0000000000..d8432bc0f8 --- /dev/null +++ b/libavcodec/riscv/vorbisdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/vorbisdsp.h" + +void ff_vorbis_inverse_coupling_rvv(float *mag, float *ang, + ptrdiff_t blocksize); + +av_cold void ff_vorbisdsp_init_riscv(VorbisDSPContext *c) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->vorbis_inverse_coupling = ff_vorbis_inverse_coupling_rvv; +#endif +} diff --git a/libavcodec/riscv/vorbisdsp_rvv.S b/libavcodec/riscv/vorbisdsp_rvv.S new file mode 100644 index 0000000000..e8953fb548 --- /dev/null +++ b/libavcodec/riscv/vorbisdsp_rvv.S @@ -0,0 +1,44 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_vorbis_inverse_coupling_rvv, zve32f + fmv.w.x ft0, zero +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a1) + sub a2, a2, t0 + vle32.v v24, (a0) + vfmax.vf v8, v16, ft0 + vfmin.vf v16, v16, ft0 + vfsgnj.vv v8, v8, v24 + vfsgnj.vv v16, v16, v24 + vfsub.vv v8, v24, v8 + vfsub.vv v24, v24, v16 + vse32.v v8, (a1) + sh2add a1, t0, a1 + vse32.v v24, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc diff --git a/libavcodec/vorbisdsp.c b/libavcodec/vorbisdsp.c index 693c44dfcb..70022bd262 100644 --- a/libavcodec/vorbisdsp.c +++ b/libavcodec/vorbisdsp.c @@ -53,6 +53,8 @@ av_cold void ff_vorbisdsp_init(VorbisDSPContext *dsp) ff_vorbisdsp_init_arm(dsp); #elif ARCH_PPC ff_vorbisdsp_init_ppc(dsp); +#elif ARCH_RISCV + ff_vorbisdsp_init_riscv(dsp); #elif ARCH_X86 ff_vorbisdsp_init_x86(dsp); #endif diff --git a/libavcodec/vorbisdsp.h b/libavcodec/vorbisdsp.h index 1775a92cf2..5c369ecf22 100644 --- a/libavcodec/vorbisdsp.h +++ b/libavcodec/vorbisdsp.h @@ -34,5 +34,6 @@ void ff_vorbisdsp_init_aarch64(VorbisDSPContext *dsp); void ff_vorbisdsp_init_x86(VorbisDSPContext *dsp); void ff_vorbisdsp_init_arm(VorbisDSPContext *dsp); void ff_vorbisdsp_init_ppc(VorbisDSPContext *dsp); +void ff_vorbisdsp_init_riscv(VorbisDSPContext *dsp); #endif /* AVCODEC_VORBISDSP_H */ From patchwork Sun Sep 25 14:26:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38276 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1707381pzh; Sun, 25 Sep 2022 07:29:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4jcVF1WUqYHBQt/cCFpxZMr2V/TFx54jXeJwXYNuDUcW0pTWzb2d4uZqNt4MyUJSJJx3l+ X-Received: by 2002:a05:6402:f0f:b0:451:1ecd:a61f with SMTP id i15-20020a0564020f0f00b004511ecda61fmr18005633eda.125.1664116185274; Sun, 25 Sep 2022 07:29:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116185; cv=none; d=google.com; s=arc-20160816; b=vplO5KCllbuXk2mdbdhXSUy2HKgH1J9XoGFUjSMK5wRhfOvAENRHXWXOMuFfSqiQI9 allyO4JzZ7vrTAxmDYxk4Xbiztjy7CtItcV84YxwMbZdputgznYfnED0F0LU4iAFZvIT Idf2MaDHY09xkQp0FKS5aWTsXgQQa6ApVnDz5psz2D49hspm1IZVVkYrRY9HAdxzCP59 WQq6Zdhk15MCujQmEhjpWanJl2DcCG3t2TNZ9YQpM8p8eKywfcVHcDB/VSv6oDXz+UX9 cO35imQmV0rnsT0RH9SZHMemxE1TPkAe5ZvyzhuI5oOiA2UJOLpeZCNLNJu21Q7CpK7L O5vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=XhZ5Icl5RHSU1Ee2fgXb4H4fLFbfs4BF9MbWqLNzDXk=; b=KQ/41lc3U/XS5DLbU7FSvTNSdwH1C5gHk8RR/RQ7rPAv3uBO4ahYc3hW9yIKkvlnvb vzh5DbvipgceqSHQ7CFhNKRMEPSeOrqrghDt9jw9kwJQy3BORRJZhdSH7N65TJxIpHGx t44fyazGmzcV0jgodtTKhxMNSivJ8A890huPPEyzBo9ZmSVX6VgquUr1b5cQ/Ll5XRFh 9BBCFVj/aUIkD2KVFbQr/08dI7/O2QaYlfksXUIW87PskzsITfBZNWKPSid9O/+RhIG4 XlCqRYLan8SNONKE3Uv+S7SWUYp1B29HUsJp6jfXT7l2h0VEAu6Le0zYPschQAo8HqHd k41A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hq19-20020a1709073f1300b0077d6d63bd0dsi16188016ejc.184.2022.09.25.07.29.44; Sun, 25 Sep 2022 07:29:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3927368BC15; Sun, 25 Sep 2022 17:26:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 26BD668BA05 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 0EA88C00C6 for ; Sun, 25 Sep 2022 17:26:24 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:14 +0300 Message-Id: <20220925142619.67917-26-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 26/31] lavc/aacpsdsp: RISC-V V add_squares X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: gxjMn1ypFfOg From: Rémi Denis-Courmont --- libavcodec/aacpsdsp.h | 1 + libavcodec/aacpsdsp_template.c | 2 ++ libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/aacpsdsp_init.c | 37 ++++++++++++++++++++++++++++++++ libavcodec/riscv/aacpsdsp_rvv.S | 37 ++++++++++++++++++++++++++++++++ 5 files changed, 79 insertions(+) create mode 100644 libavcodec/riscv/aacpsdsp_init.c create mode 100644 libavcodec/riscv/aacpsdsp_rvv.S diff --git a/libavcodec/aacpsdsp.h b/libavcodec/aacpsdsp.h index 917ac5303f..8b32761bdb 100644 --- a/libavcodec/aacpsdsp.h +++ b/libavcodec/aacpsdsp.h @@ -55,6 +55,7 @@ void AAC_RENAME(ff_psdsp_init)(PSDSPContext *s); void ff_psdsp_init_arm(PSDSPContext *s); void ff_psdsp_init_aarch64(PSDSPContext *s); void ff_psdsp_init_mips(PSDSPContext *s); +void ff_psdsp_init_riscv(PSDSPContext *s); void ff_psdsp_init_x86(PSDSPContext *s); #endif /* AVCODEC_AACPSDSP_H */ diff --git a/libavcodec/aacpsdsp_template.c b/libavcodec/aacpsdsp_template.c index e3cbf3feec..c063788b89 100644 --- a/libavcodec/aacpsdsp_template.c +++ b/libavcodec/aacpsdsp_template.c @@ -230,6 +230,8 @@ av_cold void AAC_RENAME(ff_psdsp_init)(PSDSPContext *s) ff_psdsp_init_aarch64(s); #elif ARCH_MIPS ff_psdsp_init_mips(s); +#elif ARCH_RISCV + ff_psdsp_init_riscv(s); #elif ARCH_X86 ff_psdsp_init_x86(s); #endif diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 03a95301d7..829a1823d2 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,3 +1,5 @@ +OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o +RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c new file mode 100644 index 0000000000..525fc9aa38 --- /dev/null +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/aacpsdsp.h" + +void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); + +av_cold void ff_psdsp_init_riscv(PSDSPContext *c) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->add_squares = ff_ps_add_squares_rvv; +#endif +} diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S new file mode 100644 index 0000000000..b516063ea7 --- /dev/null +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_ps_add_squares_rvv, zve32f +1: + vsetvli t0, a2, e32, m1, ta, ma + vlseg2e32.v v24, (a1) + sub a2, a2, t0 + vle32.v v16, (a0) + sh3add a1, t0, a1 + vfmacc.vv v16, v24, v24 + vfmacc.vv v16, v25, v25 + vse32.v v16, (a0) + sh2add a0, t0, a0 + bnez a2, 1b + + ret +endfunc From patchwork Sun Sep 25 14:26:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38279 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1707546pzh; Sun, 25 Sep 2022 07:30:09 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6wNq1A1G17U7ouJQNzszEdgFMyT/n6u1eA7ruSJHiCOu+X32JgsqQHXCx9uuQQd73Zgd5W X-Received: by 2002:a05:6402:e87:b0:456:c93c:5361 with SMTP id h7-20020a0564020e8700b00456c93c5361mr11549736eda.88.1664116208854; Sun, 25 Sep 2022 07:30:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116208; cv=none; d=google.com; s=arc-20160816; b=njIFlfs+7Nlts7oXDKDoQy2E8pwaMScAW9Kdu9HV03fXDapjo6Cvs8priwux+NauUh /N0mu3/s7I3srGkHe9jqXqsfL2e5PSpszck6Y24xCBob7pX/e2VZEpNECKTxTywGhNdS LY5sJ3hCfLdi/Dk7MiKFJZboiqZrxiKYk/6nhJM9VDtCLfspMSyhWBJ5R4tj7C81p2LP RdIcqCnEPogddzm8lpg/LD7KtEpwTp7TiFlvB9RRslbNdg/anesxIGSYVKgIwoO9qZ1U zj8ATTAFy/mnxc1HsrIMyzEpLUwMLBLlwzbwyHgdtjvZbv1Nk+++Om4MdOqaTn3Onbem dPQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=eHYRqsM8t0dH42f7vfL4nou+gTk7+wVKkOvwr27Du8I=; b=TLTzEvMcQb03oVLiDp3n2R+O+ahV8xuC7q9Pb2FrfrM09Rs57/LkcV0NrQtzn7CpaU +x043pONRp3e4XF5ubCluGqG6SI95AqzZ+TfAzG3DLEmYl4j+OUnu0lta77sCUC0Nb8Q bTGPImKiiIUWE4aIdLcokqqiC4C1vkuDnbV9CkEgbw+RzIqsyvJASWgrka0nEQv12GDD LJgySqrrXkfctZ8+UreHOWE4dLYFj1JOi4Mb6/W6DM7OYQlmTTseZ1BQDGGqUFvmhLS6 SJfOScNDR8DIiK/fMEDyZxz5Q2nwXQbWx2v0H7of5Ke6ducwTks9+cFUsym8qKDEsYci kubA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id w24-20020a170906b19800b00730672860bdsi10978553ejy.123.2022.09.25.07.30.08; Sun, 25 Sep 2022 07:30:08 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E438B68BC2E; Sun, 25 Sep 2022 17:26:49 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4463D68BB19 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 3881DC00C7 for ; Sun, 25 Sep 2022 17:26:24 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:15 +0300 Message-Id: <20220925142619.67917-27-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 27/31] lavc/aacpsdsp: RISC-V V mul_pair_single X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: d+tWs8JA/xIP From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 6 +++++- libavcodec/riscv/aacpsdsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 525fc9aa38..90c9c501c3 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -25,13 +25,17 @@ #include "libavcodec/aacpsdsp.h" void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); +void ff_ps_mul_pair_single_rvv(float (*dst)[2], float (*src0)[2], float *src1, + int n); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) + if (flags & AV_CPU_FLAG_RV_ZVE32F) { c->add_squares = ff_ps_add_squares_rvv; + c->mul_pair_single = ff_ps_mul_pair_single_rvv; + } #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index b516063ea7..70b7b72218 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -35,3 +35,20 @@ func ff_ps_add_squares_rvv, zve32f ret endfunc + +func ff_ps_mul_pair_single_rvv, zve32f +1: + vsetvli t0, a3, e32, m1, ta, ma + vlseg2e32.v v24, (a1) + sub a3, a3, t0 + vle32.v v16, (a2) + sh3add a1, t0, a1 + vfmul.vv v24, v24, v16 + sh2add a2, t0, a2 + vfmul.vv v25, v25, v16 + vsseg2e32.v v24, (a0) + sh3add a0, t0, a0 + bnez a3, 1b + + ret +endfunc From patchwork Sun Sep 25 14:26:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38281 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1707698pzh; Sun, 25 Sep 2022 07:30:27 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4OtigJnLev7A+DS1oURhtQEf/7DnsWQmtzEAliw0bPuRvWu8FAIO19RRh/nj3QnIqAA4ld X-Received: by 2002:a05:6402:51cc:b0:454:c988:48f0 with SMTP id r12-20020a05640251cc00b00454c98848f0mr17890750edd.74.1664116227100; Sun, 25 Sep 2022 07:30:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116227; cv=none; d=google.com; s=arc-20160816; b=a5lj8APTKv2HGi5gN/U0FkS+voBGN7ayMzUnn5r6PdydXZkPPFq4NqaDgV58FbsVXf 2UGOISjF9NPvnF5rryEuTt80md40TFwWo2/g1IJP+j/qwHl6PmQXGH8u0x19/UyqHHBf 3IQP2K+WVreDzyGsZfpcTEydjpmb/PxHJalMwPzP5uUpwEC75tpQwjLQ2q9W5ff1fmZi q/5/x9ObqKLr+8n20EsAJweVGg2SfqC8zMlCFwxvrIWEK2wfYb9ebUGvc17pXt5x2vVh MRtkd/S1ex1QhPqN0ex/7ouL6TnK0U6W0Nk4W6sNdmD3ENhWQB90tVK8XWIJrCU3OlY+ 4b7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=nelGJhcc0ewgI1TIC+9ICkjTvCHAuYGbgne2y+cYqCI=; b=nD8j9t9V8Fw5UdlYVwcJJF5IzYo0m26KbO/E6xDoCzGEYPiNOsJ937KsUfjZjqqHx1 VR0IOAUamjr98ACxv7drBGPBAvrQA1Axn0qwXGBZyOMOqT9OG2zZT2HM4Cq1kbllYB9F 37aaVBqNTVeXZj0MbyKZHofhwJyPvmVgyE4Ffdt2pZnI6XOiwNlIjlMrH9/TL4uKt99o 7EvgCTpWEsWbLnYAxVTIu8CUnqV/9S1T0rWgyX6sXOwMfjg2eVs/5r918Xhr0qIAj2/c 5QMbhgKN2r8sSa6oA7m2QjO/SE9I4hb3NeQ3oV2AyKeG5IMCsdaf9KRlhIggn6bACrCz 90wg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s3-20020a1709062ec300b00734c8b99826si12111603eji.803.2022.09.25.07.30.25; Sun, 25 Sep 2022 07:30:27 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0B75768BBA1; Sun, 25 Sep 2022 17:26:52 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4912868BB08 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 61EFDC00C8 for ; Sun, 25 Sep 2022 17:26:24 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:16 +0300 Message-Id: <20220925142619.67917-28-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 28/31] lavc/aacpsdsp: RISC-V V hybrid_analysis X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: C2pRDXYeJBf2 From: Rémi Denis-Courmont This starts with one-time initialisation of the 26 constant factors like 08edacc248bce3f8946d75e97188d189c74a6de6. That is done with the scalar instruction set. While the formula can readily be vectored, the gains would (probably) be more than lost in transfering the results back to FP registers (or suitably reshuffling them into vector registers). Note that the main loop could likely be scheduled sligthly better by expanding the filter macro and interleaving loads with arithmetic. It is not clear yet if that would be relevant for vector processing (as opposed to traditional SIMD). We could also use fewer vectors, but there is not much point in sparing them (they are *all* callee-clobbered). --- libavcodec/riscv/aacpsdsp_init.c | 3 + libavcodec/riscv/aacpsdsp_rvv.S | 97 ++++++++++++++++++++++++++++++++ 2 files changed, 100 insertions(+) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 90c9c501c3..6222d6f787 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -27,6 +27,8 @@ void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); void ff_ps_mul_pair_single_rvv(float (*dst)[2], float (*src0)[2], float *src1, int n); +void ff_ps_hybrid_analysis_rvv(float (*out)[2], float (*in)[2], + const float (*filter)[8][2], ptrdiff_t, int n); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { @@ -36,6 +38,7 @@ av_cold void ff_psdsp_init_riscv(PSDSPContext *c) if (flags & AV_CPU_FLAG_RV_ZVE32F) { c->add_squares = ff_ps_add_squares_rvv; c->mul_pair_single = ff_ps_mul_pair_single_rvv; + c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; } #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index 70b7b72218..65e5e0be4f 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -52,3 +52,100 @@ func ff_ps_mul_pair_single_rvv, zve32f ret endfunc + +func ff_ps_hybrid_analysis_rvv, zve32f + /* We need 26 FP registers, for 20 scratch ones. Spill fs0-fs5. */ + addi sp, sp, -32 + .irp n, 0, 1, 2, 3, 4, 5 + fsw fs\n, (4 * \n)(sp) + .endr + + .macro input, j, fd0, fd1, fd2, fd3 + flw \fd0, (4 * ((\j * 2) + 0))(a1) + flw fs4, (4 * (((12 - \j) * 2) + 0))(a1) + flw \fd1, (4 * ((\j * 2) + 1))(a1) + fsub.s \fd3, \fd0, fs4 + flw fs5, (4 * (((12 - \j) * 2) + 1))(a1) + fadd.s \fd2, \fd1, fs5 + fadd.s \fd0, \fd0, fs4 + fsub.s \fd1, \fd1, fs5 + .endm + + // re0, re1, im0, im1 + input 0, ft0, ft1, ft2, ft3 + input 1, ft4, ft5, ft6, ft7 + input 2, ft8, ft9, ft10, ft11 + input 3, fa0, fa1, fa2, fa3 + input 4, fa4, fa5, fa6, fa7 + input 5, fs0, fs1, fs2, fs3 + flw fs4, (4 * ((6 * 2) + 0))(a1) + flw fs5, (4 * ((6 * 2) + 1))(a1) + + add a2, a2, 6 * 2 * 4 // point to filter[i][6][0] + li t4, 8 * 2 * 4 // filter byte stride + slli a3, a3, 3 // output byte stride +1: + .macro filter, vs0, vs1, fo0, fo1, fo2, fo3 + vfmacc.vf v8, \fo0, \vs0 + vfmacc.vf v9, \fo2, \vs0 + vfnmsac.vf v8, \fo1, \vs1 + vfmacc.vf v9, \fo3, \vs1 + .endm + + vsetvli t0, a4, e32, m1, ta, ma + /* + * The filter (a2) has 16 segments, of which 13 need to be extracted. + * R-V V supports only up to 8 segments, so unrolling is unavoidable. + */ + addi t1, a2, -48 + vlse32.v v22, (a2), t4 + addi t2, a2, -44 + vlse32.v v16, (t1), t4 + addi t1, a2, -40 + vfmul.vf v8, v22, fs4 + vlse32.v v24, (t2), t4 + addi t2, a2, -36 + vfmul.vf v9, v22, fs5 + vlse32.v v17, (t1), t4 + addi t1, a2, -32 + vlse32.v v25, (t2), t4 + addi t2, a2, -28 + filter v16, v24, ft0, ft1, ft2, ft3 + vlse32.v v18, (t1), t4 + addi t1, a2, -24 + vlse32.v v26, (t2), t4 + addi t2, a2, -20 + filter v17, v25, ft4, ft5, ft6, ft7 + vlse32.v v19, (t1), t4 + addi t1, a2, -16 + vlse32.v v27, (t2), t4 + addi t2, a2, -12 + filter v18, v26, ft8, ft9, ft10, ft11 + vlse32.v v20, (t1), t4 + addi t1, a2, -8 + vlse32.v v28, (t2), t4 + addi t2, a2, -4 + filter v19, v27, fa0, fa1, fa2, fa3 + vlse32.v v21, (t1), t4 + sub a4, a4, t0 + vlse32.v v29, (t2), t4 + slli t1, t0, 3 + 1 + 2 // ctz(8 * 2 * 4) + add a2, a2, t1 + filter v20, v28, fa4, fa5, fa6, fa7 + filter v21, v29, fs0, fs1, fs2, fs3 + + add t2, a0, 4 + vsse32.v v8, (a0), a3 + mul t0, t0, a3 + vsse32.v v9, (t2), a3 + add a0, a0, t0 + bnez a4, 1b + + .irp n, 5, 4, 3, 2, 1, 0 + flw fs\n, (4 * \n)(sp) + .endr + addi sp, sp, 32 + ret + .purgem input + .purgem filter +endfunc From patchwork Sun Sep 25 14:26:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38282 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1707794pzh; Sun, 25 Sep 2022 07:30:35 -0700 (PDT) X-Google-Smtp-Source: AMsMyM63fU7SxNuctYlB91IDxEKL0yujcA96FTwvKL57lhy3IfdhCTu2znjvifo2qY1sT7NITzz2 X-Received: by 2002:a17:907:74e:b0:74f:83d4:cf58 with SMTP id xc14-20020a170907074e00b0074f83d4cf58mr14950834ejb.178.1664116235461; Sun, 25 Sep 2022 07:30:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116235; cv=none; d=google.com; s=arc-20160816; b=ClMZeLNvAPI96T6iXLW3hji/3LR93w1BZDv8GdTQTbHpvfU5FlPatPrYQQKJl+fto2 FQ2CZ7Jx10cXDtXyRP3xLvVARPcFBf+j34z63xz0GXOdnvOkOJHpTNC8I8GXZGphTHRz +wCll6N/KXTHAA9wNnnmag6myvalaPzWshacSokBWtI+JzJ+aSFVGlUIHlsjhkn+0uHf BM7ccQSJAmqB/H+itOIWKzL6gTL3J9ObBSfOPFKbSTCVK7eEyLUnzysUoFn5GdGiIQA4 z9LgwSDAwUA/64QXqtdw7lw2TFXEztXDMJ/+4SpCOnxlzfKEaY4sSuIJvPmZCeUghG/s z85g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ZUlTz0I9pTVYMbkjk7n7ydueMe8ktacrmM2aCMy09Qc=; b=w5b5adRM5cPdJrIsriuzEv8d4QPqE+ojJhn8c0LuhzTN1UQVvGQhBvePNVFDLsfFpu 1J1ScSd4x1H3T4qIW87MJnjLALGWoupCIz58ery+g7V/GwkDaJy9uycf4kLRCQijJ5+h foiBFE3MeakolVTEhFdhI3o6vqn0VQrmpcVMWrjUdNdynURm+VCvsIEyhth2EI1DGCVh FzxaTtndkXfAbUHc29AlXV6llhbOP33pUsRPHUYcrSzu0x6EVlBZKw2jSWt/kVi6A1np l8+PtNXcv038ouwSgZdL+z+ySCtUIEl14ExF960J5c14FRS+leui1/Z5/CRK2yqodPFu MM4w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id t14-20020a056402524e00b00450bda7e3fdsi15389099edd.28.2022.09.25.07.30.34; Sun, 25 Sep 2022 07:30:35 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0AE7368BC41; Sun, 25 Sep 2022 17:26:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4E32D68BB72 for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 8BBB1C00C9 for ; Sun, 25 Sep 2022 17:26:24 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:17 +0300 Message-Id: <20220925142619.67917-29-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 29/31] lavc/aacpsdsp: RISC-V V hybrid_analysis_ileave X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: qpIRz/Ki0VSv From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 14 +++++++++---- libavcodec/riscv/aacpsdsp_rvv.S | 35 ++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+), 4 deletions(-) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 6222d6f787..76f55502ee 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -29,16 +29,22 @@ void ff_ps_mul_pair_single_rvv(float (*dst)[2], float (*src0)[2], float *src1, int n); void ff_ps_hybrid_analysis_rvv(float (*out)[2], float (*in)[2], const float (*filter)[8][2], ptrdiff_t, int n); +void ff_ps_hybrid_analysis_ileave_rvv(float (*out)[32][2], float L[2][38][64], + int i, int len); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) { - c->add_squares = ff_ps_add_squares_rvv; - c->mul_pair_single = ff_ps_mul_pair_single_rvv; - c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; + if (flags & AV_CPU_FLAG_RV_ZVE32X) { + c->hybrid_analysis_ileave = ff_ps_hybrid_analysis_ileave_rvv; + + if (flags & AV_CPU_FLAG_RV_ZVE32F) { + c->add_squares = ff_ps_add_squares_rvv; + c->mul_pair_single = ff_ps_mul_pair_single_rvv; + c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; + } } #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index 65e5e0be4f..c9cc15e73d 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -149,3 +149,38 @@ func ff_ps_hybrid_analysis_rvv, zve32f .purgem input .purgem filter endfunc + +func ff_ps_hybrid_analysis_ileave_rvv, zve32x /* no needs for zve32f here */ + slli t0, a2, 5 + 1 + 2 // ctz(32 * 2 * 4) + sh2add a1, a2, a1 + add a0, a0, t0 + addi a2, a2, -64 + li t1, 38 * 64 * 4 + li t6, 64 * 4 // (uint8_t *)L[x][j+1][i] - L[x][j][i] + add a4, a1, t1 // &L[1] + beqz a2, 3f +1: + mv t0, a0 + mv t1, a1 + mv t3, a3 + mv t4, a4 + addi a2, a2, 1 +2: + vsetvli t5, t3, e32, m1, ta, ma + vlse32.v v16, (t1), t6 + sub t3, t3, t5 + vlse32.v v17, (t4), t6 + mul t2, t5, t6 + vsseg2e32.v v16, (t0) + sh3add t0, t5, t0 + add t1, t1, t2 + add t4, t4, t2 + bnez t3, 2b + + add a0, a0, 32 * 2 * 4 + add a1, a1, 4 + add a4, a4, 4 + bnez a2, 1b +3: + ret +endfunc From patchwork Sun Sep 25 14:26:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38283 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1707854pzh; Sun, 25 Sep 2022 07:30:44 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5jpoBhUtukgG0dpVpYCpRIkkFtoK6cxpngLh3DlTGUZ0cqeeKanPUxWUK8DMByXEerYooX X-Received: by 2002:a05:6402:5290:b0:453:5942:4ef8 with SMTP id en16-20020a056402529000b0045359424ef8mr18254000edb.180.1664116244715; Sun, 25 Sep 2022 07:30:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116244; cv=none; d=google.com; s=arc-20160816; b=TidrlOEuRrffn30LvOxVqShx8dRMkHZ230AefGM1VtvLUDFnYOFzB+uVmoAnNv0WUV sqTRFOiI7apf1xRqSS/tJUYD8LsX4gjgNpc/wreujMWp50IdPh+pqHp7vbs0LjsP0gL4 BdsSMZ/Rcj8WhIgyd2FelHaxQJFD82MA4Et7r0u41EVbVX8xXH4u/eOBGlAVcDlFsv5f z2r23MpmyiiadEIZ/4JfCx/x9uvLmZX02t7kfLUAd4I+SpSWcL/p9YAi2VZXC5dQ7U6o mDFRTiepTb+RHmPRPVGzj+a/YbzsTaS6dgyHWHWGnC/Xg/MKpperwQC5qBVlbiUbRxPC Xtlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=OiIWPGssqotQXLbMyyAqcZvuucUKw9WqnZ9X6lfPXrw=; b=QKeujFtqk18xF2kygqeNJfvxbkP3KHKeZSQa0RTCR5mmSLd/HMjUBs4VjMI24jCYVq 6eXA/MXcNuqqiMJ+WJoYxkYbhbuSYOpazDelwJ9Pl/Pnh6Tfz/bXBKOvO+zcAL+3olvM v9QOVDLctAQ97MykknMXyjVaANSffJoYPCZ5bajIyIKNEZMjoa43pawTSUiPMDs0/8OJ py07k7x8+QYDINdy4XIDs59jrbM1IcbEb+ehoIcYvCTvSj6/pxYIVGv98rWZJRhq8Mcu 95trigZAWwh51dBl/ApVCcdqhX1MSakriJAejngFbRPcJbPIbSEkt/c5Pg41EH6sxGh/ lrRQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id t6-20020a056402524600b00447b2f52d55si14551905edd.627.2022.09.25.07.30.42; Sun, 25 Sep 2022 07:30:44 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EC1B368BC46; Sun, 25 Sep 2022 17:26:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 55D3368BB7D for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B550FC00CA for ; Sun, 25 Sep 2022 17:26:24 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:18 +0300 Message-Id: <20220925142619.67917-30-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 30/31] lavc/aacpsdsp: RISC-V V hybrid_synthesis_deint X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: qT4Yx6fDPfpr From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 3 +++ libavcodec/riscv/aacpsdsp_rvv.S | 35 ++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 76f55502ee..20b1a12741 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -31,6 +31,8 @@ void ff_ps_hybrid_analysis_rvv(float (*out)[2], float (*in)[2], const float (*filter)[8][2], ptrdiff_t, int n); void ff_ps_hybrid_analysis_ileave_rvv(float (*out)[32][2], float L[2][38][64], int i, int len); +void ff_ps_hybrid_synthesis_deint_rvv(float out[2][38][64], float (*in)[32][2], + int i, int len); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { @@ -39,6 +41,7 @@ av_cold void ff_psdsp_init_riscv(PSDSPContext *c) if (flags & AV_CPU_FLAG_RV_ZVE32X) { c->hybrid_analysis_ileave = ff_ps_hybrid_analysis_ileave_rvv; + c->hybrid_synthesis_deint = ff_ps_hybrid_synthesis_deint_rvv; if (flags & AV_CPU_FLAG_RV_ZVE32F) { c->add_squares = ff_ps_add_squares_rvv; diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index c9cc15e73d..0cbe4c1d3c 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -184,3 +184,38 @@ func ff_ps_hybrid_analysis_ileave_rvv, zve32x /* no needs for zve32f here */ 3: ret endfunc + +func ff_ps_hybrid_synthesis_deint_rvv, zve32x + slli t1, a2, 5 + 1 + 2 + sh2add a0, a2, a0 + add a1, a1, t1 + addi a2, a2, -64 + li t1, 38 * 64 * 4 + li t6, 64 * 4 + add a4, a0, t1 + beqz a2, 3f +1: + mv t0, a0 + mv t1, a1 + mv t3, a3 + mv t4, a4 + addi a2, a2, 1 +2: + vsetvli t5, t3, e32, m1, ta, ma + vlseg2e32.v v16, (t1) + sub t3, t3, t5 + vsse32.v v16, (t0), t6 + mul t2, t5, t6 + vsse32.v v17, (t4), t6 + sh3add t1, t5, t1 + add t0, t0, t2 + add t4, t4, t2 + bnez t3, 2b + + add a0, a0, 4 + add a1, a1, 32 * 2 * 4 + add a4, a4, 4 + bnez a2, 1b +3: + ret +endfunc From patchwork Sun Sep 25 14:26:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38284 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1707920pzh; Sun, 25 Sep 2022 07:30:52 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5qb/mCuug//W4bKa+3zGMUvI0rk1XjUK0uy6erPpeGvIcRCmVBpmVk7xbYlob2D53awiRX X-Received: by 2002:a17:907:9804:b0:77f:364f:b797 with SMTP id ji4-20020a170907980400b0077f364fb797mr14583363ejc.88.1664116252021; Sun, 25 Sep 2022 07:30:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664116252; cv=none; d=google.com; s=arc-20160816; b=RoYXJrwNMo8D2eXv9K6ydHJnEfO2KUE41DYgIwhItJXOXcqIZ3MqC4O3DujoG0BDIE 4H26m8K8RXjUSYKLAXZHpP50d6FIe/gNDVsLe9a0bGY9BJNpbJQUCpMbdVMcD/ZMe+zL ajWxXJb4Vdl9PaXLwPB/b3Y1OeHpb5xBrPL+pVmkW6E1pckWCdI969zYv3i1LbVXGOsj Cei3ZS4O8U8OieGOEjukESryhYX4i9UqcqVzQtEaFxIQMfUaudoW4zQdDYA3iDW/zLI1 Lfts+/6WtFM77B+nFldxVj9sXvFPQDze3RY9nuxJnPvY8uuhhM66r/nvYd5kEngA/T9H Fmjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=o0idrz/HTgivVK1ebFMS+IEI8DilaN8sjbr7EqpYaa4=; b=EdJpACA2Mv6PF/xMMqr7+R43fG+Q5uVYviKzKTLl+KZZgb1AFrY52/L1TyVcHZEnL5 0dseIGftNvYFVW8AID/ZvL+dImvSxlJsoeHGOM8Vm0K3tUZ8DwiIS171R0jyP5UvOCWl bPYX9Dzw5FLd6KBjIJMDo34HPjJgnT2SJn/z/4kNpB2Yqhsdv/7sUw2Ygb8yP+Y6RsQ+ +nP+xsgS/oGFuGqehuaoKKpcU1ooTtG8imiR0G4yZ6dgw2XSQ119DsqzhKROhAG5P8sM Lj62VBq4CESnut2mov4/ThMotr40FCOY+2OKL9fM1voMxG9duvJvU6WRTgk5eVvF1jS0 1J6w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id go34-20020a1709070da200b0078034101c0esi13968166ejc.978.2022.09.25.07.30.51; Sun, 25 Sep 2022 07:30:52 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EA31B68BC48; Sun, 25 Sep 2022 17:26:54 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 57F5568BB7F for ; Sun, 25 Sep 2022 17:26:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id DF1DDC00CB for ; Sun, 25 Sep 2022 17:26:24 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sun, 25 Sep 2022 17:26:19 +0300 Message-Id: <20220925142619.67917-31-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5861881.lOV4Wx5bFT@basile.remlab.net> References: <5861881.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 31/31] lavc/aacpsdsp: RISC-V V stereo_interpolate[0] X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: M/zxmJPvJZuC From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 4 ++ libavcodec/riscv/aacpsdsp_rvv.S | 65 ++++++++++++++++++++++++++++++++ 2 files changed, 69 insertions(+) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 20b1a12741..58a4c61121 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -34,6 +34,9 @@ void ff_ps_hybrid_analysis_ileave_rvv(float (*out)[32][2], float L[2][38][64], void ff_ps_hybrid_synthesis_deint_rvv(float out[2][38][64], float (*in)[32][2], int i, int len); +void ff_ps_stereo_interpolate_rvv(float (*l)[2], float (*r)[2], + float h[2][4], float h_step[2][4], int len); + av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { #if HAVE_RVV @@ -47,6 +50,7 @@ av_cold void ff_psdsp_init_riscv(PSDSPContext *c) c->add_squares = ff_ps_add_squares_rvv; c->mul_pair_single = ff_ps_mul_pair_single_rvv; c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; + c->stereo_interpolate[0] = ff_ps_stereo_interpolate_rvv; } } #endif diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index 0cbe4c1d3c..a236dfe43c 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -219,3 +219,68 @@ func ff_ps_hybrid_synthesis_deint_rvv, zve32x 3: ret endfunc + +func ff_ps_stereo_interpolate_rvv, zve32f + vsetvli t0, zero, e32, m1, ta, ma + vid.v v24 + flw ft0, (a2) + vadd.vi v24, v24, 1 // v24[i] = i + 1 + flw ft1, 4(a2) + vfcvt.f.xu.v v24, v24 + flw ft2, 8(a2) + vfmv.v.f v16, ft0 + flw ft3, 12(a2) + vfmv.v.f v17, ft1 + flw ft0, (a3) + vfmv.v.f v18, ft2 + flw ft1, 4(a3) + vfmv.v.f v19, ft3 + flw ft2, 8(a3) + vfmv.v.f v20, ft0 + flw ft3, 12(a3) + vfmv.v.f v21, ft1 + fcvt.s.wu ft4, t0 // (float)(vlenb / sizeof (float)) + vfmv.v.f v22, ft2 + li t1, 8 + vfmv.v.f v23, ft3 + addi a6, a0, 4 // l[*][1] + vfmacc.vv v16, v24, v20 // h0 += (i + 1) * h0_step + addi a7, a1, 4 // r[*][1] + vfmacc.vv v17, v24, v21 + fmul.s ft0, ft0, ft4 + vfmacc.vv v18, v24, v22 + fmul.s ft1, ft1, ft4 + vfmacc.vv v19, v24, v23 + fmul.s ft2, ft2, ft4 + fmul.s ft3, ft3, ft4 +1: + vsetvli t0, a4, e32, m1, ta, ma + vlse32.v v8, (a0), t1 // l_re + sub a4, a4, t0 + vlse32.v v9, (a6), t1 // l_im + vlse32.v v10, (a1), t1 // r_re + vlse32.v v11, (a7), t1 // r_im + vfmul.vv v12, v8, v16 + vfmul.vv v13, v9, v16 + vfmul.vv v14, v8, v17 + vfmul.vv v15, v9, v17 + vfmacc.vv v12, v10, v18 + vfmacc.vv v13, v11, v18 + vfmacc.vv v14, v10, v19 + vfmacc.vv v15, v11, v19 + vsse32.v v12, (a0), t1 + sh3add a0, t0, a0 + vsse32.v v13, (a6), t1 + sh3add a6, t0, a6 + vsse32.v v14, (a1), t1 + sh3add a1, t0, a1 + vsse32.v v15, (a7), t1 + sh3add a7, t0, a7 + vfadd.vf v16, v16, ft0 // h0 += (vlenb / sizeof (float)) * h0_step + vfadd.vf v17, v17, ft1 + vfadd.vf v18, v18, ft2 + vfadd.vf v19, v19, ft3 + bnez a4, 1b + + ret +endfunc