From patchwork Tue Sep 20 14:39:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38075 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1988406pzh; Tue, 20 Sep 2022 07:40:26 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5e0jfbm2dzWZv3S0tlj2k20/WEIjoFEmdqJf6ROYw/70cFHsRFfdeQRwucmmnv+s1OfBuQ X-Received: by 2002:a05:6402:3603:b0:451:fdda:dddd with SMTP id el3-20020a056402360300b00451fddaddddmr10029990edb.81.1663684826457; Tue, 20 Sep 2022 07:40:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684826; cv=none; d=google.com; s=arc-20160816; b=sxrxFd4NG+Invg6Drth53yNBNNRBoYR9DsUgoJcbxbI/y/djC2PSNoxXusI9nCBq3f MNC5UJj+y4mmfWZ/V2l56UA3hcTGC+x5wHDt9NUNT7yPGvZgc0A96Nd0O7ZFnUncbOoL wx46jmGhy/C8zV1ID0UYF87XlGwm8yg1fArhxh7UH5mzrvwkH67LTXh95KeVjAyXjvUc dKgqqXu8A2O5oY0tamxQhfRll9fXYQh+SljAXrtpyKYUeZn4pXX8SVv5oOnt0bDU3jwA Vueu2UR8TX8+J1ezavvQkbJxycJ1wNik6VZfeIPaB314egS4NRE59k7ycJV5GmHVjVHg mQTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=l6kdHbkMINAW4QTRZKK+SeJpdWviTSClmNijciHlUNE=; b=T4iUGi6fJ1d2qrbvUr7NTsgDZvUCvEIwpgV0P9hlT8rUc2o1sV1KmNwIlA7WvuQsQq C+Bf6sB4HSlHs0MYRvD9YXCOClZD1MgJrs0d+ynKWnPJzxpWJdCNjcg9UDQ3GtyeQTa2 2i+bBZ6pJbJHaGJuABrE2mxRaY+mYA8F3XG3BfZhI2huUQwHz15fzh2aCKsCOw5XwYYR utLlSiSpL9l+JuRuaY369Zhu2LG5iI2jDFMxnhIkYOsifQ4hkgBCIjgUtDFl0EkVUbK2 5ncuzhl5hRAazY7dq64j9vRLTzBkBfz71O5yGpfeMn7PKH4sLl4m+CXsjIW6JhGljV91 odWw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hh9-20020a170906a94900b00781bf154a4asi1385360ejb.578.2022.09.20.07.40.23; Tue, 20 Sep 2022 07:40:26 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2F85F68BB48; Tue, 20 Sep 2022 17:40:21 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 62B6568B7AB for ; Tue, 20 Sep 2022 17:40:14 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 0EC59C00AA for ; Tue, 20 Sep 2022 17:40:14 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:48 +0300 Message-Id: <20220920144013.4959-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/26] lavu/cpu: detect RISC-V base extensions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: nmKdm6gc2PD6 From: Rémi Denis-Courmont This introduces compile-time and run-time CPU detection on RISC-V. In practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of I, F and D extensions, and if it does, it probably won't have run-time detection. So the flags are essentially always set. But as things stand, checkasm wants them that way. Compare the ARMV8 flag on AArch64. We are nowhere near running short on CPU flag bits. --- libavutil/cpu.c | 9 ++++++ libavutil/cpu.h | 5 +++ libavutil/cpu_internal.h | 3 ++ libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 66 +++++++++++++++++++++++++++++++++++++++ tests/checkasm/checkasm.c | 4 +++ 6 files changed, 88 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..78e92a1bf6 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -95,6 +97,9 @@ void av_force_cpu_flags(int arg){ arg |= AV_CPU_FLAG_MMX; } +#if ARCH_RISCV + arg = ff_force_cpu_flags_riscv(arg); +#endif atomic_store_explicit(&cpu_flags, arg, memory_order_relaxed); } @@ -178,6 +183,10 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV + { "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" }, + { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" }, + { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..9aae2ccc7a 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,11 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V extensions +#define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank) +#define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP) +#define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP) + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..9ddf11488b 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,9 +48,12 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); +int ff_force_cpu_flags_riscv(int flags); + size_t ff_get_cpu_max_align_mips(void); size_t ff_get_cpu_max_align_aarch64(void); size_t ff_get_cpu_max_align_arm(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..fec1f7822a --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,66 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "libavutil/log.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) +#endif + +int ff_force_cpu_flags_riscv(int flags) +{ + if ((flags & AV_CPU_FLAG_RVD) && !(flags & AV_CPU_FLAG_RVF)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "F"); + flags |= AV_CPU_FLAG_RVF; + } + + return flags; +} + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + if (hwcap & HWCAP_RV('I')) + ret |= AV_CPU_FLAG_RVI; + if (hwcap & HWCAP_RV('F')) + ret |= AV_CPU_FLAG_RVF; + if (hwcap & HWCAP_RV('D')) + ret |= AV_CPU_FLAG_RVD; +#endif + +#ifdef __riscv_i + ret |= AV_CPU_FLAG_RVI; +#endif +#if defined (__riscv_flen) && (__riscv_flen >= 32) + ret |= AV_CPU_FLAG_RVF; +#if (__riscv_flen >= 64) + ret |= AV_CPU_FLAG_RVD; +#endif +#endif + + return ret; +} diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 6b4a0f22b2..7730b14d98 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -229,6 +229,10 @@ static const struct { { "ALTIVEC", "altivec", AV_CPU_FLAG_ALTIVEC }, { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, +#elif ARCH_RISCV + { "RVI", "rvi", AV_CPU_FLAG_RVI }, + { "RVF", "rvf", AV_CPU_FLAG_RVF }, + { "RVD", "rvd", AV_CPU_FLAG_RVD }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Tue Sep 20 14:39:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38076 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1988472pzh; Tue, 20 Sep 2022 07:40:33 -0700 (PDT) X-Google-Smtp-Source: AMsMyM721+vkYp6OEmlcFallX+nxAFUvyE/XNeZgOh8YfBlqeW8VPeRYFrKMImXRFzryrrD361nZ X-Received: by 2002:aa7:c448:0:b0:44f:c01:2fdb with SMTP id n8-20020aa7c448000000b0044f0c012fdbmr19775616edr.88.1663684833555; Tue, 20 Sep 2022 07:40:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684833; cv=none; d=google.com; s=arc-20160816; b=ZAF7PKEzCFW7wlaThQk4cryrdLkGgzbmiKNeM+OM26iE1uJlH9PnLbxo+UcNqMvZMJ V5X5ow/EGfAzK+K1BGuSGULHxel0IUr3uRbxdJI4pjQj/n5il61W81RZ0SaFKXiuMnBl et0CuuvUxUlt+WytwFmFQVR0LHgS+QF9Ib4TPBguVVLAECzCVEQseTD020Pwx/EKFYbQ RKspEc+l7rLElLLmH3aHiYBPWKxPk0Gw4MgBYrmbUGt9ygtnIGvCmNAaQ0r5DD62Zjf1 kTzsTs6iz71gXQdJAPDgUeqqI0c/TOPQDrl0mQVSInfLNGP4lbYIwNHd/wJth2TlzRo7 D/NQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=O+HgSwCNmXwf3ppSUZbEowfy1qCjm51aUTfBxBrqgAo=; b=iaCMPbxx4uZ96zzF/H/ADAlDB5pTI2P9gmjN5VOGvOGwPzdr4o56FDk+8On6Vm0a/q xuFo5yhJGyGis4PC5mjgKDciS7M1hy39Htw82SQfO5/GUAA2T26NX0drdfvwGqIxwFYr xyMAeH4Fi3kC/L2hXcqr6JE/TyefJ0EdRdRvhYSP/OTn4z50Z0fyqBas6Xys6VmGIOeV T3qHYNWWZfJ1CMGMRMoHPuPypPar5J3LAFPQ9Enp13349v+Z6/e5K3E5uw55IViYgWAJ cQHhsxKzFLijvt+pdDY+Kq58Q7wr7nC3i+Buy/r0HBFBN2WbE8XbS9A0qnbnyUNs1o8C QCdw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h14-20020aa7cdce000000b004469a602b4esi1295edw.66.2022.09.20.07.40.33; Tue, 20 Sep 2022 07:40:33 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1DFDF68BA35; Tue, 20 Sep 2022 17:40:22 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 87BAA68B7AB for ; Tue, 20 Sep 2022 17:40:14 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 42795C00AD for ; Tue, 20 Sep 2022 17:40:14 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:49 +0300 Message-Id: <20220920144013.4959-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/26] lavu/riscv: initial common header for assembler macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: GiIH7SShcIkb From: Rémi Denis-Courmont --- libavutil/riscv/asm.S | 77 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) create mode 100644 libavutil/riscv/asm.S diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S new file mode 100644 index 0000000000..dbd97f40a4 --- /dev/null +++ b/libavutil/riscv/asm.S @@ -0,0 +1,77 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * Loosely based on earlier work copyrighted by Måns Rullgård, 2008. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#if defined (__riscv_float_abi_soft) +#define NOHWF +#define NOHWD +#define HWF # +#define HWD # +#elif defined (__riscv_float_abi_single) +#define NOHWF # +#define NOHWD +#define HWF +#define HWD # +#else +#define NOHWF # +#define NOHWD # +#define HWF +#define HWD +#endif + + .macro func sym, ext= + .text + .align 2 + + .option push + .ifnb \ext + .option arch, +\ext + .endif + + .global \sym + .hidden \sym + .type \sym, %function + \sym: + + .macro endfunc + .size \sym, . - \sym + .option pop + .previous + .purgem endfunc + .endm + .endm + + .macro const sym, align=3, relocate=0 + .if \relocate + .pushsection .data.rel.ro + .else + .pushsection .rodata + .endif + .align \align + \sym: + + .macro endconst + .size \sym, . - \sym + .popsection + .purgem endconst + .endm + .endm From patchwork Tue Sep 20 14:39:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38077 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1988568pzh; Tue, 20 Sep 2022 07:40:44 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5mcctnd42F4+Cn7h5jW7xe413srLoONZw4BuTklTZ3pR+FClm34bUu3pHfj157WgHiqLF/ X-Received: by 2002:a17:907:d14:b0:781:d294:4bea with SMTP id gn20-20020a1709070d1400b00781d2944beamr1962979ejc.418.1663684844388; Tue, 20 Sep 2022 07:40:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684844; cv=none; d=google.com; s=arc-20160816; b=jweFcwfa1gnPUVb3s4PB+8M9+aaiGrygIr+ESMRfTINlQxWqmjQ99B3iBi2d/mO7xi gfC2HSujonOF2UucnBXH0QJbCmUJNabDao9iTtkumv8p7NKxKWzW16VUwYamJIQm7g10 TKlBnkKMCfO8NTGzvJUV5cABvK8pICmqGi9rdw8/81LVKBVkM/0uEXJGo4ZJoOX8xuqi aL1QoDHfl7N4jwwsWAbwblCN9vtu0/gyqtmS7a1ia8uz4TQQKqwTCpmJZ4SUkfqukVDA M+06ItXB57R++Ynjj9lzR6wG+3M/dTHG9MaTX1icCJe/ALyUFMccwCd1zuuISCSpK+/Z JQUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Hd++iMz3VwjtbZCtNQUPsT5V+P6VHH6YU5sCfBL1v18=; b=Q24NIprfdaYQZPaaLkJa9S6J2qI1C+lMFQPwYe4pt63cP3+GHrGtF47b810r20BCoI FKsXEZ2LKflseiAQPPHArcTZOi/LHpneQoXmnPr2+8wuP2IBZ5gZcaA2eVNvQnWvj6r9 RzZh5w23cb7xVjy51zxon3RSbe0/z60N33kdShc6eBI3WzUSliOnlxxwVDQeYzU8T6Se 5zFC6gMNQXJT1Ow5euxJCqMOSJGLI+gwD47oTk239zDOIN2ajc8KrU5zNV9dw9FRem2B 7GUv8NIb+OkjbLqIV1FJjfWVjNN26AluaL30C9U5ELAPC1HnicRtKEfQFOUYD9S+zcZ4 vieQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g16-20020a056402181000b004548a745457si149995edy.557.2022.09.20.07.40.43; Tue, 20 Sep 2022 07:40:44 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 34FE768BB17; Tue, 20 Sep 2022 17:40:23 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BA9B868B7AB for ; Tue, 20 Sep 2022 17:40:14 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 75709C00AF for ; Tue, 20 Sep 2022 17:40:14 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:50 +0300 Message-Id: <20220920144013.4959-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/26] lavc/audiodsp: RISC-V F vector_clipf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: DE22afJO2Iaq From: Rémi Denis-Courmont RV64G supports MIN & MAX instructions natively only on floating point registers, not general purpose ones. The later would require the Zbb extension. Due to that, it is actually faster to perform the clipping "properly" in FPU. Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech): audiodsp.vector_clipf_c: 29551.5 audiodsp.vector_clipf_rvf: 17871.0 Also tried unrolling with 2 or 8 elements but it gets worse either way. --- libavcodec/audiodsp.c | 2 ++ libavcodec/audiodsp.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/audiodsp_init.c | 33 +++++++++++++++++++++ libavcodec/riscv/audiodsp_rvf.S | 49 ++++++++++++++++++++++++++++++++ 5 files changed, 87 insertions(+) create mode 100644 libavcodec/riscv/Makefile create mode 100644 libavcodec/riscv/audiodsp_init.c create mode 100644 libavcodec/riscv/audiodsp_rvf.S diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c index ff43e87dce..eba6e809fd 100644 --- a/libavcodec/audiodsp.c +++ b/libavcodec/audiodsp.c @@ -113,6 +113,8 @@ av_cold void ff_audiodsp_init(AudioDSPContext *c) ff_audiodsp_init_arm(c); #elif ARCH_PPC ff_audiodsp_init_ppc(c); +#elif ARCH_RISCV + ff_audiodsp_init_riscv(c); #elif ARCH_X86 ff_audiodsp_init_x86(c); #endif diff --git a/libavcodec/audiodsp.h b/libavcodec/audiodsp.h index aa6fa7898b..485b512839 100644 --- a/libavcodec/audiodsp.h +++ b/libavcodec/audiodsp.h @@ -55,6 +55,7 @@ typedef struct AudioDSPContext { void ff_audiodsp_init(AudioDSPContext *c); void ff_audiodsp_init_arm(AudioDSPContext *c); void ff_audiodsp_init_ppc(AudioDSPContext *c); +void ff_audiodsp_init_riscv(AudioDSPContext *c); void ff_audiodsp_init_x86(AudioDSPContext *c); #endif /* AVCODEC_AUDIODSP_H */ diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile new file mode 100644 index 0000000000..414a9e9bd8 --- /dev/null +++ b/libavcodec/riscv/Makefile @@ -0,0 +1,2 @@ +OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ + riscv/audiodsp_rvf.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c new file mode 100644 index 0000000000..c5842815d6 --- /dev/null +++ b/libavcodec/riscv/audiodsp_init.c @@ -0,0 +1,33 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/audiodsp.h" + +void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); + +av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVF) + c->vector_clipf = ff_vector_clipf_rvf; +} diff --git a/libavcodec/riscv/audiodsp_rvf.S b/libavcodec/riscv/audiodsp_rvf.S new file mode 100644 index 0000000000..2ec8a11691 --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvf.S @@ -0,0 +1,49 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clipf_rvf, f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.x fa1, a4 +1: + flw ft0, (a1) + flw ft1, 4(a1) + fmax.s ft0, ft0, fa0 + flw ft2, 8(a1) + fmax.s ft1, ft1, fa0 + flw ft3, 12(a1) + fmax.s ft2, ft2, fa0 + addi a2, a2, -4 + fmax.s ft3, ft3, fa0 + addi a1, a1, 16 + fmin.s ft0, ft0, fa1 + fmin.s ft1, ft1, fa1 + fsw ft0, (a0) + fmin.s ft2, ft2, fa1 + fsw ft1, 4(a0) + fmin.s ft3, ft3, fa1 + fsw ft2, 8(a0) + fsw ft3, 12(a0) + addi a0, a0, 16 + bnez a2, 1b + + ret +endfunc From patchwork Tue Sep 20 14:39:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38083 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1989041pzh; Tue, 20 Sep 2022 07:41:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM49InE80GVset16i2tCbLpN6OS42gwyZx0LisjfnCnMnT5A4POrdC4J0SWYEYgc7Kxc/Gxv X-Received: by 2002:a17:907:7fa2:b0:781:ca3d:b385 with SMTP id qk34-20020a1709077fa200b00781ca3db385mr2343605ejc.641.1663684888793; Tue, 20 Sep 2022 07:41:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684888; cv=none; d=google.com; s=arc-20160816; b=kZxttDiEoV7VowoD9RbdfQjZLpv2mpmO/n9QChfGib04qi7+C46g+f0rV/92NwwFrO aEoOkXpy4tjeGl+YU06DXO4hE4hOlfsGbL0wrbkpBeIi8m/48r8j3lj5PTyjtOS6oAuB gTyR1oz6acNSK+RtIuPou5M2AmbkYW3IhxeeOvAm55dzIUqsYdP8V0v2osJlyUS8gtjG SzRBlq9hr6kMMBUKTqvUZwvLSKVZhMq29CU+J5ZrQFKTsm9LmUmm6qJO4Z0lqaexR6FS U6l5I5uRljfEFY5uIZfeJNh/fSAheF8lMiZ+ywPETEZf2DzQKMXby1VdwgiyGNrsrlO4 fQ1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=/Ga+4PJJevRhqo8rey9k1goQZJwqX8UDtM5memtoZcY=; b=dVW9Uaw+hkBe0DLG7MsTyOZoUNNUjOFfNQV+tLhYsrvmfxndyfkf40Zj1LjIYjBm2W JwVIcrbaRyUyK6zugQrhz93EY4895os0cBGTAUYp4D9sWYdIkIxH8SgUesZEBTiFBHPc Q3xmhVtsKBkeBwFPEQvFDeQ7amxgA/LgF7BqSMIIb9Wi+oLH+zG0IQxcadBogk9waPDF wQPGq60q+hZysYY+pP+1qNHa/08hTkKgA5cRgvnt43XpwOo9doAra4bsk2oOToC9tq7F ecnEsDGGanFrImv4h2ZyaIfWgD5m0BGB/3PPPf/e1HnX0tV1XwD7ebhgkvURlNNrF3Am 6GVw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ji20-20020a170907981400b0073182a31719si2970ejc.37.2022.09.20.07.41.28; Tue, 20 Sep 2022 07:41:28 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2026068BB94; Tue, 20 Sep 2022 17:40:28 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0A45B68BA7A for ; Tue, 20 Sep 2022 17:40:15 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id A899CC00B0 for ; Tue, 20 Sep 2022 17:40:14 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:51 +0300 Message-Id: <20220920144013.4959-4-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/26] lavc/pixblockdsp: RISC-V I get_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: eygZzBzldSrU From: Rémi Denis-Courmont Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech): get_pixels_c: 180.0 get_pixels_rvi: 136.7 --- libavcodec/pixblockdsp.c | 2 + libavcodec/pixblockdsp.h | 2 + libavcodec/riscv/Makefile | 2 + libavcodec/riscv/pixblockdsp_init.c | 45 ++++++++++++++++++++++ libavcodec/riscv/pixblockdsp_rvi.S | 59 +++++++++++++++++++++++++++++ 5 files changed, 110 insertions(+) create mode 100644 libavcodec/riscv/pixblockdsp_init.c create mode 100644 libavcodec/riscv/pixblockdsp_rvi.S diff --git a/libavcodec/pixblockdsp.c b/libavcodec/pixblockdsp.c index 17c487da1e..4294075cee 100644 --- a/libavcodec/pixblockdsp.c +++ b/libavcodec/pixblockdsp.c @@ -109,6 +109,8 @@ av_cold void ff_pixblockdsp_init(PixblockDSPContext *c, AVCodecContext *avctx) ff_pixblockdsp_init_arm(c, avctx, high_bit_depth); #elif ARCH_PPC ff_pixblockdsp_init_ppc(c, avctx, high_bit_depth); +#elif ARCH_RISCV + ff_pixblockdsp_init_riscv(c, avctx, high_bit_depth); #elif ARCH_X86 ff_pixblockdsp_init_x86(c, avctx, high_bit_depth); #elif ARCH_MIPS diff --git a/libavcodec/pixblockdsp.h b/libavcodec/pixblockdsp.h index 07c2ec4f40..9b002aa3d6 100644 --- a/libavcodec/pixblockdsp.h +++ b/libavcodec/pixblockdsp.h @@ -52,6 +52,8 @@ void ff_pixblockdsp_init_arm(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); void ff_pixblockdsp_init_ppc(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); +void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, AVCodecContext *avctx, + unsigned high_bit_depth); void ff_pixblockdsp_init_x86(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); void ff_pixblockdsp_init_mips(PixblockDSPContext *c, AVCodecContext *avctx, diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 414a9e9bd8..da07f1fe96 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,2 +1,4 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o +OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ + riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c new file mode 100644 index 0000000000..04bf52649f --- /dev/null +++ b/libavcodec/riscv/pixblockdsp_init.c @@ -0,0 +1,45 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/avcodec.h" +#include "libavcodec/pixblockdsp.h" + +void ff_get_pixels_8_rvi(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); +void ff_get_pixels_16_rvi(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); + +av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, + AVCodecContext *avctx, + unsigned high_bit_depth) +{ + int cpu_flags = av_get_cpu_flags(); + + if (cpu_flags & AV_CPU_FLAG_RVI) { + if (high_bit_depth) + c->get_pixels = ff_get_pixels_16_rvi; + else + c->get_pixels = ff_get_pixels_8_rvi; + } +} diff --git a/libavcodec/riscv/pixblockdsp_rvi.S b/libavcodec/riscv/pixblockdsp_rvi.S new file mode 100644 index 0000000000..93ece4405e --- /dev/null +++ b/libavcodec/riscv/pixblockdsp_rvi.S @@ -0,0 +1,59 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_get_pixels_8_rvi +.irp row, 0, 1, 2, 3, 4, 5, 6, 7 + ld t0, (a1) + add a1, a1, a2 + sd zero, ((\row * 16) + 0)(a0) + addi t6, t6, -1 + sd zero, ((\row * 16) + 8)(a0) + srli t1, t0, 8 + sb t0, ((\row * 16) + 0)(a0) + srli t2, t0, 16 + sb t1, ((\row * 16) + 2)(a0) + srli t3, t0, 24 + sb t2, ((\row * 16) + 4)(a0) + srli t4, t0, 32 + sb t3, ((\row * 16) + 6)(a0) + srli t1, t0, 40 + sb t4, ((\row * 16) + 8)(a0) + srli t2, t0, 48 + sb t1, ((\row * 16) + 10)(a0) + srli t3, t0, 56 + sb t2, ((\row * 16) + 12)(a0) + sb t3, ((\row * 16) + 14)(a0) +.endr + ret +endfunc + +func ff_get_pixels_16_rvi +.irp row, 0, 1, 2, 3, 4, 5, 6, 7 + ld t0, 0(a1) + ld t1, 8(a1) + add a1, a1, a2 + sd t0, ((\row * 16) + 0)(a0) + sd t1, ((\row * 16) + 8)(a0) +.endr + ret +endfunc From patchwork Tue Sep 20 14:39:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38090 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1989714pzh; Tue, 20 Sep 2022 07:42:33 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6EssEruov8dW65ELwiPgKcmEqTihVjDm6PSqCTlhBbO+VY7avIxGBkg4ZghQgbLgRjrN+Z X-Received: by 2002:a05:6402:3489:b0:451:a859:8a4f with SMTP id v9-20020a056402348900b00451a8598a4fmr20220042edc.279.1663684952817; Tue, 20 Sep 2022 07:42:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684952; cv=none; d=google.com; s=arc-20160816; b=d3tXpudQ2t6mNk/lpuvn0uw28p8CCE98UsmrTePKKxCjzB5Sl3W5z9t5t94ATOjtep IzbISnaLhbpkc8Sjtzmitow0RmI4RIRoze4xzu7dumG9mIpsEoEoyiOg2pHTqnaAdtAT Z2K6hBSDeif1rVJxWHp5X8pwxV/ABvJtw1hLUK3nU8y3Pq+J51v/PRCFRpWptHjYVHoy 5kxDvafVJcFiVZOEJ6M3Mf6ZzDfKG9MZFWNsCPv6PwXvXD+xW0TPQxvDKJIP9EW7sXFy bL77jHlWx2eAJChr90Pe08AG6RtOGClYCBQxY8+DJDkbotOGRyq89drQQxi+Q4Ve5vDp 6pRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=QggFCraTadVgUPv0ZHZTRphOqAQ97J7q4BSXA/dVBDc=; b=FMQapI0FEV0P4J27gqlZqQxvhizuUlpjNlT/gxgJruGeaZyLDw4GqiwIzXxZLUBGxx owNBZzvrHuqOgzJhQuVxc1VELD5HgZSAAOQLeuOH7l+A8AnsnReuuwFsfwtOJv7nwGHc ryKr5mpIJMibOGFYCXneIsl1mK0h9PHsyH3UoVwm/XFHlCo1Uy5pxG6DzVmmURHHdGw0 DzLspy7t/oH/q21WgvVWrvjlu1WVfClDfP8ai6Trmu62alUSnH7hWxBlhu8aFd7aVbpr UQn/jGJwEphQi9yDNsVkjZp1qFnXHFpEU4vYi4ikUy7KyPK7nTeWb3tGIYQ5odtkTifg Opsw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sc9-20020a1709078a0900b0073d76d8af85si2020364ejc.231.2022.09.20.07.42.32; Tue, 20 Sep 2022 07:42:32 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CD1FB68BB01; Tue, 20 Sep 2022 17:40:34 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 337D868BABD for ; Tue, 20 Sep 2022 17:40:15 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id DB64BC00B1 for ; Tue, 20 Sep 2022 17:40:14 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:52 +0300 Message-Id: <20220920144013.4959-5-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/26] lavu/cpu: CPU flags for the RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3pWi4BrDpN1t From: Rémi Denis-Courmont RVV defines a total of 12 different extensions, including: - 5 different instruction subsets: - Zve32x: 8-, 16- and 32-bit integers, - Zve32f: Zve32x plus single precision floats, - Zve64x: Zve32x plus 64-bit integers, - Zve64f: Zve32f plus Zve64x, - Zve64d: Zve64f plus double precision floats. - 6 different vector lengths: - Zvl32b (embedded only), - Zvl64b (embedded only), - Zvl128b, - Zvl256b, - Zvl512b, - Zvl1024b, - and the V extension proper: equivalent to Zve64f and Zvl128b. In total, there are 6 different possible sets of supported instructions (including the empty set), but for convenience we allocate one bit for each type sets: up-to-32-bit ints (ZVE32X), floats (ZV32F), 64-bit ints (ZV64X) and doubles (ZVE64D). Whence the vector size is needed, it can be retrieved by reading the unprivileged read-only vlenb CSR. This should probably be a separate helper macro if needed at a later point. --- libavutil/cpu.c | 4 ++++ libavutil/cpu.h | 4 ++++ libavutil/riscv/cpu.c | 46 ++++++++++++++++++++++++++++++++++++++- tests/checkasm/checkasm.c | 10 ++++++--- 4 files changed, 60 insertions(+), 4 deletions(-) diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 78e92a1bf6..58ae4858b4 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -187,6 +187,10 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" }, { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" }, { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" }, + { "rvve32", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE32X}, .unit = "flags" }, + { "rvvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE32F}, .unit = "flags" }, + { "rvve64", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE64X}, .unit = "flags" }, + { "rvv", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE64D}, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9aae2ccc7a..00698e30ef 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -82,6 +82,10 @@ #define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank) #define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP) #define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP) +#define AV_CPU_FLAG_RV_ZVE32X (1 << 3) ///< Vectors of 8/16/32-bit int's */ +#define AV_CPU_FLAG_RV_ZVE32F (1 << 4) ///< Vectors of float's */ +#define AV_CPU_FLAG_RV_ZVE64X (1 << 5) ///< Vectors of 64-bit int's */ +#define AV_CPU_FLAG_RV_ZVE64D (1 << 6) ///< Vectors of double's /** * Return the flags which specify extensions supported by the CPU. diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c index fec1f7822a..6f862635b3 100644 --- a/libavutil/riscv/cpu.c +++ b/libavutil/riscv/cpu.c @@ -30,7 +30,32 @@ int ff_force_cpu_flags_riscv(int flags) { - if ((flags & AV_CPU_FLAG_RVD) && !(flags & AV_CPU_FLAG_RVF)) { + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RV_ZVE64X)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE64X"); + flags |= AV_CPU_FLAG_RV_ZVE64X; + } + + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RV_ZVE32F)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE32F"); + flags |= AV_CPU_FLAG_RV_ZVE32F; + } + + if ((flags & (AV_CPU_FLAG_RV_ZVE64X | AV_CPU_FLAG_RV_ZVE32F)) + && !(flags & AV_CPU_FLAG_RV_ZVE32X)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE32X"); + flags |= AV_CPU_FLAG_RV_ZVE32X; + } + + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RVD)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "D"); + flags |= AV_CPU_FLAG_RVD; + } + + if ((flags & (AV_CPU_FLAG_RVD | AV_CPU_FLAG_RV_ZVE32F)) + && !(flags & AV_CPU_FLAG_RVF)) { av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "F"); flags |= AV_CPU_FLAG_RVF; } @@ -50,6 +75,11 @@ int ff_get_cpu_flags_riscv(void) ret |= AV_CPU_FLAG_RVF; if (hwcap & HWCAP_RV('D')) ret |= AV_CPU_FLAG_RVD; + + /* The V extension implies all Zve* functional subsets */ + if (hwcap & HWCAP_RV('V')) + ret |= AV_CPU_FLAG_RV_ZVE32X | AV_CPU_FLAG_RV_ZVE64X + | AV_CPU_FLAG_RV_ZVE32F | AV_CPU_FLAG_RV_ZVE64D; #endif #ifdef __riscv_i @@ -60,6 +90,20 @@ int ff_get_cpu_flags_riscv(void) #if (__riscv_flen >= 64) ret |= AV_CPU_FLAG_RVD; #endif +#endif + + /* If RV-V is enabled statically at compile-time, check the details. */ +#ifdef __riscv_vectors + ret |= AV_CPU_FLAG_RV_ZVE32X; +#if __riscv_v_elen >= 64 + ret |= AV_CPU_FLAG_RV_ZVE64X; +#endif +#if __riscv_v_elen_fp >= 32 + ret |= AV_CPU_FLAG_RV_ZVE32F; +#if __riscv_v_elen_fp >= 64 + ret |= AV_CPU_FLAG_RV_ZVE64F; +#endif +#endif #endif return ret; diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 7730b14d98..c4352f1a16 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -230,9 +230,13 @@ static const struct { { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, #elif ARCH_RISCV - { "RVI", "rvi", AV_CPU_FLAG_RVI }, - { "RVF", "rvf", AV_CPU_FLAG_RVF }, - { "RVD", "rvd", AV_CPU_FLAG_RVD }, + { "RVI", "rvi", AV_CPU_FLAG_RVI }, + { "RVF", "rvf", AV_CPU_FLAG_RVF }, + { "RVD", "rvd", AV_CPU_FLAG_RVD }, + { "RV_Zve32x", "rv_zve32x", AV_CPU_FLAG_RV_ZVE32X }, + { "RV_Zve32f", "rv_zve32f", AV_CPU_FLAG_RV_ZVE32F }, + { "RV_Zve64x", "rv_zve64x", AV_CPU_FLAG_RV_ZVE64X }, + { "RV_Zve64d", "rv_zve64d", AV_CPU_FLAG_RV_ZVE64D }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Tue Sep 20 14:39:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38098 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1990481pzh; Tue, 20 Sep 2022 07:43:52 -0700 (PDT) X-Google-Smtp-Source: AMsMyM69+m4Ptd3sgXBbblzzP8qNAj/9ORqJo7yv9G+TS8qlcGV38RfOtKBbYX6H6aAe8StrLpBo X-Received: by 2002:a17:907:c05:b0:73d:6e0a:8d22 with SMTP id ga5-20020a1709070c0500b0073d6e0a8d22mr17168701ejc.646.1663685032305; Tue, 20 Sep 2022 07:43:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663685032; cv=none; d=google.com; s=arc-20160816; b=W6FosEcVHvLiRtzSbc8n3sP42HKBPk46N9rwDBqbu2Me9esiz4oMIxWZ68z/924mPI 4FgtEfmeTdXLWXzjmYD235bFukfU7J1OQkiYz4xa0q2sMb/FbmsEMoQHveAMdSAC7V3X DGiJH3Mw+n0sL36B6RqidmEYuGjiF8NXMQDMszNvSai6nL0EPugrnFOQ0exBocVPzlpV vwx+5KTvsCIfD45yZxk2YKa3utRLsPhAPj5Yi8c8XU7MDWmNc3ZY/GPqs5zCydbBqHWD essS8DEQb99sbLyQBBUSvE3pcxwi3qtZBaArLhqlgYsss0jUyWmQcZMm+8+Hzn2li6zw +0fw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=8YmpSqdLN1v5/pqYFfopHFmlobODxFsnS8G8X4uMxZI=; b=NfkbVUeRA07jU/KlHd7Z+kbIU9VSULGjRlxH9P2mCjjMCkLIqWeAIdA954aojPIdRc jK5YQgOLA5Naadl7s1JIR6bdYKkFXCp5K+epVXsvP9XMemWzSyTcHbDbE7QaHiw7h4hK iTuu3S+etY8QGiN/Vz7Ex7mONx1FX7prjuh3s8Pz7yQcTLlMfn7Z7t9Y6m2QL+L0IvNi bXMjq8sQEXNBncEI/ij5cmUsDbg+nlllOw8J18cwV99b9VzBIiw9kPdQpu47TSskxNrG ga7vMdWsRT7yRDuOEwHqaFemOQ72ZoAaNxdX/1ISWALRg/gKSHTEbvrHD33bQgjjcVO/ N5+g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id cr16-20020a170906d55000b0077404c1e776si1574360ejc.935.2022.09.20.07.43.51; Tue, 20 Sep 2022 07:43:52 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2BFBA68BC0E; Tue, 20 Sep 2022 17:40:46 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 97DB668BB25 for ; Tue, 20 Sep 2022 17:40:19 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 19E0DC00B2 for ; Tue, 20 Sep 2022 17:40:15 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:53 +0300 Message-Id: <20220920144013.4959-6-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/26] configure: probe RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Kr0McAOgyAQp From: Rémi Denis-Courmont --- Makefile | 2 +- configure | 15 +++++++++++++++ ffbuild/arch.mak | 2 ++ 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 61f79e27ae..1fb742f390 100644 --- a/Makefile +++ b/Makefile @@ -91,7 +91,7 @@ ffbuild/.config: $(CONFIGURABLE_COMPONENTS) SUBDIR_VARS := CLEANFILES FFLIBS HOSTPROGS TESTPROGS TOOLS \ HEADERS ARCH_HEADERS BUILT_HEADERS SKIPHEADERS \ ARMV5TE-OBJS ARMV6-OBJS ARMV8-OBJS VFP-OBJS NEON-OBJS \ - ALTIVEC-OBJS VSX-OBJS MMX-OBJS X86ASM-OBJS \ + ALTIVEC-OBJS VSX-OBJS RVV-OBJS MMX-OBJS X86ASM-OBJS \ MIPSFPU-OBJS MIPSDSPR2-OBJS MIPSDSP-OBJS MSA-OBJS \ MMI-OBJS LSX-OBJS LASX-OBJS OBJS SLIBOBJS SHLIBOBJS \ STLIBOBJS HOSTOBJS TESTOBJS diff --git a/configure b/configure index c157338b1f..529fcae41e 100755 --- a/configure +++ b/configure @@ -462,6 +462,7 @@ Optimization options (experts only): --disable-mmi disable Loongson MMI optimizations --disable-lsx disable Loongson LSX optimizations --disable-lasx disable Loongson LASX optimizations + --disable-rvv disable RISC-V Vector optimizations --disable-fast-unaligned consider unaligned accesses slow Developer options (useful when working on FFmpeg itself): @@ -2126,6 +2127,10 @@ ARCH_EXT_LIST_PPC=" vsx " +ARCH_EXT_LIST_RISCV=" + rvv +" + ARCH_EXT_LIST_X86=" $ARCH_EXT_LIST_X86_SIMD cpunop @@ -2135,6 +2140,7 @@ ARCH_EXT_LIST_X86=" ARCH_EXT_LIST=" $ARCH_EXT_LIST_ARM $ARCH_EXT_LIST_PPC + $ARCH_EXT_LIST_RISCV $ARCH_EXT_LIST_X86 $ARCH_EXT_LIST_MIPS $ARCH_EXT_LIST_LOONGSON @@ -2642,6 +2648,8 @@ ppc4xx_deps="ppc" vsx_deps="altivec" power8_deps="vsx" +rvv_deps="riscv" + loongson2_deps="mips" loongson3_deps="mips" mmi_deps_any="loongson2 loongson3" @@ -6110,6 +6118,10 @@ elif enabled ppc; then check_cpp_condition power8 "altivec.h" "defined(_ARCH_PWR8)" fi +elif enabled riscv; then + + enabled rvv && check_inline_asm rvv '".option arch, +v\nvsetivli zero, 0, e8, m1, ta, ma"' + elif enabled x86; then check_builtin rdtsc intrin.h "__rdtsc()" @@ -7596,6 +7608,9 @@ if enabled loongarch; then echo "LSX enabled ${lsx-no}" echo "LASX enabled ${lasx-no}" fi +if enabled riscv; then + echo "RISC-V Vector enabled ${riscv-no}" +fi echo "debug symbols ${debug-no}" echo "strip symbols ${stripping-no}" echo "optimize for size ${small-no}" diff --git a/ffbuild/arch.mak b/ffbuild/arch.mak index 997e31e85e..39d76ee152 100644 --- a/ffbuild/arch.mak +++ b/ffbuild/arch.mak @@ -15,5 +15,7 @@ OBJS-$(HAVE_LASX) += $(LASX-OBJS) $(LASX-OBJS-yes) OBJS-$(HAVE_ALTIVEC) += $(ALTIVEC-OBJS) $(ALTIVEC-OBJS-yes) OBJS-$(HAVE_VSX) += $(VSX-OBJS) $(VSX-OBJS-yes) +OBJS-$(HAVE_RVV) += $(RVV-OBJS) $(RVV-OBJS-yes) + OBJS-$(HAVE_MMX) += $(MMX-OBJS) $(MMX-OBJS-yes) OBJS-$(HAVE_X86ASM) += $(X86ASM-OBJS) $(X86ASM-OBJS-yes) From patchwork Tue Sep 20 14:39:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38099 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1990566pzh; Tue, 20 Sep 2022 07:44:01 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6aiZAMZuK4ZupzpC4XoQMYt4JsqPqTcc/Rl+hflkLJNCLwInZb8dX8uCGt9K3005SXrYR9 X-Received: by 2002:a05:6402:27cf:b0:451:6ccc:4ea0 with SMTP id c15-20020a05640227cf00b004516ccc4ea0mr20583183ede.193.1663685040874; Tue, 20 Sep 2022 07:44:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663685040; cv=none; d=google.com; s=arc-20160816; b=pbTnDXzXAVv5zwVNgPdfgU9cQ6sYZXl4mImBtw2HFrWi71b2dfICwwcjpKM6zIaqzx mnZwSpDJcvQT/ZTxPPYIRXysAMM0fiNnMJdM01r9MTs/R/dXfibjaxt4X9N4R82/9632 Kcv3VXYP5ayYuPyyNj+41v3OGG/wxOcNaMyG8ENhrrb1RG/5UD0yhN5fyStIx/2gU3Ck KDjEhzXV8KIWVZj5E7uFStuvcCLFZXjhTyHsasmjbwB5Me/3e5wjbU6dgfTMvNDa8Uu2 GXp5eSaTHVPRaunFAH7nSv2owYK07SrWMlmTsCkOXeH3m4V9bG9ap2Skfx/nWNckwQTB mHqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=V4B/YU9eFKm2OOzroHLUVbA+Oo/nZPjwJPGo0pdRm3o=; b=HgwfSVokzYBPkR6tT367LHJG5kodXA0bYkZrngF5nQ0yRSaRSRaBkL2qihNqN03iE/ r4UUVJW5K6C4LpRSecFtVRQ4nlUX1as8OycNYXgMHnX/GAjJsWf694X9CoJbiZEOdrga ZcYK2O4Em9472ijkNNKBMNQCHAVcBljSYCoN6awJ7AdWGNbA80JM6XW1nns4WJ/8bj80 nJb7nDDMZBESzE2Tg1sj1xeW8/+I6DVrBLYTChzor0VIYW2wGp8iKPFdEdBsreUFqBs5 DOHOeqGntA6Ug0cAi9fCimPKWrl+ftkrKPy3LJqgqKStrR4NosZUL/sQePwgnboXQTeW 2B3A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s19-20020a50ab13000000b0044e863a2db4si2948edc.110.2022.09.20.07.44.00; Tue, 20 Sep 2022 07:44:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 245B168BBA8; Tue, 20 Sep 2022 17:40:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9E46E68BB70 for ; Tue, 20 Sep 2022 17:40:19 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 432A6C00B3 for ; Tue, 20 Sep 2022 17:40:15 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:54 +0300 Message-Id: <20220920144013.4959-7-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/26] lavu/floatdsp: RISC-V V vector_fmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: rtIJIfFE8saH From: Rémi Denis-Courmont This is based on existing code from the VLC git tree with two minor changes to account for the different function prototypes. --- libavutil/float_dsp.c | 2 ++ libavutil/float_dsp.h | 1 + libavutil/riscv/Makefile | 4 +++- libavutil/riscv/float_dsp_init.c | 39 +++++++++++++++++++++++++++++++ libavutil/riscv/float_dsp_rvv.S | 40 ++++++++++++++++++++++++++++++++ 5 files changed, 85 insertions(+), 1 deletion(-) create mode 100644 libavutil/riscv/float_dsp_init.c create mode 100644 libavutil/riscv/float_dsp_rvv.S diff --git a/libavutil/float_dsp.c b/libavutil/float_dsp.c index 8676c8b0f8..742dd679d2 100644 --- a/libavutil/float_dsp.c +++ b/libavutil/float_dsp.c @@ -156,6 +156,8 @@ av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact) ff_float_dsp_init_arm(fdsp); #elif ARCH_PPC ff_float_dsp_init_ppc(fdsp, bit_exact); +#elif ARCH_RISCV + ff_float_dsp_init_riscv(fdsp); #elif ARCH_X86 ff_float_dsp_init_x86(fdsp); #elif ARCH_MIPS diff --git a/libavutil/float_dsp.h b/libavutil/float_dsp.h index 9c664592bd..7cad9fc622 100644 --- a/libavutil/float_dsp.h +++ b/libavutil/float_dsp.h @@ -205,6 +205,7 @@ float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len); void ff_float_dsp_init_aarch64(AVFloatDSPContext *fdsp); void ff_float_dsp_init_arm(AVFloatDSPContext *fdsp); void ff_float_dsp_init_ppc(AVFloatDSPContext *fdsp, int strict); +void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp); void ff_float_dsp_init_x86(AVFloatDSPContext *fdsp); void ff_float_dsp_init_mips(AVFloatDSPContext *fdsp); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 1f818043dc..89a8d0d990 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1 +1,3 @@ -OBJS += riscv/cpu.o +OBJS += riscv/float_dsp_init.o \ + riscv/cpu.o +RVV-OBJS += riscv/float_dsp_rvv.o diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c new file mode 100644 index 0000000000..de567c50d2 --- /dev/null +++ b/libavutil/riscv/float_dsp_init.c @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/float_dsp.h" + +void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, + int len); + +av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; +#endif +} diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S new file mode 100644 index 0000000000..5095ed5bfc --- /dev/null +++ b/libavutil/riscv/float_dsp_rvv.S @@ -0,0 +1,40 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Tue Sep 20 14:39:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38100 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1990633pzh; Tue, 20 Sep 2022 07:44:10 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7voBl3SDHSv2WiWID1pggLbBHOj3Z96u7mmquWubla5rRHpxKZLiL/aVb2LRS6bTVZDZqx X-Received: by 2002:a17:907:724b:b0:780:49ab:4b66 with SMTP id ds11-20020a170907724b00b0078049ab4b66mr17415038ejc.67.1663685049807; Tue, 20 Sep 2022 07:44:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663685049; cv=none; d=google.com; s=arc-20160816; b=MC3ZQwfSaoI2PliWagBiBlHbs29YAvxdfwx5aktEbGZuUQvauVvyH2r8z5ja1wQE02 V76l5Uipsw390cjOAClDNF44DxS4hbgJxNdljgFsZCN4euSOLh0ungmzSfvdrjdBFYiK bax7xF3+g22DwEzfs9s/Cc/l4Dcok5Y3tcTknkaOXOY3zlboGTJju4K+73TMfFShyvT/ V/hKTb30wY0zVd1tbCrTqh2OLZ5VubGbWuSbeiIlRkYzNSDTPbOyqlax/IiZI/T+ykVx lF21kQzZd72GIUH83rIMQd5XXlmvIWEiD4maSVJZV30T1KOAV2BT8vQMeMGb5kpSLjpU /pGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=JXGO7DRPq1wEya2P5LVBv+8LA9AeKZKgzRq0mFxb5Kc=; b=CEUJAF5je4u6YWvPvsaSwGDn1OPnrOwT8fJBEqm9+cgwI44aTLioOAg3/LfoolFTlR NMpayzF9deU8NXveIO8Vm6r/ToFpQXF44yQf/o00Eweq5MP6yvJXDYjouTljZZnA9Qe+ uDjyMMSuTVZrf38O2Iy77cD7BIU8WrF7t6hDoYyqaUtZ60IkbHJ8y1Kn6LAvXtJdvWsX 9v/NoGFdi8FG3utSFBxm1vwSkSGZAmwboUTfN3GHcROP/90lmX20ugEAaynIg/f5pSKr 8HqDSUmzePZ8GVS9Z/eouZpoFeK/wAPociqmo88pe3C15qffPP28tddGBC0lqk6TXg+8 g1AQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hc20-20020a170907169400b007803ce94339si1668204ejc.484.2022.09.20.07.44.09; Tue, 20 Sep 2022 07:44:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0A31168BC17; Tue, 20 Sep 2022 17:40:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B959068BB76 for ; Tue, 20 Sep 2022 17:40:19 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6C905C00B4 for ; Tue, 20 Sep 2022 17:40:15 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:55 +0300 Message-Id: <20220920144013.4959-8-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/26] lavu/floatdsp: RISC-V V vector_dmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: IVD0IekNDZYb From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 9 ++++++++- libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index de567c50d2..b829c0f736 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -28,12 +28,19 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, + int len); + av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) + if (flags & AV_CPU_FLAG_RV_ZVE32F) { fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + + if (flags & AV_CPU_FLAG_RV_ZVE64D) + fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } #endif } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 5095ed5bfc..e82d56ac15 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -38,3 +38,21 @@ NOHWF mv a2, a3 ret endfunc + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_dmul_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 +1: + vsetvli t0, a2, e64, m1, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Tue Sep 20 14:39:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38079 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1988724pzh; Tue, 20 Sep 2022 07:40:59 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4oFFr92q3gQLNXYJ5nlXgMLIJ9xWuBeFCbbRh3NqOt+sruD3kxGAotZTUxYMojspEYD91K X-Received: by 2002:a05:6402:19:b0:447:901f:6b28 with SMTP id d25-20020a056402001900b00447901f6b28mr19993658edu.392.1663684858819; Tue, 20 Sep 2022 07:40:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684858; cv=none; d=google.com; s=arc-20160816; b=rL1Sm0e9XAroO2wJhyTPmzIg8umZJtkaFbBn/1muyXbJJxC44sVaIIv3suRXLLZMtM IcAeVmBuE7BvJCmhpU4UCMY2xhh6dKMvqpgTNmxne5ualXzwioSrInHhsEbKtmx9oYTw NrhMQUArcG65O8LMniRE0skjeMx+GvP1tqCRi25Y5nTPQ8gCBJggYcLd2JYQYC4mo3jK tKQOFiNsiqDXVbahClGvhsqefSpMM5tNy5/5opEbUjrLixbnTw4bYUctvjVVUj3/AQKK YvzCafYXqTH8KfRHWSDGUIYjyOa8I3b+dar628PkhuonkXAY4h5LbZzeUJ40u1Q+u8Yz UdTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=QyBRIH53/VMOcSa7p2MOUzP++O/Gel9fNf0MaGPHY6k=; b=EcPNVjdpKHh1L9yuAGifT+37SnThd0Dl1mTI32AigOaxA5Ic3BXQ18T1eqRp9W13Eu YmU0KPlxbP7RVx9DJQYTzAkSOU5xVckJUSIHUADPNivgwihXQAydk2M4aUWjl0zP1lna fXPNZoFAC6IbOPq/HZLlotG4Xx79tAviRjljwuqhQmC3w+InWapvMUstkWQtjsMMU2Hc rRszIkHTrkk7fAQwJ9fpm4bABh3VBYOmnqogdo3XMB4rt0g6uTNWu8x8dwNnu5cT6fjB RDUpBAZATDOxVEs4HdSWkzNen2uiq0ZSfbkf1EKxqQ3fUvXPewVwdz0R9mENRPTSDm7V xU6w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qb5-20020a1709077e8500b007330c08fe49si1749472ejc.206.2022.09.20.07.40.53; Tue, 20 Sep 2022 07:40:58 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2B4A368BB67; Tue, 20 Sep 2022 17:40:24 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C3B4268BA35 for ; Tue, 20 Sep 2022 17:40:19 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 95A9BC00B5 for ; Tue, 20 Sep 2022 17:40:15 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:56 +0300 Message-Id: <20220920144013.4959-9-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/26] lavu/floatdsp: RISC-V V vector_fmul X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: WzodKuOeQaYH From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index b829c0f736..60b79bd59e 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -25,6 +25,8 @@ #include "libavutil/cpu.h" #include "libavutil/float_dsp.h" +void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); @@ -37,6 +39,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_RV_ZVE32F) { + fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index e82d56ac15..fb2cb54081 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -21,6 +21,24 @@ #include "config.h" #include "asm.S" +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_fmul_rvv, zve32f +1: + vsetvli t0, a3, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vle32.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 From patchwork Tue Sep 20 14:39:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38081 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1988913pzh; Tue, 20 Sep 2022 07:41:17 -0700 (PDT) X-Google-Smtp-Source: AMsMyM57FD76jNtqO2pdFV76UY3n9c8Fo+pBHlSHVoETovK+DOojLWhgIF2NsZHaoQlBgWYH56/r X-Received: by 2002:a17:907:7da3:b0:776:a0ae:5147 with SMTP id oz35-20020a1709077da300b00776a0ae5147mr16794781ejc.662.1663684877639; Tue, 20 Sep 2022 07:41:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684877; cv=none; d=google.com; s=arc-20160816; b=a2N0Aq5nY+h8iw8Ri2J6XPOFPRFodxcBmw6TUH1RZALel0ID0bfVXO5hMCMyg+O7y7 Myjo3OFGg2bJbnAsZqXaRYuiM41qS2Mc/iGhUr4UBYzlZgrTQQo7uzilvpx/Mb8g2BIA aDjb8cXBVPXkZGJNvYBf3MCEjcRR6rqg9OnRXWLl3zkCbTvZ+XhoGKgqVPRE7/NmbbVk cTDmTDi17JFWcPnMiWNNLo5LiFPyjYeYdjJQFnkiXbfDmh8VnJCaP9boVhAhZdB7tn8r uq1ZtGkmUpnINK4vAeypOFo+Y+MCFPVB0pDramRkaNN6iWyEVJRtw7DwaQzqd4QIPYrC SO7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=P+A/WKltgFK4PsdFlukmok1qWOwBvwczYTynlwOXN7Q=; b=g18a0iHLNS3u6L1ZJpE6DqHy2RrbmhjZv5iwQGzitPY+cCwUhzyS/l3FzW3aS25YUd N3pfKXaC02cOHI9/ZH1hkgCGgc5s/iqRlCEFqhrBgA+MKixaM7UE8mTt6Ki0PcTmNAcD +/xg571kscSwZE1hHWGgbqZmua6/POLteUD7ZLK21ygXQXzl3SgljwrPf1Ba+dFkAM5m TbsL27SY4haeCX/VVaPxFb/ot4YQ6aYkdE9BxvBZXsml3yOkEFAVhh/hCzt5IeaUPRig eCHVwwVZEq1G9leQK4u5JTW2GtdblA5ThK807n1+wk4si42Hg19a1Hsa08ry/xFWdoos zd1w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hs30-20020a1709073e9e00b0073301a22cb7si1720997ejc.294.2022.09.20.07.41.11; Tue, 20 Sep 2022 07:41:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 49E0868BB7D; Tue, 20 Sep 2022 17:40:26 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id ED12468BA35 for ; Tue, 20 Sep 2022 17:40:19 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id BE9ADC00B6 for ; Tue, 20 Sep 2022 17:40:15 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:57 +0300 Message-Id: <20220920144013.4959-10-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/26] lavu/floatdsp: RISC-V V vector_dmul X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: +4LCEKpNDkMw From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 6 +++++- libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 60b79bd59e..6027a67b46 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -30,6 +30,8 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -42,8 +44,10 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; - if (flags & AV_CPU_FLAG_RV_ZVE64D) + if (flags & AV_CPU_FLAG_RV_ZVE64D) { + fdsp->vector_dmul = ff_vector_dmul_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } } #endif } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index fb2cb54081..b16c0f3005 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -57,6 +57,24 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_dmul_rvv, zve64d +1: + vsetvli t0, a3, e64, m1, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vle64.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Tue Sep 20 14:39:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38082 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1988947pzh; Tue, 20 Sep 2022 07:41:20 -0700 (PDT) X-Google-Smtp-Source: AMsMyM72FSwFJgU4voSjdqQOt+kF8IzOA7LM2qM6Fr2tLv3gxZowvX5ESGlQOmqN+iFz/+6gTyMr X-Received: by 2002:a05:6402:280a:b0:451:5aad:8968 with SMTP id h10-20020a056402280a00b004515aad8968mr20854304ede.55.1663684879896; Tue, 20 Sep 2022 07:41:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684879; cv=none; d=google.com; s=arc-20160816; b=ZJkHwWwzDRvrL87A63yGyWikZl30DM3t8Ednw7qGIHOF7gqzCAeoTUFO+x3z6IDshN MhKWXE5m04uX0wld6fON+nNfZTpgwUbv+xGfCSengrV8Eq5Pi/mBDcFX1h0oRnIpCxf1 glFX2S0kDW7sjUGzC5hvK5g0UzdTsMEc9daKBZioovJsykb095/bw0ZvC3ukphnKz/gt qUzriedV2+RcKa7J//D6sEWpKEqm+CQHAK0VVen8aDgPhS5ZSpvJMWKHN4v6zMrnSZCo 7Dv3J+TBF9TftfyMHw76xE9PyVwxNLL/GfMQ2RRawcU/hbWxEZ2/k68SgDW48rJKRc5Z WAtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=i21Fx4VCS0cDWgq9dnMuzpUjGHQbPqE35hnNyf/p7bE=; b=qfQDkEuNDkbgLLpOFwyz0zZTVCyDiLDmrUI61xPo70DeWdnCnJexUS2TCuHYd18rYy DSi8KvssUupuepxDf+yFKpLLLvK4zhrtSP2q1baZaRIZRPYxZcjgvgZngbwAQ0fHaYrt 1C3TjF+vuWS8neFTTsqz8odGoZA2thBzuPQM/J/FMT+n+5pEa4G5I3VnzTAt11MdVR0K stsyIlhWm3WWfVM3FXbu4k8M104imDBuBsXgj5uI2Q5uZXvupjydB7tX5G1bcreX02ZI YTFc08uhU77vMp+ry4T0K02hGXkUd6qmba8xLGAPk32uaeefjkos7YrkwJJJikV3Pb1A ldyA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hz24-20020a1709072cf800b0072a6c18f1fasi1355030ejc.639.2022.09.20.07.41.19; Tue, 20 Sep 2022 07:41:19 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3178D68BB64; Tue, 20 Sep 2022 17:40:27 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0006868BA76 for ; Tue, 20 Sep 2022 17:40:19 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id E7E0CC00B7 for ; Tue, 20 Sep 2022 17:40:15 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:58 +0300 Message-Id: <20220920144013.4959-11-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 11/26] lavu/floatdsp: RISC-V V vector_fmac_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: cQw5ydDEKLwC From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 6027a67b46..c2d93e0cd7 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -27,6 +27,8 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, int len); +void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); @@ -42,6 +44,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_RV_ZVE32F) { fdsp->vector_fmul = ff_vector_fmul_rvv; + fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index b16c0f3005..1c1fa906e6 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -39,6 +39,25 @@ func ff_vector_fmul_rvv, zve32f ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_fmac_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v24, (a1) + add a1, a1, t1 + vle32.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 From patchwork Tue Sep 20 14:39:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38080 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1988764pzh; Tue, 20 Sep 2022 07:41:02 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5yRkgy+n0o8igJVXjsfWURtqzEOLRXIrTdeiX7W1R1NQEFnlNA11kJrCBfEUHBsfmA7Jmw X-Received: by 2002:a17:906:5a4c:b0:76f:3e98:b453 with SMTP id my12-20020a1709065a4c00b0076f3e98b453mr17433388ejc.509.1663684862517; Tue, 20 Sep 2022 07:41:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684862; cv=none; d=google.com; s=arc-20160816; b=EMOgxhpNBgBJIOIqLwy0zz7bC8bz3043pdj1MBtQAu6Nkk7ZPdgxJnYpbXdA2akL/B sGdlyADaiMDbG6X4LRAK40gj/LXEQyTE3U8QUJoaMgqioqhGs0AF24akGHBeIAkkXJxX ZNwMt7yxekHQIYwUn1YOBqd6nzyO2O8uKoCQCSGC3qn63RPUy7hrg03diaJQGHYlPrZ+ sy8fQCPjWiOMtVygUCWXsm34ykfJIUuPIojc9TjpjzRYe0eA5VDaLIq4nTFGiFuzBzXG /zKz4M+CR6OonhiO0+1MFXxnXHdmjD/qdNYUBZWyLzi6p1IoCQZs2fl96C5hu+UQ7oLE dUFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=M1PW3H3E68urE1/hCZRWDxjN3yXRtNvAg/+lUCKbQrs=; b=Tpo5EZKimDoLFLoMgXzPz2qO774WpfR5zks6Z0CkJitMVFvmOIayjpNIODOuymxB5I 3A1FeD+8IQHWuZPCStX5Gts1U2cUXe87uPen+geObgSvyDJ3dE5cxM7YHN5NcJBT2oPF iYGlxEipvhbHdg5UIkzEV/vkI+PvMOYBrxYQ5DWCzVJAtzClSecFpCl+gR4VKY6gyjG6 Wn4HiWJzRBYx6xfh4dSK2MKaQ1fxc0ZT+uPTfQkdFtkk+/bCtHXdjYlwfRAMtjlqlJ+u 2O0koygitO2PLwjgXV6g7O86RqPrdGRWaMKiudJgEQRlDy+mahRPXCO4h3hIuGS4A+tV ywUw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ka26-20020a170907991a00b00780805b99ccsi1290291ejc.648.2022.09.20.07.41.02; Tue, 20 Sep 2022 07:41:02 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 42C1168BB7F; Tue, 20 Sep 2022 17:40:25 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E605F68B7AB for ; Tue, 20 Sep 2022 17:40:19 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 1CFB6C00B8 for ; Tue, 20 Sep 2022 17:40:16 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:39:59 +0300 Message-Id: <20220920144013.4959-12-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 12/26] lavu/floatdsp: RISC-V V vector_dmac_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: AE60B0V3uasL From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index c2d93e0cd7..d17d0f66c5 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -34,6 +34,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); +void ff_vector_dmac_scalar_rvv(double *dst, const double *src, double mul, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -49,6 +51,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; + fdsp->vector_dmac_scalar = ff_vector_dmac_scalar_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; } } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 1c1fa906e6..0d6fffe235 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -94,6 +94,25 @@ func ff_vector_dmul_rvv, zve64d ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_dmac_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 +1: + vsetvli t0, a2, e64, m1, ta, ma + slli t1, t0, 3 + vle64.v v24, (a1) + add a1, a1, t1 + vle64.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Tue Sep 20 14:40:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38084 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1989130pzh; Tue, 20 Sep 2022 07:41:38 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5nLrqM8Su1eIR4HIupY4tvxGbONbsYdI7gq++mipZDonmDON8UsuTl8WrRw3W4jRixAQgG X-Received: by 2002:a17:907:2724:b0:779:7545:5df6 with SMTP id d4-20020a170907272400b0077975455df6mr17433619ejl.325.1663684897788; Tue, 20 Sep 2022 07:41:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684897; cv=none; d=google.com; s=arc-20160816; b=qNd0kWRBNDotsJvO8oCE4VMK4zoG9ubBZ1LO7AXi40aZPDeM8ImJUj1/qrP2uTqIMV fHuJrMMUv76BdAGxvOVBM9Nf9h5eODKDynpS6L+UgT8JYn//ysshoHI0iIfghmVEy30O I6sgLM8YV08hXRd03JlpeLNo0NBTeTUd6qzN/F4WtddX7qYTdCYu+aOsp2oQmgPR+AvO C5PED1nelkBO8VcFByScZBL65qAOHs8HduKeEeOBBvtIv+Ao/eBabBsnPiMRzY9/LK7P Gtgvy7pILBykdq/vO9+2+GJlDMcHdSwxduE9tRf1L/hyA3opIwIBdOayFaJL7o5AwVKH iWBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=N9W1Iw/cJGC7mwurs1lZC+8AmwYw66grXENAYS36XII=; b=DF8PsKTI0mAVcETb9/69v517AS/pQXbUWWrtofIE6gjgluvFa6Q7KyQNhZr1T9AA70 98Oi9F12RV5clfG8uSf8fZ2oS1XvtsO0LeAwPm/J5xHdFPB1iUzuW5Q3xY9vzyW45D8K XSM6u/bVRImx4FFZrEqMk4Jb+msjbIHAr/i+fkFSy6tR0mEij1ZsMh9McLy/BuzkPTGz Z5Dw33oibCKwJUwCF25qSaXPe/Bo0FBa9OYxr4GtpRvw5TcRjjXOof0pHG8Zqp1oRORw 6F94p6VJ6oab+4iaTkDbxsnJOZB899xplcgRBO5B+n2xWME6wWXn683/qpot6SLVp1uJ lXXw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id mf21-20020a170906cb9500b0077cfec3a52fsi1497094ejb.839.2022.09.20.07.41.37; Tue, 20 Sep 2022 07:41:37 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 29E8068BB8E; Tue, 20 Sep 2022 17:40:29 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1D0AB68BA91 for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 46941C00B9 for ; Tue, 20 Sep 2022 17:40:16 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:00 +0300 Message-Id: <20220920144013.4959-13-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 13/26] lavu/floatdsp: RISC-V V vector_fmul_add X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: cpcKtnsEKzJU From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 20 ++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index d17d0f66c5..2ddd2050f7 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -31,6 +31,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, + const float *src2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -48,6 +50,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 0d6fffe235..9b68187e01 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -76,6 +76,26 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) + (a3) [0..a4-1] +func ff_vector_fmul_add_rvv, zve32f +1: + vsetvli t0, a4, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v8, (a1) + add a1, a1, t1 + vle32.v v16, (a2) + add a2, a2, t1 + vle32.v v24, (a3) + add a3, a3, t1 + vfmadd.vv v8, v16, v24 + sub a4, a4, t0 + vse32.v v8, (a0) + add a0, a0, t1 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Tue Sep 20 14:40:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38085 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1989219pzh; Tue, 20 Sep 2022 07:41:46 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5Ym6z5NL6eYJjuAgP0nbMdlll27IiJB6Ji6xgSsO9KxNN83MBjxGVetr6Y/0egymTZkjNv X-Received: by 2002:a05:6402:1ad1:b0:44e:8dfb:2d04 with SMTP id ba17-20020a0564021ad100b0044e8dfb2d04mr20100398edb.400.1663684906755; Tue, 20 Sep 2022 07:41:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684906; cv=none; d=google.com; s=arc-20160816; b=0+waj5/XHbynMcrBu8EGWp0+CMQgrhZ9lpUddNjoXuTv/A84m5D6bP7u0ffK/VGml5 K2GvoU71m1DX3lphlc7JPOBVjP7zA6YAwE7Km/PAT0mBVlNlCVldRD8vuH9+XJmG5EUs qTggAaWqg5s8NnsAPFoShLUhS5vx0s+rhOiKvuAoNcChYE0Vr1Slj3PC92aHVgCPte0Z tNBCDszdmaPotKCr6YfKZFQ/ilR5ukWrmtChqPqLEpq8mXu71ZxD0Dx/kTU+Io1IFsAf u2niZhIKe2u1bmy8kTCD2Pl50KPYg+C11Su+tyvRBtwZpQMfjcbqoft3RIljHiJFTeIO SHJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=m+iYu17eYDStbCcyU4pKfVecoF7rqEky1WwbPJkJ1t4=; b=cOTAvWSIttFn6rlP51LQz4VYt//Duiar9Ej65vVPbiRgrQRbye76pjlfiw8D0wz5g8 BTxK1hn5EDjVJJBdAoAjefxerSF/jpfpEe7hAM3/6OP5+vTZU2RXanbg0jm/9GFa9IUq 9i1tgqTFFOn2oYbZNaXiGKk0kVb7HjieSSlE1CeZrQx2yzt1Y9Td0JonU181IpzX4iCj yLZi/Vt0FILXy+rvcVdgh/MQmA7Dcn+B+P+KJdEfbmEBErz+G1EM+nqlAsIOQ6158bR5 Kcho7X4U16axf0/ly15D10Pdv3TZY85BcrG4ilkTzijCf2/KS7baakFxMXp2owhXGxCA nhEw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id cr5-20020a056402222500b0044eecc780fasi218111edb.188.2022.09.20.07.41.46; Tue, 20 Sep 2022 07:41:46 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 07B0168BB98; Tue, 20 Sep 2022 17:40:30 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1D29668BA9B for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6FF26C00BA for ; Tue, 20 Sep 2022 17:40:16 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:01 +0300 Message-Id: <20220920144013.4959-14-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 14/26] lavu/floatdsp: RISC-V V butterflies_float X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: lFpS31/mpz+y From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 2ddd2050f7..f164b1308f 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -33,6 +33,7 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -51,6 +52,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 9b68187e01..0366009213 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -96,6 +96,25 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_float_rvv, zve32f +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vfadd.vv v0, v16, v24 + vfsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Tue Sep 20 14:40:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38088 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1989515pzh; Tue, 20 Sep 2022 07:42:14 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6/vOYjGIhSDdQP7ff9AfjFbEIgMCxEfMUqtX2sIdgHbCav9xgGb/oWKICSQt3do0Ogi4HC X-Received: by 2002:a17:907:980e:b0:77a:6958:5aa1 with SMTP id ji14-20020a170907980e00b0077a69585aa1mr17794070ejc.232.1663684934392; Tue, 20 Sep 2022 07:42:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684934; cv=none; d=google.com; s=arc-20160816; b=D9Ig2+t5h1nY8dZ787s7MLZybf0NRkjMuibquKQW1iWSkwBvC8KYHxmHM/RfvyiQh9 pfLBmv523qGLCHnhU0knxQPvoTjnjRXG7kHciqgIp0rUnfZp9UQeQQ5H+r3Wdu3WBWL8 C+5SOf/VibkkjH4vgMZ5pTQPAhiwC8vTdW4MGaS0lpEbVLSwx6vTwMpAS8Omv/Gx5kQd 5cZsuuwXIkoBDWPFYgjPT2MfnvyofrmPA5qMrQnKGKOSoy3sk/LWNb9aaNTTOHQ/79fQ hAxnMr0et2mUpX+avXUR/AeQVubaw9nnO94iXHOEBgBf/qHKqigM+1Buzqh1jidXrp33 rVzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=KafWOMGWK8X6JFzASheiByq7Z/FAq/fmw8macWyMzZw=; b=hYxm00GRVuKdXkSIj0hyk5E6/BhvL5eZ/OuF2qV4zFkk8EWmHyuhQTG0/ZtAy6mkNa UQ1AanPI/CvhAGRToHKfSfU/n0xSqhBrSQTfyL5Q16pnwNm3RbpSVIKAuYp94I0TI7i7 V2APFdjOexUO/W1/HWcufWASIky+j9nSIqS7KI6MflfS6DErnE7ydGwOjIPXKCzqdSJA VTFjuQs5wHeJdHBqYhRdZ/V7q7fZVDJpdoFNESN64BVejtiipbx6HAY1IUKRmX0Isj6N xodlbzX4DQphxwwxY9DUsNcmseu61yHqjkc8RstFk13eZUK8KSuJeVfN3WPya1vPVmFQ I6IQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hq37-20020a1709073f2500b0077087274a48si1603979ejc.257.2022.09.20.07.42.13; Tue, 20 Sep 2022 07:42:14 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EE4D968BBBD; Tue, 20 Sep 2022 17:40:32 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 29E7E68BABC for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 99563C00BB for ; Tue, 20 Sep 2022 17:40:16 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:02 +0300 Message-Id: <20220920144013.4959-15-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 15/26] lavu/floatdsp: RISC-V V vector_fmul_reversed X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: O/TMbXxR2+5t From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 22 ++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index f164b1308f..9b8fd9942b 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -33,6 +33,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, + const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, @@ -52,6 +54,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 0366009213..6a1304d24a 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -96,6 +96,28 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a1) * reverse(a2) [0..a3-1] +func ff_vector_fmul_reverse_rvv, zve32f + add t3, a3, -1 + li t2, -4 // byte stride + slli t3, t3, 2 + add a2, a2, t3 +1: + vsetvli t0, a3, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vlse32.v v24, (a2), t2 + sub a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] func ff_butterflies_float_rvv, zve32f 1: From patchwork Tue Sep 20 14:40:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38087 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1989413pzh; Tue, 20 Sep 2022 07:42:05 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7NGkDob9Y9VQDYzJ4ed7YgKYxdLod+5Lw4wcUKKd8DUdvMbsDSkXJC7Tt6x0xctXULEGdm X-Received: by 2002:aa7:cd49:0:b0:451:e570:8a82 with SMTP id v9-20020aa7cd49000000b00451e5708a82mr20319523edw.369.1663684924809; Tue, 20 Sep 2022 07:42:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684924; cv=none; d=google.com; s=arc-20160816; b=TXSZIrLDAG7vfDJrnOmo+1GlDdIoXzoIgIh6yYxTlKrT8U/95H4O1KnvOC+ttpzXoD 5kz291xvU8fAdrqQYzVW/8tC3dJF8fM9HxT2EeSQH2/7K8qXz5l480GZ0lM6pjZDuyYN lk7vtSgBmFfnzwfNXcyuMxDnsnI9s0tlT1j8yyg1RB2QJO3zvjV6hfhU6oEvfik/TdAl 3Cn7wYDbUsO+Vhif+mkcn9nyq4TZTbvhn/lUhA8fzhYLEkrvSYbB9b8/ftM5qpPlNecv VSov30O/nBQcigDMy9HHsdpKXeVFfx4mjJzhJeNpMMWXroS+ysRV+1M4HVYHOkBPVcxq vMaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=DZ4AABOz1h4tg2WV9i+VMik+NfVXbBu/iGZmqne83gw=; b=OwOMBFeXD9k2thDAJBf7V8ScgmITVIOjWle6IWsEbzq9W1V0QBx717j+R5TQ5ZoB6E OfaHvk7X9JbJx3oS34SFFW337TcxeK6aZo/9R1DBWKDLJtnWHTn9/wAAoRLxkno1CNz0 eeCYeRVgBPnPb0qZYFU+E7Prkis16frTuRJI3OjkNY03KcBNQpXHsMMn9GzMbStQ3In1 YQyJI3DYF6CdTpTJdjYiYq/HDaFqkcqlQ6vE9YLGJ7G0S96RGlUvOQVodBtCi1vG6erD M4xMD7pZ42n77zJ+133XjUzdUFrKk9MI5wH+hjPYbKTFlerhljNNsfjEsox43p5+9j4n ZMgw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sh13-20020a1709076e8d00b0073318653cb0si1624290ejc.759.2022.09.20.07.42.04; Tue, 20 Sep 2022 07:42:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F093268BBB4; Tue, 20 Sep 2022 17:40:31 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1EF9768BA9D for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id C2C6DC00BC for ; Tue, 20 Sep 2022 17:40:16 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:03 +0300 Message-Id: <20220920144013.4959-16-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 16/26] lavu/floatdsp: RISC-V V vector_fmul_window X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: sqaMRPnbyXh4 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 35 ++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 9b8fd9942b..dacd81c08b 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -31,6 +31,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_window_rvv(float *dst, const float *src0, + const float *src1, const float *win, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, @@ -53,6 +55,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_window = ff_vector_fmul_window_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 6a1304d24a..84f675970c 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -76,6 +76,41 @@ NOHWF mv a2, a3 ret endfunc +func ff_vector_fmul_window_rvv, zve32f + // a0: dst, a1: src0, a2: src1, a3: window, a4: length + addi t0, a4, -1 + add t1, t0, a4 + slli t0, t0, 2 + slli t1, t1, 2 + add a2, a2, t0 + add t0, a0, t1 + add t3, a3, t1 + li t1, -4 // byte stride +1: + vsetvli t2, a4, e32, m1, ta, ma + slli t4, t2, 2 + vle32.v v16, (a1) + add a1, a1, t4 + vlse32.v v20, (a2), t1 + sub a2, a2, t4 + vle32.v v24, (a3) + add a3, a3, t4 + vlse32.v v28, (t3), t1 + sub t3, t3, t4 + vfmul.vv v0, v16, v28 + sub a4, a4, t2 + vfmul.vv v8, v16, v24 + vfnmsac.vv v0, v20, v24 + vfmacc.vv v8, v20, v28 + vse32.v v0, (a0) + add a0, a0, t4 + vsse32.v v8, (t0), t1 + sub t0, t0, t4 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) + (a3) [0..a4-1] func ff_vector_fmul_add_rvv, zve32f 1: From patchwork Tue Sep 20 14:40:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38089 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1989608pzh; Tue, 20 Sep 2022 07:42:23 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7lbMIsCsgn3jaA2G45P9FwhVTpCxHfPzyLMR7baJxukk0+Zz6LMTWEntP7imEfdU60NYU8 X-Received: by 2002:a05:6402:520c:b0:451:4213:49db with SMTP id s12-20020a056402520c00b00451421349dbmr20659550edd.130.1663684943473; Tue, 20 Sep 2022 07:42:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684943; cv=none; d=google.com; s=arc-20160816; b=CvqopjTPi6HN//9JIrcUwpFSV2barjZEsa0xz6TuqGUD2/wrs6aGcHHDyy8O5zHkOA BF3PoK0bKsdrO/hC1iSYSKlZr80swnwIQ3DGGE9LVzg0Jqb4STpiVhdj7zPzK5i3ai0z yK/SucCJI3RxKGu9d335Hw02aYD5u/3f0Yhze8w6r5lVWfDpsCfdx/aA8jP/a9OaeJAe L+2BN+JrZTwMNjmnAOvhbE376wHHMUyRkYc7j/JrAXIgXzVEfFtPDHI0iqVVJGL+Px67 Jf07zIJjrM+L6KsR9pmzr2wA8IBQ1ivddnOO0bO1FFDV8x0nRz2g4sipwrQwwTM+A2xC 3O4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=wBEZVmdHlBj6k0wTdIOHcOQI+ZJwYhvbMJAgMGbnWqY=; b=IBgX14+u9hPw9dUchTYJMtatz1gGkwjt9NJzB7+oEOolPMiW8G6PmjdFn8DLInIbab y8D7NtwO/i5firRcHTeLAU0PienT0IAzIIkLld7JcAXwrrlWdWwu3DNm5hZ1RT86ffQD XJdfYUOifZDLbZc18VFihmFbv9nGPFlZP81tA2o973nwkSlB8fHQqvnOItzBW6hEKSOK 8rKkDSXVx6q/vGThgwSYtAmeJzRBTF30WPjXzD4w991J+afUAEjbVnY6it9plKhnzSrf ykSloNW6mTz0oS/nHv/NeFJVndStTYlYnoHm9+Jlri5WKB/6dsGKSv2FNambioZx9uqr nG3g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id m22-20020a509996000000b0044c28ad39f8si219057edb.238.2022.09.20.07.42.22; Tue, 20 Sep 2022 07:42:23 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E541F68BB07; Tue, 20 Sep 2022 17:40:33 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3084668B800 for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id EC68FC00BD for ; Tue, 20 Sep 2022 17:40:16 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:04 +0300 Message-Id: <20220920144013.4959-17-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 17/26] lavu/floatdsp: RISC-V V scalarproduct_float X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 1tLMUppJjekG From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index dacd81c08b..cc9b7e83dc 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -38,6 +38,7 @@ void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); +float ff_scalarproduct_float_rvv(const float *v1, const float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -59,6 +60,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; + fdsp->scalarproduct_float = ff_scalarproduct_float_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 84f675970c..48a44b8150 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -172,6 +172,27 @@ func ff_butterflies_float_rvv, zve32f ret endfunc +// a0 = (a0).(a1) [0..a2-1] +func ff_scalarproduct_float_rvv, zve32f + vsetvli zero, zero, e32, m1, ta, ma + vmv.s.x v8, zero +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + add a0, a0, t1 + vle32.v v24, (a1) + add a1, a1, t1 + vfmul.vv v16, v16, v24 + sub a2, a2, t0 + vfredusum.vs v8, v16, v8 + bnez a2, 1b + + vfmv.f.s fa0, v8 +NOHWF fmv.x.w a0, fa0 + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Tue Sep 20 14:40:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38086 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1989332pzh; Tue, 20 Sep 2022 07:41:56 -0700 (PDT) X-Google-Smtp-Source: AMsMyM77qYbTHoNPaCBu/k3LE4mQ3coLv55CXaYNJ+b+/md6pgYvWOBZ2z+pgyrsdBhJwZvW08GX X-Received: by 2002:aa7:d5c8:0:b0:44e:3eb1:a13f with SMTP id d8-20020aa7d5c8000000b0044e3eb1a13fmr20991526eds.220.1663684916362; Tue, 20 Sep 2022 07:41:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684916; cv=none; d=google.com; s=arc-20160816; b=y/2MjW/yVphLHPjN3mdh4eqoVwVPzhMz3lrn75uWB95Mv8sQVM5WAeSUhClstKzZor sQn2BhLgm653RUIsfkaQSLJLOEvIR/5JGzt9YNVh+nWzIgysfWn+61/GosAXTYvghNih Q4EaE4fXUiMUvcoavhvRYablg2Hiqd44TBWOO30T6iJmdwprnkoGze8jolYgJMVGvm7b +4Qp0ojjgxAKdKbLdskrUO5ztFU9lCemaRJBMQ6NHkif88rSd4fLGWC4cLsJSeYVsjGv 09ez70FVeleuhAsWY9gss73N06r+UdWmJ8fdj3ZEUuPLzZKcKSoXO7QwoWsbCJyZW2Sl 1Pjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=2p2DuqwnqB1JNgvbWwA9ERzunShnKYZuHdGvyCZE67g=; b=d5loDwNumT5JKHgB6pedCfbdJgb2srwtdiGGBNJNuObrqpmAeEStqxME/Eja3HhDBv OVeZxnYrbeiICuT14ktreRjaGWgXdaeShGbfNkOTt8QCxO+gQJl+grMW/CS4SNJf+yXh aDNRB/UD/RI5mvVuczxd5rDegHBTdTvqIFepJSn33TNq1UjZwQLsKBi7P+DZJTAB1W9o R3HabQNRC1lOPsm7Hhaclee3L7xJAPJlfT6O/qFMF5Nn3InvQSTqfK6ILCZlwkSc6MzA kIiUXwczXgEG62wqzJi3g6WIlW9l7LxxDWnj23Jg/+IMXfMLzdPrnL2uNC6ESeypWwV0 3W/A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nb23-20020a1709071c9700b0077fd47147d9si233280ejc.135.2022.09.20.07.41.55; Tue, 20 Sep 2022 07:41:56 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E367168BB9F; Tue, 20 Sep 2022 17:40:30 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 25AEC68BA9F for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 21D87C00BE for ; Tue, 20 Sep 2022 17:40:17 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:05 +0300 Message-Id: <20220920144013.4959-18-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 18/26] lavu/fixeddsp: RISC-V V butterflies_fixed X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: QJyPuwx2ZI2J From: Rémi Denis-Courmont --- libavutil/fixed_dsp.c | 4 +++- libavutil/fixed_dsp.h | 1 + libavutil/riscv/Makefile | 4 +++- libavutil/riscv/fixed_dsp_init.c | 38 +++++++++++++++++++++++++++++ libavutil/riscv/fixed_dsp_rvv.S | 41 ++++++++++++++++++++++++++++++++ 5 files changed, 86 insertions(+), 2 deletions(-) create mode 100644 libavutil/riscv/fixed_dsp_init.c create mode 100644 libavutil/riscv/fixed_dsp_rvv.S diff --git a/libavutil/fixed_dsp.c b/libavutil/fixed_dsp.c index 154f3bc2d3..bc847949dc 100644 --- a/libavutil/fixed_dsp.c +++ b/libavutil/fixed_dsp.c @@ -162,7 +162,9 @@ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int bit_exact) fdsp->butterflies_fixed = butterflies_fixed_c; fdsp->scalarproduct_fixed = scalarproduct_fixed_c; -#if ARCH_X86 +#if ARCH_RISCV + ff_fixed_dsp_init_riscv(fdsp); +#elif ARCH_X86 ff_fixed_dsp_init_x86(fdsp); #endif diff --git a/libavutil/fixed_dsp.h b/libavutil/fixed_dsp.h index fec806ff2d..1217d3a53b 100644 --- a/libavutil/fixed_dsp.h +++ b/libavutil/fixed_dsp.h @@ -161,6 +161,7 @@ typedef struct AVFixedDSPContext { */ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int strict); +void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp); void ff_fixed_dsp_init_x86(AVFixedDSPContext *fdsp); /** diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 89a8d0d990..1597154ba5 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1,3 +1,5 @@ OBJS += riscv/float_dsp_init.o \ + riscv/fixed_dsp_init.o \ riscv/cpu.o -RVV-OBJS += riscv/float_dsp_rvv.o +RVV-OBJS += riscv/float_dsp_rvv.o \ + riscv/fixed_dsp_rvv.o diff --git a/libavutil/riscv/fixed_dsp_init.c b/libavutil/riscv/fixed_dsp_init.c new file mode 100644 index 0000000000..4075e521f2 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_init.c @@ -0,0 +1,38 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/fixed_dsp.h" + +void ff_butterflies_fixed_rvv(int *v1, int *v2, int len); + +av_cold void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32X) + fdsp->butterflies_fixed = ff_butterflies_fixed_rvv; +#endif +} diff --git a/libavutil/riscv/fixed_dsp_rvv.S b/libavutil/riscv/fixed_dsp_rvv.S new file mode 100644 index 0000000000..9890a980f6 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_rvv.S @@ -0,0 +1,41 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_fixed_rvv, zve32x +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vadd.vv v0, v16, v24 + vsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc From patchwork Tue Sep 20 14:40:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38091 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1989832pzh; Tue, 20 Sep 2022 07:42:42 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4m1WXql/29JC5ypV/DXfrOgf2zGjM64WLgEVIMwDndjcZmWE866qAjWiN8V9O8Z/Arm+Aw X-Received: by 2002:a17:907:2d88:b0:781:44ff:443f with SMTP id gt8-20020a1709072d8800b0078144ff443fmr8751602ejc.358.1663684962231; Tue, 20 Sep 2022 07:42:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684962; cv=none; d=google.com; s=arc-20160816; b=dHhkzpqnk0+fJU30U6pskny0YmuIKHH5zZ44ju0f9unrj6wcoZD55sascjVtn2U3yO bb3CVXMLvs3zSfBJKnDHUxhHeFimvlzzn8BFUbWS8O7MAhM5AYp1iUpoey7uY0etwfAz qn24unnnib1VWDwVvJH+XJ0lwg9fxF9K0qLD1AevAg7oVJwk9t+z4PiMN4biD5WlsV01 ESKsS5xLa7Y8PBV74SAtaOgIfjWr+CSvLM7mO4xYtx+/ynFdVD6o7+/o2bMqXYk+RBNP 0yvcPnmevwHfjnJrJM7ToYObVeQtJYCt/N0TA0kkVwTJ+RjQ7Ru8XgPDgFJHir2C7slA Iyyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=1RYK1tQgLNxt8Pxer3ctLUAd+AgIEG36x5FafxYTHN4=; b=mELXzkTYUT94Aa1XZmRDohs950wvnfXWowdnpKS6rG8yQR6Sdge8bjllMD9GVkm5od P2IquLhLKH2AJJPx1t4oaUpA9QY7yd3OuVcDg8YR6W7FbQx6GT1MOYzpHL+zzl96ytJM fuUuKNaLco0QAqEnwRp5d/MyE8xLTqJA2SzU9oEWeH36uRb+wel5F0hfzBiqPEsEDAHx dyXdb8HrHFjNfwDGrFNHlD7XICt6AA99vi7E2wFW5Pg1z1OaYREGf04tZsJQl2xlVGgm TCQTJFgW4gBvPqzotI335pdV2atxV6jbkwGZDYuYhUwrgmCd7Mb2ql4L1vdmjhn5UF4n euTA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id t25-20020a170906065900b0073d8830e4c7si1081505ejb.954.2022.09.20.07.42.41; Tue, 20 Sep 2022 07:42:42 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 805B168BA3B; Tue, 20 Sep 2022 17:40:37 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3D05468BA6F for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 4BFB6C00BF for ; Tue, 20 Sep 2022 17:40:17 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:06 +0300 Message-Id: <20220920144013.4959-19-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 19/26] lavc/audiodsp: RISC-V V vector_clip_int32 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: y84zkKGNqSag From: Rémi Denis-Courmont --- libavcodec/riscv/Makefile | 1 + libavcodec/riscv/audiodsp_init.c | 9 ++++++++ libavcodec/riscv/audiodsp_rvv.S | 37 ++++++++++++++++++++++++++++++++ 3 files changed, 47 insertions(+) create mode 100644 libavcodec/riscv/audiodsp_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index da07f1fe96..99541b075e 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,4 +1,5 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o +RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index c5842815d6..ce8b60ee52 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -18,16 +18,25 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "config.h" + #include "libavutil/attributes.h" #include "libavutil/cpu.h" #include "libavcodec/audiodsp.h" void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); +void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, + int32_t max, unsigned int len); + av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) { int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_RVF) c->vector_clipf = ff_vector_clipf_rvf; +#if HAVE_RVV + if (flags & AV_CPU_FLAG_RV_ZVE32X) + c->vector_clip_int32 = ff_vector_clip_int32_rvv; +#endif } diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S new file mode 100644 index 0000000000..26b3cdffcf --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clip_int32_rvv, zve32x +1: + vsetvli t0, a4, e32, m1, ta, ma + vle32.v v8, (a1) + slli t1, t0, 2 + vmax.vx v8, v8, a2 + add a1, a1, t1 + vmin.vx v8, v8, a3 + sub a4, a4, t0 + vse32.v v8, (a0) + add a0, a0, t1 + bnez a4, 1b + + ret +endfunc From patchwork Tue Sep 20 14:40:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38095 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1990193pzh; Tue, 20 Sep 2022 07:43:17 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4mgdzxJXvBu0b3XKU9xPs2CrsXYvntxofE6trAtFILS0UPNO3UVekZnDpmRvV8ymMDg90O X-Received: by 2002:a17:907:3d8e:b0:77b:fd55:affe with SMTP id he14-20020a1709073d8e00b0077bfd55affemr16746912ejc.498.1663684997707; Tue, 20 Sep 2022 07:43:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684997; cv=none; d=google.com; s=arc-20160816; b=u/uhfF/QUjskRba6tW5GaX+kiZGXE/XRmI146bawdY1Nwh+xmVCrDplXqPL8egBoeJ FPVMB7sfCnctU5mP+6uz1lEZZ+N7STwwv+1mgN5EiSfa1gyRb9HeokBdSvCrOoXx2PFe wwQFLXR+LNSDrCrHxxi6eoW4ehPUaEl57m1dQmQqvWB3/EzSVrUW0sES6XOS1JVGqi82 KKppHyoK96SSEuudjM3ZPRI+CYTSYQBgjcQbgxNBvYrJQyXTjHK7+rB7+EzvsWfdxSUu 5Eag3QTKumc3tPnmN8jmRzWL9W+oGmZp7PpR/L18uxlrwhE4sVTyOBexcduqMXO/d/62 mV9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ADDywSDejfYw0dKAs6QAyLAiDCXPl3RqNuTNw2xmbrk=; b=BFFJWMEpukAGXXLx45emPlvmaYYei7EMif4+gRHEmACJ3frS+4d2qecokv5KlHypOf QrZ2Jazin/RkiBw3a+KyzOEf6O2YMz5wTFCFuQQMETUv0tvY5oc4r0vQlIXUIyjf4V2h KyYCElVbdhdQgLdaAjZVvr+yfD8srsnrMyXF3vH/Ztq/OStkSJUfKivoEYDFmLUbYe/1 38iwavAs6fYCfmbndaJ0pm8SeCxM3yAJiZisGOOmQABrMZh/ESx3fN58c8QoXose8v9N vTpsVDqgoDxL5kMyFgeyECj8mOCOGLH9rt5K2/CsJKxnAv6zwI7PABBm6dSz+gWAnF3D sNoA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id kd4-20020a17090798c400b00780e89aecd1si1094319ejc.849.2022.09.20.07.43.17; Tue, 20 Sep 2022 07:43:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3AB6168BAF2; Tue, 20 Sep 2022 17:40:42 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4191668BB07 for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 7564FC00C0 for ; Tue, 20 Sep 2022 17:40:17 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:07 +0300 Message-Id: <20220920144013.4959-20-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 20/26] lavc/audiodsp: RISC-V V vector_clipf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: JYcBwKZqC+wH From: Rémi Denis-Courmont --- libavcodec/riscv/audiodsp_init.c | 7 ++++++- libavcodec/riscv/audiodsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index ce8b60ee52..ddd561484f 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -26,6 +26,7 @@ void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); +void ff_vector_clipf_rvv(float *dst, const float *src, int len, float min, float max); void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, int32_t max, unsigned int len); @@ -36,7 +37,11 @@ av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) if (flags & AV_CPU_FLAG_RVF) c->vector_clipf = ff_vector_clipf_rvf; #if HAVE_RVV - if (flags & AV_CPU_FLAG_RV_ZVE32X) + if (flags & AV_CPU_FLAG_RV_ZVE32X) { c->vector_clip_int32 = ff_vector_clip_int32_rvv; + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->vector_clipf = ff_vector_clipf_rvv; + } #endif } diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S index 26b3cdffcf..e5a09f3b19 100644 --- a/libavcodec/riscv/audiodsp_rvv.S +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -35,3 +35,21 @@ func ff_vector_clip_int32_rvv, zve32x ret endfunc + +func ff_vector_clipf_rvv, zve32f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.x fa1, a4 +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v8, (a1) + slli t1, t0, 2 + vfmax.vf v8, v8, fa0 + add a1, a1, t1 + vfmin.vf v8, v8, fa1 + sub a2, a2, t0 + vse32.v v8, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Tue Sep 20 14:40:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38096 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1990259pzh; Tue, 20 Sep 2022 07:43:27 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5pxo1XYqX8Z/Ec1bPmJP0MrNZ2ZAntnyilMx2HpreJTw6R3IrFvQeLR5LTUyOiFWNsv5pK X-Received: by 2002:a17:907:94c6:b0:77d:7ad3:d065 with SMTP id dn6-20020a17090794c600b0077d7ad3d065mr17495964ejc.149.1663685006803; Tue, 20 Sep 2022 07:43:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663685006; cv=none; d=google.com; s=arc-20160816; b=btYqWQLkOyXNUEJ9zsr30UuIGKUg/Fx/T8RAu9ZVU4l4Us6tTF8YmuDXqs4jqcNtfx TosR4uvWPOxQMLohEDXwr7H/M3W5fb+Y5Xqfbch87UEAA0tA8iiDIAzQaf7hmJBZdE/j xeSt1Gxh9zQwLLVEa3bkiObRqF/RD01urDCV3ORqRkumc528y5SfAhcjJGIzQoLnzJtT 5jbdBOYRavyO90e9jp2E8U5iuexiYK4/5w5aZtdWAwh7KOyoDS7bFfzvhO8eMrPuFdhZ 6ZMNmeNJt1JyjaqroqCpTmNwTawoVk8g6wj9V68y2gF19MijfaOsx/Lwo/SpL5kqrY9L wQaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=RpiM/MkQbA2qSc2usy+Qo3R1uEm98crrbp5FFbdV3bM=; b=wjWxlnUBcbqY9I0KdqAiM5mcSt19ElcSeQmYGO3LSe4MmO0D4H3GAkF+s6gMw2Z1l/ brVuSG9a69XGeXX/sKi79aPN6mUwStmGYRjFi8lYGPV58Jb9Mu1y0m/arqOA9JsS0Ly/ ip/m7lfKdxtD3wbf7gXuidLop7E5qETGxx8d6gn/8iJxYQLkvmgYW4jRjcq/iP7Rg/+S 2mXo8hIYga/uuu4xdxJDBVROrNhuRBfggAeU3q+N5/fb5q5+JeXdMTumsdmFzZTet38Q BdmhhA41gRFOVnj8Bbvm/YHcMrJrYwAI3sU3bR4ya0PRyLcroH8TigimUxe+t8m70LPx R1Kg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a22-20020a056402169600b004542c9947c1si226601edv.217.2022.09.20.07.43.26; Tue, 20 Sep 2022 07:43:26 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3A38468BBF2; Tue, 20 Sep 2022 17:40:43 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 489E468BB0F for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 9EB7BC00C1 for ; Tue, 20 Sep 2022 17:40:17 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:08 +0300 Message-Id: <20220920144013.4959-21-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 21/26] lavc/audiodsp: RISC-V V scalarproduct_int16 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: +aCeTzNzRNes From: Rémi Denis-Courmont --- libavcodec/riscv/audiodsp_init.c | 2 ++ libavcodec/riscv/audiodsp_rvv.S | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index ddd561484f..6f38b7bc83 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -29,6 +29,7 @@ void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float void ff_vector_clipf_rvv(float *dst, const float *src, int len, float min, float max); void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, int32_t max, unsigned int len); +int32_t ff_scalarproduct_int16_rvv(const int16_t *v1, const int16_t *v2, int len); av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) { @@ -38,6 +39,7 @@ av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) c->vector_clipf = ff_vector_clipf_rvf; #if HAVE_RVV if (flags & AV_CPU_FLAG_RV_ZVE32X) { + c->scalarproduct_int16 = ff_scalarproduct_int16_rvv; c->vector_clip_int32 = ff_vector_clip_int32_rvv; if (flags & AV_CPU_FLAG_RV_ZVE32F) diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S index e5a09f3b19..852ae1dc1f 100644 --- a/libavcodec/riscv/audiodsp_rvv.S +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -20,6 +20,26 @@ #include "libavutil/riscv/asm.S" +func ff_scalarproduct_int16_rvv, zve32x + vsetvli zero, zero, e16, m1, ta, ma + vmv.s.x v8, zero +1: + vsetvli t0, a2, e16, m1, ta, ma + vle16.v v16, (a0) + slli t1, t0, 1 + vle16.v v24, (a1) + sub a2, a2, t0 + vwmul.vv v0, v16, v24 + add a0, a0, t1 + vsetvli zero, t0, e32, m2, ta, ma + vredsum.vs v8, v0, v8 + add a1, a1, t1 + bnez a2, 1b + + vmv.x.s a0, v8 + ret +endfunc + func ff_vector_clip_int32_rvv, zve32x 1: vsetvli t0, a4, e32, m1, ta, ma From patchwork Tue Sep 20 14:40:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38093 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1990018pzh; Tue, 20 Sep 2022 07:43:00 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4SqpQ/Cm0rC67rL7isCIBW3iPNK9tTXGmp3maDxR8Ml7Obwj5ZGdzOb4H/o6DP/HPxfsxE X-Received: by 2002:a17:907:7638:b0:76f:cad4:f176 with SMTP id jy24-20020a170907763800b0076fcad4f176mr17295426ejc.647.1663684980098; Tue, 20 Sep 2022 07:43:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684980; cv=none; d=google.com; s=arc-20160816; b=U5DhOvpuuIuPFpRxHTyWpo0p/GeqdlAt6mgJob3POFQNzqd8bawUwChRE4EMeVfbm5 6bMUtNV0ePJuu7xdxBY3wTsKQe+9+MN/lF9MTrjRZDWXPGPtmSu2a+vEBaSvfotVE2t+ hHAPloBp/4j5rzWS8t5R68l/WoEYrAeyfHumrw0vtZquA9zvGgdMHaB975d+tXhWhdGD DLgOXHYoowg8ag6tVKGKZvgbKrU7F865n0+ImMdx4IrhKaFbqULA+xlqIq7icRoX5V4o osCtik3x3a0MQWG395vNNkWUrVLmBvNrJDNyOocn/b/u4bZnRAmv1an+XIWNwRuRQyWr OZtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=e8kJLQ9ohdrsVei0me80RixMmWYYZRHxijmjUPEo3uI=; b=yrcvLVP0Fs+OmWkYYkTUfzyYqWZ091T7CDrbBVG87HwxDbZCWOJhKfDicKwB85Wrz8 ewJFLwJaJj1luDw+4wsQ5Ut53f2u/k8TnOAvSN6hogJGgFnkaDQbpRs3ZvvNveIKyLvE IMitzpgwZFzVUbhLqtx/4gbjTwY3AYr6+CZGL5g1d/7Cgu5eGdvosGLxgCFi37nQFW/F Azh9pMFdJWJ5BFgQ2U5BUvnpl4cmjZIY6cShG9o3dzG8UucrXD74JKgA71UWGBG0zqJG 1fQ8YiyvwJP432nFwIJwnNhygJeeE3mnRbyTRCzww54UBb3rPlAl8M5bJqa/aGUnLVzH kKuA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id k15-20020a1709061c0f00b007707edd5487si1268255ejg.947.2022.09.20.07.42.59; Tue, 20 Sep 2022 07:43:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EB62B68BB25; Tue, 20 Sep 2022 17:40:39 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3E26268BAFC for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id C8680C00C2 for ; Tue, 20 Sep 2022 17:40:17 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:09 +0300 Message-Id: <20220920144013.4959-22-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 22/26] lavc/fmtconvert: RISC-V V int32_to_float_fmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Jb6N5PbT9OBz From: Rémi Denis-Courmont --- libavcodec/fmtconvert.c | 2 ++ libavcodec/fmtconvert.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/fmtconvert_init.c | 41 ++++++++++++++++++++++++++++++ libavcodec/riscv/fmtconvert_rvv.S | 40 +++++++++++++++++++++++++++++ 5 files changed, 86 insertions(+) create mode 100644 libavcodec/riscv/fmtconvert_init.c create mode 100644 libavcodec/riscv/fmtconvert_rvv.S diff --git a/libavcodec/fmtconvert.c b/libavcodec/fmtconvert.c index 00f55f8f1e..8dcef82ee3 100644 --- a/libavcodec/fmtconvert.c +++ b/libavcodec/fmtconvert.c @@ -53,6 +53,8 @@ av_cold void ff_fmt_convert_init(FmtConvertContext *c, AVCodecContext *avctx) ff_fmt_convert_init_arm(c, avctx); #elif ARCH_PPC ff_fmt_convert_init_ppc(c, avctx); +#elif ARCH_RISCV + ff_fmt_convert_init_riscv(c, avctx); #elif ARCH_X86 ff_fmt_convert_init_x86(c, avctx); #endif diff --git a/libavcodec/fmtconvert.h b/libavcodec/fmtconvert.h index b2df7a9629..ca51d0861a 100644 --- a/libavcodec/fmtconvert.h +++ b/libavcodec/fmtconvert.h @@ -61,6 +61,7 @@ void ff_fmt_convert_init(FmtConvertContext *c, AVCodecContext *avctx); void ff_fmt_convert_init_aarch64(FmtConvertContext *c, AVCodecContext *avctx); void ff_fmt_convert_init_arm(FmtConvertContext *c, AVCodecContext *avctx); void ff_fmt_convert_init_ppc(FmtConvertContext *c, AVCodecContext *avctx); +void ff_fmt_convert_init_riscv(FmtConvertContext *c, AVCodecContext *avctx); void ff_fmt_convert_init_x86(FmtConvertContext *c, AVCodecContext *avctx); void ff_fmt_convert_init_mips(FmtConvertContext *c); diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 99541b075e..682174e875 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,5 +1,7 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o +OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o +RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/fmtconvert_init.c b/libavcodec/riscv/fmtconvert_init.c new file mode 100644 index 0000000000..aa7d44ff5a --- /dev/null +++ b/libavcodec/riscv/fmtconvert_init.c @@ -0,0 +1,41 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/avcodec.h" +#include "libavcodec/fmtconvert.h" + +void ff_int32_to_float_fmul_scalar_rvv(float *dst, const int32_t *src, + float mul, int len); + +av_cold void ff_fmt_convert_init_riscv(FmtConvertContext *c, + AVCodecContext *avctx) +{ +#ifdef HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->int32_to_float_fmul_scalar = ff_int32_to_float_fmul_scalar_rvv; +#endif +} diff --git a/libavcodec/riscv/fmtconvert_rvv.S b/libavcodec/riscv/fmtconvert_rvv.S new file mode 100644 index 0000000000..c19b77e38a --- /dev/null +++ b/libavcodec/riscv/fmtconvert_rvv.S @@ -0,0 +1,40 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_int32_to_float_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v24, (a1) + slli t1, t0, 2 + vfcvt.f.x.v v24, v24 + sub a2, a2, t0 + vfmul.vf v24, v24, fa0 + add a1, a1, t1 + vse32.v v24, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Tue Sep 20 14:40:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38092 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1989941pzh; Tue, 20 Sep 2022 07:42:51 -0700 (PDT) X-Google-Smtp-Source: AMsMyM67MsruxQIF/4b4u1NrfOfOEChOnMeHLszd0pPs2RgxGqE0oITkq14g3CqgKB0fxw62EcH9 X-Received: by 2002:a05:6402:1e92:b0:451:dcf:641d with SMTP id f18-20020a0564021e9200b004510dcf641dmr20728784edf.335.1663684971804; Tue, 20 Sep 2022 07:42:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684971; cv=none; d=google.com; s=arc-20160816; b=lrztVvQ3Q7r9aHUODxkSUN2faqjcjZ0yGeVWTy67/NqNb9AP3oFWF2fs7mS/7AAK98 u9pl1ImG0I4nl1cfgzLRhgmWZqltPKGvYuqGW2WtIc+I7uBDvq+0WDZ/kjWETsF9Qud8 QKCrBhc7SMEZhqTZsG1SPfg3n6mwIJSDI7dp6rQ/AOjDe+tkGV2YNDua1Sf5aV2IPMdJ Abgni/4vc9Z+ysyhjKujMyaIsszWlc0S1M6DPE/k4+cKGW11MKjB07lFsSON6EBClyee IY//jm2c7M/LcLxrvW2J+Nf+OJZ9ikOEbkDM2Gu1XwtfHie0JvYgGRF5EKH4r6zp3jb5 dcpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=5AYvNhvxTd3NJr7LtOocIxpBTiHkHVXQTgTkhhm6x8s=; b=vUI4weokqJLsmkwzDgG4+AHy/t3xpJSXV1myN1UWgbrwzJDCfXC7nIVW8REhXGKijE j/RwHApkbXXOq+/NvfmiMWQ5AQgEO6Q5e5dEthK/zIRR20iapAA8Ch86jHV1/Ezyenjj NoZCsnW6S6oR67R3nu5+VNec80uUVv0lvAXRRfQqWxgUqocscrYIttiSvuttVyx6N0Lc wkkiK/qinfEuG0LbAmUS28YX3aOPYSTuAlqOhyvp+XhRo/brB/ppJl4xg0WHssj4KFig 6qAJIJTPgakeKxv7yocTVP5r8qCdBVZh08w7PtnO20LOA3G+DMUdS4SgNOtlbB40NQuP hXTQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s7-20020a056402520700b0044efcaeec2asi277991edd.167.2022.09.20.07.42.50; Tue, 20 Sep 2022 07:42:51 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CF29168B73A; Tue, 20 Sep 2022 17:40:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3E76F68BB01 for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id F2578C00C3 for ; Tue, 20 Sep 2022 17:40:17 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:10 +0300 Message-Id: <20220920144013.4959-23-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 23/26] lavc/fmtconvert: RISC-V V int32_to_float_fmul_array8 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: /ka+0ivnda93 From: Rémi Denis-Courmont --- libavcodec/riscv/fmtconvert_init.c | 7 ++++++- libavcodec/riscv/fmtconvert_rvv.S | 29 +++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/fmtconvert_init.c b/libavcodec/riscv/fmtconvert_init.c index aa7d44ff5a..1bc1dfda9a 100644 --- a/libavcodec/riscv/fmtconvert_init.c +++ b/libavcodec/riscv/fmtconvert_init.c @@ -28,6 +28,9 @@ void ff_int32_to_float_fmul_scalar_rvv(float *dst, const int32_t *src, float mul, int len); +void ff_int32_to_float_fmul_array8_rvv(FmtConvertContext *c, float *dst, + const int32_t *src, const float *mul, + int len); av_cold void ff_fmt_convert_init_riscv(FmtConvertContext *c, AVCodecContext *avctx) @@ -35,7 +38,9 @@ av_cold void ff_fmt_convert_init_riscv(FmtConvertContext *c, #ifdef HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) + if (flags & AV_CPU_FLAG_RV_ZVE32F) { c->int32_to_float_fmul_scalar = ff_int32_to_float_fmul_scalar_rvv; + c->int32_to_float_fmul_array8 = ff_int32_to_float_fmul_array8_rvv; + } #endif } diff --git a/libavcodec/riscv/fmtconvert_rvv.S b/libavcodec/riscv/fmtconvert_rvv.S index c19b77e38a..b472be0505 100644 --- a/libavcodec/riscv/fmtconvert_rvv.S +++ b/libavcodec/riscv/fmtconvert_rvv.S @@ -38,3 +38,32 @@ NOHWF mv a2, a3 ret endfunc + +func ff_int32_to_float_fmul_array8_rvv, zve32f + srai a4, a4, 3 + +1: vsetvli t0, a4, e32, m1, ta, ma + vle32.v v24, (a3) + slli t1, t0, 2 + vlseg8e32.v v16, (a2) + slli t2, t0, 2 + 3 + vsetvli t3, zero, e32, m8, ta, ma + vfcvt.f.x.v v16, v16 + add a3, a3, t1 + vsetvli t0, a4, e32, m1, ta, ma + vfmul.vv v16, v16, v24 + add a2, a2, t2 + vfmul.vv v17, v17, v24 + sub a4, a4, t0 + vfmul.vv v18, v18, v24 + vfmul.vv v19, v19, v24 + vfmul.vv v20, v20, v24 + vfmul.vv v21, v21, v24 + vfmul.vv v22, v22, v24 + vfmul.vv v23, v23, v24 + vsseg8e32.v v16, (a1) + add a1, a1, t2 + bnez a4, 1b + + ret +endfunc From patchwork Tue Sep 20 14:40:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38094 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1990116pzh; Tue, 20 Sep 2022 07:43:08 -0700 (PDT) X-Google-Smtp-Source: AMsMyM67UtAA1Kz2aC7vJZSmbQb8gv44PZao2GpSx7fQQFRA2l17IONOAoYVBDVtAlDj2aluM+9Y X-Received: by 2002:a05:6402:f0f:b0:451:1ecd:a61f with SMTP id i15-20020a0564020f0f00b004511ecda61fmr20161952eda.125.1663684988471; Tue, 20 Sep 2022 07:43:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663684988; cv=none; d=google.com; s=arc-20160816; b=i8TMXMEnGJoO1BWCwZ2HBaIG3OnCOAWF/slMUFHHYu028kK/h21vY3oI8ZXdtYqiTu ssyDaLN59tEWwVywVssbKsxAmYTJ1LRUHoFe1mOxmEIxGBVwBGwArtBh85Qn+ZgdJAyZ 0FqzziwquuJM8ds6wfkSSjKhUvqENmGaziXSBOAf8tsGgX9ASKktV6WlvVrdKVwzNrU9 C6Dk9M9gqquvVGQ11xRF4okMnMMq10hHjUWhsofF05kgSdd+N8a/Zw3xGbQSEyJTTStJ IKM5SBKbyX3Tbd+pKu5+70zWt8DNgOEOiw3tZpomWncIBG60is9RfgdXhPTfllMv4GQv BC+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=V/VGqwrFqa1YlaWY99T8g1nHqBeR9R+P8203NiUHIMw=; b=OT/AS83ItlJ94rnOVAnyQJZX0HXf3wtJoiABJWuY8HxPDfdHupJA2NmTK9DhUFNivp hBWQ+RVPKmtWc2yHk0L2QbURhuulaM444YYuXgS+E48puaAMmcymrdDt8OdttzltCloU NCH+c2MSXvlvu0BLZyg83EXHy5dVdQAw0BZ32ED5GPpgPqQFZ7ErmuEnL4hbucyqSF5x CQBmo2NZ4TYFuzt4YeJBAsCxHyacUExl8OoXrd3i0ALGwz2uJ+oS251HL/DiH3ww0Cc4 OdcXGyLSCUEEdOxMp1387YEWmCZDcF7+chxEDx6FGECNZS1MYxCRPkpVTMNrGodNhNHJ yzSg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gn12-20020a1709070d0c00b00781e9949008si663735ejc.453.2022.09.20.07.43.08; Tue, 20 Sep 2022 07:43:08 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 373B968BB41; Tue, 20 Sep 2022 17:40:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3FDBD68B957 for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 27DB3C00C4 for ; Tue, 20 Sep 2022 17:40:18 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:11 +0300 Message-Id: <20220920144013.4959-24-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 24/26] lavc/vorbisdsp: RISC-V V inverse_coupling X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 9miVy44kBbqG From: Rémi Denis-Courmont This uses the following vectorisation: for (i = 0; i < blocksize; i++) { ang[i] = mag[i] - copysignf(fmaxf(ang[i], 0.f), mag[i]); mag[i] = mag[i] - copysignf(fminf(ang[i], 0.f), mag[i]); } --- libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/vorbisdsp_init.c | 37 +++++++++++++++++++++++++ libavcodec/riscv/vorbisdsp_rvv.S | 45 +++++++++++++++++++++++++++++++ libavcodec/vorbisdsp.c | 2 ++ libavcodec/vorbisdsp.h | 1 + 5 files changed, 87 insertions(+) create mode 100644 libavcodec/riscv/vorbisdsp_init.c create mode 100644 libavcodec/riscv/vorbisdsp_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 682174e875..03a95301d7 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -5,3 +5,5 @@ OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o +OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o +RVV-OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_rvv.o diff --git a/libavcodec/riscv/vorbisdsp_init.c b/libavcodec/riscv/vorbisdsp_init.c new file mode 100644 index 0000000000..d8432bc0f8 --- /dev/null +++ b/libavcodec/riscv/vorbisdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/vorbisdsp.h" + +void ff_vorbis_inverse_coupling_rvv(float *mag, float *ang, + ptrdiff_t blocksize); + +av_cold void ff_vorbisdsp_init_riscv(VorbisDSPContext *c) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->vorbis_inverse_coupling = ff_vorbis_inverse_coupling_rvv; +#endif +} diff --git a/libavcodec/riscv/vorbisdsp_rvv.S b/libavcodec/riscv/vorbisdsp_rvv.S new file mode 100644 index 0000000000..0a3f225149 --- /dev/null +++ b/libavcodec/riscv/vorbisdsp_rvv.S @@ -0,0 +1,45 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_vorbis_inverse_coupling_rvv, zve32f + fmv.w.x ft0, zero +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a1) + slli t1, t0, 2 + vle32.v v24, (a0) + sub a2, a2, t0 + vfmax.vf v8, v16, ft0 + vfmin.vf v16, v16, ft0 + vfsgnj.vv v8, v8, v24 + vfsgnj.vv v16, v16, v24 + vfsub.vv v8, v24, v8 + vfsub.vv v24, v24, v16 + vse32.v v8, (a1) + add a1, a1, t1 + vse32.v v24, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc diff --git a/libavcodec/vorbisdsp.c b/libavcodec/vorbisdsp.c index 693c44dfcb..70022bd262 100644 --- a/libavcodec/vorbisdsp.c +++ b/libavcodec/vorbisdsp.c @@ -53,6 +53,8 @@ av_cold void ff_vorbisdsp_init(VorbisDSPContext *dsp) ff_vorbisdsp_init_arm(dsp); #elif ARCH_PPC ff_vorbisdsp_init_ppc(dsp); +#elif ARCH_RISCV + ff_vorbisdsp_init_riscv(dsp); #elif ARCH_X86 ff_vorbisdsp_init_x86(dsp); #endif diff --git a/libavcodec/vorbisdsp.h b/libavcodec/vorbisdsp.h index 1775a92cf2..5c369ecf22 100644 --- a/libavcodec/vorbisdsp.h +++ b/libavcodec/vorbisdsp.h @@ -34,5 +34,6 @@ void ff_vorbisdsp_init_aarch64(VorbisDSPContext *dsp); void ff_vorbisdsp_init_x86(VorbisDSPContext *dsp); void ff_vorbisdsp_init_arm(VorbisDSPContext *dsp); void ff_vorbisdsp_init_ppc(VorbisDSPContext *dsp); +void ff_vorbisdsp_init_riscv(VorbisDSPContext *dsp); #endif /* AVCODEC_VORBISDSP_H */ From patchwork Tue Sep 20 14:40:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38097 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1990326pzh; Tue, 20 Sep 2022 07:43:34 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4POkKEzn8kOXao9b4rApofFY0aOJOvd6TvpCxY3j/CB+lYWAX2RzuqClPNVot6rYXHydak X-Received: by 2002:a05:6402:524a:b0:450:bab6:cd5f with SMTP id t10-20020a056402524a00b00450bab6cd5fmr20421189edd.233.1663685014800; Tue, 20 Sep 2022 07:43:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663685014; cv=none; d=google.com; s=arc-20160816; b=mgOSjANTVs+Uv30hRjkno1+Df3+2MA3h4LFymYeGEhhs+BJr/FQKrBzHNXfnyLKZEm cMwoCP4OcqNy3CyC6qgZPnNCb/MdgJ0Js71xhmbtObZvG5NhDfMyCOR07wyV5fRQzrNO HC/anV9A9rHfEoESJA7g4VQE2vwS8iIx75Q3TeZ01Z/kc5t6ALJjPpqfWvf2IAWrEM2W jzpEUFWtCNO2HJSgg9cXLKMpnUa7T3vDJAT2ZGbvbIvwzbcY79hdUILz+wgVpUioljYV 9KMqs3Hb0DAuNgEto5HR6tG9bRWTwZeBHexFndO7hx72SG0FVoFb6u+sn2VrhFZpKKoR aUxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=mhlXtiugDJ4YBvBN+Jt1f6tm9IJvv05UD6ng2wYAL6Y=; b=Zl6YGCysdccK5kpn4mukQbiBkjERNHAQdtgFDtLj7uQco/WNs2d4c05iYL/gi2FqOZ mnxpgWlrlOBaonwY7ppjOjh2WsjaP4RogqJzIyva2wMIDbVt418/qZ20nJwD5YRiVtVm socvoQZwuG9dvAE96PfxQdBVB8eneygTXAyJp0iDVxwJWeN6KkRZQ0mnVqjzuoCo/e89 uZB97Qg7817GWr5XfgyrW0MMVqc5a8EDIccTRZZaOk3bWyxu2w42HhONktN0t2F51GMM W205YwU9tlbFvEB4u+xuCOri+YFutcOWWjH63c6uppd1bauyKSKQkEy7DrBKPepZ0LrB RY/A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c2-20020aa7df02000000b00445dba83badsi187908edy.395.2022.09.20.07.43.34; Tue, 20 Sep 2022 07:43:34 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2A04768BB2E; Tue, 20 Sep 2022 17:40:44 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5484B68BA58 for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 5204FC00C5 for ; Tue, 20 Sep 2022 17:40:18 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:12 +0300 Message-Id: <20220920144013.4959-25-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 25/26] lavc/aacpsdsp: RISC-V V add_squares X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: bAJFjIm4pHe0 From: Rémi Denis-Courmont --- libavcodec/aacpsdsp.h | 1 + libavcodec/aacpsdsp_template.c | 2 ++ libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/aacpsdsp_init.c | 37 ++++++++++++++++++++++++++++++ libavcodec/riscv/aacpsdsp_rvv.S | 39 ++++++++++++++++++++++++++++++++ 5 files changed, 81 insertions(+) create mode 100644 libavcodec/riscv/aacpsdsp_init.c create mode 100644 libavcodec/riscv/aacpsdsp_rvv.S diff --git a/libavcodec/aacpsdsp.h b/libavcodec/aacpsdsp.h index 917ac5303f..8b32761bdb 100644 --- a/libavcodec/aacpsdsp.h +++ b/libavcodec/aacpsdsp.h @@ -55,6 +55,7 @@ void AAC_RENAME(ff_psdsp_init)(PSDSPContext *s); void ff_psdsp_init_arm(PSDSPContext *s); void ff_psdsp_init_aarch64(PSDSPContext *s); void ff_psdsp_init_mips(PSDSPContext *s); +void ff_psdsp_init_riscv(PSDSPContext *s); void ff_psdsp_init_x86(PSDSPContext *s); #endif /* AVCODEC_AACPSDSP_H */ diff --git a/libavcodec/aacpsdsp_template.c b/libavcodec/aacpsdsp_template.c index e644037587..31ff718420 100644 --- a/libavcodec/aacpsdsp_template.c +++ b/libavcodec/aacpsdsp_template.c @@ -227,6 +227,8 @@ av_cold void AAC_RENAME(ff_psdsp_init)(PSDSPContext *s) ff_psdsp_init_aarch64(s); #elif ARCH_MIPS ff_psdsp_init_mips(s); +#elif ARCH_RISCV + ff_psdsp_init_riscv(s); #elif ARCH_X86 ff_psdsp_init_x86(s); #endif diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 03a95301d7..829a1823d2 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,3 +1,5 @@ +OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o +RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c new file mode 100644 index 0000000000..525fc9aa38 --- /dev/null +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/aacpsdsp.h" + +void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); + +av_cold void ff_psdsp_init_riscv(PSDSPContext *c) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->add_squares = ff_ps_add_squares_rvv; +#endif +} diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S new file mode 100644 index 0000000000..cedaab0cf0 --- /dev/null +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_ps_add_squares_rvv, zve32f +1: + vsetvli t0, a2, e32, m1, ta, ma + vlseg2e32.v v24, (a1) + slli t1, t0, 3 + vle32.v v16, (a0) + slli t2, t0, 2 + vfmacc.vv v16, v24, v24 + sub a2, a2, t0 + vfmacc.vv v16, v25, v25 + add a1, a1, t1 + vse32.v v16, (a0) + add a0, a0, t2 + bnez a2, 1b + + ret +endfunc From patchwork Tue Sep 20 14:40:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38078 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1990396pzh; Tue, 20 Sep 2022 07:43:43 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7GovVNv1Zq0hiMbr+tWnJhvdyU8ta+3GsIIBUXCpVF9/D6vLAxnzhP0srxkvhW90CwYPQt X-Received: by 2002:a17:907:86a9:b0:780:191:b7d2 with SMTP id qa41-20020a17090786a900b007800191b7d2mr17141607ejc.766.1663685023211; Tue, 20 Sep 2022 07:43:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663685023; cv=none; d=google.com; s=arc-20160816; b=KUvxGaQej1NRq9EPRWFjgXXafSHbUF0/YlUJzgH1l8VuA2GPOnZfPXs1qB8p6UJSph V325SnrVtzrapNMe0uAdZT0mAuoHQ/dNSurNazr3OxuC8sA/2E0XZmq3GHcD/BLF6M6v w1r+C5QdpUx/ENDLOdjkIfoQyExSu+6jh8rlGNPWGrig0icPMwljbOSFbCnRNxG0Pn+x 3Wf7QBUFTzZQfIzFydDENyNgZHDVpCicaoBakC8BFS8YNqmp2eneXRU7hsFaqP39FL5o MSsSyQP6Wk+ikphSCl02OSPjrtvIqHbSDFyvX9re+3kfOtyKqBfr88AUVnveD/d/pPOv Whcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=f7MENQAWtVkOKvsPZJ1+0OaEwSxEgHCcuOEa+gSNQ44=; b=tuk5EbR1mlVMKyQBVb86VIMUMI9Mtd0p1vtY6Wat6cNXLiZKkjKnO3AUpChjIbHbe1 CejMDqXOnMNrSgmslwraGCqIAIx8UF2wm/nu5/f3uPVd5ORoJ1+WOq+9tTDw7VwT1tiG MN9DWcEnn2OvO8wlLNz6pEtyTMydj4CroGQ66VhX891//chFlbvpJencfaYIEW1jo3XL i4ZYROfaBF5bs+kVMbz8i5BTUkgJG6YpAdZZPUtGg7mHFfAT/74hqw1RpGtGHHW2cqBJ DtNEoP1mCiPGV3jBudGyAkV7SJQbZg/eDVtovxV/BCHyqOHVpnnk5PE8yx5X6mlSRbdT r5RA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nc12-20020a1709071c0c00b0077c3374a293si1316362ejc.142.2022.09.20.07.43.42; Tue, 20 Sep 2022 07:43:43 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 29C7268BC02; Tue, 20 Sep 2022 17:40:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5B80668BB1F for ; Tue, 20 Sep 2022 17:40:20 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 7C279C00C6 for ; Tue, 20 Sep 2022 17:40:18 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 20 Sep 2022 17:40:13 +0300 Message-Id: <20220920144013.4959-26-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602865.DvuYhMxLoT@basile.remlab.net> References: <5602865.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 26/26] lavc/aacpsdsp: RISC-V V mul_pair_single X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: azyyp/zvwJSK From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 6 +++++- libavcodec/riscv/aacpsdsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 525fc9aa38..90c9c501c3 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -25,13 +25,17 @@ #include "libavcodec/aacpsdsp.h" void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); +void ff_ps_mul_pair_single_rvv(float (*dst)[2], float (*src0)[2], float *src1, + int n); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) + if (flags & AV_CPU_FLAG_RV_ZVE32F) { c->add_squares = ff_ps_add_squares_rvv; + c->mul_pair_single = ff_ps_mul_pair_single_rvv; + } #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index cedaab0cf0..1c174cd110 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -37,3 +37,22 @@ func ff_ps_add_squares_rvv, zve32f ret endfunc + +func ff_ps_mul_pair_single_rvv, zve32f +1: + vsetvli t0, a3, e32, m1, ta, ma + slli t1, t0, 3 + vlseg2e32.v v24, (a1) + slli t2, t0, 2 + vle32.v v16, (a2) + sub a3, a3, t0 + vfmul.vv v24, v24, v16 + add a1, a1, t1 + vfmul.vv v25, v25, v16 + add a2, a2, t2 + vsseg2e32.v v24, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc