From patchwork Thu Sep 22 18:36:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38156 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp514715pzh; Thu, 22 Sep 2022 11:37:35 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5XKFh6g1OOs0sDbgqbcT1i9g7U8MXcVE9lx8CIMhb3swSm8X6kNDvixsTiyd3sKmFJMH8K X-Received: by 2002:a17:907:2cd3:b0:77c:3e23:7bec with SMTP id hg19-20020a1709072cd300b0077c3e237becmr4089098ejc.380.1663871855554; Thu, 22 Sep 2022 11:37:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871855; cv=none; d=google.com; s=arc-20160816; b=c9FJTNSUUb8EKlih1movk8uz71zu6adIhandtLgOYdCV4a031qtBpA3PFnXJkhzSQ8 I6yoFLBBtZJEoU7p2d5XfjzP604hD0DCLZm+Sw1TnXgi0Sd0oQEQugpnWhbEfSTpoj3g XEJ+/BODMCrlAJ3Exjc5BR9R+/NQWuoLTF1bbmmQno0nCuS8KCxgqtuSaOG6juwtKBpP w5huKDnwo58v4Rj4B2cOBhU7DZvxtN1uMQU9qs2atC7R63kIpTo1sD/0PYTxyEgIC6Xb duRF2F7H8FYwmLlb0SfN5gQlDLmCGKp6ns0HqLgc5okPsRbfI5N0QBN7RaauLif9s12P W4bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=QPkPwEg5LVYUltOuzxNAHVL4r5yF8kObHCjvJOi8v9w=; b=WXuSVwIE8nWgZ1162BuRH4rS6glLBR7WyKWthnGJ0yJQ/5vcyo9WhVv5SQcr80/HLY 2FQvyNb4Z+/RjWR/Ux2dCM5HIwHWk5kXs7QU2g0MqIsBc2TpH1d3GFuUwEeAF50BoNiX Ef/8ItXKmC+jhyG46uFjOPUhKsPTOG4TR1JtJg9QAqMctfENsJYA4iBATiPrX+KBLVMN raU7UioMcJEl1O06ZCz+MCizWdwgatVemF1kaIIQSki7J4b+nlVYfxSEX5YWBdlovhFU 69JO9ouSjuMl7M/YlbMpp6n4zIMej4ygbwhHz2kEgiiv4j/i69feFDsP/f4J1yDdOjtD mQXg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a3-20020a170906468300b00780f607444asi5593821ejr.868.2022.09.22.11.37.35; Thu, 22 Sep 2022 11:37:35 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 18AF668B94F; Thu, 22 Sep 2022 21:37:33 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AD30E68B808 for ; Thu, 22 Sep 2022 21:37:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 43972C0072 for ; Thu, 22 Sep 2022 21:37:26 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:36:58 +0300 Message-Id: <20220922183726.38624-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/29] lavu/cpu: detect RISC-V base extensions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 8dBUQyqWdNBG From: Rémi Denis-Courmont This introduces compile-time and run-time CPU detection on RISC-V. In practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of I, F and D extensions, and if it does, it probably won't have run-time detection. So the flags are essentially always set. But as things stand, checkasm wants them that way. Compare the ARMV8 flag on AArch64. We are nowhere near running short on CPU flag bits. --- libavutil/cpu.c | 9 ++++++ libavutil/cpu.h | 5 +++ libavutil/cpu_internal.h | 3 ++ libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 66 +++++++++++++++++++++++++++++++++++++++ tests/checkasm/checkasm.c | 4 +++ 6 files changed, 88 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..78e92a1bf6 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -95,6 +97,9 @@ void av_force_cpu_flags(int arg){ arg |= AV_CPU_FLAG_MMX; } +#if ARCH_RISCV + arg = ff_force_cpu_flags_riscv(arg); +#endif atomic_store_explicit(&cpu_flags, arg, memory_order_relaxed); } @@ -178,6 +183,10 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV + { "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" }, + { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" }, + { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..9aae2ccc7a 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,11 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V extensions +#define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank) +#define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP) +#define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP) + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..9ddf11488b 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,9 +48,12 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); +int ff_force_cpu_flags_riscv(int flags); + size_t ff_get_cpu_max_align_mips(void); size_t ff_get_cpu_max_align_aarch64(void); size_t ff_get_cpu_max_align_arm(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..fec1f7822a --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,66 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "libavutil/log.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) +#endif + +int ff_force_cpu_flags_riscv(int flags) +{ + if ((flags & AV_CPU_FLAG_RVD) && !(flags & AV_CPU_FLAG_RVF)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "F"); + flags |= AV_CPU_FLAG_RVF; + } + + return flags; +} + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + if (hwcap & HWCAP_RV('I')) + ret |= AV_CPU_FLAG_RVI; + if (hwcap & HWCAP_RV('F')) + ret |= AV_CPU_FLAG_RVF; + if (hwcap & HWCAP_RV('D')) + ret |= AV_CPU_FLAG_RVD; +#endif + +#ifdef __riscv_i + ret |= AV_CPU_FLAG_RVI; +#endif +#if defined (__riscv_flen) && (__riscv_flen >= 32) + ret |= AV_CPU_FLAG_RVF; +#if (__riscv_flen >= 64) + ret |= AV_CPU_FLAG_RVD; +#endif +#endif + + return ret; +} diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 8fd9bba0b0..e1135a84ac 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -232,6 +232,10 @@ static const struct { { "ALTIVEC", "altivec", AV_CPU_FLAG_ALTIVEC }, { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, +#elif ARCH_RISCV + { "RVI", "rvi", AV_CPU_FLAG_RVI }, + { "RVF", "rvf", AV_CPU_FLAG_RVF }, + { "RVD", "rvd", AV_CPU_FLAG_RVD }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Thu Sep 22 18:36:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38157 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp514798pzh; Thu, 22 Sep 2022 11:37:44 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4vrFGqQj1Z82Tjhg28mi8howbWYwcykxiPDUGbS6OgFqb+9cJ2mc7BEUxJKewAgmw7CQlQ X-Received: by 2002:a17:907:75d4:b0:77a:fcb7:a2cc with SMTP id jl20-20020a17090775d400b0077afcb7a2ccmr4008589ejc.480.1663871864374; Thu, 22 Sep 2022 11:37:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871864; cv=none; d=google.com; s=arc-20160816; b=Yia17aUa9rwm6vbiEQiW3WyRyXMFyANUb7WOJ8uwmam5wtSdZTDizZ5GjNl97ZFJ/c 3oPyD4ySk4SmnxekWy5mIiotyO1eDYORndV6ZV9JKXfN6UYhx3KeFhpq6cInOMZLjkir VZz7tmj6Vs/8v9BkTuuE0Qhc98vazcvmOGU5ORpvSlxpt0W/uTLHiY95r9LAl+Lgtyti YKzL6rPAUFNtreWB5EjyqLVXAbTBZP8yftlxWEqf6GSyll5ulTOLSNPCWO6oYqI88miA 1wCUaxbRcFx0iR8tchQuePDSgjqOpBl2NecGeuPBPQ1WkXfoIiyadyXNV5C/iTrZDqrn 1BDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=O+HgSwCNmXwf3ppSUZbEowfy1qCjm51aUTfBxBrqgAo=; b=QQGLKejpQKbg5bdijObDDTiiI7bIJ9ROSk9Yc/gsmSfFdiHb+WuQ8GWnEzWZgqUTss 9AzIVhwSA4RH5Ttumg+DRQysmziBXPRMHfJoZ0snFNmZ2wuIXTtXqk92kUTZsNpMBfRo umzww/h4KTumGg91LPNkgwMcQoefn4fstuIYdrXoN1V9wUNqz3ZfY929+52QjktI5soc Jsb3JjIJlATGAMSZI/v5t4VtdrSaCuU0bBzolWmhoeOI7vxXI9HU4V20KK50hBFMUm9j sOBadc4wuSswlQx4pSQ3arrWxNNvFVlUF/diVyf90KWVIOXzqnwOi+ci6K70epj1l510 JNfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id rh28-20020a17090720fc00b00740110c599bsi4526069ejb.146.2022.09.22.11.37.43; Thu, 22 Sep 2022 11:37:44 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 18DD968BBD3; Thu, 22 Sep 2022 21:37:34 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B7DAA68B89E for ; Thu, 22 Sep 2022 21:37:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 70C39C007A for ; Thu, 22 Sep 2022 21:37:26 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:36:59 +0300 Message-Id: <20220922183726.38624-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/29] lavu/riscv: initial common header for assembler macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3KFAUdAm+d7B From: Rémi Denis-Courmont --- libavutil/riscv/asm.S | 77 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) create mode 100644 libavutil/riscv/asm.S diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S new file mode 100644 index 0000000000..dbd97f40a4 --- /dev/null +++ b/libavutil/riscv/asm.S @@ -0,0 +1,77 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * Loosely based on earlier work copyrighted by Måns Rullgård, 2008. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#if defined (__riscv_float_abi_soft) +#define NOHWF +#define NOHWD +#define HWF # +#define HWD # +#elif defined (__riscv_float_abi_single) +#define NOHWF # +#define NOHWD +#define HWF +#define HWD # +#else +#define NOHWF # +#define NOHWD # +#define HWF +#define HWD +#endif + + .macro func sym, ext= + .text + .align 2 + + .option push + .ifnb \ext + .option arch, +\ext + .endif + + .global \sym + .hidden \sym + .type \sym, %function + \sym: + + .macro endfunc + .size \sym, . - \sym + .option pop + .previous + .purgem endfunc + .endm + .endm + + .macro const sym, align=3, relocate=0 + .if \relocate + .pushsection .data.rel.ro + .else + .pushsection .rodata + .endif + .align \align + \sym: + + .macro endconst + .size \sym, . - \sym + .popsection + .purgem endconst + .endm + .endm From patchwork Thu Sep 22 18:37:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38158 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp514864pzh; Thu, 22 Sep 2022 11:37:53 -0700 (PDT) X-Google-Smtp-Source: AMsMyM58QjRFmaXd2pRzNYzYbwbWeAS98gNzzGXTkTT7QsTGvXds1MCuaYWtV0XKYP6VruyZFemG X-Received: by 2002:a17:906:4fca:b0:782:2484:6d72 with SMTP id i10-20020a1709064fca00b0078224846d72mr4122800ejw.150.1663871873454; Thu, 22 Sep 2022 11:37:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871873; cv=none; d=google.com; s=arc-20160816; b=lOQsmAf9VVCQwuUTh0eLbG1egemZ+Ne4MPB92RooJHPbPEKdYck1x639wSBeNbbvgr WUOsOOsCg7Z6vKJGzF9/yB3eKZByUaLlNluka85Dmul8GgpaBWt4d7fDKU0stm9r5HZC 9I3JIzvBvTUgHgz2toGP7WounF4cOVQz1HX1zQzkCGweJ0uJpTKxysQOzhDW3ie2Ve0q 4m1TMgQHZ0aNpUMeUFcuBs6W3kHUBObIN4QHYiadOeqcNbySaZ+9qXUsAQHO/375F3CI 6Kt8Sq2MbVcCUG9vNzmKLRgI8xPuq8pd+nDfU8v/5Fx4rIVGQ7JLmY5QDrUyUStH7Iid 6Lfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Hd++iMz3VwjtbZCtNQUPsT5V+P6VHH6YU5sCfBL1v18=; b=eGeqHV5HIHJ1CcUrz7AuJIdsyg+1Z7JFVAJIr5Fg3EZOmPdUQUaq5h+TfD0qdrmZc1 jwDqHx4VZvJR/I+VbJeyC7Y1p3oGYLQ9RdL5rxpJEpSjPBJmaXKsRYbPCTBN72/Xupfw +up/NWKePtrL/c/RDCR6NmL4Pp6TJi+OBNP6RVNoKJmKYn3JiFt/94KtZlcAbk0k6Drd FLHQiIjfDia13e+mD5oFbLcV5vIrj5S5+0Log9wF0R45q+3Lq+N+IRYK2EOW5XXglTVy UTzFWMiF0uo6zw6kcJaM3mBjS9904RAHxq6i3D4rc28GByM5I/OfGO1kb8CGnBIbsd6I Wg9w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v25-20020a50d099000000b00443088b40b7si5263699edd.123.2022.09.22.11.37.53; Thu, 22 Sep 2022 11:37:53 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1B4A668BBDB; Thu, 22 Sep 2022 21:37:35 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E124868B930 for ; Thu, 22 Sep 2022 21:37:26 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 9EA00C00AF for ; Thu, 22 Sep 2022 21:37:26 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:00 +0300 Message-Id: <20220922183726.38624-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/29] lavc/audiodsp: RISC-V F vector_clipf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: S6qhTT/N250f From: Rémi Denis-Courmont RV64G supports MIN & MAX instructions natively only on floating point registers, not general purpose ones. The later would require the Zbb extension. Due to that, it is actually faster to perform the clipping "properly" in FPU. Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech): audiodsp.vector_clipf_c: 29551.5 audiodsp.vector_clipf_rvf: 17871.0 Also tried unrolling with 2 or 8 elements but it gets worse either way. --- libavcodec/audiodsp.c | 2 ++ libavcodec/audiodsp.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/audiodsp_init.c | 33 +++++++++++++++++++++ libavcodec/riscv/audiodsp_rvf.S | 49 ++++++++++++++++++++++++++++++++ 5 files changed, 87 insertions(+) create mode 100644 libavcodec/riscv/Makefile create mode 100644 libavcodec/riscv/audiodsp_init.c create mode 100644 libavcodec/riscv/audiodsp_rvf.S diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c index ff43e87dce..eba6e809fd 100644 --- a/libavcodec/audiodsp.c +++ b/libavcodec/audiodsp.c @@ -113,6 +113,8 @@ av_cold void ff_audiodsp_init(AudioDSPContext *c) ff_audiodsp_init_arm(c); #elif ARCH_PPC ff_audiodsp_init_ppc(c); +#elif ARCH_RISCV + ff_audiodsp_init_riscv(c); #elif ARCH_X86 ff_audiodsp_init_x86(c); #endif diff --git a/libavcodec/audiodsp.h b/libavcodec/audiodsp.h index aa6fa7898b..485b512839 100644 --- a/libavcodec/audiodsp.h +++ b/libavcodec/audiodsp.h @@ -55,6 +55,7 @@ typedef struct AudioDSPContext { void ff_audiodsp_init(AudioDSPContext *c); void ff_audiodsp_init_arm(AudioDSPContext *c); void ff_audiodsp_init_ppc(AudioDSPContext *c); +void ff_audiodsp_init_riscv(AudioDSPContext *c); void ff_audiodsp_init_x86(AudioDSPContext *c); #endif /* AVCODEC_AUDIODSP_H */ diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile new file mode 100644 index 0000000000..414a9e9bd8 --- /dev/null +++ b/libavcodec/riscv/Makefile @@ -0,0 +1,2 @@ +OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ + riscv/audiodsp_rvf.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c new file mode 100644 index 0000000000..c5842815d6 --- /dev/null +++ b/libavcodec/riscv/audiodsp_init.c @@ -0,0 +1,33 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/audiodsp.h" + +void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); + +av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVF) + c->vector_clipf = ff_vector_clipf_rvf; +} diff --git a/libavcodec/riscv/audiodsp_rvf.S b/libavcodec/riscv/audiodsp_rvf.S new file mode 100644 index 0000000000..2ec8a11691 --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvf.S @@ -0,0 +1,49 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clipf_rvf, f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.x fa1, a4 +1: + flw ft0, (a1) + flw ft1, 4(a1) + fmax.s ft0, ft0, fa0 + flw ft2, 8(a1) + fmax.s ft1, ft1, fa0 + flw ft3, 12(a1) + fmax.s ft2, ft2, fa0 + addi a2, a2, -4 + fmax.s ft3, ft3, fa0 + addi a1, a1, 16 + fmin.s ft0, ft0, fa1 + fmin.s ft1, ft1, fa1 + fsw ft0, (a0) + fmin.s ft2, ft2, fa1 + fsw ft1, 4(a0) + fmin.s ft3, ft3, fa1 + fsw ft2, 8(a0) + fsw ft3, 12(a0) + addi a0, a0, 16 + bnez a2, 1b + + ret +endfunc From patchwork Thu Sep 22 18:37:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38163 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515246pzh; Thu, 22 Sep 2022 11:38:36 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4TWzt1QCMfH7pMkFa2ZEvi4BtSO3XftMY2fU0ivyX/p6Ou6ZcAl7uw+n7HqYbuj3GZ+cnL X-Received: by 2002:a17:907:270c:b0:76f:afae:2705 with SMTP id w12-20020a170907270c00b0076fafae2705mr3851991ejk.463.1663871916285; Thu, 22 Sep 2022 11:38:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871916; cv=none; d=google.com; s=arc-20160816; b=lZ/ZooCOQ39r5rRAJfM2vVTEqX0tdGiFjnUV9REWANBD3ujcCoUTbonFqlEOHF9Umb Ub26bG2L2iXfdAh03f/BcNcru4MrZb1x5+10p9wQkNz/vR6wlkHGc/aO/k7vRyVNq8/+ x13vmFKERDbZAN9tJsOTpxk+ePkfavlQ+xc4KN4Z04iQF3b3tOtWbwF7QHyJpcFrQOfJ u6l9AwCgLuRCWK9DZCFYjFmEp7cCat9mYjWrBHZ1++bfPshek0w9MXelcZKKWuYpTJbN JdYED0vJRT2NX8dtAs2h3AFNdHyxyjc1vSSDhkCNzd6CXKYY6Se4KBE2Cp4WNF4RCvGl 78HA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=/Ga+4PJJevRhqo8rey9k1goQZJwqX8UDtM5memtoZcY=; b=uv1vMtRGM3Q01t9by/HFAuQk1InfLDzYXVW2SwW8zsKdI0TEv3vQ51PtxSwlTnlOvT gDgvERL2stMMotVkTAw5mYS5QlJEHkWRNlh7Lqp0o88JK0nSiSqVm1j81UpU+eLiQNAk Nj0xSLu3vAZL9y1KGVNqQFVmZ0KWDG0VSpvCvQJ4czJMNohOwlvxlrKiPl0qyAEPUqcq zlgJcMRKZYPcAdcB0BH/VwlMM5GXJznYesKqqpfUZKRAMgqPlFTrWL/a76CpPikLKbfj pgJWp87RYsv+HEjy6BwWSJdL2ccmpL2LyT7pteQpFi2JuzXgBbpMThprpCdC0NFJeA8t Lb3g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id o6-20020a1709064f8600b0078166fe80dasi4810214eju.443.2022.09.22.11.38.35; Thu, 22 Sep 2022 11:38:36 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A04C968BB9F; Thu, 22 Sep 2022 21:37:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 22CDE68B9F1 for ; Thu, 22 Sep 2022 21:37:27 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id CC7C8C00B0 for ; Thu, 22 Sep 2022 21:37:26 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:01 +0300 Message-Id: <20220922183726.38624-4-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/29] lavc/pixblockdsp: RISC-V I get_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: QSCUZhLtSZXL From: Rémi Denis-Courmont Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech): get_pixels_c: 180.0 get_pixels_rvi: 136.7 --- libavcodec/pixblockdsp.c | 2 + libavcodec/pixblockdsp.h | 2 + libavcodec/riscv/Makefile | 2 + libavcodec/riscv/pixblockdsp_init.c | 45 ++++++++++++++++++++++ libavcodec/riscv/pixblockdsp_rvi.S | 59 +++++++++++++++++++++++++++++ 5 files changed, 110 insertions(+) create mode 100644 libavcodec/riscv/pixblockdsp_init.c create mode 100644 libavcodec/riscv/pixblockdsp_rvi.S diff --git a/libavcodec/pixblockdsp.c b/libavcodec/pixblockdsp.c index 17c487da1e..4294075cee 100644 --- a/libavcodec/pixblockdsp.c +++ b/libavcodec/pixblockdsp.c @@ -109,6 +109,8 @@ av_cold void ff_pixblockdsp_init(PixblockDSPContext *c, AVCodecContext *avctx) ff_pixblockdsp_init_arm(c, avctx, high_bit_depth); #elif ARCH_PPC ff_pixblockdsp_init_ppc(c, avctx, high_bit_depth); +#elif ARCH_RISCV + ff_pixblockdsp_init_riscv(c, avctx, high_bit_depth); #elif ARCH_X86 ff_pixblockdsp_init_x86(c, avctx, high_bit_depth); #elif ARCH_MIPS diff --git a/libavcodec/pixblockdsp.h b/libavcodec/pixblockdsp.h index 07c2ec4f40..9b002aa3d6 100644 --- a/libavcodec/pixblockdsp.h +++ b/libavcodec/pixblockdsp.h @@ -52,6 +52,8 @@ void ff_pixblockdsp_init_arm(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); void ff_pixblockdsp_init_ppc(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); +void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, AVCodecContext *avctx, + unsigned high_bit_depth); void ff_pixblockdsp_init_x86(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); void ff_pixblockdsp_init_mips(PixblockDSPContext *c, AVCodecContext *avctx, diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 414a9e9bd8..da07f1fe96 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,2 +1,4 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o +OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ + riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c new file mode 100644 index 0000000000..04bf52649f --- /dev/null +++ b/libavcodec/riscv/pixblockdsp_init.c @@ -0,0 +1,45 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/avcodec.h" +#include "libavcodec/pixblockdsp.h" + +void ff_get_pixels_8_rvi(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); +void ff_get_pixels_16_rvi(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); + +av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, + AVCodecContext *avctx, + unsigned high_bit_depth) +{ + int cpu_flags = av_get_cpu_flags(); + + if (cpu_flags & AV_CPU_FLAG_RVI) { + if (high_bit_depth) + c->get_pixels = ff_get_pixels_16_rvi; + else + c->get_pixels = ff_get_pixels_8_rvi; + } +} diff --git a/libavcodec/riscv/pixblockdsp_rvi.S b/libavcodec/riscv/pixblockdsp_rvi.S new file mode 100644 index 0000000000..93ece4405e --- /dev/null +++ b/libavcodec/riscv/pixblockdsp_rvi.S @@ -0,0 +1,59 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_get_pixels_8_rvi +.irp row, 0, 1, 2, 3, 4, 5, 6, 7 + ld t0, (a1) + add a1, a1, a2 + sd zero, ((\row * 16) + 0)(a0) + addi t6, t6, -1 + sd zero, ((\row * 16) + 8)(a0) + srli t1, t0, 8 + sb t0, ((\row * 16) + 0)(a0) + srli t2, t0, 16 + sb t1, ((\row * 16) + 2)(a0) + srli t3, t0, 24 + sb t2, ((\row * 16) + 4)(a0) + srli t4, t0, 32 + sb t3, ((\row * 16) + 6)(a0) + srli t1, t0, 40 + sb t4, ((\row * 16) + 8)(a0) + srli t2, t0, 48 + sb t1, ((\row * 16) + 10)(a0) + srli t3, t0, 56 + sb t2, ((\row * 16) + 12)(a0) + sb t3, ((\row * 16) + 14)(a0) +.endr + ret +endfunc + +func ff_get_pixels_16_rvi +.irp row, 0, 1, 2, 3, 4, 5, 6, 7 + ld t0, 0(a1) + ld t1, 8(a1) + add a1, a1, a2 + sd t0, ((\row * 16) + 0)(a0) + sd t1, ((\row * 16) + 8)(a0) +.endr + ret +endfunc From patchwork Thu Sep 22 18:37:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38175 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515996pzh; Thu, 22 Sep 2022 11:40:10 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5/L1tgD0986gLxm6Uso8PHZCwklSlxL1LGH4A/lwL2xEsG8GukaXG7oXZCedU3v3T4Ya9q X-Received: by 2002:aa7:d556:0:b0:451:f7e6:5121 with SMTP id u22-20020aa7d556000000b00451f7e65121mr4776887edr.188.1663872010660; Thu, 22 Sep 2022 11:40:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872010; cv=none; d=google.com; s=arc-20160816; b=RWMdna417JfDPnkLtgDyR0YB3qGGAXM8JNs9R4X8wKbioo0nJONGK/1EbqG+FOizXc Mjh4hwTaBqS9W4fnk5JZVyNYXgatfvdwCWX45YKigJX2pAts31SxMmBMMA71QBLLqAJL NpFAGH42PgQ4Fmu6g2bcv8/lJywvqXf/AwJIhcCt4NYp/G2BhtgS4o2HzBWgfSgi3oRK m/Qh8tejnkPsbU6/uoxt0W7uo0nRdmb+B3t9gmFUhWCse52Y4mfw5awqdn7YRWUHsjd5 /e3BCS4yhGnw1EONmSvgFYpVH515DEnUukKaTJMIFvhWOkcuBqBVCQUncAWd6+VyfQyd chuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Fsx2+NODpi3eETDrDwdD+38okKfrybhG2e87KFZEfow=; b=tGmKHBgBrTFMOHCPFYa7Bqtc44Z9/mZCUOKK5k5Kw/8+Rkn1p637pgbmPuwWRoC0Zu XJlC1dtwcI2ot4EYGtQbrk3+XzaaBHYKr+yyBzNI8dfbBxMkhM9Xcc+xjxzHGu/pNXOW 9eKNnbaoK1bpemVwk9XrK3KJxw9kO6Xx1Or00QtTIt929tQ4o+cnHZMvZFV5+eljgcFJ Dg/XiSIFiSg6PIuO+1Iih4RCfYCwJAlNkOz5X8oKrz0gBaeoTgvTFsxtkuJWFDwFfDkc YJ+ulByOWWvjI++uTZ2QqkuXLEdAeUbljty8c+vhrY3pEGy6kKbeUO48QY+UYovRRljD 6Ynw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y16-20020a056402441000b0045138471d7csi7254752eda.375.2022.09.22.11.40.10; Thu, 22 Sep 2022 11:40:10 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EAD5768BC0F; Thu, 22 Sep 2022 21:37:51 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5070868BB2D for ; Thu, 22 Sep 2022 21:37:27 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 06323C00B1 for ; Thu, 22 Sep 2022 21:37:26 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:02 +0300 Message-Id: <20220922183726.38624-5-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/29] lavu/cpu: CPU flags for the RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: KbpPyh8HGOVV From: Rémi Denis-Courmont RVV defines a total of 12 different extensions, including: - 5 different instruction subsets: - Zve32x: 8-, 16- and 32-bit integers, - Zve32f: Zve32x plus single precision floats, - Zve64x: Zve32x plus 64-bit integers, - Zve64f: Zve32f plus Zve64x, - Zve64d: Zve64f plus double precision floats. - 6 different vector lengths: - Zvl32b (embedded only), - Zvl64b (embedded only), - Zvl128b, - Zvl256b, - Zvl512b, - Zvl1024b, - and the V extension proper: equivalent to Zve64f and Zvl128b. In total, there are 6 different possible sets of supported instructions (including the empty set), but for convenience we allocate one bit for each type sets: up-to-32-bit ints (ZVE32X), floats (ZV32F), 64-bit ints (ZV64X) and doubles (ZVE64D). Whence the vector size is needed, it can be retrieved by reading the unprivileged read-only vlenb CSR. This should probably be a separate helper macro if needed at a later point. --- libavutil/cpu.c | 4 ++++ libavutil/cpu.h | 4 ++++ libavutil/riscv/cpu.c | 46 ++++++++++++++++++++++++++++++++++++++- tests/checkasm/checkasm.c | 10 ++++++--- 4 files changed, 60 insertions(+), 4 deletions(-) diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 78e92a1bf6..58ae4858b4 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -187,6 +187,10 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" }, { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" }, { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" }, + { "rvve32", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE32X}, .unit = "flags" }, + { "rvvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE32F}, .unit = "flags" }, + { "rvve64", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE64X}, .unit = "flags" }, + { "rvv", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE64D}, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9aae2ccc7a..00698e30ef 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -82,6 +82,10 @@ #define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank) #define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP) #define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP) +#define AV_CPU_FLAG_RV_ZVE32X (1 << 3) ///< Vectors of 8/16/32-bit int's */ +#define AV_CPU_FLAG_RV_ZVE32F (1 << 4) ///< Vectors of float's */ +#define AV_CPU_FLAG_RV_ZVE64X (1 << 5) ///< Vectors of 64-bit int's */ +#define AV_CPU_FLAG_RV_ZVE64D (1 << 6) ///< Vectors of double's /** * Return the flags which specify extensions supported by the CPU. diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c index fec1f7822a..6f862635b3 100644 --- a/libavutil/riscv/cpu.c +++ b/libavutil/riscv/cpu.c @@ -30,7 +30,32 @@ int ff_force_cpu_flags_riscv(int flags) { - if ((flags & AV_CPU_FLAG_RVD) && !(flags & AV_CPU_FLAG_RVF)) { + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RV_ZVE64X)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE64X"); + flags |= AV_CPU_FLAG_RV_ZVE64X; + } + + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RV_ZVE32F)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE32F"); + flags |= AV_CPU_FLAG_RV_ZVE32F; + } + + if ((flags & (AV_CPU_FLAG_RV_ZVE64X | AV_CPU_FLAG_RV_ZVE32F)) + && !(flags & AV_CPU_FLAG_RV_ZVE32X)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE32X"); + flags |= AV_CPU_FLAG_RV_ZVE32X; + } + + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RVD)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "D"); + flags |= AV_CPU_FLAG_RVD; + } + + if ((flags & (AV_CPU_FLAG_RVD | AV_CPU_FLAG_RV_ZVE32F)) + && !(flags & AV_CPU_FLAG_RVF)) { av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "F"); flags |= AV_CPU_FLAG_RVF; } @@ -50,6 +75,11 @@ int ff_get_cpu_flags_riscv(void) ret |= AV_CPU_FLAG_RVF; if (hwcap & HWCAP_RV('D')) ret |= AV_CPU_FLAG_RVD; + + /* The V extension implies all Zve* functional subsets */ + if (hwcap & HWCAP_RV('V')) + ret |= AV_CPU_FLAG_RV_ZVE32X | AV_CPU_FLAG_RV_ZVE64X + | AV_CPU_FLAG_RV_ZVE32F | AV_CPU_FLAG_RV_ZVE64D; #endif #ifdef __riscv_i @@ -60,6 +90,20 @@ int ff_get_cpu_flags_riscv(void) #if (__riscv_flen >= 64) ret |= AV_CPU_FLAG_RVD; #endif +#endif + + /* If RV-V is enabled statically at compile-time, check the details. */ +#ifdef __riscv_vectors + ret |= AV_CPU_FLAG_RV_ZVE32X; +#if __riscv_v_elen >= 64 + ret |= AV_CPU_FLAG_RV_ZVE64X; +#endif +#if __riscv_v_elen_fp >= 32 + ret |= AV_CPU_FLAG_RV_ZVE32F; +#if __riscv_v_elen_fp >= 64 + ret |= AV_CPU_FLAG_RV_ZVE64F; +#endif +#endif #endif return ret; diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index e1135a84ac..f7d108e8ea 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -233,9 +233,13 @@ static const struct { { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, #elif ARCH_RISCV - { "RVI", "rvi", AV_CPU_FLAG_RVI }, - { "RVF", "rvf", AV_CPU_FLAG_RVF }, - { "RVD", "rvd", AV_CPU_FLAG_RVD }, + { "RVI", "rvi", AV_CPU_FLAG_RVI }, + { "RVF", "rvf", AV_CPU_FLAG_RVF }, + { "RVD", "rvd", AV_CPU_FLAG_RVD }, + { "RV_Zve32x", "rv_zve32x", AV_CPU_FLAG_RV_ZVE32X }, + { "RV_Zve32f", "rv_zve32f", AV_CPU_FLAG_RV_ZVE32F }, + { "RV_Zve64x", "rv_zve64x", AV_CPU_FLAG_RV_ZVE64X }, + { "RV_Zve64d", "rv_zve64d", AV_CPU_FLAG_RV_ZVE64D }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Thu Sep 22 18:37:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38161 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515077pzh; Thu, 22 Sep 2022 11:38:20 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4ltFN11lh+FRaiLK68trE0QdYd3Rhc81v1fi8S/z6aIirqQ7keYGOdOTqqNsXCcFSYElzS X-Received: by 2002:a17:907:7b93:b0:770:1d4f:4de9 with SMTP id ne19-20020a1709077b9300b007701d4f4de9mr3999525ejc.201.1663871900107; Thu, 22 Sep 2022 11:38:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871900; cv=none; d=google.com; s=arc-20160816; b=TqWA1mPjCqg4u7/y7XsHiaFWFb4a8GBCMSJ/74MZ/f5Xm6PQJ3Ks61UwS/BRcOR19v YFcDtaN9aB10rUp3MVY5QkLPRIacafEZhTN6OfvB0DZue9Efnv0b1XOUkX34tT7YBhGf egwClTn9kBHN3dZ2mFr/BDotluRkQ3VFmWlrpucLXNAV2j5undydsx2ARkOKqk0vZ/kE fGKoqZOt6tETFJQaiAEAdcJ8YYoINdodX7JSbJxof5TCbzC1wY4a2brtIUr620L7u0QA IdUBV7rRXwOTDZ0EHVb02SsLrb7fTRS5K8Imqd41H8PCPWVx0LBQ+HD+D/9/aMrM8hIo w3NA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=8YmpSqdLN1v5/pqYFfopHFmlobODxFsnS8G8X4uMxZI=; b=r+uUVOGLTw8y3Mg8dq29Fep9HhpMqJNRrycnJgTAQg+p7Y6AFMVJADWZeBfwVW0FXz 1AN3dREZjENFbjfZBpXn0chfPyjHI64nT6cg2llRd0ySCQwpzG3pGCukA5sQaqthKx4N FOhpNaclGr4Oq5irQa4H5xY92pTiDwRxjepqiwEv6sTiJgf7sGn8hJJn0l2PRxuRYVav lExElYhzFNhW4mCepmvQ2Xn4n156doCOQXcWjBGE5m6ED1qBwymVg91EPT3mhtqm/XTa IrKb6Z2DhtOPiljawctTWXTqduTt57PZNvwWpdBaHnDsqGqT+xnzvmUl/ihbauWgvOnC y94A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id n18-20020a1709067b5200b007814add3a40si5212752ejo.981.2022.09.22.11.38.19; Thu, 22 Sep 2022 11:38:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6E27D68BBCE; Thu, 22 Sep 2022 21:37:39 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 020B768B9A4 for ; Thu, 22 Sep 2022 21:37:31 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 335D4C00B2 for ; Thu, 22 Sep 2022 21:37:27 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:03 +0300 Message-Id: <20220922183726.38624-6-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/29] configure: probe RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: fApPO52pqjn5 From: Rémi Denis-Courmont --- Makefile | 2 +- configure | 15 +++++++++++++++ ffbuild/arch.mak | 2 ++ 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 61f79e27ae..1fb742f390 100644 --- a/Makefile +++ b/Makefile @@ -91,7 +91,7 @@ ffbuild/.config: $(CONFIGURABLE_COMPONENTS) SUBDIR_VARS := CLEANFILES FFLIBS HOSTPROGS TESTPROGS TOOLS \ HEADERS ARCH_HEADERS BUILT_HEADERS SKIPHEADERS \ ARMV5TE-OBJS ARMV6-OBJS ARMV8-OBJS VFP-OBJS NEON-OBJS \ - ALTIVEC-OBJS VSX-OBJS MMX-OBJS X86ASM-OBJS \ + ALTIVEC-OBJS VSX-OBJS RVV-OBJS MMX-OBJS X86ASM-OBJS \ MIPSFPU-OBJS MIPSDSPR2-OBJS MIPSDSP-OBJS MSA-OBJS \ MMI-OBJS LSX-OBJS LASX-OBJS OBJS SLIBOBJS SHLIBOBJS \ STLIBOBJS HOSTOBJS TESTOBJS diff --git a/configure b/configure index c157338b1f..529fcae41e 100755 --- a/configure +++ b/configure @@ -462,6 +462,7 @@ Optimization options (experts only): --disable-mmi disable Loongson MMI optimizations --disable-lsx disable Loongson LSX optimizations --disable-lasx disable Loongson LASX optimizations + --disable-rvv disable RISC-V Vector optimizations --disable-fast-unaligned consider unaligned accesses slow Developer options (useful when working on FFmpeg itself): @@ -2126,6 +2127,10 @@ ARCH_EXT_LIST_PPC=" vsx " +ARCH_EXT_LIST_RISCV=" + rvv +" + ARCH_EXT_LIST_X86=" $ARCH_EXT_LIST_X86_SIMD cpunop @@ -2135,6 +2140,7 @@ ARCH_EXT_LIST_X86=" ARCH_EXT_LIST=" $ARCH_EXT_LIST_ARM $ARCH_EXT_LIST_PPC + $ARCH_EXT_LIST_RISCV $ARCH_EXT_LIST_X86 $ARCH_EXT_LIST_MIPS $ARCH_EXT_LIST_LOONGSON @@ -2642,6 +2648,8 @@ ppc4xx_deps="ppc" vsx_deps="altivec" power8_deps="vsx" +rvv_deps="riscv" + loongson2_deps="mips" loongson3_deps="mips" mmi_deps_any="loongson2 loongson3" @@ -6110,6 +6118,10 @@ elif enabled ppc; then check_cpp_condition power8 "altivec.h" "defined(_ARCH_PWR8)" fi +elif enabled riscv; then + + enabled rvv && check_inline_asm rvv '".option arch, +v\nvsetivli zero, 0, e8, m1, ta, ma"' + elif enabled x86; then check_builtin rdtsc intrin.h "__rdtsc()" @@ -7596,6 +7608,9 @@ if enabled loongarch; then echo "LSX enabled ${lsx-no}" echo "LASX enabled ${lasx-no}" fi +if enabled riscv; then + echo "RISC-V Vector enabled ${riscv-no}" +fi echo "debug symbols ${debug-no}" echo "strip symbols ${stripping-no}" echo "optimize for size ${small-no}" diff --git a/ffbuild/arch.mak b/ffbuild/arch.mak index 997e31e85e..39d76ee152 100644 --- a/ffbuild/arch.mak +++ b/ffbuild/arch.mak @@ -15,5 +15,7 @@ OBJS-$(HAVE_LASX) += $(LASX-OBJS) $(LASX-OBJS-yes) OBJS-$(HAVE_ALTIVEC) += $(ALTIVEC-OBJS) $(ALTIVEC-OBJS-yes) OBJS-$(HAVE_VSX) += $(VSX-OBJS) $(VSX-OBJS-yes) +OBJS-$(HAVE_RVV) += $(RVV-OBJS) $(RVV-OBJS-yes) + OBJS-$(HAVE_MMX) += $(MMX-OBJS) $(MMX-OBJS-yes) OBJS-$(HAVE_X86ASM) += $(X86ASM-OBJS) $(X86ASM-OBJS-yes) From patchwork Thu Sep 22 18:37:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38159 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp514925pzh; Thu, 22 Sep 2022 11:38:02 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5ODabcduX6BiRjgwb30C9F909NNo3YuNukQyXcwbJgQbYLOqjo9Ey3yIWpL2nokFTZKbTc X-Received: by 2002:a17:906:9753:b0:781:be0a:5b7 with SMTP id o19-20020a170906975300b00781be0a05b7mr4084330ejy.752.1663871882532; Thu, 22 Sep 2022 11:38:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871882; cv=none; d=google.com; s=arc-20160816; b=IEFG41m43hAI44C8TnwuSSK2A3aXfq5TNKNUEqtSWO+8GDE7MoxMwTk48pBtVopxHu IFjKdVwHjkZPhUv0jywFkuI7LN5Q1XuQpwmLwfQVsPoOwnkLJzCMbXVI7S+ZqUUmRLDc kiHoxcjFj5U2vCp/y82zF1I3wjVYXLkeBNqVDWSo68FY8TZ/66DR0s6s+UZ3DGnNMuj8 eogmfg3YRoXyLxE3cW9ZI4F9ELPLbjQGlRhbNomRXvy67mpd6dwsjaVVqToYoismWory YY0hC7fYDu93qZcgIEDRg5dxdF/gqQl0mtFq1yG3WAuY0YZP/8JwBKZ62ulPJe7czrht HlHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=V4B/YU9eFKm2OOzroHLUVbA+Oo/nZPjwJPGo0pdRm3o=; b=aDWNwCeSXECrsWA1RZXdHLfpsGCBPf0y+Q7WZudegisnLrd7g6XtQXtSCN4Km/8M+X 2NcFLJBpl01/ShL2c3wWh5nGIAr2hDBstCQ40HgRULvX4cvjRmi97UcxAJplOLcJt54q nI/pe7xLfeEZdcAIZDIVq6AIT99+vVIa+pXlxTY3iziXHaGVxCmc13h9gcVGgW0cFNfh kAa0ZCgFYvULdJELrK0MNSobioQMQQA+1uim+Qh8waQuP1hl61D2p2eEr1q/14RjYAeV DBPeB1eBSVqnHAMog0faQhNRatrU+CYYBS95EItKaFflI6CBqTO1obh8OYi9x/ESAYJh aBBg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id m9-20020a056402510900b00451b2764a9asi6531560edd.387.2022.09.22.11.38.02; Thu, 22 Sep 2022 11:38:02 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1E87668BB86; Thu, 22 Sep 2022 21:37:37 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F3C6068B94F for ; Thu, 22 Sep 2022 21:37:31 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 60FD4C00B3 for ; Thu, 22 Sep 2022 21:37:27 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:04 +0300 Message-Id: <20220922183726.38624-7-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/29] lavu/floatdsp: RISC-V V vector_fmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ghrdTI8BhDX9 From: Rémi Denis-Courmont This is based on existing code from the VLC git tree with two minor changes to account for the different function prototypes. --- libavutil/float_dsp.c | 2 ++ libavutil/float_dsp.h | 1 + libavutil/riscv/Makefile | 4 +++- libavutil/riscv/float_dsp_init.c | 39 +++++++++++++++++++++++++++++++ libavutil/riscv/float_dsp_rvv.S | 40 ++++++++++++++++++++++++++++++++ 5 files changed, 85 insertions(+), 1 deletion(-) create mode 100644 libavutil/riscv/float_dsp_init.c create mode 100644 libavutil/riscv/float_dsp_rvv.S diff --git a/libavutil/float_dsp.c b/libavutil/float_dsp.c index 8676c8b0f8..742dd679d2 100644 --- a/libavutil/float_dsp.c +++ b/libavutil/float_dsp.c @@ -156,6 +156,8 @@ av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact) ff_float_dsp_init_arm(fdsp); #elif ARCH_PPC ff_float_dsp_init_ppc(fdsp, bit_exact); +#elif ARCH_RISCV + ff_float_dsp_init_riscv(fdsp); #elif ARCH_X86 ff_float_dsp_init_x86(fdsp); #elif ARCH_MIPS diff --git a/libavutil/float_dsp.h b/libavutil/float_dsp.h index 9c664592bd..7cad9fc622 100644 --- a/libavutil/float_dsp.h +++ b/libavutil/float_dsp.h @@ -205,6 +205,7 @@ float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len); void ff_float_dsp_init_aarch64(AVFloatDSPContext *fdsp); void ff_float_dsp_init_arm(AVFloatDSPContext *fdsp); void ff_float_dsp_init_ppc(AVFloatDSPContext *fdsp, int strict); +void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp); void ff_float_dsp_init_x86(AVFloatDSPContext *fdsp); void ff_float_dsp_init_mips(AVFloatDSPContext *fdsp); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 1f818043dc..89a8d0d990 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1 +1,3 @@ -OBJS += riscv/cpu.o +OBJS += riscv/float_dsp_init.o \ + riscv/cpu.o +RVV-OBJS += riscv/float_dsp_rvv.o diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c new file mode 100644 index 0000000000..de567c50d2 --- /dev/null +++ b/libavutil/riscv/float_dsp_init.c @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/float_dsp.h" + +void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, + int len); + +av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; +#endif +} diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S new file mode 100644 index 0000000000..5095ed5bfc --- /dev/null +++ b/libavutil/riscv/float_dsp_rvv.S @@ -0,0 +1,40 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Thu Sep 22 18:37:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38160 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp514969pzh; Thu, 22 Sep 2022 11:38:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4omwEY9hKxfQkDVdduxFJD8Jgc+eQsZaJ+GBGZop1y8wF5RGOJweZfolmuG/qHyvxd0uEC X-Received: by 2002:a17:906:cc56:b0:779:ed37:b59e with SMTP id mm22-20020a170906cc5600b00779ed37b59emr3942916ejb.536.1663871891127; Thu, 22 Sep 2022 11:38:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871891; cv=none; d=google.com; s=arc-20160816; b=mggfylcvqGUfnUwsI6/LtDsYeNLMuaTaiNw617K7Hn8+n6Dxta+RQpPd80QIGykpuy j/ferl9QxK2+PriUaiM4MVAc7mczHqxKHazvZvsxMTr+wGnIaKvnVmm5nSQgx1yPaTR3 ZdqsU4AXwEE3soUFXaoQbFOZOE0/xzT8zudAB7eStO8tIou17I6NL7oA3fOPeJcK+HHc vkUuf58ADkJG5dtxMF0szS+W+B6+9u+6MMOn0RWEp51y233ZIX5Oo3mX/PK29LMkA6rX U+fCLjPW8APhz9ItjB3y/zdEO0ytUaAOJjsZuTp2LeF2sLEpBJA7iKzU+uzuPXRzR/aH hwHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=JXGO7DRPq1wEya2P5LVBv+8LA9AeKZKgzRq0mFxb5Kc=; b=a4Bi3MjH4BaEVbxLM3J80DqiXDEAZRspEOIl2CeqwrtXK7Bd/1lhNLAwZEL/43Ukfu cFfnmSlj9Ut7i0AmYSubBW2X6cjFRkpOtZyLYBy0g/+93YaRCqG2LIxivi23fKaU0WLk 185bz97uAqGV/MujKWKHUsqBUN8VpAIT/xrXSSqJ6cjAwIxNwjFglyx+tlUdmIiUIRE4 kYc0bVUWVwcU+gNp3TfxbtZMYuv8TcXdsVvm3iIVnNV2jNtBi//rMSZR97fZR7aS0Pto 4WwtYJeE12WC5NOAYkJ4oNeCVLTENBlWDwHrMw/4hyOE52NOgkhqMI+R1YmK8H1sZ7v2 lhow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ca10-20020aa7cd6a000000b00448b72be304si5147449edb.64.2022.09.22.11.38.10; Thu, 22 Sep 2022 11:38:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 71C1468BBDE; Thu, 22 Sep 2022 21:37:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F38E968B89E for ; Thu, 22 Sep 2022 21:37:31 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 8E0CDC00B4 for ; Thu, 22 Sep 2022 21:37:27 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:05 +0300 Message-Id: <20220922183726.38624-8-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/29] lavu/floatdsp: RISC-V V vector_dmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: kr9Lfx3hdW15 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 9 ++++++++- libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index de567c50d2..b829c0f736 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -28,12 +28,19 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, + int len); + av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) + if (flags & AV_CPU_FLAG_RV_ZVE32F) { fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + + if (flags & AV_CPU_FLAG_RV_ZVE64D) + fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } #endif } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 5095ed5bfc..e82d56ac15 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -38,3 +38,21 @@ NOHWF mv a2, a3 ret endfunc + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_dmul_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 +1: + vsetvli t0, a2, e64, m1, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Thu Sep 22 18:37:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38162 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515155pzh; Thu, 22 Sep 2022 11:38:28 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6OyCpS8Tu1ay9S950qTi1UDgrTZ47W4mLm3D16HMFmZW6bjn0yXXiMCzYMLHHsWbOYXdFt X-Received: by 2002:a05:6402:3508:b0:451:db83:b2a7 with SMTP id b8-20020a056402350800b00451db83b2a7mr4874500edd.266.1663871908013; Thu, 22 Sep 2022 11:38:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871908; cv=none; d=google.com; s=arc-20160816; b=dta/BFUwswGqaqJ8b9ej24Wvdr57PjXAuWP5Q7x56rXJ228jl1H8QlB0wIIdPCw9Qk bNlukqvc0PdS+eeYAGusN+FocCqqrSYSSvjSxm5iGX2LqNfcPZ0bt+Nc+3kfmvNKoZDy mH/eCzn4DZUSk6BHbAUTBpZF1cavP5/HXcId5IY8vx3JOP4MJoEUslOnyJjojD6A9zfr FXIosjR4MqK2iGnJ2Se+PbpkxVhS+u3WHuJvOVUrWVeLcR0rAGcym5XIcU5+RYPk/GR8 ckXWCT7b5cMJ8AFu2dHTPEpCHm442o4ep2wUnmPn3j5FDFCiEzMQXRNI2+MJoFiYdP4V wBIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=QyBRIH53/VMOcSa7p2MOUzP++O/Gel9fNf0MaGPHY6k=; b=SE339CUVerDgwuW2oPjBrlRRa2GlLQ7ciR12YPQtFr+eLWhHDmS8TmmuGoVhWC+4c+ mpXVkdY+lI4IKMifSDzg7BHMPOEWi0d39N2qm/ytMqYq8jGMh23XqTtJaC5lJL2NSTQo zif7uQYHH8S9HRq/bJKrHXobwuGntE4iVD4pe0eXH8af35zPveExwDsAwfKsclxl1hAv UjXATRfTA+uaNQeT5fqUiTaqmRX9J3Op/z47SBUamHTXPiV7EAtiMFVVyBYo0F3XVP1T ROXXuyIXuXVqGBvZKzVSYgyESWFUt1LtaG9107pA4zqUM4XYKg19YDdPMnELYNnmHPV/ GDsA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id o11-20020a170906974b00b00732fa13e848si7878289ejy.597.2022.09.22.11.38.27; Thu, 22 Sep 2022 11:38:28 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8F31B68BBE3; Thu, 22 Sep 2022 21:37:40 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 03DCF68B9A5 for ; Thu, 22 Sep 2022 21:37:31 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id BBC81C00B5 for ; Thu, 22 Sep 2022 21:37:27 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:06 +0300 Message-Id: <20220922183726.38624-9-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/29] lavu/floatdsp: RISC-V V vector_fmul X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: nZrNW6qe0fkb From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index b829c0f736..60b79bd59e 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -25,6 +25,8 @@ #include "libavutil/cpu.h" #include "libavutil/float_dsp.h" +void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); @@ -37,6 +39,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_RV_ZVE32F) { + fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index e82d56ac15..fb2cb54081 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -21,6 +21,24 @@ #include "config.h" #include "asm.S" +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_fmul_rvv, zve32f +1: + vsetvli t0, a3, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vle32.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 From patchwork Thu Sep 22 18:37:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38168 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515390pzh; Thu, 22 Sep 2022 11:38:53 -0700 (PDT) X-Google-Smtp-Source: AMsMyM63h2dMpk8Ym/Xj5q4UWlyNkhf7zVd5OBBKS9pX3i2PHairtRc4IVfgVkGXqX7gO1ai2pHJ X-Received: by 2002:a17:907:d0e:b0:782:6565:33b3 with SMTP id gn14-20020a1709070d0e00b00782656533b3mr2834091ejc.52.1663871933471; Thu, 22 Sep 2022 11:38:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871933; cv=none; d=google.com; s=arc-20160816; b=GLT1gWRqM7e+3cvRKKR7+5G2AZE0jvjHF+kxtRqsneAnOlHqSyFTpPmcY6XzRguCVv 0VK+y/KkkOfBSSUd9qHXF68PW98bv05rB35LW8g0q6IeHz1NsMgP9utntCv5HsRrY9sR 6JnZLZUh2cvqCTQPh8caswOwHmTL+9QYkd9sajBfBQvoJfCaa+1D+UxEZPzfCpx1yF7h 957A1xB9tAYjTPmlHNkkLD9GY/D5F2QI/AUVIOvg73lYFQeIqdrX4H8XD7VS9g7pqY+1 Qd3pXas2abP3SYeUEj3SOaZBHvl0c/+TyJoH0NPBT2kmfYxS/lJbKc+ijFMrIW3RubzJ VQ4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=P+A/WKltgFK4PsdFlukmok1qWOwBvwczYTynlwOXN7Q=; b=0FypVeKnx0qcm9rEGUJ2HrtXAB0Vg69AI8dCnmxSqQsMVbXvlFIGZSCskc7LSTZfbK QpjVT1VzlgKuvbBnBTBKUNhqwPkQfwU85rQ8ZefmTklNpgW6m5ErpQtgs7TDhqZ+N+Cy adtEEQBhpSs9JtdcgJDzwze0QBdQmHCXQkY4vN+imPKmXOqhHStQBoJy3Kapp2bBNF9I Q5Fv94QsmFLuKXVWQN7YFOmRciu5p7Fhxl7OzvjzEu6yzxHriICBaLGsEivRnclHCci7 0K3lQIHpgxhSUzqDiDThP2YkDOwKFtPA+B8Z/SPIm8mwokZeDp5jNXFYALYHlkSvaMLO gNuA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l19-20020a170906795300b0073cd3a15f42si6635195ejo.394.2022.09.22.11.38.53; Thu, 22 Sep 2022 11:38:53 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DD50468BBF0; Thu, 22 Sep 2022 21:37:43 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 33F8368B89E for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id E8966C00B6 for ; Thu, 22 Sep 2022 21:37:27 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:07 +0300 Message-Id: <20220922183726.38624-10-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/29] lavu/floatdsp: RISC-V V vector_dmul X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 6JbXi16WiMMw From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 6 +++++- libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 60b79bd59e..6027a67b46 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -30,6 +30,8 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -42,8 +44,10 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; - if (flags & AV_CPU_FLAG_RV_ZVE64D) + if (flags & AV_CPU_FLAG_RV_ZVE64D) { + fdsp->vector_dmul = ff_vector_dmul_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } } #endif } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index fb2cb54081..b16c0f3005 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -57,6 +57,24 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_dmul_rvv, zve64d +1: + vsetvli t0, a3, e64, m1, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vle64.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Thu Sep 22 18:37:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38171 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515573pzh; Thu, 22 Sep 2022 11:39:18 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7zNpZv1GLcW09z8QPS9JrHNaO8u+hFPVcL6AzQzCMoiY4bUZfOWYJtFvjULqByUwQ2KRSM X-Received: by 2002:a05:6402:2748:b0:454:762b:154b with SMTP id z8-20020a056402274800b00454762b154bmr4877915edd.27.1663871958050; Thu, 22 Sep 2022 11:39:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871958; cv=none; d=google.com; s=arc-20160816; b=Jf+Ec6drYR6ZmyCz+9fqPwN18N0x+85s/IREhV2lkU1QXv+JuNLI3mQyXLPGY3e3kX J5q64cb/ZssQb8+4dFRZ/vieso3HWnm2kSW3vHCK1QeyHLgXGc6zN/3PLMt0crBfMc95 0MSnlzebkrvC9XNyV7oup66QC21tMvPBXqKgrFD6t+9/SE6XeQ170S+iRaM1F+hAiROZ G12w9xIkVNkBwxXCeXTl5TkD8e6sZ7guHvDrefDM6xB6lvnYl3PiXdK1Mxal9W3MjbTC 6jn25eMBOMsYEbp3lMMAzWZSKvThklGiJmxCrWU1Y1MlTatCbALsDNV8MIq/Bo6/ADpm 5+3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=i21Fx4VCS0cDWgq9dnMuzpUjGHQbPqE35hnNyf/p7bE=; b=BhLqB3LekQfL57Vc6rqHBqBBb2uN9pCz614CSX4p4pyGPUi/uvjXFf+yjNUkiqfs8r 0iY7t3c7dh06FSn3Q1geRaNa1FfD4Y/RwJX6/nCY38iCaQrmUX9vEdP8ejC2ONZrMu9J l+bTHF67YlK1kD4l4pX/uClWXygjxd2IOJ3yW3WddFPXq/8T/4hZndTA5CjqENemhMws CWFUAQt+YJIjj3dlKQ5RwEgGOPOq5LkXH+H2HdcsB4hzbXBOzWKvAuDM+5dHW83l5ZVT G0aL4Alyh0Wdy7fAgyfOQ7Ee41JQu48UkIy/z1o8lfcuHSYqCO119H/stk3MY5zMhtZM ASJw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sg14-20020a170907a40e00b00780bff83216si6269327ejc.52.2022.09.22.11.39.17; Thu, 22 Sep 2022 11:39:18 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 936F368BBFF; Thu, 22 Sep 2022 21:37:46 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3746D68BA30 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 225C7C00B7 for ; Thu, 22 Sep 2022 21:37:28 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:08 +0300 Message-Id: <20220922183726.38624-11-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 11/29] lavu/floatdsp: RISC-V V vector_fmac_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: RuUdUye1LUkS From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 6027a67b46..c2d93e0cd7 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -27,6 +27,8 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, int len); +void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); @@ -42,6 +44,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_RV_ZVE32F) { fdsp->vector_fmul = ff_vector_fmul_rvv; + fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index b16c0f3005..1c1fa906e6 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -39,6 +39,25 @@ func ff_vector_fmul_rvv, zve32f ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_fmac_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v24, (a1) + add a1, a1, t1 + vle32.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 From patchwork Thu Sep 22 18:37:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38165 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515331pzh; Thu, 22 Sep 2022 11:38:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4WgX8uBupgFnDfBQElIWm282KTJrFmSrWZcfUoxUu2kFZV9Y+nkdS3rgN6Y3dBPIkdRld4 X-Received: by 2002:a17:907:1c91:b0:782:496f:271f with SMTP id nb17-20020a1709071c9100b00782496f271fmr4158581ejc.718.1663871925205; Thu, 22 Sep 2022 11:38:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871925; cv=none; d=google.com; s=arc-20160816; b=nl/dgv1uW8NfNskv+wGHasr1gvEuG4YfcjxOeHWkS1PvKkwZKM0JV2JWxecHaDCyZf rgHoijRYE4FAdx4z1U6GSdD7TjYtrP3xDXMRnly8aEROEwm/NIrLPVbDsqMUH8Tb+66G x1cgiGRn80FP3DY+p4xklHfJRZZtqF5GAVaogoRdDg+XeCAGVW4HyBb2oi/9LnDnVS7t pUopCqCt8a+T+TRkpUpudgpUyk1ijP2oI8E3z/Si2Bhn83B0w7FuACCZ5tSWPuU8a2N5 p4YtzyHWprg9WLLzl2T8+U7dO68GveQPo8EVw5A/OajRgAS9P7hNhaUYcnyOpb+LdN2I oqVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=M1PW3H3E68urE1/hCZRWDxjN3yXRtNvAg/+lUCKbQrs=; b=rgZszVmkkLsePHcTu3irzZjqhp0WyGHH5geJIb7I5MbuQOzOL4OJQZHd4wrqvaMXNs +Z7hfeTqd+XeYRHl4xOtNYXg5j/jSOWF3ZJ5gM64W18+3EijyHe5fXTRk1a2Z9dJ2dM5 DdvCtEtdpeQ8ppnhnn56PtI8mcjXzbmX+ouMv0w+KIIXDgjHo9ff0WBmyoD0wjs+wNdN wl9MUAQVBEYrTg5YjZ5y1Bar987DD1P6un0Sb5UDfCHqqfT7vqZ0dOQcy4/HZw/E435X 0NvvYEK1DV4+rsEEJ/u9ei1GnCEX5FT35K4QhAVmKXGv4XZug0cEdcjk2JWc2hfzGI3d MP4w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nc24-20020a1709071c1800b00778d193ca81si4624035ejc.550.2022.09.22.11.38.44; Thu, 22 Sep 2022 11:38:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C304568BBEC; Thu, 22 Sep 2022 21:37:42 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 342F568BA05 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 4F747C00B8 for ; Thu, 22 Sep 2022 21:37:28 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:09 +0300 Message-Id: <20220922183726.38624-12-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 12/29] lavu/floatdsp: RISC-V V vector_dmac_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: eYogY2M65YfY From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index c2d93e0cd7..d17d0f66c5 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -34,6 +34,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); +void ff_vector_dmac_scalar_rvv(double *dst, const double *src, double mul, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -49,6 +51,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; + fdsp->vector_dmac_scalar = ff_vector_dmac_scalar_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; } } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 1c1fa906e6..0d6fffe235 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -94,6 +94,25 @@ func ff_vector_dmul_rvv, zve64d ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_dmac_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 +1: + vsetvli t0, a2, e64, m1, ta, ma + slli t1, t0, 3 + vle64.v v24, (a1) + add a1, a1, t1 + vle64.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Thu Sep 22 18:37:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38170 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515508pzh; Thu, 22 Sep 2022 11:39:10 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5HevlbOVhcLyq3na8/9zOcdxo4r9DBOnqadoS3/sKwWBXcJrCVmfGdcrqEEXXZxWAZbsT/ X-Received: by 2002:a17:907:86a2:b0:781:eff0:999a with SMTP id qa34-20020a17090786a200b00781eff0999amr4031009ejc.71.1663871950029; Thu, 22 Sep 2022 11:39:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871950; cv=none; d=google.com; s=arc-20160816; b=qfX9d7ILlbH+E7Z/JJVmvwLlwpbGiVJGyQ0p4FUShuLwhzH/kxxRipAJ+qPAtiOY4K rutPemARdzIIT4uXMCHwVCaXwp8rBMGQBo8q6dEJ6lERIYy92xXrnHPOqUy2BIOX6wc2 o2+xmRWE/ioh97IDFMqt+9CG1T8MC4Rm2VDHXEmp5FDfJKu4ko3JepIEMYv5OXAeIHRv 177oGElqmrzvU31AvQkRd6VS8aJNQ3kwJaENZBGSM48U87QkOCo9tG9dhx260QAy9t7f RXFiWkWGg1hwEo8FNrTByB4puE7WCbwcDhlXVAA5W+tCgikn5yTMcUjYDVkP+RtbJ8fw znSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=N9W1Iw/cJGC7mwurs1lZC+8AmwYw66grXENAYS36XII=; b=FX2r9GXR1c4lhaj9aiYToZTZh5QfXfRNrjFun+s2kkkaHXrt/ikFmDSAD2g96k3Zqs wIwfqP0f//FiQbeQQbWLIxSEm6sZd/MgKcd9QfHHTobSGmKWfDlCk/r0aqpOb7C5S9/D B0tdTxJoodqDe4AVNim+LGMY721mp6zlHcYhImmU/k2mwmFVIgNTTy7k8gLSvvUzE8NE lerMh36WMWtC5dWdCnRqHxaZkBADU9ICsxLZvqh1/LvAfB/W0q4Knz920iMy83wRTZDG 7OMykpVtDElDNlq3TWjO5yIs1LwE64QUA2Zlc0XLgRPI2QunTySssDZyzOrLkhj2B0W9 J88A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hr23-20020a1709073f9700b00781e984151esi6299645ejc.232.2022.09.22.11.39.09; Thu, 22 Sep 2022 11:39:10 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C7CEF68BB5A; Thu, 22 Sep 2022 21:37:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 344FE68B9FE for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 7D2FAC00B9 for ; Thu, 22 Sep 2022 21:37:28 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:10 +0300 Message-Id: <20220922183726.38624-13-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 13/29] lavu/floatdsp: RISC-V V vector_fmul_add X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: sexcoKXrENHI From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 20 ++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index d17d0f66c5..2ddd2050f7 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -31,6 +31,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, + const float *src2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -48,6 +50,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 0d6fffe235..9b68187e01 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -76,6 +76,26 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) + (a3) [0..a4-1] +func ff_vector_fmul_add_rvv, zve32f +1: + vsetvli t0, a4, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v8, (a1) + add a1, a1, t1 + vle32.v v16, (a2) + add a2, a2, t1 + vle32.v v24, (a3) + add a3, a3, t1 + vfmadd.vv v8, v16, v24 + sub a4, a4, t0 + vse32.v v8, (a0) + add a0, a0, t1 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Thu Sep 22 18:37:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38169 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515453pzh; Thu, 22 Sep 2022 11:39:02 -0700 (PDT) X-Google-Smtp-Source: AMsMyM45Uln7CslqYj0bCIqVRXxeJq72Sev5LjR3uIyDfXo3P+8AawYnyoPnd1C7o29WaokbJ/7z X-Received: by 2002:a05:6402:2894:b0:453:b17b:d540 with SMTP id eg20-20020a056402289400b00453b17bd540mr4817976edb.178.1663871942385; Thu, 22 Sep 2022 11:39:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871942; cv=none; d=google.com; s=arc-20160816; b=DEVu5IlOkyoBX8iS0cDz7/R+UNtEXQ0TxVp3LaHekh+rEbLTh8WI3FVPkWlJb33RY2 OUwQPLjio9LNtiaCqEd3bep1ozS/xoY01BSwmMWXO9eB0MNI7RYKhm3Rj5JYiQePRe60 atzF/ojVbERCDqK7GECM8I47aGIs8w32a9muiKo3P375lml2Ag4jUkVNdb239IqW0DGO xBSYuHy+xatwbY2bxgFqYAcgpYa9R7k/X014YdWv7+d4d2aUT9SXlT3Oo7XKg+W2SvHZ /1s9h3LyctIvNRuC2Gf+EI9dRvSO2GFnfV8BrjBG1eUgt0ENcl+X62fZRNaBAbLbYXEf LgQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=m+iYu17eYDStbCcyU4pKfVecoF7rqEky1WwbPJkJ1t4=; b=XhoaHOMGTNoI0P77AMnyRsKVAnGXqz9aNNOPZdlJJ4F50/mffRw4F2jDezkU3tnKAA MI10wVTFVFDvLg2vAU26wOeskJIxSYWQ79LOJQpShB1T2Fqo7Wh6mmaK5PYHwDoy5A12 92c6xAnb5ulxJIgIDS1cjhaQ5ad1Y0303M2Lrv/lvAg2bfUrSwXEID/CE8RnyqLZvhWw Dg9ONrpSWsCfcA/uhOIDVWSzJLfLmUSywwHWfeNNA6WslxoNzs+8yO/wDGF+359R476E Cz9UIWqr0RaQ+6JuNouc7IZRI7GB4xhb1fXfWZvPUhB4rCpRfvdLNDjv2uwi32VYJKI3 vp1A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g5-20020a170906198500b00781bfe698d6si4699059ejd.602.2022.09.22.11.39.01; Thu, 22 Sep 2022 11:39:02 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D8AAD68BBED; Thu, 22 Sep 2022 21:37:44 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3467768BA11 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id AA59AC00BA for ; Thu, 22 Sep 2022 21:37:28 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:11 +0300 Message-Id: <20220922183726.38624-14-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 14/29] lavu/floatdsp: RISC-V V butterflies_float X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Y8+abXyXElhC From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 2ddd2050f7..f164b1308f 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -33,6 +33,7 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -51,6 +52,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 9b68187e01..0366009213 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -96,6 +96,25 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_float_rvv, zve32f +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vfadd.vv v0, v16, v24 + vfsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Thu Sep 22 18:37:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38172 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515625pzh; Thu, 22 Sep 2022 11:39:26 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7xddSGbMyq0g1QJ1ByzFHAiT85yBxF/xNcBLcXpCvtvfZ1gma/J53DoTg7rDICAPqvy/tb X-Received: by 2002:aa7:d51a:0:b0:453:9086:fc37 with SMTP id y26-20020aa7d51a000000b004539086fc37mr4826952edq.174.1663871966568; Thu, 22 Sep 2022 11:39:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871966; cv=none; d=google.com; s=arc-20160816; b=m+48QU5Wb+r35H/HaOzG1ZyNnNltm5Vcc9qHM7I+m1DvWSsfnEyEnDrvMbvuL/tpEy U6YXDb5S/hjumAhg40pVym2mLL7p6CYbsw0OyWpFVfmnvV5/ag3VQUTe6hZA0VVL7PEo tLNGLDmueU3Zl9SafOR5HjqQn0t6eTE0gdfcEodWzPDjSwOhG9li621yKlGQcJJJ7Mns 7vHUo90ygnoPdIAUrY7tWTjAlYByGVPZkXw6zKrc8oAAdC6GRUuvfdX0awW9bb3QwmNT Np+CfNAv8yDGQEbedginJGo20PqmGUzP37n/tWLy5u5HX7MXkDOyiWw2lQwaRcnQC7Xz 9Tlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=KafWOMGWK8X6JFzASheiByq7Z/FAq/fmw8macWyMzZw=; b=rkw4Z8itxzc5qZt2S0V5zTuIxEJRNncKTlhw3uQK+9Wsf4QjtsNR7k4rOvhXHz0+sC jAj2QqMZOZ47UFhmIVQtJ3PZT26vyN4gnN1s6w5iYoFlJJ1hjKL6QXI/niYvVX/mauBE p/j03y/+FRFc/IBzrDWW5D8zRe6h10yjmNRHI8TQ01YYoKFhP4Hx0K3A7Zjj/2pcHg50 u5TgJHUTlWU9SgjgJYGyy28MK2BLJT2qCDItYXKEWjSS8WS6gek97stwhBAIObteT7+p r2/ir0dbzBPgp7+t/8yV6Psg/+lLR/7db2Xm4jU7FgHPL8til+W5CNR2JJ4kYUuBORHr Doxg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i22-20020a1709064fd600b0072b7fac8a7asi6340250ejw.926.2022.09.22.11.39.26; Thu, 22 Sep 2022 11:39:26 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5562F68BC02; Thu, 22 Sep 2022 21:37:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3A3F468BA31 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id D8028C00BB for ; Thu, 22 Sep 2022 21:37:28 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:12 +0300 Message-Id: <20220922183726.38624-15-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 15/29] lavu/floatdsp: RISC-V V vector_fmul_reversed X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: KZegJbvU8oDs From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 22 ++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index f164b1308f..9b8fd9942b 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -33,6 +33,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, + const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, @@ -52,6 +54,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 0366009213..6a1304d24a 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -96,6 +96,28 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a1) * reverse(a2) [0..a3-1] +func ff_vector_fmul_reverse_rvv, zve32f + add t3, a3, -1 + li t2, -4 // byte stride + slli t3, t3, 2 + add a2, a2, t3 +1: + vsetvli t0, a3, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vlse32.v v24, (a2), t2 + sub a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] func ff_butterflies_float_rvv, zve32f 1: From patchwork Thu Sep 22 18:37:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38173 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515692pzh; Thu, 22 Sep 2022 11:39:35 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4e2E9d4rYqtCjdsyw8JGsZLreogNd6Zl4QLtXY8HCFUEvnf1kSnW4YaOY5Cv8lll0NxZyY X-Received: by 2002:a05:6402:d05:b0:425:b5c8:faeb with SMTP id eb5-20020a0564020d0500b00425b5c8faebmr4652620edb.273.1663871975640; Thu, 22 Sep 2022 11:39:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871975; cv=none; d=google.com; s=arc-20160816; b=FdnHoa+Ri6BF1t8UVW5yzYwSJE80NcUDRBcU1OEgY+/8GV09JTgj9AKg9jQu8Vtwq2 yR9v10pxPwVyAueTvv55utRxB39yVkQU0yqf9rMwtfOMelI6E1XqYoWbx8hTVJP9KqF6 qlcyMT7PmibeSA7IrQ/fiR1Zuv+5zuZAz4uTJIoJKL8H/p9SuPaNEIQezHjmxmR+fY1Z quDgYfTxLzK07DUU+ILUD0pt/HuOGdBE1qyUVFZ4uHI27Zu7dXqGN7OV9LonHtbNQeTx DN5F0PtQk4wi40v0nVqUdJm8wSJToHrNVcuVSkHtZUK9yDkjRj3duhudgpdCX7wt1AuF JA3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=DZ4AABOz1h4tg2WV9i+VMik+NfVXbBu/iGZmqne83gw=; b=BruiWo5e5noZVle4S7TtXA4+O9EsWkeRQz4mEbOZ8LBnSuDuBQ2aFtNUdv2hDjQxqh 9bW2xPOplVG9CwINgLtxlS/o+H0qfgK/msb1kk5wu7Bs+lYANsOEmUdLt9S/JZxgxVuH XTUbveyISh3O5HZMHJcdgpNytIViarhdPSPqNfBv9QvDy7q7w0c78Q22/eKG+F3ji4tp 16UTX4Cfcdqt1nC6xk5cPno1D4iq4QnM3am/sSBb42hAjt12sAXhawgaoMxQqzzfgIx/ lPAApuPY7Xce5MUv3ykgGTBHFAe2iCiR17r9IqRQ+vinU7vZe/hOv/dbmuaLFbK7CxFC Uxew== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dy16-20020a05640231f000b004548a7454f7si5271348edb.448.2022.09.22.11.39.35; Thu, 22 Sep 2022 11:39:35 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 15A9A68B7D1; Thu, 22 Sep 2022 21:37:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3E07268BB10 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 10CBEC00BC for ; Thu, 22 Sep 2022 21:37:29 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:13 +0300 Message-Id: <20220922183726.38624-16-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 16/29] lavu/floatdsp: RISC-V V vector_fmul_window X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NNlbF9uikXcG From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 35 ++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 9b8fd9942b..dacd81c08b 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -31,6 +31,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_window_rvv(float *dst, const float *src0, + const float *src1, const float *win, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, @@ -53,6 +55,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_window = ff_vector_fmul_window_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 6a1304d24a..84f675970c 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -76,6 +76,41 @@ NOHWF mv a2, a3 ret endfunc +func ff_vector_fmul_window_rvv, zve32f + // a0: dst, a1: src0, a2: src1, a3: window, a4: length + addi t0, a4, -1 + add t1, t0, a4 + slli t0, t0, 2 + slli t1, t1, 2 + add a2, a2, t0 + add t0, a0, t1 + add t3, a3, t1 + li t1, -4 // byte stride +1: + vsetvli t2, a4, e32, m1, ta, ma + slli t4, t2, 2 + vle32.v v16, (a1) + add a1, a1, t4 + vlse32.v v20, (a2), t1 + sub a2, a2, t4 + vle32.v v24, (a3) + add a3, a3, t4 + vlse32.v v28, (t3), t1 + sub t3, t3, t4 + vfmul.vv v0, v16, v28 + sub a4, a4, t2 + vfmul.vv v8, v16, v24 + vfnmsac.vv v0, v20, v24 + vfmacc.vv v8, v20, v28 + vse32.v v0, (a0) + add a0, a0, t4 + vsse32.v v8, (t0), t1 + sub t0, t0, t4 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) + (a3) [0..a4-1] func ff_vector_fmul_add_rvv, zve32f 1: From patchwork Thu Sep 22 18:37:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38167 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515850pzh; Thu, 22 Sep 2022 11:39:53 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7T3tiBFoHEn+hVMRxDtW7MErVIGsAe/qboFvbq4PmAH9pg1HiM7QADPurI4IUjEvu7kkvl X-Received: by 2002:a17:907:2bef:b0:77d:e0f3:81e6 with SMTP id gv47-20020a1709072bef00b0077de0f381e6mr4056948ejc.513.1663871992879; Thu, 22 Sep 2022 11:39:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871992; cv=none; d=google.com; s=arc-20160816; b=dZ1cegtXx+q0QsBwVJibbbm6irxDfRGKGE/eBFvddWAeTZOtHsC30y8whc/eSPTnNJ risNXu+1jq2621kDWXxd7Y/WDeFlm05SqQku17trcRyRvgH8j1N+a4I/VkB3ufItzNPo 7qz7D0w703H+B+pPNV+PGd5wXC++fAzkBrlEJEgUFE3IvNIeJ/mLGlI/31lnQLXHyCeV V9R3fnSWa8nGSY24U+yU7hJfPphfAkYw0eBZMIkcbPuJBGdxtyH5A9kG9Gz1mVMMRuBN 9S5xXr47H+p0Xv0ET9OCLZCUjjQ9ERSE5irUnF4NKVl0qrMkjCZb9aqPOZOAf1auTGyu 30Zw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=wBEZVmdHlBj6k0wTdIOHcOQI+ZJwYhvbMJAgMGbnWqY=; b=e9kF+H4WliyKd6GVu03YitrcQMgo6Wd7ygzGxEK7czTW/QR7uxBi0dFoEXNVKcNqPs swpTC1EVIIyJNT/7RVLgSpvOW9+fmYpT+yEdUWfwGnXilncEcpj3BkQ6nNd+TSu9kGjg OSJ5C0+/nJ7DtxbZt6QogZY13dKFbwPwPa3PkVqjzuMZ+eIUJE6o+nMmxw6EpXKDQZHu j6YbU1gJjJXvFh+ho9z5ZwvX/cZqmBHFBxwNX7SgA45AGN1GRRhsXzPCi5g0+HnxPQUt eOsAICQdslAvPZl8dtENbdGoQSHwPy8arM+Vr4+3Xb77mRZ8+BVneDZhbJ2ZSVJxC6SC VXEg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qa13-20020a170907868d00b0077ec350b0fbsi6020527ejc.272.2022.09.22.11.39.52; Thu, 22 Sep 2022 11:39:52 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D7FBC68BBA4; Thu, 22 Sep 2022 21:37:49 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 41B5C68BB19 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 3DD73C00BD for ; Thu, 22 Sep 2022 21:37:29 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:14 +0300 Message-Id: <20220922183726.38624-17-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 17/29] lavu/floatdsp: RISC-V V scalarproduct_float X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: mwS1ms7vZi1T From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index dacd81c08b..cc9b7e83dc 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -38,6 +38,7 @@ void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); +float ff_scalarproduct_float_rvv(const float *v1, const float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -59,6 +60,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; + fdsp->scalarproduct_float = ff_scalarproduct_float_rvv; if (flags & AV_CPU_FLAG_RV_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 84f675970c..48a44b8150 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -172,6 +172,27 @@ func ff_butterflies_float_rvv, zve32f ret endfunc +// a0 = (a0).(a1) [0..a2-1] +func ff_scalarproduct_float_rvv, zve32f + vsetvli zero, zero, e32, m1, ta, ma + vmv.s.x v8, zero +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + add a0, a0, t1 + vle32.v v24, (a1) + add a1, a1, t1 + vfmul.vv v16, v16, v24 + sub a2, a2, t0 + vfredusum.vs v8, v16, v8 + bnez a2, 1b + + vfmv.f.s fa0, v8 +NOHWF fmv.x.w a0, fa0 + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: From patchwork Thu Sep 22 18:37:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38166 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515781pzh; Thu, 22 Sep 2022 11:39:44 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4FPjtHjcloAkvxI8DJ+25O8cDkKZsfK3f1+TxlsXnhZkbfY8/K9kzEY//ywfvjU3fptg2D X-Received: by 2002:a17:907:2c77:b0:77c:59aa:c011 with SMTP id ib23-20020a1709072c7700b0077c59aac011mr4021660ejc.724.1663871984433; Thu, 22 Sep 2022 11:39:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663871984; cv=none; d=google.com; s=arc-20160816; b=lv2F/qVF9to/oSnXpk7pz8LfQqFUc+DBnqlXJG+j/yw12TEn7xyk/dVUfdzVL+k5K4 0Cr+zU61Osv/iv6aG1jcccUO/SDmMNGNKuMl6ia4zyxmSQ1X64JogishtpkqHk2bAwuI sWBNuIZENsY0Wx5TtB1LkBR+LUlqh9A/lKFG0FeUaTaVAc3BkZaK6PDcDM8pqdmfUz7y WhL/9JlGM+YaL822WnUsuqheECXGs6MCsYyc1eFJG40D7+dX75INglYURzzj5ZQBICP2 U+NZkoYWnvoOe8V3K5XDDVqBjOul/fbLD4+QtA3Dw+3ZaGYfSdMNdw7yrDOIWmQPQs1F +DuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=2p2DuqwnqB1JNgvbWwA9ERzunShnKYZuHdGvyCZE67g=; b=jhmJtRP3pGJIAFLknw5e1TPjYmeh+jUS9+g9GVS4XniNMGX6rkpfGcnS/rJA3sw/vH ibH1tahNTfHDwEYHM/na4uVAXvaA20hIQQFr7ae73i1PDg/5a8VkaXLmnaQNAGKYpkmo sCjOxMIdmlutpIVcMcpMJUckOqNUIw+lsKtPykh0NqBLKMLH8HzPJltKxEUF3G94ibJL L89YHhk44QXHbZUPynaCgWKkzrXxByKNyf3P8U0j3wIQbO9F2uGEYLqOej+ruwWjGC0w oASWmfZXQuM5jIeOniVPq5Yi1y1K/PyK0RfFL45wK8IoDDJw/OvwfHdwKoRzqfUqL8UW C0iA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id jg11-20020a170907970b00b00780bc3725a0si6326665ejc.700.2022.09.22.11.39.44; Thu, 22 Sep 2022 11:39:44 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DDDCF68BBA2; Thu, 22 Sep 2022 21:37:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4176068BB15 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6BC2FC00BE for ; Thu, 22 Sep 2022 21:37:29 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:15 +0300 Message-Id: <20220922183726.38624-18-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 18/29] lavu/fixeddsp: RISC-V V butterflies_fixed X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: VAnVD6jacVu8 From: Rémi Denis-Courmont --- libavutil/fixed_dsp.c | 4 +++- libavutil/fixed_dsp.h | 1 + libavutil/riscv/Makefile | 4 +++- libavutil/riscv/fixed_dsp_init.c | 38 +++++++++++++++++++++++++++++ libavutil/riscv/fixed_dsp_rvv.S | 41 ++++++++++++++++++++++++++++++++ 5 files changed, 86 insertions(+), 2 deletions(-) create mode 100644 libavutil/riscv/fixed_dsp_init.c create mode 100644 libavutil/riscv/fixed_dsp_rvv.S diff --git a/libavutil/fixed_dsp.c b/libavutil/fixed_dsp.c index 154f3bc2d3..bc847949dc 100644 --- a/libavutil/fixed_dsp.c +++ b/libavutil/fixed_dsp.c @@ -162,7 +162,9 @@ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int bit_exact) fdsp->butterflies_fixed = butterflies_fixed_c; fdsp->scalarproduct_fixed = scalarproduct_fixed_c; -#if ARCH_X86 +#if ARCH_RISCV + ff_fixed_dsp_init_riscv(fdsp); +#elif ARCH_X86 ff_fixed_dsp_init_x86(fdsp); #endif diff --git a/libavutil/fixed_dsp.h b/libavutil/fixed_dsp.h index fec806ff2d..1217d3a53b 100644 --- a/libavutil/fixed_dsp.h +++ b/libavutil/fixed_dsp.h @@ -161,6 +161,7 @@ typedef struct AVFixedDSPContext { */ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int strict); +void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp); void ff_fixed_dsp_init_x86(AVFixedDSPContext *fdsp); /** diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 89a8d0d990..1597154ba5 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1,3 +1,5 @@ OBJS += riscv/float_dsp_init.o \ + riscv/fixed_dsp_init.o \ riscv/cpu.o -RVV-OBJS += riscv/float_dsp_rvv.o +RVV-OBJS += riscv/float_dsp_rvv.o \ + riscv/fixed_dsp_rvv.o diff --git a/libavutil/riscv/fixed_dsp_init.c b/libavutil/riscv/fixed_dsp_init.c new file mode 100644 index 0000000000..4075e521f2 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_init.c @@ -0,0 +1,38 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/fixed_dsp.h" + +void ff_butterflies_fixed_rvv(int *v1, int *v2, int len); + +av_cold void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32X) + fdsp->butterflies_fixed = ff_butterflies_fixed_rvv; +#endif +} diff --git a/libavutil/riscv/fixed_dsp_rvv.S b/libavutil/riscv/fixed_dsp_rvv.S new file mode 100644 index 0000000000..9890a980f6 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_rvv.S @@ -0,0 +1,41 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_fixed_rvv, zve32x +1: + vsetvli t0, a2, e32, m1, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vadd.vv v0, v16, v24 + vsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc From patchwork Thu Sep 22 18:37:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38174 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp515916pzh; Thu, 22 Sep 2022 11:40:01 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4EiYQBwpYyCWCbBa3E88YXgphxsoR1hSlbxtZzrq7CJ70w1F8Rcb/bfUGZVFyrzoIVC1PW X-Received: by 2002:a17:907:75c1:b0:72f:248d:5259 with SMTP id jl1-20020a17090775c100b0072f248d5259mr4012286ejc.227.1663872001752; Thu, 22 Sep 2022 11:40:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872001; cv=none; d=google.com; s=arc-20160816; b=g0BIx6+IciZvCakME20ojTgAB6Ho5xgw7+BDR6iWnOqvMKYNCxaUqx2fpS4k1NMF43 lAW1xLC6wVQyw80TzCVf8UJpoB2Q10YyEQa55xPzQTM1EsJ3WXxFSQdJ08rj2aZhb8I8 RZD4yKQpImSeecaxISFDcVBpCeTSk3mVeUr9mH/Q9CUjcOn3L14jaJBumLEAX2eJKPlb FsxmAwQP0v9+2QWBOYf+OYmWRNwME+AI+pytpM9iwS+OmBgHVLRHNka+KsgDrjzO4098 Qp1i1fu2s6jxHcPfpWWvapIa54lFWpVWsS8uZK7hdQgtCDJveIUfbMyLQdzRW3/jfHOx 9x3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=1RYK1tQgLNxt8Pxer3ctLUAd+AgIEG36x5FafxYTHN4=; b=034lAGcWWNpcJzoq0A3L6XxjbS6ilInbnyv9f1uTFMeQ9yfH2s5nFr/Sr0m3tsukEn tK4MdXFUmuSKAQoJzP+x+CkEDaZO2f54bwRR54iExGvPZAZWlaMh4p2Tb7Xe9Uwn7CRA ZI1zTcjOxynhxZY5yjzCVkL6AVskCLtKUhZ4J+z0r2/p1F0mATBYKKNsUHmLWKgTHE1G 2qDcESfELCWKawGQzl2dvDzksHMqNMC5OUThut3ScDXfcDB4eLHSatbR3TsarCDlaCAt TYkI6u9smsxAc2pJOzoAqkjo3qBxb/NGYU+L3YLO8TT0qGHdE00xaT30c+5Rbe0l38tB b0HA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dt9-20020a170907728900b0077f2c399241si6759609ejc.243.2022.09.22.11.40.01; Thu, 22 Sep 2022 11:40:01 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E55FA68BB49; Thu, 22 Sep 2022 21:37:50 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 452F468B9A5 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 99864C00BF for ; Thu, 22 Sep 2022 21:37:29 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:16 +0300 Message-Id: <20220922183726.38624-19-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 19/29] lavc/audiodsp: RISC-V V vector_clip_int32 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3QgnmAV1lOeB From: Rémi Denis-Courmont --- libavcodec/riscv/Makefile | 1 + libavcodec/riscv/audiodsp_init.c | 9 ++++++++ libavcodec/riscv/audiodsp_rvv.S | 37 ++++++++++++++++++++++++++++++++ 3 files changed, 47 insertions(+) create mode 100644 libavcodec/riscv/audiodsp_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index da07f1fe96..99541b075e 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,4 +1,5 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o +RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index c5842815d6..ce8b60ee52 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -18,16 +18,25 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "config.h" + #include "libavutil/attributes.h" #include "libavutil/cpu.h" #include "libavcodec/audiodsp.h" void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); +void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, + int32_t max, unsigned int len); + av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) { int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_RVF) c->vector_clipf = ff_vector_clipf_rvf; +#if HAVE_RVV + if (flags & AV_CPU_FLAG_RV_ZVE32X) + c->vector_clip_int32 = ff_vector_clip_int32_rvv; +#endif } diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S new file mode 100644 index 0000000000..26b3cdffcf --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clip_int32_rvv, zve32x +1: + vsetvli t0, a4, e32, m1, ta, ma + vle32.v v8, (a1) + slli t1, t0, 2 + vmax.vx v8, v8, a2 + add a1, a1, t1 + vmin.vx v8, v8, a3 + sub a4, a4, t0 + vse32.v v8, (a0) + add a0, a0, t1 + bnez a4, 1b + + ret +endfunc From patchwork Thu Sep 22 18:37:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38181 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp516565pzh; Thu, 22 Sep 2022 11:41:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6MQMuNgh8iqShhGlzuxn+ujWxf5ytJUAPA8yQ11966bBSQ4jSu1JFPxiWkrWPqf3oC9JiF X-Received: by 2002:a17:907:94d0:b0:77e:c2e5:a35e with SMTP id dn16-20020a17090794d000b0077ec2e5a35emr4085130ejc.648.1663872071583; Thu, 22 Sep 2022 11:41:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872071; cv=none; d=google.com; s=arc-20160816; b=un/SMvSVciFnca7olX2A8/uFftDrNtQyegphtL7OXo0H91OjvLNFYtP5RzS4B3cyMW DOYOhBmfKg247wvQX2MlMsmzXMzoVBAjKgipWXADKMm5Fjy2BiVuddxuisZDJdXngs9B fCo0VDrFGh4+DJL683mt0ZHDJRE57oFJb0jBtmakzPIsaFe0LHy4m6VeQ1oxQIlk0tzq iR9ibuuwKsXKDqdBNEKLrcBszOgX9tPI2h3e3zjS6pFbsfEuAtlVhU8n+9ioC35k3zUI w3Mcy/WlzT6DqTAxA30XKX3dOlN0JdO9YiUs2+73ajbm+eX/b/XmYjSf1CQNiCnmEoUF KGww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ADDywSDejfYw0dKAs6QAyLAiDCXPl3RqNuTNw2xmbrk=; b=erRCwdSRlTKb9hPoXY1psvWC8XnSs81CF/EguKpft5sctu5ZkFFanvE7DHHUpAXG++ FJihN4NQfL/U7iX3cutBkHMJ/QuuMjV3nrhiql50hJLx1KKgbUhKwJ9XrrAOTb1uGqji 3uZj3r4IMLPUPdPYITc8o8Nl5/w5JcaQph8qoCyNp2ej5xJjRJnfUp+7BJFaxRjIB4Zu YPLycmUEva5capQ9S2FpLJBKjeXTgd2AdgMGjJN7vjFb9rteLF0C/FpOXrAo1Nu8Q84Z h4Sov+lTBi0PqK6pGT9SHPOZnlRahpKBazNNkZDAc8KzIWLZDvTrDRvkvE6M5/L8ADMu CBSg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y13-20020a056402440d00b0044e7ec802f4si6453024eda.381.2022.09.22.11.41.11; Thu, 22 Sep 2022 11:41:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DF25B68BBF6; Thu, 22 Sep 2022 21:37:58 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5C1C768BA05 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id C6D53C00C0 for ; Thu, 22 Sep 2022 21:37:29 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:17 +0300 Message-Id: <20220922183726.38624-20-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 20/29] lavc/audiodsp: RISC-V V vector_clipf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ysXatLV2Zc7S From: Rémi Denis-Courmont --- libavcodec/riscv/audiodsp_init.c | 7 ++++++- libavcodec/riscv/audiodsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index ce8b60ee52..ddd561484f 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -26,6 +26,7 @@ void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); +void ff_vector_clipf_rvv(float *dst, const float *src, int len, float min, float max); void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, int32_t max, unsigned int len); @@ -36,7 +37,11 @@ av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) if (flags & AV_CPU_FLAG_RVF) c->vector_clipf = ff_vector_clipf_rvf; #if HAVE_RVV - if (flags & AV_CPU_FLAG_RV_ZVE32X) + if (flags & AV_CPU_FLAG_RV_ZVE32X) { c->vector_clip_int32 = ff_vector_clip_int32_rvv; + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->vector_clipf = ff_vector_clipf_rvv; + } #endif } diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S index 26b3cdffcf..e5a09f3b19 100644 --- a/libavcodec/riscv/audiodsp_rvv.S +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -35,3 +35,21 @@ func ff_vector_clip_int32_rvv, zve32x ret endfunc + +func ff_vector_clipf_rvv, zve32f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.x fa1, a4 +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v8, (a1) + slli t1, t0, 2 + vfmax.vf v8, v8, fa0 + add a1, a1, t1 + vfmin.vf v8, v8, fa1 + sub a2, a2, t0 + vse32.v v8, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Thu Sep 22 18:37:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38182 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp516639pzh; Thu, 22 Sep 2022 11:41:20 -0700 (PDT) X-Google-Smtp-Source: AMsMyM62Pb/6GQzOFUyDgFg7WorKQtH0h4vIjGoafkNde1N7G/hf+SqIxF938/DXT4xgvivsnf+Z X-Received: by 2002:a17:906:bcf6:b0:781:be0a:5c8 with SMTP id op22-20020a170906bcf600b00781be0a05c8mr4022430ejb.363.1663872080070; Thu, 22 Sep 2022 11:41:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872080; cv=none; d=google.com; s=arc-20160816; b=XooMFvbZIZdq3t7P66U6sGtqW4oK6MpPpPEsj2TgqvYR+OJDD38fj3J7Od/euGNHKq LcuC30+Hjx9PF4bJ5Z4+qyFB+EPb2NdY5woTHFeUWcvhn4VyYo8X5xAsj2JUsCGpz/Si ejU3ofkvDTaGDMrlsfYHmjw/0BpjIjtDq82ONVFNZbTVM9YC4Ul5UBL2maulRL+rlF9H YFLIoMP/98TiWFnkmFxXHKnFT9OCeQBDPFybDj4bjV3os6RZMFvU1jj9bZocmzd09CEX r0WxptVMsTUi4arFlJaDDtrmaZZKrKABE1HVRfEWNYla7FrKARVYMv+9qN4+1T/D8LAD /waw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=RpiM/MkQbA2qSc2usy+Qo3R1uEm98crrbp5FFbdV3bM=; b=z/97pcsIQBpkEaUfo79a2Swpt3tv64vnMSWLwbLoJYJAgYr1lE3ESvladSC4GgtD6j 2erPH32wEE2JHFwELAngEw1pIMu6/HlnRL39SrN01Hn4jfSdntv9nLWrKOp+RLtXCM7T pnrvLCHO4g4E0bl2mrF1AS483mnhRa6K0gW1mHlgZyEsQBcz8H1Zhc7N0jzFf4voUx/p ESqRODHz3yL8r8fv8bL9CQu2BwOmCZw3m4GjWs0XAwW55or05baRx5dnYWfR+ICuglsJ nL7m47IpFLVabs5XVqzCeZq0YXL9dqAuQYmLtzLz9nnI8L+rziGFew4CGHYVM9G545zB QiPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hb12-20020a170907160c00b007414886601asi4733483ejc.25.2022.09.22.11.41.19; Thu, 22 Sep 2022 11:41:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D406468BC38; Thu, 22 Sep 2022 21:37:59 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5FB3768BB80 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 00390C00C1 for ; Thu, 22 Sep 2022 21:37:29 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:18 +0300 Message-Id: <20220922183726.38624-21-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 21/29] lavc/audiodsp: RISC-V V scalarproduct_int16 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: szTKbdXN5gp0 From: Rémi Denis-Courmont --- libavcodec/riscv/audiodsp_init.c | 2 ++ libavcodec/riscv/audiodsp_rvv.S | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c index ddd561484f..6f38b7bc83 100644 --- a/libavcodec/riscv/audiodsp_init.c +++ b/libavcodec/riscv/audiodsp_init.c @@ -29,6 +29,7 @@ void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float void ff_vector_clipf_rvv(float *dst, const float *src, int len, float min, float max); void ff_vector_clip_int32_rvv(int32_t *dst, const int32_t *src, int32_t min, int32_t max, unsigned int len); +int32_t ff_scalarproduct_int16_rvv(const int16_t *v1, const int16_t *v2, int len); av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) { @@ -38,6 +39,7 @@ av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) c->vector_clipf = ff_vector_clipf_rvf; #if HAVE_RVV if (flags & AV_CPU_FLAG_RV_ZVE32X) { + c->scalarproduct_int16 = ff_scalarproduct_int16_rvv; c->vector_clip_int32 = ff_vector_clip_int32_rvv; if (flags & AV_CPU_FLAG_RV_ZVE32F) diff --git a/libavcodec/riscv/audiodsp_rvv.S b/libavcodec/riscv/audiodsp_rvv.S index e5a09f3b19..852ae1dc1f 100644 --- a/libavcodec/riscv/audiodsp_rvv.S +++ b/libavcodec/riscv/audiodsp_rvv.S @@ -20,6 +20,26 @@ #include "libavutil/riscv/asm.S" +func ff_scalarproduct_int16_rvv, zve32x + vsetvli zero, zero, e16, m1, ta, ma + vmv.s.x v8, zero +1: + vsetvli t0, a2, e16, m1, ta, ma + vle16.v v16, (a0) + slli t1, t0, 1 + vle16.v v24, (a1) + sub a2, a2, t0 + vwmul.vv v0, v16, v24 + add a0, a0, t1 + vsetvli zero, t0, e32, m2, ta, ma + vredsum.vs v8, v0, v8 + add a1, a1, t1 + bnez a2, 1b + + vmv.x.s a0, v8 + ret +endfunc + func ff_vector_clip_int32_rvv, zve32x 1: vsetvli t0, a4, e32, m1, ta, ma From patchwork Thu Sep 22 18:37:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38177 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp516151pzh; Thu, 22 Sep 2022 11:40:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7ZcG2H+D3Qr2f9nHGhL+FIgr1QEOCPV4KRdn5Q8l9I0MJmfXc1RIhxhEVuA23qcchN17/G X-Received: by 2002:a17:906:6a0e:b0:782:31be:e91 with SMTP id qw14-20020a1709066a0e00b0078231be0e91mr3994188ejc.280.1663872029112; Thu, 22 Sep 2022 11:40:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872029; cv=none; d=google.com; s=arc-20160816; b=YhQ4K7SGVHkXHjPzrNF2YLQESY/N3FeZ0B9I1Jej7mTxpkNkHMfWaZoBKsVjkZsnmX FMxmFot3aT8u38E5F1fxqVyfZSSAdli2bgLv6xidGg8t7GpkgzVvIwf9q96ZzlFEjKoh CK2Z2PU7SfRlXXjnYjhrDGfbBoCAXHHsQ0dwcDQ0GUQOU8JTmUUDK7BI4g1KFBeLAypg p1JWG1RaP+Hix5D2MHMyNLuL6X4h6VyARGe/aMrX/n53CxCRjHhAEV9RxRvIxrWObMz6 PKeIUjEFD4Nu5BnSFnP9z4I/Jb7X8TQSc5WUyqHRj+sE8Kv+p7RgKDo163cVzBwvB43d orLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=cdmDsMaaddQKtwH0QDNn1WhvHX0lITx4vDVNUkjOpW4=; b=VQjX4pm2afwnU229qE8gWOcvHYkuC7P5XdekTB+qWfH1KM9sTVn5Bu5ma0Fd1hhWrB R/TaQrSRe1CLaGtkWp/QIxzzQ3VfWJ6tHvnkrZsYV6Kw4fiyEa6vN1E0gCKvPMDGlE+V O0nmzK0iCiigQopn6Tnv9OEiv/7ojB2txg2rgxj7om9+i9qaRadweYLAgaRPZloisXGB SSTxQ7ZKK5tx59VQVCV5B9KwkOPSYtEPQ/GRnAWZlOOoHslSR0YonWea+cV92uufuY0E h7OSCeHvSdNLhUNfAXMoG1CtIVSL5VWfdM+X6ameZIIbFEvKLBKLp0UrmCV9kJz0bVkZ fKNQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dt8-20020a170907728800b00770880dff50si6354570ejc.586.2022.09.22.11.40.28; Thu, 22 Sep 2022 11:40:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 08B1F68BB39; Thu, 22 Sep 2022 21:37:54 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 560D068BB5A for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 2D7BCC00C2 for ; Thu, 22 Sep 2022 21:37:30 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:19 +0300 Message-Id: <20220922183726.38624-22-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 22/29] lavc/fmtconvert: RISC-V V int32_to_float_fmul_scalar X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: A7j/RUA21Zvr From: Rémi Denis-Courmont --- libavcodec/fmtconvert.c | 2 ++ libavcodec/fmtconvert.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/fmtconvert_init.c | 39 +++++++++++++++++++++++++++++ libavcodec/riscv/fmtconvert_rvv.S | 40 ++++++++++++++++++++++++++++++ 5 files changed, 84 insertions(+) create mode 100644 libavcodec/riscv/fmtconvert_init.c create mode 100644 libavcodec/riscv/fmtconvert_rvv.S diff --git a/libavcodec/fmtconvert.c b/libavcodec/fmtconvert.c index cedfd61138..d889e61aca 100644 --- a/libavcodec/fmtconvert.c +++ b/libavcodec/fmtconvert.c @@ -52,6 +52,8 @@ av_cold void ff_fmt_convert_init(FmtConvertContext *c) ff_fmt_convert_init_arm(c); #elif ARCH_PPC ff_fmt_convert_init_ppc(c); +#elif ARCH_RISCV + ff_fmt_convert_init_riscv(c); #elif ARCH_X86 ff_fmt_convert_init_x86(c); #endif diff --git a/libavcodec/fmtconvert.h b/libavcodec/fmtconvert.h index da244e05a5..1cb4628a64 100644 --- a/libavcodec/fmtconvert.h +++ b/libavcodec/fmtconvert.h @@ -61,6 +61,7 @@ void ff_fmt_convert_init(FmtConvertContext *c); void ff_fmt_convert_init_aarch64(FmtConvertContext *c); void ff_fmt_convert_init_arm(FmtConvertContext *c); void ff_fmt_convert_init_ppc(FmtConvertContext *c); +void ff_fmt_convert_init_riscv(FmtConvertContext *c); void ff_fmt_convert_init_x86(FmtConvertContext *c); void ff_fmt_convert_init_mips(FmtConvertContext *c); diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 99541b075e..682174e875 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,5 +1,7 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o +OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o +RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/fmtconvert_init.c b/libavcodec/riscv/fmtconvert_init.c new file mode 100644 index 0000000000..fd2f58d060 --- /dev/null +++ b/libavcodec/riscv/fmtconvert_init.c @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/fmtconvert.h" + +void ff_int32_to_float_fmul_scalar_rvv(float *dst, const int32_t *src, + float mul, int len); + +av_cold void ff_fmt_convert_init_riscv(FmtConvertContext *c) +{ +#ifdef HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->int32_to_float_fmul_scalar = ff_int32_to_float_fmul_scalar_rvv; +#endif +} diff --git a/libavcodec/riscv/fmtconvert_rvv.S b/libavcodec/riscv/fmtconvert_rvv.S new file mode 100644 index 0000000000..c19b77e38a --- /dev/null +++ b/libavcodec/riscv/fmtconvert_rvv.S @@ -0,0 +1,40 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_int32_to_float_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v24, (a1) + slli t1, t0, 2 + vfcvt.f.x.v v24, v24 + sub a2, a2, t0 + vfmul.vf v24, v24, fa0 + add a1, a1, t1 + vse32.v v24, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Thu Sep 22 18:37:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38176 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp516081pzh; Thu, 22 Sep 2022 11:40:20 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4jJbXx90WpCceSjaXljycA8Ymbp0w4DueJ/M752EnUBsZowIpMrMrQZwEKJFyaQ4WoC3HR X-Received: by 2002:a05:6402:8c3:b0:454:2c73:3381 with SMTP id d3-20020a05640208c300b004542c733381mr4797470edz.308.1663872020562; Thu, 22 Sep 2022 11:40:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872020; cv=none; d=google.com; s=arc-20160816; b=lamEblijoxdaJX0/htRcOUpi7aH+SEdu13Fgha369Dd2WWatUlVCurW9zwZLQqBWML AMsHPP5dYg4em55CNDiDE/U9aDRFyPK3NIzTkKPqIMp1JgGu4X8QhvqEHHEB5ZOQQbt+ foYP9p37z7Zs+XExTY+Er9cUqlso78Wpjz073HxwG4eLyhw8efvcs3m61WY0L6B3C3ne l3al77pWQVkzpK26MCmrugUlbBUQEh+6y2KYayOmiAHXkHKmN/mfNwGeTcW/utuXYz9t jM7vlizu8jS37p1dcdyFYbol43Qln75UHSh+iQpoddPCmnLyCjQy7Y2EQvMWnby0Q5Gn kRaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=RDQ2SeQMAwGBXlyaCGDEXaBw/VXo00Tm10smh6FHWSU=; b=QZ7tCaHybmxwri0m9EcINA1w8mnHSYVAM6sNqz/2SgqDyXHb0j/T+Anh1/Rbx4rWTb OGCXUUA/8g685jbAWO5HGj1/Se1RVtNWOUZ7ry5+BGpg5XkApWGEkw/uwS7M9H7qsBmC n9hbS9fbNhRg2LeGdG4U21m2SyrQuU5pcosXZdsME2RpdPW2Rhtzy7JhvBqC3xko04qH 90VoSFg1eIMoDHUmAQSLwMEQYUWI6b6S3++Qb5VJF3orXph+sqYMJIWpE+Wgmacq59Ct vWkMoY3INoL3Yv3mk0gmCeMZjgcnhjBaFcjEYej2RZAeHdc7N3sB5ZDyt0WMzF1q+sTI M89Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ds13-20020a0564021ccd00b00450d34e5e94si6014723edb.250.2022.09.22.11.40.19; Thu, 22 Sep 2022 11:40:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 04C2D68BBB9; Thu, 22 Sep 2022 21:37:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5585868BB39 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 5B0B6C00C3 for ; Thu, 22 Sep 2022 21:37:30 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:20 +0300 Message-Id: <20220922183726.38624-23-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 23/29] lavc/fmtconvert: RISC-V V int32_to_float_fmul_array8 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: bhk6C7GXIS2J From: Rémi Denis-Courmont --- libavcodec/riscv/fmtconvert_init.c | 7 ++++++- libavcodec/riscv/fmtconvert_rvv.S | 29 +++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/fmtconvert_init.c b/libavcodec/riscv/fmtconvert_init.c index fd2f58d060..1796717a1c 100644 --- a/libavcodec/riscv/fmtconvert_init.c +++ b/libavcodec/riscv/fmtconvert_init.c @@ -27,13 +27,18 @@ void ff_int32_to_float_fmul_scalar_rvv(float *dst, const int32_t *src, float mul, int len); +void ff_int32_to_float_fmul_array8_rvv(FmtConvertContext *c, float *dst, + const int32_t *src, const float *mul, + int len); av_cold void ff_fmt_convert_init_riscv(FmtConvertContext *c) { #ifdef HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) + if (flags & AV_CPU_FLAG_RV_ZVE32F) { c->int32_to_float_fmul_scalar = ff_int32_to_float_fmul_scalar_rvv; + c->int32_to_float_fmul_array8 = ff_int32_to_float_fmul_array8_rvv; + } #endif } diff --git a/libavcodec/riscv/fmtconvert_rvv.S b/libavcodec/riscv/fmtconvert_rvv.S index c19b77e38a..b472be0505 100644 --- a/libavcodec/riscv/fmtconvert_rvv.S +++ b/libavcodec/riscv/fmtconvert_rvv.S @@ -38,3 +38,32 @@ NOHWF mv a2, a3 ret endfunc + +func ff_int32_to_float_fmul_array8_rvv, zve32f + srai a4, a4, 3 + +1: vsetvli t0, a4, e32, m1, ta, ma + vle32.v v24, (a3) + slli t1, t0, 2 + vlseg8e32.v v16, (a2) + slli t2, t0, 2 + 3 + vsetvli t3, zero, e32, m8, ta, ma + vfcvt.f.x.v v16, v16 + add a3, a3, t1 + vsetvli t0, a4, e32, m1, ta, ma + vfmul.vv v16, v16, v24 + add a2, a2, t2 + vfmul.vv v17, v17, v24 + sub a4, a4, t0 + vfmul.vv v18, v18, v24 + vfmul.vv v19, v19, v24 + vfmul.vv v20, v20, v24 + vfmul.vv v21, v21, v24 + vfmul.vv v22, v22, v24 + vfmul.vv v23, v23, v24 + vsseg8e32.v v16, (a1) + add a1, a1, t2 + bnez a4, 1b + + ret +endfunc From patchwork Thu Sep 22 18:37:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38178 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp516227pzh; Thu, 22 Sep 2022 11:40:38 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7wPC70ykRBnIfVFBcb/nneij9t0QOCkShCvdkFMMp9PnegAV//YsVlCOrcBRQSPsNNjFoD X-Received: by 2002:a17:906:8455:b0:773:c45b:d970 with SMTP id e21-20020a170906845500b00773c45bd970mr3987705ejy.46.1663872037754; Thu, 22 Sep 2022 11:40:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872037; cv=none; d=google.com; s=arc-20160816; b=pCtoET+dm9Pzw8M4VdnV82HY8UB2ev/xyPfS256G6vtAz6TWFD7gc6mttepJGqIEJw 6m4B0j/h1ydAdxtRG1WDBaoCRcB19u/Dik9HvIk4u+LG8dIdV8v08Iac2caHDOc+SUpx a5plKpcVcGa9vcRgrlm5ovkkxxH4CiPcrvYLasOze9pS1AzSzKpzjrJBSJrwW6dlLUCG Gwyxs2cbD4yWhklzzKVujoFBog/mFKLMxwnwJlownJEfX+EsqPMFaE2kMcQHPaaFhMe1 dx5VRODemHeDzeXA6kcTH9vAL6PqiAA++HZohFYCSQKe/ZuJdxuT33+ozU8/8oItP7D6 50OA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=V/VGqwrFqa1YlaWY99T8g1nHqBeR9R+P8203NiUHIMw=; b=rIcUrj5u4jzGqeOs156vSP8LIcFfqI6lNYRcPNPX+ISpJRUy6LpZwzyNmegXaPPAq/ Wz5FmMFuD95QCF31s/Ic8Tepu7sykO8wVLrl3Soh1B1wdzlGz31i/cBnExTE1oqVbZvh WokVt5zWOlVxKAmVmR9cXUu1nuFKS1L9CrFCzaNf7ABG87ZJa2DKkFK2G73hrpTzH/CB DXwhMN0HOYzN1l0Uryfsq8rEwvs/agN9LAiCT4aPl+aYsB4WnQPTASCh6ztPJhHyFQ09 JsI98TDQfcbtNanuNCTreHPwldrDg/9GX7g4ds2IMvk3DbAjXKG4kBwFbPJHxgF5UuaC X7kg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id w23-20020a170906385700b0073d9ea386d4si4348117ejc.983.2022.09.22.11.40.37; Thu, 22 Sep 2022 11:40:37 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E59B568BC1B; Thu, 22 Sep 2022 21:37:54 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5599468BB49 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 883D7C00C4 for ; Thu, 22 Sep 2022 21:37:30 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:21 +0300 Message-Id: <20220922183726.38624-24-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 24/29] lavc/vorbisdsp: RISC-V V inverse_coupling X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 1ezE+FDRUcCe From: Rémi Denis-Courmont This uses the following vectorisation: for (i = 0; i < blocksize; i++) { ang[i] = mag[i] - copysignf(fmaxf(ang[i], 0.f), mag[i]); mag[i] = mag[i] - copysignf(fminf(ang[i], 0.f), mag[i]); } --- libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/vorbisdsp_init.c | 37 +++++++++++++++++++++++++ libavcodec/riscv/vorbisdsp_rvv.S | 45 +++++++++++++++++++++++++++++++ libavcodec/vorbisdsp.c | 2 ++ libavcodec/vorbisdsp.h | 1 + 5 files changed, 87 insertions(+) create mode 100644 libavcodec/riscv/vorbisdsp_init.c create mode 100644 libavcodec/riscv/vorbisdsp_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 682174e875..03a95301d7 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -5,3 +5,5 @@ OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_init.o RVV-OBJS-$(CONFIG_FMTCONVERT) += riscv/fmtconvert_rvv.o OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ riscv/pixblockdsp_rvi.o +OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o +RVV-OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_rvv.o diff --git a/libavcodec/riscv/vorbisdsp_init.c b/libavcodec/riscv/vorbisdsp_init.c new file mode 100644 index 0000000000..d8432bc0f8 --- /dev/null +++ b/libavcodec/riscv/vorbisdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/vorbisdsp.h" + +void ff_vorbis_inverse_coupling_rvv(float *mag, float *ang, + ptrdiff_t blocksize); + +av_cold void ff_vorbisdsp_init_riscv(VorbisDSPContext *c) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->vorbis_inverse_coupling = ff_vorbis_inverse_coupling_rvv; +#endif +} diff --git a/libavcodec/riscv/vorbisdsp_rvv.S b/libavcodec/riscv/vorbisdsp_rvv.S new file mode 100644 index 0000000000..0a3f225149 --- /dev/null +++ b/libavcodec/riscv/vorbisdsp_rvv.S @@ -0,0 +1,45 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_vorbis_inverse_coupling_rvv, zve32f + fmv.w.x ft0, zero +1: + vsetvli t0, a2, e32, m1, ta, ma + vle32.v v16, (a1) + slli t1, t0, 2 + vle32.v v24, (a0) + sub a2, a2, t0 + vfmax.vf v8, v16, ft0 + vfmin.vf v16, v16, ft0 + vfsgnj.vv v8, v8, v24 + vfsgnj.vv v16, v16, v24 + vfsub.vv v8, v24, v8 + vfsub.vv v24, v24, v16 + vse32.v v8, (a1) + add a1, a1, t1 + vse32.v v24, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc diff --git a/libavcodec/vorbisdsp.c b/libavcodec/vorbisdsp.c index 693c44dfcb..70022bd262 100644 --- a/libavcodec/vorbisdsp.c +++ b/libavcodec/vorbisdsp.c @@ -53,6 +53,8 @@ av_cold void ff_vorbisdsp_init(VorbisDSPContext *dsp) ff_vorbisdsp_init_arm(dsp); #elif ARCH_PPC ff_vorbisdsp_init_ppc(dsp); +#elif ARCH_RISCV + ff_vorbisdsp_init_riscv(dsp); #elif ARCH_X86 ff_vorbisdsp_init_x86(dsp); #endif diff --git a/libavcodec/vorbisdsp.h b/libavcodec/vorbisdsp.h index 1775a92cf2..5c369ecf22 100644 --- a/libavcodec/vorbisdsp.h +++ b/libavcodec/vorbisdsp.h @@ -34,5 +34,6 @@ void ff_vorbisdsp_init_aarch64(VorbisDSPContext *dsp); void ff_vorbisdsp_init_x86(VorbisDSPContext *dsp); void ff_vorbisdsp_init_arm(VorbisDSPContext *dsp); void ff_vorbisdsp_init_ppc(VorbisDSPContext *dsp); +void ff_vorbisdsp_init_riscv(VorbisDSPContext *dsp); #endif /* AVCODEC_VORBISDSP_H */ From patchwork Thu Sep 22 18:37:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38164 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp516292pzh; Thu, 22 Sep 2022 11:40:46 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6RM/34ylOjKmCGcXEm17F9RXk7/ZmyKa99H7mf7iUKAsMwtB1wQcZOQ/RzmOk1WgwXnlAM X-Received: by 2002:a17:907:6e1d:b0:781:f24f:4bb with SMTP id sd29-20020a1709076e1d00b00781f24f04bbmr4004747ejc.712.1663872046256; Thu, 22 Sep 2022 11:40:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872046; cv=none; d=google.com; s=arc-20160816; b=sp8rNarBBAsCZ0ZD09naPUO2j5126gB59FvkaNOXvNbEDofnTQxRuQxx60pm0tvHvV gUnR3zRINbR9sg2diTW5T5voL3s/igcOd4RUjvZXkTI3vI+dHyQszWbVu4xmRf/+yebw cmz0TcIoXkdbtOFiyJArBXlgDgh25p0A0FgrUPCYf6aY4lLeRddX+2dUeMqtp0RFaFp3 k9xHDRaLiPs46W28h+tINTgbofYE6/Hsm3VZ6j3fQpIft8gWzjizgIPfhQiqzIXctEK0 bOlbAJRwDXgXhAHNUrIMTJ/7zTleZhCy3vMW09BCkoQEG8W3z23pH8JDkZxYW4mpF2pi 4CMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=wyZSLlLcH5ho6gvKbNgFxuTzFOgoAYE8nveU9p6fUYU=; b=LOkrnonuAJ0F1akrLa+isMjpDYdsDcHWx2DSJDZw1PPiQYRrLV57oFzUjPAgLVhcU2 JW172fXimqrwpJxa6m3CAHxvWrgmbh+OTGvqooRI9L2eLlzrGRnZkRRWhWagKfs9OfZc on9eorMyUFrC9z3gxIaCx+CZ30DN89+0Iz/3kB/Phs73TyR7ftLr+fe+wTtj/HdPCegZ RXRvG0U8durtdBBLcnfCWXsRqkSUKXJsql0hT6D0cYsLBUCo8CHrKkxZ6AmjhgwmKEgd kTYr2/2GqHd2OMOJZh7yMOgaNTbpw/AnPqGZrFyZnVYeEodFVOzBWGDdjH9RQb3ehHFO yr1Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id wg7-20020a17090705c700b00782525053dbsi2709408ejb.699.2022.09.22.11.40.45; Thu, 22 Sep 2022 11:40:46 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D662B68BC21; Thu, 22 Sep 2022 21:37:55 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 55AA468BB56 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B6098C00C5 for ; Thu, 22 Sep 2022 21:37:30 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:22 +0300 Message-Id: <20220922183726.38624-25-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 25/29] lavc/aacpsdsp: RISC-V V add_squares X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: jxdMDYtJhYiM From: Rémi Denis-Courmont --- libavcodec/aacpsdsp.h | 1 + libavcodec/aacpsdsp_template.c | 2 ++ libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/aacpsdsp_init.c | 37 ++++++++++++++++++++++++++++++ libavcodec/riscv/aacpsdsp_rvv.S | 39 ++++++++++++++++++++++++++++++++ 5 files changed, 81 insertions(+) create mode 100644 libavcodec/riscv/aacpsdsp_init.c create mode 100644 libavcodec/riscv/aacpsdsp_rvv.S diff --git a/libavcodec/aacpsdsp.h b/libavcodec/aacpsdsp.h index 917ac5303f..8b32761bdb 100644 --- a/libavcodec/aacpsdsp.h +++ b/libavcodec/aacpsdsp.h @@ -55,6 +55,7 @@ void AAC_RENAME(ff_psdsp_init)(PSDSPContext *s); void ff_psdsp_init_arm(PSDSPContext *s); void ff_psdsp_init_aarch64(PSDSPContext *s); void ff_psdsp_init_mips(PSDSPContext *s); +void ff_psdsp_init_riscv(PSDSPContext *s); void ff_psdsp_init_x86(PSDSPContext *s); #endif /* AVCODEC_AACPSDSP_H */ diff --git a/libavcodec/aacpsdsp_template.c b/libavcodec/aacpsdsp_template.c index e3cbf3feec..c063788b89 100644 --- a/libavcodec/aacpsdsp_template.c +++ b/libavcodec/aacpsdsp_template.c @@ -230,6 +230,8 @@ av_cold void AAC_RENAME(ff_psdsp_init)(PSDSPContext *s) ff_psdsp_init_aarch64(s); #elif ARCH_MIPS ff_psdsp_init_mips(s); +#elif ARCH_RISCV + ff_psdsp_init_riscv(s); #elif ARCH_X86 ff_psdsp_init_x86(s); #endif diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 03a95301d7..829a1823d2 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,3 +1,5 @@ +OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o +RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o RVV-OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_rvv.o diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c new file mode 100644 index 0000000000..525fc9aa38 --- /dev/null +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/aacpsdsp.h" + +void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); + +av_cold void ff_psdsp_init_riscv(PSDSPContext *c) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RV_ZVE32F) + c->add_squares = ff_ps_add_squares_rvv; +#endif +} diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S new file mode 100644 index 0000000000..cedaab0cf0 --- /dev/null +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -0,0 +1,39 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_ps_add_squares_rvv, zve32f +1: + vsetvli t0, a2, e32, m1, ta, ma + vlseg2e32.v v24, (a1) + slli t1, t0, 3 + vle32.v v16, (a0) + slli t2, t0, 2 + vfmacc.vv v16, v24, v24 + sub a2, a2, t0 + vfmacc.vv v16, v25, v25 + add a1, a1, t1 + vse32.v v16, (a0) + add a0, a0, t2 + bnez a2, 1b + + ret +endfunc From patchwork Thu Sep 22 18:37:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38183 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp516752pzh; Thu, 22 Sep 2022 11:41:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4dMkzFSqMBQKelmdYmAvTLCfTr2sVTXQH8Hu/66cQuY2AMmTuB2uqR1+CEiP5iYnah0cSj X-Received: by 2002:a05:6402:254b:b0:451:2b1d:d82c with SMTP id l11-20020a056402254b00b004512b1dd82cmr4823873edb.343.1663872088891; Thu, 22 Sep 2022 11:41:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872088; cv=none; d=google.com; s=arc-20160816; b=Kxbjt9CRqqs2RaKscYVlEwGIetUcnugHNYMYAY+8FRi6svf56yAgh9MTgqu3vxwr1Q Y4dfHIbkRY+Ey77s+nrGF7g1aLO6QN3O5Z5vGTDH81wYNvhH0JujN00q9OlbL7iCQk3h MgDJbXPNgnAXXNABKWI2F6obPgH8e6I5hh67A5PgMDRtCUjFHzC9K5pk4w6pPMweW7Hf S+V/CYlWsEBRDc2X73JeI5NDDVbgGV7qskb6vjY2G1Sgxqn3siY4WgOIsMZZn54SdiBT 6M9CSCH1t3DU1IFDjVO5u5HxkoKQNcqdDHGCi0BLQZDfeVZhJ471D04O3Z8uzDwbN/rm 8qVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=f7MENQAWtVkOKvsPZJ1+0OaEwSxEgHCcuOEa+gSNQ44=; b=xAmHEhfuLVhGSh81AAPAxrFfxsIkmCac8P8Cy6kjFgk8vOmS9Ofmqj/5hFlOhPV52k qaQgKvj7kfN42vDWEa4N/9AYHDrElEfvxh5E0n0KI/dQYAzEB6Jej95WxOXpdoceIt9Q jKFpzHJxRCiSJmoZibZ00PefDNPXufl31m4uCoI/Lk73tdwd0m3yQstFRRRmd32FdR6X p6/T3nfnhpBx+FUFNnRzVbxQqmfmBpFqzU//NJoY79Uv58q15V9r+G0791AjX2k5m8RP hVMtTh7Zqe16jQPC00WGKyZYL/7YggNMgw8vHJqEJMI3JcJyTetxBe0EGIHYgaSn/MOL +T3w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g23-20020a056402321700b004539b045326si5929048eda.417.2022.09.22.11.41.28; Thu, 22 Sep 2022 11:41:28 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B5F2468BC3E; Thu, 22 Sep 2022 21:38:00 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6FA4B68B900 for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id E3EAAC00C6 for ; Thu, 22 Sep 2022 21:37:30 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:23 +0300 Message-Id: <20220922183726.38624-26-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 26/29] lavc/aacpsdsp: RISC-V V mul_pair_single X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ut8KzdfuHmKd From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 6 +++++- libavcodec/riscv/aacpsdsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 525fc9aa38..90c9c501c3 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -25,13 +25,17 @@ #include "libavcodec/aacpsdsp.h" void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); +void ff_ps_mul_pair_single_rvv(float (*dst)[2], float (*src0)[2], float *src1, + int n); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) + if (flags & AV_CPU_FLAG_RV_ZVE32F) { c->add_squares = ff_ps_add_squares_rvv; + c->mul_pair_single = ff_ps_mul_pair_single_rvv; + } #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index cedaab0cf0..1c174cd110 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -37,3 +37,22 @@ func ff_ps_add_squares_rvv, zve32f ret endfunc + +func ff_ps_mul_pair_single_rvv, zve32f +1: + vsetvli t0, a3, e32, m1, ta, ma + slli t1, t0, 3 + vlseg2e32.v v24, (a1) + slli t2, t0, 2 + vle32.v v16, (a2) + sub a3, a3, t0 + vfmul.vv v24, v24, v16 + add a1, a1, t1 + vfmul.vv v25, v25, v16 + add a2, a2, t2 + vsseg2e32.v v24, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc From patchwork Thu Sep 22 18:37:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38179 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp516378pzh; Thu, 22 Sep 2022 11:40:55 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5SdpA+gibxldgriIC+5ntXa2zgkDxmfkqxU6M8/i43JMJ0r7k26VdhrmFXNr4wQCcpMEiU X-Received: by 2002:a17:907:7da3:b0:776:a0ae:5147 with SMTP id oz35-20020a1709077da300b00776a0ae5147mr3883545ejc.662.1663872054996; Thu, 22 Sep 2022 11:40:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872054; cv=none; d=google.com; s=arc-20160816; b=Np34l8CH5J0fk2NMZ1sfOTRpkAIIhs5zojuTOZMnmFDQ36vNPSRnZVG5q8FGo8jK+w E/X9wqzlzXYZaVLFheSjX1haSYP5KfXAxeIo1ffv0OiAazNNzF7sbS3OPOnI1texMPdQ mZBZ9T52nIpAfPjHtm8Flf+MUh45k6h+sTyI08ZAopvjjAbb+B2oBx7D301rREZx3CPf xAOLgwRxhtkQtLi1NdF/jl6Kz509k9nd1uX/cmo1mzJAveRuOqzcL61R5i2/No5OBK2s 24X16qdumHWtSNyHbSPk6rq63wvdv6YWKScOTKtVHGyeI6baxjT+ls0JNAYwnhiYLSUl K27w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=9Nk1Rd1k8Seso4XufWid69AM+OB9MO4FyUsBZmZSz9A=; b=fuRkzEYWG4T0yrI3lRClXqH2xwU7v4TQjTi0x8Ms6x0agwB0GBUUzhqDsjb2wHev9G q9aJQ7V8VzV1+0iJc3S/m9A6/X+r8RLN3QG9n4OUI4i+F7724iU0WxdeNaLPaDsltdHB RfeDRsAleZY448jYGu8BCWLUBTbAoupg6cVs4rxfsrbv3Qjeue+K9iBIXVFHspk3vYzb 9kuuqeg4Sosum7BSCu0z2eEOLBcDKkYwUeIO5/we5kewDvvbhZZzjh6qN7N5A9DdYdCs IZV9I+kUXOODd0G3nEpRiJeCYiIoGQgwrymg3l81h0UnrEmpVCec/qlGSEwrxIPVn4sA KgSw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id cs19-20020a170906dc9300b00773db106d4dsi6445454ejc.588.2022.09.22.11.40.54; Thu, 22 Sep 2022 11:40:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C9D2668BC26; Thu, 22 Sep 2022 21:37:56 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5706868BB5C for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 1D7D3C00C7 for ; Thu, 22 Sep 2022 21:37:31 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:24 +0300 Message-Id: <20220922183726.38624-27-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 27/29] lavc/aacpsdsp: RISC-V V hybrid_analysis X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 7yxFwcX5yKkr From: Rémi Denis-Courmont This starts with one-time initialisation of the 26 constant factors like 08edacc248bce3f8946d75e97188d189c74a6de6. That is done with the scalar instruction set. While the formula can readily be vectored, the gains would (probably) be more than lost in transfering the results back to FP registers (or suitably reshuffling them into vector registers). Note that the main loop could likely be scheduled sligthly better by expanding the filter macro and interleaving loads with arithmetic. It is not clear yet if that would be relevant for vector processing (as opposed to traditional SIMD). We could also use fewer vectors, but there is not much point in sparing them (they are *all* callee-clobbered). --- libavcodec/riscv/aacpsdsp_init.c | 3 + libavcodec/riscv/aacpsdsp_rvv.S | 97 ++++++++++++++++++++++++++++++++ 2 files changed, 100 insertions(+) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 90c9c501c3..6222d6f787 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -27,6 +27,8 @@ void ff_ps_add_squares_rvv(float *dst, const float (*src)[2], int n); void ff_ps_mul_pair_single_rvv(float (*dst)[2], float (*src0)[2], float *src1, int n); +void ff_ps_hybrid_analysis_rvv(float (*out)[2], float (*in)[2], + const float (*filter)[8][2], ptrdiff_t, int n); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { @@ -36,6 +38,7 @@ av_cold void ff_psdsp_init_riscv(PSDSPContext *c) if (flags & AV_CPU_FLAG_RV_ZVE32F) { c->add_squares = ff_ps_add_squares_rvv; c->mul_pair_single = ff_ps_mul_pair_single_rvv; + c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; } #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index 1c174cd110..993462de29 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -56,3 +56,100 @@ func ff_ps_mul_pair_single_rvv, zve32f ret endfunc + +func ff_ps_hybrid_analysis_rvv, zve32f + /* We need 26 FP registers, for 20 scratch ones. Spill fs0-fs5. */ + addi sp, sp, -32 + .irp n, 0, 1, 2, 3, 4, 5 + fsw fs\n, (4 * \n)(sp) + .endr + + .macro input, j, fd0, fd1, fd2, fd3 + flw \fd0, (4 * ((\j * 2) + 0))(a1) + flw fs4, (4 * (((12 - \j) * 2) + 0))(a1) + flw \fd1, (4 * ((\j * 2) + 1))(a1) + fsub.s \fd3, \fd0, fs4 + flw fs5, (4 * (((12 - \j) * 2) + 1))(a1) + fadd.s \fd2, \fd1, fs5 + fadd.s \fd0, \fd0, fs4 + fsub.s \fd1, \fd1, fs5 + .endm + + // re0, re1, im0, im1 + input 0, ft0, ft1, ft2, ft3 + input 1, ft4, ft5, ft6, ft7 + input 2, ft8, ft9, ft10, ft11 + input 3, fa0, fa1, fa2, fa3 + input 4, fa4, fa5, fa6, fa7 + input 5, fs0, fs1, fs2, fs3 + flw fs4, (4 * ((6 * 2) + 0))(a1) + flw fs5, (4 * ((6 * 2) + 1))(a1) + + add a2, a2, 6 * 2 * 4 // point to filter[i][6][0] + li t4, 8 * 2 * 4 // filter byte stride + slli a3, a3, 3 // output byte stride +1: + .macro filter, vs0, vs1, fo0, fo1, fo2, fo3 + vfmacc.vf v8, \fo0, \vs0 + vfmacc.vf v9, \fo2, \vs0 + vfnmsac.vf v8, \fo1, \vs1 + vfmacc.vf v9, \fo3, \vs1 + .endm + + vsetvli t0, a4, e32, m1, ta, ma + /* + * The filter (a2) has 16 segments, of which 13 need to be extracted. + * R-V V supports only up to 8 segments, so unrolling is unavoidable. + */ + addi t1, a2, -48 + vlse32.v v22, (a2), t4 + addi t2, a2, -44 + vlse32.v v16, (t1), t4 + addi t1, a2, -40 + vfmul.vf v8, v22, fs4 + vlse32.v v24, (t2), t4 + addi t2, a2, -36 + vfmul.vf v9, v22, fs5 + vlse32.v v17, (t1), t4 + addi t1, a2, -32 + vlse32.v v25, (t2), t4 + addi t2, a2, -28 + filter v16, v24, ft0, ft1, ft2, ft3 + vlse32.v v18, (t1), t4 + addi t1, a2, -24 + vlse32.v v26, (t2), t4 + addi t2, a2, -20 + filter v17, v25, ft4, ft5, ft6, ft7 + vlse32.v v19, (t1), t4 + addi t1, a2, -16 + vlse32.v v27, (t2), t4 + addi t2, a2, -12 + filter v18, v26, ft8, ft9, ft10, ft11 + vlse32.v v20, (t1), t4 + addi t1, a2, -8 + vlse32.v v28, (t2), t4 + addi t2, a2, -4 + filter v19, v27, fa0, fa1, fa2, fa3 + vlse32.v v21, (t1), t4 + sub a4, a4, t0 + vlse32.v v29, (t2), t4 + slli t1, t0, 3 + 1 + 2 // ctz(8 * 2 * 4) + add a2, a2, t1 + filter v20, v28, fa4, fa5, fa6, fa7 + filter v21, v29, fs0, fs1, fs2, fs3 + + add t2, a0, 4 + vsse32.v v8, (a0), a3 + mul t0, t0, a3 + vsse32.v v9, (t2), a3 + add a0, a0, t0 + bnez a4, 1b + + .irp n, 5, 4, 3, 2, 1, 0 + flw fs\n, (4 * \n)(sp) + .endr + addi sp, sp, 32 + ret + .purgem input + .purgem filter +endfunc From patchwork Thu Sep 22 18:37:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38180 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp516479pzh; Thu, 22 Sep 2022 11:41:04 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5RVJRHOcVfsE+VWWeBsugEUpeo2xXRKdKHNs9LH0ZoIDMDgaDLEgohu02X2CMFQjER6eZv X-Received: by 2002:a05:6402:3211:b0:453:ba03:9dee with SMTP id g17-20020a056402321100b00453ba039deemr4780040eda.351.1663872064407; Thu, 22 Sep 2022 11:41:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872064; cv=none; d=google.com; s=arc-20160816; b=DTyYSmR3/yX8XG8nwDSKOHMV2hrRkra3W6Xdd3jJo0qC5TRFs8whJ9NPrqmNKrMQdB 14gVsP++pyGQdAroQ6OHRtokMr1ADgiOq3Q6Zd8JAComIMmVpEq2TMDgMfFN0Zf4pCUl y2OBmZj1DK/Oo2oAtrPlmg5/1Oa/X8ao28RWL79aqNcRxhaM5lxPFeX/XjLVuDyMMlN/ YiAb0yLoY4Ir2arhq94Zp78VmbepVeiW9gA9oSrftKcp4t68QWPuT/Ck5pOhnupx1yt2 aFwoCVHktgI9TQ1bBpDwjTHY2uMDNBSpVzpK6XqgHlh25NEg1/EzKTWJHj0c5Q3OywzA tvUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=wsFAluOYe+7WthxiTu8Pcrlldz+WlDurmWC9DZutTOU=; b=llIN63xtUZ3D2+KG1ESK3piwuihXuaPNfKhWcp/6oeVGLz2/wlhybc/kFcQnGa0nuI ++sqrr312mqTZiVc46spL11KKHivJhDPAO7hLGND9mxvJ6OEDSn6vycF7mJjZPR90BTg OVDI183iJyBE9wpK3wRr9xtSo5lhnEw+YnKe84/0HwEQJp/Ush5Q7Up9XcaPVP4aYUeV Wf0MJZu/ewazD+gLOiNPrUC+SjLcOAWAl1utuHYFWJ35UzA0NEjReZnxPY9MuBm00F48 RAgvOIv6m22bX856fSiMVZKoUtj7YciskYGOKd7bT/scbbvCzVesnp8/TLPwbgf6DQZT MkUg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sh13-20020a1709076e8d00b0073dced7204bsi5553453ejc.767.2022.09.22.11.41.02; Thu, 22 Sep 2022 11:41:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CF6B568BB56; Thu, 22 Sep 2022 21:37:57 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 59A7068BB6F for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 4A620C00C8 for ; Thu, 22 Sep 2022 21:37:31 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:25 +0300 Message-Id: <20220922183726.38624-28-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 28/29] lavc/aacpsdsp: RISC-V V hybrid_analysis_ileave X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: b+KqSoxYrfcn From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 14 ++++++++---- libavcodec/riscv/aacpsdsp_rvv.S | 37 ++++++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+), 4 deletions(-) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 6222d6f787..76f55502ee 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -29,16 +29,22 @@ void ff_ps_mul_pair_single_rvv(float (*dst)[2], float (*src0)[2], float *src1, int n); void ff_ps_hybrid_analysis_rvv(float (*out)[2], float (*in)[2], const float (*filter)[8][2], ptrdiff_t, int n); +void ff_ps_hybrid_analysis_ileave_rvv(float (*out)[32][2], float L[2][38][64], + int i, int len); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RV_ZVE32F) { - c->add_squares = ff_ps_add_squares_rvv; - c->mul_pair_single = ff_ps_mul_pair_single_rvv; - c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; + if (flags & AV_CPU_FLAG_RV_ZVE32X) { + c->hybrid_analysis_ileave = ff_ps_hybrid_analysis_ileave_rvv; + + if (flags & AV_CPU_FLAG_RV_ZVE32F) { + c->add_squares = ff_ps_add_squares_rvv; + c->mul_pair_single = ff_ps_mul_pair_single_rvv; + c->hybrid_analysis = ff_ps_hybrid_analysis_rvv; + } } #endif } diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index 993462de29..9c7bda1098 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -153,3 +153,40 @@ func ff_ps_hybrid_analysis_rvv, zve32f .purgem input .purgem filter endfunc + +func ff_ps_hybrid_analysis_ileave_rvv, zve32x /* no needs for zve32f here */ + slli t0, a2, 5 + 1 + 2 // ctz(32 * 2 * 4) + slli t1, a2, 2 + add a0, a0, t0 + add a1, a1, t1 + addi a2, a2, -64 + li t1, 38 * 64 * 4 + li t6, 64 * 4 // (uint8_t *)L[x][j+1][i] - L[x][j][i] + add a4, a1, t1 // &L[1] + beqz a2, 3f +1: + mv t0, a0 + mv t1, a1 + mv t3, a3 + mv t4, a4 + addi a2, a2, 1 +2: + vsetvli t5, t3, e32, m1, ta, ma + vlse32.v v16, (t1), t6 + sub t3, t3, t5 + vlse32.v v17, (t4), t6 + mul t2, t5, t6 + vsseg2e32.v v16, (t0) + add t1, t1, t2 + add t4, t4, t2 + slli t2, t5, 1 + 2 + add t0, t0, t2 + bnez t3, 2b + + add a0, a0, 32 * 2 * 4 + add a1, a1, 4 + add a4, a4, 4 + bnez a2, 1b +3: + ret +endfunc From patchwork Thu Sep 22 18:37:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38184 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp516847pzh; Thu, 22 Sep 2022 11:41:37 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6YdTC2ho+DVYSRyD8+8ZC1h0JoeerJxDmsBDwMSMxvj8uPSJfOh9v5f9z7r5ffiyXMkPAr X-Received: by 2002:a17:907:d27:b0:782:cc3:8e87 with SMTP id gn39-20020a1709070d2700b007820cc38e87mr3815183ejc.94.1663872097339; Thu, 22 Sep 2022 11:41:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663872097; cv=none; d=google.com; s=arc-20160816; b=hn4yyT7kheoLws1d7pXXYRpAUKq8WCIXSDy+okeLM+GwbqHZrevhEAcqlLmZZzZVP1 hq4wZwumGHKhh+AFa5PjDCZ3WaQl5qe9jENpPhC9MciheLkk9O5vV/+3/Gw/gzhje0TK 5tcAtuLpltVcoH77Q2AIwVZ5x/PLHlZYBen627mG9PGLdidl8dEbO7UyIiG6Nb5kooA3 AQ5WJkXLKz2KX9yjQq//+dg/8kOYEn3/eiUlfqTQ6BSmktW8kgahiKuuZyHBrU56AUFI hIftknDaPN5vWalH4SfwZ3ktSYgRDnAxk42VWpQ6lM/SVKKP2WE9RaUldzOKdOBb+SiL HMBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=70iOXyEb7MZQLjIidtWvVI/U7OdWjC/DHcGb+pgyGzY=; b=OJbDPXs5WbwzeEfWd/WgsG+qmJfKb3A1+h1pKkemCdf41q7smz7Y8S5SzjeqFrZMbJ 2Iww8hGL+OKfTeE2AEg6QyWMlJysK609NjZ15VqQTV9nw7KDtIVs4rsjzivciFeccmBk hxK0UNgAQOXT3taAPwlFwYWNyLVbQGkMoB6b1h/NHl4M1Q3cojkF0SeyLi3MmdJeHAO6 sQPQxuI1rwKSO3WsZEl2a8d7cznyXLWZpeiQ+kkwTKQqfIutR7FTwJYu7SrAzDVyMRzL nmLRe6gIYN2k7M4Kc9TtMf7zlhZaZ2hnAAhqBvqrS5CUns/y5mtSd0bIenYnr4QCscII 5VKQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qw20-20020a170906fcb400b0072f38ecf74asi4923199ejb.794.2022.09.22.11.41.36; Thu, 22 Sep 2022 11:41:37 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BAD1168BBBC; Thu, 22 Sep 2022 21:38:01 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6F82668B89E for ; Thu, 22 Sep 2022 21:37:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 77067C00C9 for ; Thu, 22 Sep 2022 21:37:31 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 21:37:26 +0300 Message-Id: <20220922183726.38624-29-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12078904.O9o76ZdvQC@basile.remlab.net> References: <12078904.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 29/29] lavc/aacpsdsp: RISC-V V hybrid_synthesis_deint X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: xFjFTJeXtNR0 From: Rémi Denis-Courmont --- libavcodec/riscv/aacpsdsp_init.c | 3 +++ libavcodec/riscv/aacpsdsp_rvv.S | 37 ++++++++++++++++++++++++++++++++ 2 files changed, 40 insertions(+) diff --git a/libavcodec/riscv/aacpsdsp_init.c b/libavcodec/riscv/aacpsdsp_init.c index 76f55502ee..20b1a12741 100644 --- a/libavcodec/riscv/aacpsdsp_init.c +++ b/libavcodec/riscv/aacpsdsp_init.c @@ -31,6 +31,8 @@ void ff_ps_hybrid_analysis_rvv(float (*out)[2], float (*in)[2], const float (*filter)[8][2], ptrdiff_t, int n); void ff_ps_hybrid_analysis_ileave_rvv(float (*out)[32][2], float L[2][38][64], int i, int len); +void ff_ps_hybrid_synthesis_deint_rvv(float out[2][38][64], float (*in)[32][2], + int i, int len); av_cold void ff_psdsp_init_riscv(PSDSPContext *c) { @@ -39,6 +41,7 @@ av_cold void ff_psdsp_init_riscv(PSDSPContext *c) if (flags & AV_CPU_FLAG_RV_ZVE32X) { c->hybrid_analysis_ileave = ff_ps_hybrid_analysis_ileave_rvv; + c->hybrid_synthesis_deint = ff_ps_hybrid_synthesis_deint_rvv; if (flags & AV_CPU_FLAG_RV_ZVE32F) { c->add_squares = ff_ps_add_squares_rvv; diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index 9c7bda1098..1b410ce5d8 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -190,3 +190,40 @@ func ff_ps_hybrid_analysis_ileave_rvv, zve32x /* no needs for zve32f here */ 3: ret endfunc + +func ff_ps_hybrid_synthesis_deint_rvv, zve32x + slli t0, a2, 2 + slli t1, a2, 5 + 1 + 2 + add a0, a0, t0 + add a1, a1, t1 + addi a2, a2, -64 + li t1, 38 * 64 * 4 + li t6, 64 * 4 + add a4, a0, t1 + beqz a2, 3f +1: + mv t0, a0 + mv t1, a1 + mv t3, a3 + mv t4, a4 + addi a2, a2, 1 +2: + vsetvli t5, t3, e32, m1, ta, ma + vlseg2e32.v v16, (t1) + sub t3, t3, t5 + vsse32.v v16, (t0), t6 + slli t2, t5, 1 + 2 + vsse32.v v17, (t4), t6 + add t1, t1, t2 + mul t2, t5, t6 + add t0, t0, t2 + add t4, t4, t2 + bnez t3, 2b + + add a0, a0, 4 + add a1, a1, 32 * 2 * 4 + add a4, a4, 4 + bnez a2, 1b +3: + ret +endfunc