From patchwork Sat Sep 17 12:45:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37978 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp191433pzh; Sat, 17 Sep 2022 05:45:47 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6yQkGgrOCIFz6Lg5f4+KsnKuLV8HSXBoctWrnAY5GOU1DsHiNlfOPa1pe2tZnT8IjJ9B1y X-Received: by 2002:aa7:cd49:0:b0:451:e570:8a82 with SMTP id v9-20020aa7cd49000000b00451e5708a82mr7651446edw.369.1663418747221; Sat, 17 Sep 2022 05:45:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663418747; cv=none; d=google.com; s=arc-20160816; b=Is6mcgAciYYPYQT7hpU6+gtHgrujhHT6jYPNdnq7gif8y1Z9zuM8Je1EOyimDwrN4u upSd7wQG3shSioAECoUMPL60u9NN8qT0E7WguGSYBNHAYQSa6uOPOJVrgMYmr3p3IArb efkmtb97HdKcI50F3fXuxk+XdZtSWhjeupbh0QHYadXaMiq1l5daLZhEc7kiJ4qyrlj2 kBLvofS3xCBeDIkYt/V6SIZngoRHEFcpgy1r7fcUHQsrAA3pTa5N2A5tebm7laYpJPZX HlL0sfahyovKfxM3y00lKTi8XSDbmIVH7jPQgLS5EkhJGYavyqgKWiLe08sHiu5SUBi0 3psA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=hD9vYB2Ruv2QW4YwTpRpcXkzeA6lCws5bs5ImUcYMuA=; b=wZnN0xmJLkSYiMpJf65fsSREdvNwbuWoVRbRNB8b/d7Hk9y8Xp9K2H3sOgo0/B6+Nn Lj0V8GDKTFxxqSNMvRSomrOv2iqrRSX7UMzJxZJ5fSryOZfj2RveBxp9YmWNOo647gIK 61KqTsmtXvTyuHl7xXL2oOzaPvWA2pboY22TdJxxbBkOupWeheQ1+AiIYribwrJ+TgUW q37fNbJycv06n+9DNmz7uobfbQUi5qz41WqD2lTQKAT1SyTIIMmZeZ+PyXgSdmRwrPxr vkCNjNE6jdgBrrEHeVJ72W/vuUZPPNng9Lv7PHF7BT8q9JEsLsQTOzXMlmid/BWwwHMv BRfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y19-20020a056402171300b0045215610d99si3762504edu.460.2022.09.17.05.45.46; Sat, 17 Sep 2022 05:45:47 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 932D968BBF2; Sat, 17 Sep 2022 15:45:44 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 808E168BBCA for ; Sat, 17 Sep 2022 15:45:37 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 3A079C0003 for ; Sat, 17 Sep 2022 15:45:37 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sat, 17 Sep 2022 15:45:32 +0300 Message-Id: <20220917124537.66238-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602047.DvuYhMxLoT@basile.remlab.net> References: <5602047.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/6] lavu/cpu: detect RISC-V base extensions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: EY5wz7H89jdG From: Rémi Denis-Courmont This introduces compile-time and run-time CPU detection on RISC-V. In practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of I, F and D extensions, and if it does, it probably won't have run-time detection. So the flags are essentially always set. But as things stand, checkasm wants them that way. Compare the ARMV8 flag on AArch64. We are nowhere near running short on CPU flag bits. --- libavutil/cpu.c | 9 ++++++ libavutil/cpu.h | 5 +++ libavutil/cpu_internal.h | 3 ++ libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 64 +++++++++++++++++++++++++++++++++++++++ tests/checkasm/checkasm.c | 4 +++ 6 files changed, 86 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..78e92a1bf6 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -95,6 +97,9 @@ void av_force_cpu_flags(int arg){ arg |= AV_CPU_FLAG_MMX; } +#if ARCH_RISCV + arg = ff_force_cpu_flags_riscv(arg); +#endif atomic_store_explicit(&cpu_flags, arg, memory_order_relaxed); } @@ -178,6 +183,10 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV + { "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" }, + { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" }, + { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..9aae2ccc7a 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,11 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V extensions +#define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank) +#define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP) +#define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP) + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..9ddf11488b 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,9 +48,12 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); +int ff_force_cpu_flags_riscv(int flags); + size_t ff_get_cpu_max_align_mips(void); size_t ff_get_cpu_max_align_aarch64(void); size_t ff_get_cpu_max_align_arm(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..b382e8fa07 --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,64 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "libavutil/log.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) +#endif + +int ff_force_cpu_flags_riscv(int flags) +{ + if ((flags & AV_CPU_FLAG_RVD) && !(flags & AV_CPU_FLAG_RVF)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "F"); + flags |= AV_CPU_FLAG_RVF; + } + + return flags; +} + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + if (hwcap & HWCAP_RV('I')) + ret |= AV_CPU_FLAG_RVI; + if (hwcap & HWCAP_RV('F')) + ret |= AV_CPU_FLAG_RVF; + if (hwcap & HWCAP_RV('D')) + ret |= AV_CPU_FLAG_RVD; +#endif + +#ifdef __riscv_i + ret |= AV_CPU_FLAG_RVI; +#endif +#if defined (__riscv_flen) && (__riscv_flen >= 32) + ret |= AV_CPU_FLAG_RVF; +#if (__riscv_flen >= 64) + ret |= AV_CPU_FLAG_RVD; +#endif +#endif + + return ret; +} diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index e56fd3850e..ea25fbad75 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -226,6 +226,10 @@ static const struct { { "ALTIVEC", "altivec", AV_CPU_FLAG_ALTIVEC }, { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, +#elif ARCH_RISCV + { "RVI", "rvi", AV_CPU_FLAG_RVI }, + { "RVF", "rvf", AV_CPU_FLAG_RVF }, + { "RVD", "rvd", AV_CPU_FLAG_RVD }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Sat Sep 17 12:45:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37979 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp191484pzh; Sat, 17 Sep 2022 05:45:55 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5BSsMH1IfIg/F2bBt3WvRbD2C2aKFd31R4eL7QPprE+SiD5vlwHc8+4+KIjFnJTIywZ+2N X-Received: by 2002:a05:6402:5106:b0:451:787c:9fcc with SMTP id m6-20020a056402510600b00451787c9fccmr7608122edd.164.1663418755412; Sat, 17 Sep 2022 05:45:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663418755; cv=none; d=google.com; s=arc-20160816; b=gigQ72WwchTlasS6+eoBaA1k5wcP7QqRngnJsLC1c+dH3Nr/JRCwPFo2ZWBvofk2cp /Fbs363WEnjmzh0RBOZGDorsOP4SG7BsONNP8UnnTX45/iFnqVT0rdDhzluTMGYgNPMO hG1DtoU9QIwiRuZQybNapH+Sz8Cfq9ySjP0QGOLis2Y2ywwXS9OE9WPaZcnYQPdfuJ/G M2dPTpOKwa2cWKF9l2duaGS0wLOhB+wLK3NchoLM2GKO+KIK2aYZjGe7dNLYTG/zpGwc KmCoXMUpcWR27E07La05a68NSoQZeMg8ZaFcfk749NEzEzU2ee1bNe3Nl2jM3roi4PfL Qhzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=+HS15qHCU/B5j1DXiBLCdbprLe31/OTuqZJwetio+tY=; b=yUAwEYVGBR6JKVSqRgz5wqT0i3b64XhLfdpB2WkvqGVa3RULdX1Cg0j2Id3cPOimTi Pizm8CqnU1VMjjjiulFAL6in++reJkmWd+ZAmZz5XN97brVi7krJVi6Zie/tdgP9smVX 2SaEWuI+3uprKRGvM+/5OsT3jIIlRcTdlldu73n7OjDEkbwsig4amhu3L/4rLg1xUPyv 46SQuK8qLj3Rtq+XIZ5OYgDfuj5C5r0UO1Jb/D1eODkKERkxBrV7tfRajRR9JfgSvW5A cMpC4tl7cRRP9OsA6F9hwrFoHho9pLzz9hduhMWp61dOr10GLykcTS5QYy+aNXtACA+q hYWQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nd27-20020a170907629b00b0073d62820e42si20392623ejc.288.2022.09.17.05.45.54; Sat, 17 Sep 2022 05:45:55 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 84BB768BC01; Sat, 17 Sep 2022 15:45:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C3F3C68BBCC for ; Sat, 17 Sep 2022 15:45:37 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6824CC0029 for ; Sat, 17 Sep 2022 15:45:37 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sat, 17 Sep 2022 15:45:33 +0300 Message-Id: <20220917124537.66238-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602047.DvuYhMxLoT@basile.remlab.net> References: <5602047.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/6] lavu/cpu: CPU flags for the RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: eQa7CsoQbtkA From: Rémi Denis-Courmont RVV defines a total of 12 different extensions, including: - 5 different instruction subsets: - Zve32x: 8-, 16- and 32-bit integers, - Zve32f: Zve32x plus single precision floats, - Zve64x: Zve32x plus 64-bit integers, - Zve64f: Zve32f plus Zve64x, - Zve64d: Zve64f plus double precision floats. - 6 different vector lengths: - Zvl32b (embedded only), - Zvl64b (embedded only), - Zvl128b, - Zvl256b, - Zvl512b, - Zvl1024b, - and the V extension proper: equivalent to Zve64f and Zvl128b. In total, there are 6 different possible sets of supported instructions (including the empty set), but for convenience we allocate one bit for each type sets: up-to-32-bit ints (ZVE32X), floats (ZV32F), 64-bit ints (ZV64X) and doubles (ZVE64D). Whence the vector size is needed, it can be retrieved by reading the unprivileged read-only vlenb CSR. This should probably be a separate helper macro if needed at a later point. --- libavutil/cpu.c | 4 ++++ libavutil/cpu.h | 4 ++++ libavutil/riscv/cpu.c | 46 ++++++++++++++++++++++++++++++++++++++- tests/checkasm/checkasm.c | 10 ++++++--- 4 files changed, 60 insertions(+), 4 deletions(-) diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 78e92a1bf6..58ae4858b4 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -187,6 +187,10 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "rvi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVI }, .unit = "flags" }, { "rvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVF }, .unit = "flags" }, { "rvd", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVD }, .unit = "flags" }, + { "rvve32", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE32X}, .unit = "flags" }, + { "rvvf", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE32F}, .unit = "flags" }, + { "rvve64", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE64X}, .unit = "flags" }, + { "rvv", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVE64D}, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9aae2ccc7a..00698e30ef 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -82,6 +82,10 @@ #define AV_CPU_FLAG_RVI (1 << 0) ///< I (full GPR bank) #define AV_CPU_FLAG_RVF (1 << 1) ///< F (single precision FP) #define AV_CPU_FLAG_RVD (1 << 2) ///< D (double precision FP) +#define AV_CPU_FLAG_RV_ZVE32X (1 << 3) ///< Vectors of 8/16/32-bit int's */ +#define AV_CPU_FLAG_RV_ZVE32F (1 << 4) ///< Vectors of float's */ +#define AV_CPU_FLAG_RV_ZVE64X (1 << 5) ///< Vectors of 64-bit int's */ +#define AV_CPU_FLAG_RV_ZVE64D (1 << 6) ///< Vectors of double's /** * Return the flags which specify extensions supported by the CPU. diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c index b382e8fa07..3e6c99819b 100644 --- a/libavutil/riscv/cpu.c +++ b/libavutil/riscv/cpu.c @@ -28,7 +28,32 @@ int ff_force_cpu_flags_riscv(int flags) { - if ((flags & AV_CPU_FLAG_RVD) && !(flags & AV_CPU_FLAG_RVF)) { + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RV_ZVE64X)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE64X"); + flags |= AV_CPU_FLAG_RV_ZVE64X; + } + + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RV_ZVE32F)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE32F"); + flags |= AV_CPU_FLAG_RV_ZVE32F; + } + + if ((flags & (AV_CPU_FLAG_RV_ZVE64X | AV_CPU_FLAG_RV_ZVE32F)) + && !(flags & AV_CPU_FLAG_RV_ZVE32X)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", + "_ZVE32X"); + flags |= AV_CPU_FLAG_RV_ZVE32X; + } + + if ((flags & AV_CPU_FLAG_RV_ZVE64D) && !(flags & AV_CPU_FLAG_RVD)) { + av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "D"); + flags |= AV_CPU_FLAG_RVD; + } + + if ((flags & (AV_CPU_FLAG_RVD | AV_CPU_FLAG_RV_ZVE32F)) + && !(flags & AV_CPU_FLAG_RVF)) { av_log(NULL, AV_LOG_WARNING, "RV%s implied by specified flags\n", "F"); flags |= AV_CPU_FLAG_RVF; } @@ -48,6 +73,11 @@ int ff_get_cpu_flags_riscv(void) ret |= AV_CPU_FLAG_RVF; if (hwcap & HWCAP_RV('D')) ret |= AV_CPU_FLAG_RVD; + + /* The V extension implies all Zve* functional subsets */ + if (hwcap & HWCAP_RV('V')) + ret |= AV_CPU_FLAG_RV_ZVE32X | AV_CPU_FLAG_RV_ZVE64X + | AV_CPU_FLAG_RV_ZVE32F | AV_CPU_FLAG_RV_ZVE64D; #endif #ifdef __riscv_i @@ -58,6 +88,20 @@ int ff_get_cpu_flags_riscv(void) #if (__riscv_flen >= 64) ret |= AV_CPU_FLAG_RVD; #endif +#endif + + /* If RV-V is enabled statically at compile-time, check the details. */ +#ifdef __riscv_vectors + ret |= AV_CPU_FLAG_RV_ZVE32X; +#if __riscv_v_elen >= 64 + ret |= AV_CPU_FLAG_RV_ZVE64X; +#endif +#if __riscv_v_elen_fp >= 32 + ret |= AV_CPU_FLAG_RV_ZVE32F; +#if __riscv_v_elen_fp >= 64 + ret |= AV_CPU_FLAG_RV_ZVE64F; +#endif +#endif #endif return ret; diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index ea25fbad75..2f863c9a8a 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -227,9 +227,13 @@ static const struct { { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, #elif ARCH_RISCV - { "RVI", "rvi", AV_CPU_FLAG_RVI }, - { "RVF", "rvf", AV_CPU_FLAG_RVF }, - { "RVD", "rvd", AV_CPU_FLAG_RVD }, + { "RVI", "rvi", AV_CPU_FLAG_RVI }, + { "RVF", "rvf", AV_CPU_FLAG_RVF }, + { "RVD", "rvd", AV_CPU_FLAG_RVD }, + { "RV_Zve32x", "rv_zve32x", AV_CPU_FLAG_RV_ZVE32X }, + { "RV_Zve32f", "rv_zve32f", AV_CPU_FLAG_RV_ZVE32F }, + { "RV_Zve64x", "rv_zve64x", AV_CPU_FLAG_RV_ZVE64X }, + { "RV_Zve64d", "rv_zve64d", AV_CPU_FLAG_RV_ZVE64D }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Sat Sep 17 12:45:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37980 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp191538pzh; Sat, 17 Sep 2022 05:46:03 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7av8MeQc+ky30EcFcJk2pDtxdEMagG/VUmuCmYi8ZNrMlsSJv+36iGjtopQlS+sOfyK0B/ X-Received: by 2002:a05:6402:34cc:b0:451:62bf:c816 with SMTP id w12-20020a05640234cc00b0045162bfc816mr7836202edc.213.1663418763101; Sat, 17 Sep 2022 05:46:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663418763; cv=none; d=google.com; s=arc-20160816; b=kcDyn2DseM0J6y0sAImvID/ofuQQPEDNsB3LDWXco8rrPsBbTY+O/5SarH5b7u0bm3 4kk1iECD/pKuJtRYJfMwnHypf2inAg5nlNnK7VmiXrQvFvjYY0qz7Rhn3T7EiDUprJZV AHKtOyb0fJtYrmJ7TaCum8QBp/PkW0ckTrNaq2zHbSBez92Sfu3LFTZVzeQpe9Wbzttg vjBCYybe+776lbxQb/L4Zkl3X8S2vw9RRRLkhT3cWribaGIWPpjQW1kWkX45HkSK+Xvb 7hlLwWY5F5lJt9Y/tdHLlO7qmjxJjh0J1b02pA1/XcLbh6tBtY/u0rxxEArhW7Fjr5dh mA7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ivniDczS8t7xMqLh97E7eh9mvIibgXQJEMfU0+AQJhY=; b=yhGPkXbFhqH8Yc3wKvPul+OfgS6iw62Rp/F+kSVreaFS7VwydUgCYoFG5gZbT78ogx s5rT2x+AZRCxsoCrAqMxJUSYhkdX9ZucMEFpKZQy/9vjCy8eDZpgr3T0i2hj+lJeI8TF UZzdg97yojwVThIFl7nVwbzhOAKWPwshMFuP2XFLMFUmMHz/wIBrRC0Kew2afek4xkTF S44KuURlwYDxsJC6tfq1fSeeiofSQv1jl508MSFNG45RdmPVCK+4hDQG8KHr/OE+UfWw 8s0v2s84Y1ZZk7l6LvtOaoQlUn9ojYMrvYdWIHPsGNY8edIqRaZvh9o4zqWchfr942B6 rljA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id t23-20020a50ab57000000b004525f31edd4si3742577edc.223.2022.09.17.05.46.02; Sat, 17 Sep 2022 05:46:03 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7EA5668BC0B; Sat, 17 Sep 2022 15:45:46 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DD0D968BBD2 for ; Sat, 17 Sep 2022 15:45:37 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 95852C00AF for ; Sat, 17 Sep 2022 15:45:37 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sat, 17 Sep 2022 15:45:34 +0300 Message-Id: <20220917124537.66238-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602047.DvuYhMxLoT@basile.remlab.net> References: <5602047.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/6] configure: probe RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: MIEhN63H2jRs From: Rémi Denis-Courmont --- Makefile | 2 +- configure | 15 +++++++++++++++ ffbuild/arch.mak | 2 ++ 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 61f79e27ae..1fb742f390 100644 --- a/Makefile +++ b/Makefile @@ -91,7 +91,7 @@ ffbuild/.config: $(CONFIGURABLE_COMPONENTS) SUBDIR_VARS := CLEANFILES FFLIBS HOSTPROGS TESTPROGS TOOLS \ HEADERS ARCH_HEADERS BUILT_HEADERS SKIPHEADERS \ ARMV5TE-OBJS ARMV6-OBJS ARMV8-OBJS VFP-OBJS NEON-OBJS \ - ALTIVEC-OBJS VSX-OBJS MMX-OBJS X86ASM-OBJS \ + ALTIVEC-OBJS VSX-OBJS RVV-OBJS MMX-OBJS X86ASM-OBJS \ MIPSFPU-OBJS MIPSDSPR2-OBJS MIPSDSP-OBJS MSA-OBJS \ MMI-OBJS LSX-OBJS LASX-OBJS OBJS SLIBOBJS SHLIBOBJS \ STLIBOBJS HOSTOBJS TESTOBJS diff --git a/configure b/configure index 240ae942d1..32be5ad625 100755 --- a/configure +++ b/configure @@ -462,6 +462,7 @@ Optimization options (experts only): --disable-mmi disable Loongson MMI optimizations --disable-lsx disable Loongson LSX optimizations --disable-lasx disable Loongson LASX optimizations + --disable-rvv disable RISC-V Vector optimizations --disable-fast-unaligned consider unaligned accesses slow Developer options (useful when working on FFmpeg itself): @@ -2126,6 +2127,10 @@ ARCH_EXT_LIST_PPC=" vsx " +ARCH_EXT_LIST_RISCV=" + rvv +" + ARCH_EXT_LIST_X86=" $ARCH_EXT_LIST_X86_SIMD cpunop @@ -2135,6 +2140,7 @@ ARCH_EXT_LIST_X86=" ARCH_EXT_LIST=" $ARCH_EXT_LIST_ARM $ARCH_EXT_LIST_PPC + $ARCH_EXT_LIST_RISCV $ARCH_EXT_LIST_X86 $ARCH_EXT_LIST_MIPS $ARCH_EXT_LIST_LOONGSON @@ -2642,6 +2648,8 @@ ppc4xx_deps="ppc" vsx_deps="altivec" power8_deps="vsx" +rvv_deps="riscv" + loongson2_deps="mips" loongson3_deps="mips" mmi_deps_any="loongson2 loongson3" @@ -6112,6 +6120,10 @@ elif enabled ppc; then check_cpp_condition power8 "altivec.h" "defined(_ARCH_PWR8)" fi +elif enabled riscv; then + + enabled rvv && check_inline_asm rvv '".option arch, +v\nvsetivli zero, 0, e8, m1, ta, ma"' + elif enabled x86; then check_builtin rdtsc intrin.h "__rdtsc()" @@ -7598,6 +7610,9 @@ if enabled loongarch; then echo "LSX enabled ${lsx-no}" echo "LASX enabled ${lasx-no}" fi +if enabled riscv; then + echo "RISC-V Vector enabled ${riscv-no}" +fi echo "debug symbols ${debug-no}" echo "strip symbols ${stripping-no}" echo "optimize for size ${small-no}" diff --git a/ffbuild/arch.mak b/ffbuild/arch.mak index 997e31e85e..39d76ee152 100644 --- a/ffbuild/arch.mak +++ b/ffbuild/arch.mak @@ -15,5 +15,7 @@ OBJS-$(HAVE_LASX) += $(LASX-OBJS) $(LASX-OBJS-yes) OBJS-$(HAVE_ALTIVEC) += $(ALTIVEC-OBJS) $(ALTIVEC-OBJS-yes) OBJS-$(HAVE_VSX) += $(VSX-OBJS) $(VSX-OBJS-yes) +OBJS-$(HAVE_RVV) += $(RVV-OBJS) $(RVV-OBJS-yes) + OBJS-$(HAVE_MMX) += $(MMX-OBJS) $(MMX-OBJS-yes) OBJS-$(HAVE_X86ASM) += $(X86ASM-OBJS) $(X86ASM-OBJS-yes) From patchwork Sat Sep 17 12:45:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37982 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp191679pzh; Sat, 17 Sep 2022 05:46:20 -0700 (PDT) X-Google-Smtp-Source: AMsMyM77KcI1vBG0hrVIoxVHN2SorYBBguj9Q/lWKmkobWQXwojsK02XyxFsxL22rFVTBpyUm+8k X-Received: by 2002:a17:907:7621:b0:741:6656:bd14 with SMTP id jy1-20020a170907762100b007416656bd14mr6563644ejc.298.1663418780255; Sat, 17 Sep 2022 05:46:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663418780; cv=none; d=google.com; s=arc-20160816; b=TUzJSodaPFqePdNVCkzycq9OK/O9ab6psE1Z/L/dFnEnuRJ/UX7gYjkXm6nXo5Vn5R 282qR9ntKp+eqdRa/z+ju95f0slf3puY9cTfEZgRcjpY8eh5xCLxnkIsejBGYER5rNB3 IG1ApkoKrRn2GVxfYpcu5astinB6Pv+Rpvl6ilyfiAdyqTpvBU9RRBYoDxJqLz0f2Gma MwS5g96TqTFEz0g7OIOUG/7MIuCN8tcSJUaa9WK6ApNJGCQf24PQ52FEd1f6OfbcdvTI 7+Flux2JVj3NL390uBVdIJQmOUt22vLxpm4zKECNnYv4BjShqNkFgAyhuF8lO6WLDVX4 Nk5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=eaEou0KQyOFm/1LpkQQhDLfhyGIV6wIp/FCAO/Z/+CM=; b=QAOIFH/MtqQh5EPJ87TrpZYRNGl0RRLnP1eE3emYowcGxYs1gLO03gO34IlLlWMoj4 moTpeIwBQSJi4cCplEOHZR466Hu6v+VQDjkE47alserxepUcZtlCmPiHg5S/WjXTeNr8 6NHDEVx5UG4kpgIRDtskNi6ZNPCnW2eRH8J9v2lw5xLauP0GdgJOnkiU/BDKt2CypQVB AdMpuanH5PU3XbtOfuS72T8laxvts5v/m+IiMSzIXV87cU8LvlmRWB0jCphTed6jYzS/ o4BRt1g6e/Yky9T2NhjZIyYAR/gjEzAL+l4wIsvhERStLrYKrFfw1Adi7heAg95/7tI3 fEDg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id z18-20020a05640240d200b0045161c9a31fsi5496688edb.69.2022.09.17.05.46.19; Sat, 17 Sep 2022 05:46:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 743F968BC13; Sat, 17 Sep 2022 15:45:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1145868BBCC for ; Sat, 17 Sep 2022 15:45:38 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id C29A0C00B0 for ; Sat, 17 Sep 2022 15:45:37 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sat, 17 Sep 2022 15:45:35 +0300 Message-Id: <20220917124537.66238-4-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602047.DvuYhMxLoT@basile.remlab.net> References: <5602047.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/6] lavu/riscv: initial common header for assembler macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: B5DryPOHs53x From: Rémi Denis-Courmont --- libavutil/riscv/asm.S | 74 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 libavutil/riscv/asm.S diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S new file mode 100644 index 0000000000..7623c161cf --- /dev/null +++ b/libavutil/riscv/asm.S @@ -0,0 +1,74 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#if defined (__riscv_float_abi_soft) +#define NOHWF +#define NOHWD +#define HWF # +#define HWD # +#elif defined (__riscv_float_abi_single) +#define NOHWF # +#define NOHWD +#define HWF +#define HWD # +#else +#define NOHWF # +#define NOHWD # +#define HWF +#define HWD +#endif + + .macro func sym, ext= + .text + .align 2 + + .option push + .ifnb \ext + .option arch, +\ext + .endif + + .global \sym + .hidden \sym + .type \sym, %function + \sym: + + .macro endfunc + .size \sym, . - \sym + .option pop + .previous + .purgem endfunc + .endm + .endm + + .macro const sym, align=3, relocate=0 + .if \relocate + .pushsection .data.rel.ro + .else + .pushsection .rodata + .endif + .align \align + \sym: + + .macro endconst + .size \sym, . - \sym + .popsection + .purgem endconst + .endm + .endm From patchwork Sat Sep 17 12:45:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37983 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp191731pzh; Sat, 17 Sep 2022 05:46:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7zQSISvzTI7BByCG0CuVjgslwPrk/+i+QvLTl3eMpyy5JRvqvrvSES/uS4bKwplncE5PPE X-Received: by 2002:a17:907:6da2:b0:77c:52c4:882f with SMTP id sb34-20020a1709076da200b0077c52c4882fmr6648846ejc.246.1663418788885; Sat, 17 Sep 2022 05:46:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663418788; cv=none; d=google.com; s=arc-20160816; b=BhFHPP9NSp8M7geQVxKzJJS866a8J9vGjBAawQkmKFzDi6MbNZ+5maAURom5CpT+WR xV8YMK94dGmXERkyr1I+q5Gyop/LK229t6Pp8szQhQcC5QXQd7NUyIUhby1l49S0UQUM XWZ4072DtK8szJJDenYGGtMPu3bw3VquBss3AeqXKbNpmDI6KeL9I4UZjYED8KBlvy3y qpIlrjD7DKvr4n3RGFrYconz2xgS7R9XX2wV93Nu8EaMoyxTIwveGvoEDACnnwUlxtFy oRqtyVH6ExalJ8TQ4+IHk7yDC5MWMHn1AxkCqvr6fphsHqUBsgipV7HetU9PkERa5bdM 8hLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=KZ1f13WJk1Tc5kbxHNWTZyOpWIaUqnENIUd3VJA9tDQ=; b=SIrmadYtjr+81mwPk+6dpKZdxZm7ErszSbZ7hEHVLaXwWEn2CkumvQo4+kjGKgpszu WOx6Pd5wOB/u5MNsNrz25EzNrr+pE8ZeIRRq3XTQr7j4blgMWmPAFWK8F520vdnySy8c iZTmI8YcK8E+8laR0CTWlOe5YnymlPIQkqXGC5cAyrTJWedj6e3G+c8Di1qVUDV0kvwe hsjM5o9V1kU6gpmIPndi7vIfUAfHPD0HMPPAgG6ST/I+CluIFNIEs5aWkXNrQH4Bls+w Bb2ykTRx1gpXE8rgnoKjMNaTzX7HQ68IyO+Pa2aD/S4KEuh5Fn8H3FfyUxiQuOIf3bnj qw/w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id x2-20020a05640225c200b004538b587ff6si2250781edb.405.2022.09.17.05.46.28; Sat, 17 Sep 2022 05:46:28 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 611B868BC2B; Sat, 17 Sep 2022 15:45:49 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 434A568BBCC for ; Sat, 17 Sep 2022 15:45:38 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id F0575C00B1 for ; Sat, 17 Sep 2022 15:45:37 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sat, 17 Sep 2022 15:45:36 +0300 Message-Id: <20220917124537.66238-5-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602047.DvuYhMxLoT@basile.remlab.net> References: <5602047.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/6] lavc/audiodsp: add RISC-V F float vector clip X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ZGBLyHzrkvhJ From: Rémi Denis-Courmont RV64G supports MIN & MAX instructions natively only on floating point registers, not general purpose ones. The later would require the Zbb extension. Due to that, it is actually faster to perform the clipping "properly" in FPU. Benchmarked on SiFive U74-MC: audiodsp.vector_clipf_c: 29551.5 audiodsp.vector_clipf_rvf: 17871.0 Also tried unrolling with 2 or 8 elements but it gets worse either way. --- libavcodec/audiodsp.c | 2 ++ libavcodec/audiodsp.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/audiodsp_init.c | 31 +++++++++++++++++++++ libavcodec/riscv/audiodsp_rvf.S | 46 ++++++++++++++++++++++++++++++++ 5 files changed, 82 insertions(+) create mode 100644 libavcodec/riscv/Makefile create mode 100644 libavcodec/riscv/audiodsp_init.c create mode 100644 libavcodec/riscv/audiodsp_rvf.S diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c index ff43e87dce..eba6e809fd 100644 --- a/libavcodec/audiodsp.c +++ b/libavcodec/audiodsp.c @@ -113,6 +113,8 @@ av_cold void ff_audiodsp_init(AudioDSPContext *c) ff_audiodsp_init_arm(c); #elif ARCH_PPC ff_audiodsp_init_ppc(c); +#elif ARCH_RISCV + ff_audiodsp_init_riscv(c); #elif ARCH_X86 ff_audiodsp_init_x86(c); #endif diff --git a/libavcodec/audiodsp.h b/libavcodec/audiodsp.h index aa6fa7898b..485b512839 100644 --- a/libavcodec/audiodsp.h +++ b/libavcodec/audiodsp.h @@ -55,6 +55,7 @@ typedef struct AudioDSPContext { void ff_audiodsp_init(AudioDSPContext *c); void ff_audiodsp_init_arm(AudioDSPContext *c); void ff_audiodsp_init_ppc(AudioDSPContext *c); +void ff_audiodsp_init_riscv(AudioDSPContext *c); void ff_audiodsp_init_x86(AudioDSPContext *c); #endif /* AVCODEC_AUDIODSP_H */ diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile new file mode 100644 index 0000000000..414a9e9bd8 --- /dev/null +++ b/libavcodec/riscv/Makefile @@ -0,0 +1,2 @@ +OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ + riscv/audiodsp_rvf.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c new file mode 100644 index 0000000000..ebd008a311 --- /dev/null +++ b/libavcodec/riscv/audiodsp_init.c @@ -0,0 +1,31 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/audiodsp.h" + +void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); + +av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVF) + c->vector_clipf = ff_vector_clipf_rvf; +} diff --git a/libavcodec/riscv/audiodsp_rvf.S b/libavcodec/riscv/audiodsp_rvf.S new file mode 100644 index 0000000000..d2c042bb26 --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvf.S @@ -0,0 +1,46 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clipf_rvf, f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.x fa1, a4 +1: + flw ft0, (a1) + flw ft1, 4(a1) + fmax.s ft0, ft0, fa0 + flw ft2, 8(a1) + fmax.s ft1, ft1, fa0 + flw ft3, 12(a1) + fmax.s ft2, ft2, fa0 + addi a2, a2, -4 + fmax.s ft3, ft3, fa0 + addi a1, a1, 16 + fmin.s ft0, ft0, fa1 + fmin.s ft1, ft1, fa1 + fsw ft0, (a0) + fmin.s ft2, ft2, fa1 + fsw ft1, 4(a0) + fmin.s ft3, ft3, fa1 + fsw ft2, 8(a0) + fsw ft3, 12(a0) + addi a0, a0, 16 + bnez a2, 1b + ret +endfunc From patchwork Sat Sep 17 12:45:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37981 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp191615pzh; Sat, 17 Sep 2022 05:46:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7MHX1TsOBYtL18R0j2bi3U+j4F3ArDfuPs3jDh1PnQZ9IgJB9WgpI8LhErN+PCB3r30dFC X-Received: by 2002:a05:6402:448b:b0:43b:5ec6:8863 with SMTP id er11-20020a056402448b00b0043b5ec68863mr7643524edb.377.1663418771601; Sat, 17 Sep 2022 05:46:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663418771; cv=none; d=google.com; s=arc-20160816; b=b8hlVzX7TfcqJk9ZYoI9cp/wLySZqXBArImtG6cQWtAfnZmhFTeZORH1NFMHdYTg2n +ZrOhRAFvYSfvtlcMMBlaFs5Mk0MzVk6MwNvjALHQ131hAM+Rzu9OmcUrgD0VzGzd4u+ xZ3wuIIT61ZJ7WriP6nMa3hvZ8pfe1aMh8VTP66c6zQHXkqIXpRFRd47vnZdhyez8RwN YciVttto9Dtgcso/+fJFcPAHTg/Mt+AWhK2pasGQvwWCWpMfsSe53LREYuZVND1BDWOn umGnPwpvQaYf9NxWcffk/3WbVo6AI8LYuLYgAmRjDnR2yJv6jufofwtmlLsFreg+NKrr VcjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=wbJVGtyH8TMGlGzDsbiCry5CHrVWPjUQEPwvgCeEf4c=; b=CRyn2X9Bqwg4VbTSbDLPp8PiuyfIAPQIpIAC8Rc9o3J3lARmZhaibkiLHsKo+3NQek l7HcMnTGiHmBI1kxru4RGyNzF/eEjqIWhBSxjKHU9rWBv2hFO532aPGI6IrHvklBlRaE LJehoeEH8jwuKSnwgTnVmWPzvtGWhnFBqHs5iTlR3XZRi+SRmScrH7EooLsBJkOqzuwA Rc9NLMuVeR9o839URSCTjuSeY+xr/beI080pncZQgConOPaXzGNcsKwos2BH6VXlzBqP EwuJeRZNpLLY/jRnw8BNtq76otvID2PkCqraJeaQzCIwIWHg2QE+zNgsGU+D3t0OwN7w Z8vw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e12-20020a17090658cc00b0077c2e4f3c39si6722474ejs.26.2022.09.17.05.46.11; Sat, 17 Sep 2022 05:46:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8958F68BBD9; Sat, 17 Sep 2022 15:45:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0F7A668BBCA for ; Sat, 17 Sep 2022 15:45:42 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 29518C00B2 for ; Sat, 17 Sep 2022 15:45:38 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sat, 17 Sep 2022 15:45:37 +0300 Message-Id: <20220917124537.66238-6-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5602047.DvuYhMxLoT@basile.remlab.net> References: <5602047.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 6/6] lavc/pixblockdsp: RISC-V scalar optimisations X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: FNcg+mlaZ5xL From: Rémi Denis-Courmont Benchmarks: get_pixels_c: 180.0 get_pixels_rvi: 136.7 --- libavcodec/pixblockdsp.c | 2 + libavcodec/pixblockdsp.h | 2 + libavcodec/riscv/Makefile | 2 + libavcodec/riscv/pixblockdsp_init.c | 43 ++++++++++++++++++++++ libavcodec/riscv/pixblockdsp_rvi.S | 57 +++++++++++++++++++++++++++++ 5 files changed, 106 insertions(+) create mode 100644 libavcodec/riscv/pixblockdsp_init.c create mode 100644 libavcodec/riscv/pixblockdsp_rvi.S diff --git a/libavcodec/pixblockdsp.c b/libavcodec/pixblockdsp.c index 17c487da1e..4294075cee 100644 --- a/libavcodec/pixblockdsp.c +++ b/libavcodec/pixblockdsp.c @@ -109,6 +109,8 @@ av_cold void ff_pixblockdsp_init(PixblockDSPContext *c, AVCodecContext *avctx) ff_pixblockdsp_init_arm(c, avctx, high_bit_depth); #elif ARCH_PPC ff_pixblockdsp_init_ppc(c, avctx, high_bit_depth); +#elif ARCH_RISCV + ff_pixblockdsp_init_riscv(c, avctx, high_bit_depth); #elif ARCH_X86 ff_pixblockdsp_init_x86(c, avctx, high_bit_depth); #elif ARCH_MIPS diff --git a/libavcodec/pixblockdsp.h b/libavcodec/pixblockdsp.h index 07c2ec4f40..9b002aa3d6 100644 --- a/libavcodec/pixblockdsp.h +++ b/libavcodec/pixblockdsp.h @@ -52,6 +52,8 @@ void ff_pixblockdsp_init_arm(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); void ff_pixblockdsp_init_ppc(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); +void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, AVCodecContext *avctx, + unsigned high_bit_depth); void ff_pixblockdsp_init_x86(PixblockDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth); void ff_pixblockdsp_init_mips(PixblockDSPContext *c, AVCodecContext *avctx, diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 414a9e9bd8..da07f1fe96 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,2 +1,4 @@ OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ riscv/audiodsp_rvf.o +OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ + riscv/pixblockdsp_rvi.o diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c new file mode 100644 index 0000000000..f489ec528b --- /dev/null +++ b/libavcodec/riscv/pixblockdsp_init.c @@ -0,0 +1,43 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/avcodec.h" +#include "libavcodec/pixblockdsp.h" + +void ff_get_pixels_8_rvi(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); +void ff_get_pixels_16_rvi(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); + +av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, + AVCodecContext *avctx, + unsigned high_bit_depth) +{ + int cpu_flags = av_get_cpu_flags(); + + if (cpu_flags & AV_CPU_FLAG_RVI) { + if (high_bit_depth) + c->get_pixels = ff_get_pixels_16_rvi; + else + c->get_pixels = ff_get_pixels_8_rvi; + } +} diff --git a/libavcodec/riscv/pixblockdsp_rvi.S b/libavcodec/riscv/pixblockdsp_rvi.S new file mode 100644 index 0000000000..dbf51b0ad9 --- /dev/null +++ b/libavcodec/riscv/pixblockdsp_rvi.S @@ -0,0 +1,57 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "../libavutil/riscv/asm.S" + +func ff_get_pixels_8_rvi +.irp row, 0, 1, 2, 3, 4, 5, 6, 7 + ld t0, (a1) + add a1, a1, a2 + sd zero, ((\row * 16) + 0)(a0) + addi t6, t6, -1 + sd zero, ((\row * 16) + 8)(a0) + srli t1, t0, 8 + sb t0, ((\row * 16) + 0)(a0) + srli t2, t0, 16 + sb t1, ((\row * 16) + 2)(a0) + srli t3, t0, 24 + sb t2, ((\row * 16) + 4)(a0) + srli t4, t0, 32 + sb t3, ((\row * 16) + 6)(a0) + srli t1, t0, 40 + sb t4, ((\row * 16) + 8)(a0) + srli t2, t0, 48 + sb t1, ((\row * 16) + 10)(a0) + srli t3, t0, 56 + sb t2, ((\row * 16) + 12)(a0) + sb t3, ((\row * 16) + 14)(a0) +.endr + ret +endfunc + +func ff_get_pixels_16_rvi +.irp row, 0, 1, 2, 3, 4, 5, 6, 7 + ld t0, 0(a1) + ld t1, 8(a1) + add a1, a1, a2 + sd t0, ((\row * 16) + 0)(a0) + sd t1, ((\row * 16) + 8)(a0) +.endr + ret +endfunc