From patchwork Sat Sep 3 19:01:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37634 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp1715614pzh; Sat, 3 Sep 2022 12:02:06 -0700 (PDT) X-Google-Smtp-Source: AA6agR7+uOSGKSYHQxEZeynXuj+iGIVteudz7iDsyTYpni/K7j48QIO9qeB0WLR4SADEsNlmsFM0 X-Received: by 2002:a05:6402:4511:b0:43b:a182:8a0a with SMTP id ez17-20020a056402451100b0043ba1828a0amr37763126edb.410.1662231726224; Sat, 03 Sep 2022 12:02:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662231726; cv=none; d=google.com; s=arc-20160816; b=ym/VU/WMWDzdyrOPaaNwsUvAu5UJFS1lD+2PeSFMq9sOPy7jus4EcfINLZ77GE3rbS tBaw7BzL6UYLUlKNfTlsncP5OHYKdhZ0lW9wSNpE/wwiGn0YGH+3TUYfbuIzGRjrVIuc q9cV1OVYPai67ZUBLEdetWrWOUrI/oLzUBnXZe8Th9euHrc7VtgzTDC74O8Tmo4+muw4 7fae+H/4P9Y7pceGHqAb5nrWuvGHpnvrZxQQ2INmoHnQCKVJadMOyUsENansnXMboWth LVU9BxpnItQb6N9S2W762QX6nh5nL/bFURgUZw4SLj4lsdujumTmyJdpwsVy3mBWvrz2 LUxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=5+3LiGx1k7/UW/ObT1HyQYXZXlHyIF0PqEM+xmA8yg8=; b=0a9Ijnq0uJgOxdlJZv1sS4Vc7s0lLgfhbEBJfHpMQWi5mIQ1+7DVgZg88Lze2GfRm8 SOseFIDzyAWz0+mdsUK6Oc/mDt6IqTdPKYc3QwlOOAtd7yri+12D+7G2dJBO5EAWfJeb VVYgWAK4KbFt2K8GCc+J8IyZ4/X6L74hhG1PFsLXYUKRkL9OpR+VbVOaLPkETccxiwC1 JWmHP1UYE+3IxMGfbfAP65T91aFnCNhMKPQYgNx6NVYBntPQhmbwIuG42mqElnMzVYmK 2BA/5Q6CXyOZsOCi4vcnphMNbi++B+FPO7ED9xv2uJbwkyiGV2mToH1Vin5L8fuHuleh Hbww== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id z9-20020a170906814900b00730983c4646si3742954ejw.508.2022.09.03.12.02.05; Sat, 03 Sep 2022 12:02:06 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7E73368BA12; Sat, 3 Sep 2022 22:01:56 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 35DD968B325 for ; Sat, 3 Sep 2022 22:01:48 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id E19AFC0003 for ; Sat, 3 Sep 2022 22:01:47 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sat, 3 Sep 2022 22:01:45 +0300 Message-Id: <20220903190147.196927-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12046336.O9o76ZdvQC@basile.remlab.net> References: <12046336.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/3] riscv: add CPU flags for the RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: BRYl0GSe5flx From: RĂ©mi Denis-Courmont RVV defines a total of 12 different extensions: V, Zvl32b, Zvl64b, Zvl128b, Zvl256b, Zvl512b, Zvl1024b, Zve32x, Zve32f, Zve64x, Zve64f and Zve64d. At this stage, we don't care about the vector length extensions Zvl*, as most or all optimisations will be running in a loop that is independent on the data set size. Zve64f is equivalent to Zve32f plus Zve64x, so it is exposed as a convenience flag, but not tracked internally. Likewise V is the equivalent of Zve64d plus Zvl128b. Technically, Zve32f and Zve64x are both implied by Zve64d and both imply Zve32x, leaving only 5 possibilities (including no vector support), but we keep 4 separate bits for easy run-time checks as on other instruction set architectures. --- libavutil/cpu.c | 14 ++++++++++ libavutil/cpu.h | 6 +++++ libavutil/cpu_internal.h | 1 + libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 58 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 80 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..83bf513cf2 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -178,6 +180,18 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV +#define AV_CPU_FLAG_ZVE32F_M (AV_CPU_FLAG_ZVE32F | AV_CPU_FLAG_ZVE32X) +#define AV_CPU_FLAG_ZVE64X_M (AV_CPU_FLAG_ZVE64X | AV_CPU_FLAG_ZVE32X) +#define AV_CPU_FLAG_ZVE64D_M (AV_CPU_FLAG_ZVE64D | AV_CPU_FLAG_ZVE64F_M) +#define AV_CPU_FLAG_ZVE64F_M (AV_CPU_FLAG_ZVE32F_M | AV_CPU_FLAG_ZVE64X_M) +#define AV_CPU_FLAG_VECTORS AV_CPU_FLAG_ZVE64D_M + { "vectors", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VECTORS }, .unit = "flags" }, + { "zve32x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE32X }, .unit = "flags" }, + { "zve32f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE32F_M }, .unit = "flags" }, + { "zve64x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64X_M }, .unit = "flags" }, + { "zve64f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64F_M }, .unit = "flags" }, + { "zve64d", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64D_M }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..44836e50d6 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,12 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V Vector extension +#define AV_CPU_FLAG_ZVE32X (1 << 0) /* 8-, 16-, 32-bit integers */ +#define AV_CPU_FLAG_ZVE32F (1 << 1) /* single precision scalars */ +#define AV_CPU_FLAG_ZVE64X (1 << 2) /* 64-bit integers */ +#define AV_CPU_FLAG_ZVE64D (1 << 3) /* double precision scalars */ + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..634f28bac4 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,6 +48,7 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..96726f2f85 --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,58 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#endif + +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) +#define ZVE_UP_TO(cap) ((2 * (cap)) - 1) + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; + + /* If RV-V is enabled statically at compile-time, check the details. */ +#ifdef __riscv_vectors + ret |= AV_CPU_FLAG_ZVE32X; +#if __riscv_v_elen >= 64 + ret |= AV_CPU_FLAG_ZVE64X; +#endif +#if __riscv_v_elen_fp >= 32 + ret |= AV_CPU_FLAG_ZVE32F; +#endif +#if __riscv_v_elen_fp >= 64 + ret |= AV_CPU_FLAG_ZVE32F; +#endif +#endif + +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + /* The V extension implies all subsets */ + if (hwcap & HWCAP_RV('V')) + ret |= AV_CPU_FLAG_ZVE32X | AV_CPU_FLAG_ZVE64X + | AV_CPU_FLAG_ZVE32F | AV_CPU_FLAG_ZVE64D; +#endif + + return ret; +}