From patchwork Wed Sep 14 17:50:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37913 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1475683pzh; Wed, 14 Sep 2022 10:50:41 -0700 (PDT) X-Google-Smtp-Source: AA6agR76CLDRHqDuPUvMEeG9C3i97w4BrBS7gI9t3uX4F2pDmhlPxu8R4Qq1BSRrhvgiYU8Yr6Ex X-Received: by 2002:aa7:c74d:0:b0:44f:d34:affb with SMTP id c13-20020aa7c74d000000b0044f0d34affbmr30779795eds.143.1663177840982; Wed, 14 Sep 2022 10:50:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663177840; cv=none; d=google.com; s=arc-20160816; b=IifCpciSaQSOrawSDLUpEB424o2HXaqkoPrqssaGAOecaq6E/Cbuv966I05z6wTcC7 z/daH80apyK5MU0MH9IbicinUeU4xQ7MjABBx+3tJUptRUuDeNvuG/3QHljByZjoC/Ai JtHKtDldN1YYHXYgc8h3uQUe5scJlLq697Uh0Wk5+AzViK0Zte+pFNIp27cUYzVPctrS 5NSi1V/mD9ZDjsm+BPSed4hXof+kETQQ8pzpudpsh3oQK3fMyO19Ipp7wbfCWTBdbNvw AMd7SPYwmJmHk+/vxCYl3D3m8lC5VYbDtt+xmFN6Fa20cphMk7IzYwaNRCbxHxAG/peI 6bGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=7r4vVuYpylXLHT0CDbyWifq5wzTeqIeKyz9gPdWiENk=; b=YinyxxmTMxDVAWE/z36L53p+T5l4r/Oih+oYPE+FnuldJ9/PUQEf/9XebXr1ipOXFR yuty0lkfv32QPRXie8CZb5z6AwmXpLcExfBNWuIXKt7cu13iAvOl1ahIxKHklDO045hJ IeQjthArbfbzp0xcyYmqY6qBmjQuMi9NaE0KHgN/kANbh2aycZ11kA9njgtQoiSNE3nz pzSfOYiJ5T9yYgNIhX7BwG9o75IQkThJrpyiGI/ST0uJ3SAymgPA3ls51yUeNoTxoOme RddmvMQiiqwYRWXBLvShz1MAB7480qw3ozzWOgVMYyngndNNGTVBtWiNwO8zzsEXLiyO X7gg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h4-20020a0564020e0400b0044e8e0dc87fsi11782777edh.362.2022.09.14.10.50.40; Wed, 14 Sep 2022 10:50:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0E80668BB6F; Wed, 14 Sep 2022 20:50:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F111968BB66 for ; Wed, 14 Sep 2022 20:50:31 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id A453BC003A for ; Wed, 14 Sep 2022 20:50:31 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Wed, 14 Sep 2022 20:50:29 +0300 Message-Id: <20220914175031.162194-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <4768066.31r3eYUQgx@basile.remlab.net> References: <4768066.31r3eYUQgx@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/3] lavu: detect RISC-V F extension (i.e. float) X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: EdT+9MumpszT From: Rémi Denis-Courmont This introduces compile-tim and run-time CPU detection on RISC-V. In practice, I doubt that FFmpeg will ever see a RISC-V CPU without the F extension, and if it does, it probably won't have run-time detection. So the flag is essentially always set. But as things stand, checkasm wants it that way, and we are nowhere near running short on CPU flag bits on that platform. --- libavutil/cpu.c | 4 ++++ libavutil/cpu.h | 3 +++ libavutil/cpu_internal.h | 1 + libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 44 +++++++++++++++++++++++++++++++++++++++ tests/checkasm/checkasm.c | 2 ++ 6 files changed, 55 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..6e9b8c5f58 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -178,6 +180,8 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV + { "float", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_F }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..71ae70bcbd 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,9 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V Vector extension +#define AV_CPU_FLAG_F (1 << 0) + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..634f28bac4 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,6 +48,7 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..6fc30f73c6 --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,44 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#endif + +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + if (hwcap & HWCAP_RV('F')) + ret |= AV_CPU_FLAG_F; +#endif + +#ifdef __riscv_flen + ret |= AV_CPU_FLAG_F; +#endif + + return ret; +} diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index e56fd3850e..4f6edfe6a3 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -226,6 +226,8 @@ static const struct { { "ALTIVEC", "altivec", AV_CPU_FLAG_ALTIVEC }, { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, +#elif ARCH_RISCV + { "F", "f", AV_CPU_FLAG_F }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Wed Sep 14 17:50:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37914 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1475740pzh; Wed, 14 Sep 2022 10:50:49 -0700 (PDT) X-Google-Smtp-Source: AA6agR7nhoa8lVF5TndLiAvtTR9T5tTgZQc7mp3ALdfss+q1XL3xlAReA3xvtil96UnDIiFoVI+T X-Received: by 2002:a17:906:30c8:b0:73c:81a9:f8e1 with SMTP id b8-20020a17090630c800b0073c81a9f8e1mr26379012ejb.649.1663177849050; Wed, 14 Sep 2022 10:50:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663177849; cv=none; d=google.com; s=arc-20160816; b=D9ryaXs5yAIGrTWE5qnqAi1OWB2yAthrxvzejr8NWFY2kch99cn4Q3YC0hSJYJsUwl +XqnQabyOo9ZJ/NO1xypgRzCDblLDAenQuYsIKEi9SYRAt5irsnLsnMMEHphVDwG9xl1 KjL5cjy04pv0P/6vQE0iNe2AYaQAAfYf7RSDnwM6hsExSf5gsHWP4DYOzrBTpp6OeWCV QeFWCKKKyK56jPhPaSynrckKqGMkPTZS8xySkxVRHTwBddCMGLqMhfVRAvjU9THr3WkZ AKimAqZo3VJodqiDSpDYflaWgRd+80HxOdx1cyOK1EnF32HAkeNolcXRgTnJ9Lg/sfEK Lo+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=eaEou0KQyOFm/1LpkQQhDLfhyGIV6wIp/FCAO/Z/+CM=; b=pwEyMaGzcyWZeHBu+BzYl0cv+LGWk3bB3E525soHzfXRsEWwuNmRDUo/IPnwnZQxNF P8oLE35idZwan1c/naOcOz/jcT2oHwqd7IeyTB7O+HlFyXhnDiRzf9C7/zh2hQ9+KUfX L5FI1TjTAlsvm6X77+CN3MC9KfvY4ss+ZtIBNmPwxfxvmG5KXSadvBnmlrfxdn9n7cMB xylxmtC2px7TyXTfncMKSS+nW+adVb+AqGvlHAd7SOR4wq+HLZkShDCJbZa1fbj+4aRC UWRFFa+14DJ/iH8DU59zsXdfhifZpe/jk6bc239wg0Q9iKVDULL/zESl9KeX6damNTz5 UC5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f11-20020a056402194b00b00452e7ae21e6si837560edz.286.2022.09.14.10.50.48; Wed, 14 Sep 2022 10:50:49 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0EC4668BB73; Wed, 14 Sep 2022 20:50:39 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 31B0F68BB67 for ; Wed, 14 Sep 2022 20:50:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id D866BC0090 for ; Wed, 14 Sep 2022 20:50:31 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Wed, 14 Sep 2022 20:50:30 +0300 Message-Id: <20220914175031.162194-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <4768066.31r3eYUQgx@basile.remlab.net> References: <4768066.31r3eYUQgx@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/3] lavu/riscv: initial common header for assembler macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: xamNjjHlawfq From: Rémi Denis-Courmont --- libavutil/riscv/asm.S | 74 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 libavutil/riscv/asm.S diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S new file mode 100644 index 0000000000..7623c161cf --- /dev/null +++ b/libavutil/riscv/asm.S @@ -0,0 +1,74 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#if defined (__riscv_float_abi_soft) +#define NOHWF +#define NOHWD +#define HWF # +#define HWD # +#elif defined (__riscv_float_abi_single) +#define NOHWF # +#define NOHWD +#define HWF +#define HWD # +#else +#define NOHWF # +#define NOHWD # +#define HWF +#define HWD +#endif + + .macro func sym, ext= + .text + .align 2 + + .option push + .ifnb \ext + .option arch, +\ext + .endif + + .global \sym + .hidden \sym + .type \sym, %function + \sym: + + .macro endfunc + .size \sym, . - \sym + .option pop + .previous + .purgem endfunc + .endm + .endm + + .macro const sym, align=3, relocate=0 + .if \relocate + .pushsection .data.rel.ro + .else + .pushsection .rodata + .endif + .align \align + \sym: + + .macro endconst + .size \sym, . - \sym + .popsection + .purgem endconst + .endm + .endm From patchwork Wed Sep 14 17:50:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37915 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp1475786pzh; Wed, 14 Sep 2022 10:50:56 -0700 (PDT) X-Google-Smtp-Source: AA6agR7gZJ/ZyEQ7N0JmSPI9pg/xNkjeOLHP6Q4vVhea28+CyVab6Iuqp8uBpYAKZdeal8EpuUHE X-Received: by 2002:aa7:d6c7:0:b0:452:2604:ae8b with SMTP id x7-20020aa7d6c7000000b004522604ae8bmr10320199edr.94.1663177856800; Wed, 14 Sep 2022 10:50:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663177856; cv=none; d=google.com; s=arc-20160816; b=yOE9vCSksVypRUKQ26wu/y9aCGOVgPrHtwiOU6KBLIAZTtgyLqX5zHwfADq3ctE+7v 1kjPOzbqGU/lRT/m66vN1l0XulJZWO0LQKpyFdj3lssQacnUs7XHuYehQ9OlV/CJfcaR lUpupCaQI7HNlHRrHOJtWKWVK3RobbaPt8njegtEG5nweEh9Op6blwSbk2RRvq4mSfu/ nTkRHx8ksYBE7+lucjC7wbIBxt1PUeIeLecraP8G70jTNqFf4nc/mYE+K0QMMdwOELb1 cBMURBiLO/XERZPihj6whZza9+Y6/YysZYp+nSWg04wyHnsmUFceQRAhndUq8qeU6MTL Jz3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=kWroH1ToWN/2PTn2vzep/Qgh8b/AQEqhMnfzzSzgmzY=; b=lOzFYBOreVfBEzUjJjs0tE3w/Jj0/kpIiq/KOAq7LYtMrcQc+4GiYhCXeFM5tUQBZh eT+7vqPn4IVMMpetoxt1fSoi/uoovX61Bb0cldguixzBcM/51IN4hMA/6kA3espFJekE qqRmsb6/hFEbiul30rpQfY+Q55oD3YMr7ZGy3t2ogk8f3CVZZUBL9tqJpF6CbAWh3+hC TMntvDtiwvw9tnMVmtXjXdPMKVgQXykBVJcPsCfe9JS+DKs/+ZKgb7kiCsE2KuaOsyYh qvAgxqF5kn/CdPPWTN+JXU9J2MLZxX5RQPl0I5RzxgfwGIo20eOPTbv3xqOJ1SiipDNs b5Uw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sb12-20020a1709076d8c00b0073832e13344si11252987ejc.86.2022.09.14.10.50.56; Wed, 14 Sep 2022 10:50:56 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E212868BB78; Wed, 14 Sep 2022 20:50:39 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 62B0868BB5B for ; Wed, 14 Sep 2022 20:50:32 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 17298C00AF for ; Wed, 14 Sep 2022 20:50:32 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Wed, 14 Sep 2022 20:50:31 +0300 Message-Id: <20220914175031.162194-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <4768066.31r3eYUQgx@basile.remlab.net> References: <4768066.31r3eYUQgx@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/3] lavc/audiodsp: add RISC-V F float vector clip X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: pVvkpqmZZ4br From: Rémi Denis-Courmont RV64G supports MIN & MAX instructions natively only on floating point registers, not general purpose ones. The later would require the Zbb extension. Due to that, it is actually faster to perform the clipping "properly" in FPU. Benchmarked on SiFive U74-MC: audiodsp.vector_clipf_c: 29551.5 audiodsp.vector_clipf_f: 17871.0 Also tried unrolling with 2 or 8 elements but it gets worse either way. --- libavcodec/audiodsp.c | 2 ++ libavcodec/audiodsp.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/audiodsp_init.c | 31 +++++++++++++++++++++ libavcodec/riscv/audiodsp_rvf.S | 46 ++++++++++++++++++++++++++++++++ 5 files changed, 82 insertions(+) create mode 100644 libavcodec/riscv/Makefile create mode 100644 libavcodec/riscv/audiodsp_init.c create mode 100644 libavcodec/riscv/audiodsp_rvf.S diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c index ff43e87dce..eba6e809fd 100644 --- a/libavcodec/audiodsp.c +++ b/libavcodec/audiodsp.c @@ -113,6 +113,8 @@ av_cold void ff_audiodsp_init(AudioDSPContext *c) ff_audiodsp_init_arm(c); #elif ARCH_PPC ff_audiodsp_init_ppc(c); +#elif ARCH_RISCV + ff_audiodsp_init_riscv(c); #elif ARCH_X86 ff_audiodsp_init_x86(c); #endif diff --git a/libavcodec/audiodsp.h b/libavcodec/audiodsp.h index aa6fa7898b..485b512839 100644 --- a/libavcodec/audiodsp.h +++ b/libavcodec/audiodsp.h @@ -55,6 +55,7 @@ typedef struct AudioDSPContext { void ff_audiodsp_init(AudioDSPContext *c); void ff_audiodsp_init_arm(AudioDSPContext *c); void ff_audiodsp_init_ppc(AudioDSPContext *c); +void ff_audiodsp_init_riscv(AudioDSPContext *c); void ff_audiodsp_init_x86(AudioDSPContext *c); #endif /* AVCODEC_AUDIODSP_H */ diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile new file mode 100644 index 0000000000..a1f67ed55b --- /dev/null +++ b/libavcodec/riscv/Makefile @@ -0,0 +1,2 @@ +OBJS += riscv/audiodsp_init.o \ + riscv/audiodsp_rvf.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c new file mode 100644 index 0000000000..7ffd7e8162 --- /dev/null +++ b/libavcodec/riscv/audiodsp_init.c @@ -0,0 +1,31 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/audiodsp.h" + +void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); + +av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_F) + c->vector_clipf = ff_vector_clipf_rvf; +} diff --git a/libavcodec/riscv/audiodsp_rvf.S b/libavcodec/riscv/audiodsp_rvf.S new file mode 100644 index 0000000000..148af96ea2 --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvf.S @@ -0,0 +1,46 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clipf_rvf, f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.v fa1, a4 +1: + flw ft0, (a1) + flw ft1, 4(a1) + fmax.s ft0, ft0, fa0 + flw ft2, 8(a1) + fmax.s ft1, ft1, fa0 + flw ft3, 12(a1) + fmax.s ft2, ft2, fa0 + addi a2, a2, -4 + fmax.s ft3, ft3, fa0 + addi a1, a1, 16 + fmin.s ft0, ft0, fa1 + fmin.s ft1, ft1, fa1 + fsw ft0, (a0) + fmin.s ft2, ft2, fa1 + fsw ft1, 4(a0) + fmin.s ft3, ft3, fa1 + fsw ft2, 8(a0) + fsw ft3, 12(a0) + addi a0, a0, 16 + bnez a2, 1b + ret +endfunc