From patchwork Tue Sep 6 16:50:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37712 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3385257pzh; Tue, 6 Sep 2022 09:50:32 -0700 (PDT) X-Google-Smtp-Source: AA6agR7MdpfDGppMJoVt6G/YmvPsWWcT9hV4z4SYeHhCXejVPX8B7dZC87HAVVrX5mkFprypms6J X-Received: by 2002:a17:907:8a09:b0:731:610:ff8d with SMTP id sc9-20020a1709078a0900b007310610ff8dmr39630544ejc.399.1662483032248; Tue, 06 Sep 2022 09:50:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662483032; cv=none; d=google.com; s=arc-20160816; b=Po7o6Ggfp7LbF/apwSPSYlg1dAl3RVwtJ/x/Jtv7mylPgfnShEX5crnPlqU7Pnqq85 gVtPPrES42wxm8yspuwLtP41nHiTbY0/N3E4kVClWfYeDtZuNb3qSG4oOaXjqxBgpZ6z t6aIiqBtEgeS/8gfA36RJYsyLXxIow/+fv0rdUYvu/vnpE2TmsW4Cq1KZxm/0S28BC2d LUWZ0RW+ULhUJBSSv3ZyFLTvWgoRGKmvGfRf3p8Wd0IiYnaZ1fYxjPHU2iDWT8B11RqV R0YiKtJvH1ksUX6IYIfn+0zaPC/KLA5+anRn1GwCzbCmmm5BRESYWlcr4T/nKxzSb1MO cfEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=6PlyFkHqfrV5ybdML8P46hrfUGGhmQk9yJv6encpIKM=; b=T/Rq9hSLKp+BCkD3JKrjwgXXg8hS+pnctGGnbEPC1zxIKBt6u+GJ8OGlHVVt48/0yh 8vgLtnYpT5/p1pqJ/Lj3LIPvLNwtlZ6oPRiXF9/VfCNT/WtMal+PNa6WXJVY1kjeauns rqPfDOUuLIXlOQGffHPF3YII0NTDO+H+syyxtb33hE/yaEcsxgjVfH40wAlBNad9Cqxj sHmP3sZwcum3KW/CORWEK8At5QP0v9sj4S4JphUhUQtKdW1xjuLuFtu9NeB4LNxyGKcG 42BYq46chUO4c8jLrVLs/Eu7IrIaQv8ytDG7XNxUdjanzNLr9vww1N0HPXcryb26IcPS fa3Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dm17-20020a170907949100b0073d6d81733bsi11440044ejc.678.2022.09.06.09.50.31; Tue, 06 Sep 2022 09:50:32 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A493768BB0C; Tue, 6 Sep 2022 19:50:29 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B6DB168BAFD for ; Tue, 6 Sep 2022 19:50:27 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6B404C00AD for ; Tue, 6 Sep 2022 19:50:27 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 19:50:23 +0300 Message-Id: <20220906165027.91347-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12048155.O9o76ZdvQC@basile.remlab.net> References: <12048155.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/5] doc: reference the RISC-V specification X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: nHlllqZjfzqm From: Rémi Denis-Courmont --- doc/optimization.txt | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/doc/optimization.txt b/doc/optimization.txt index 974e2f9af2..3ed29fe38c 100644 --- a/doc/optimization.txt +++ b/doc/optimization.txt @@ -267,6 +267,11 @@ CELL/SPU: http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/30B3520C93F437AB87257060006FFE5E/$file/Language_Extensions_for_CBEA_2.4.pdf http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/9F820A5FFA3ECE8C8725716A0062585F/$file/CBE_Handbook_v1.1_24APR2007_pub.pdf +RISC-V-specific: +---------------- +The RISC-V Instruction Set Manual, Volume 1, Unprivileged ISA: +https://riscv.org/technical/specifications/ + GCC asm links: -------------- official doc but quite ugly From patchwork Tue Sep 6 16:50:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37713 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3385322pzh; Tue, 6 Sep 2022 09:50:41 -0700 (PDT) X-Google-Smtp-Source: AA6agR7XUwRkfYImM6cjdo1DJZbiF54QnHMl+wn78kHUeiJFGlzqpACFMIEl0JKIPwKVL6HrkfB+ X-Received: by 2002:a05:6402:354f:b0:448:2385:b998 with SMTP id f15-20020a056402354f00b004482385b998mr39394638edd.57.1662483040730; Tue, 06 Sep 2022 09:50:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662483040; cv=none; d=google.com; s=arc-20160816; b=Rgj5h5vRNNw9oD2GmfIsPLbrkrG/T7jyofsTrwnhLENQHnaM02Tye5q7MgRytFNS+5 zn8gPS8Ue1n8ykkQRI8ChlXrtgivUxEVkS+Kz6XuC+pa7NpcG1zcTM9dvtCNCNiJuq9S hoEXojptu6jCVeZ3xL5FJ2OzCc1Kr0THyf1AOP/1P3Eh8LcF0sYF1z0HMurVMTi5CeuX jfjXWcwJtrSl+5YNaTz0tqSXrvij/WTcmchvjJn7cMBsgbOSmUfYzcz8mD0L7rn8jfQQ 63corsrjNiz7UulubrLocNnR/GyWBOslfg3PONHs25P0FhASPVQC0g7EcklUPXK2HNlc 6Sbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=tY9v+MnUl5BAcjQ+7Po7fOjiEFSK1GpAdNAxjQs6RDo=; b=XE49CEnr6N8U1aQUMaCz/OxFltSUkTHu6b1DLGLV3+vP+lSY7lcybxjef/M/XZHnos u5hWxe3gobSX8ejA2H4EgUc3fVICuyQcMyXzwZ1bScqjm0oh0GKbD+a0fg0sO6O+tMA4 0CYuGm0e6zlOwDkTsqq6nLww5fiQqsX7EnirkPdu8EV6IHz76KHcDkWgeNhgTAepCukw HRdn+OHEsYz2fT6gdnUSVTSKmaTmOkwIaI2r2QFpYZ0XWtNl1//FbrNhwatNcPu67Hja 1RGEYSpLQC855qwri7862Xi+DgSYns4nQq7uWLHqgp90n0vAiqKPnAHM5cBFVyPYmiME vjgg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h11-20020a170906854b00b007309e3ce06csi8129057ejy.647.2022.09.06.09.50.40; Tue, 06 Sep 2022 09:50:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A72DA68BB18; Tue, 6 Sep 2022 19:50:30 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 28BF368BAFD for ; Tue, 6 Sep 2022 19:50:28 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 94277C00AE for ; Tue, 6 Sep 2022 19:50:27 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 19:50:24 +0300 Message-Id: <20220906165027.91347-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12048155.O9o76ZdvQC@basile.remlab.net> References: <12048155.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/5] lavu/riscv: AV_READ_TIME cycle counter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 9cL0vsLKzpiP From: Rémi Denis-Courmont This uses the architected RISC-V 64-bit cycle counter from the RISC-V unprivileged instruction set. In 64-bit and 128-bit, this is a straightforward CSR read. In 32-bit mode, the 64-bit value is exposed as two CSRs, which cannot be read atomically, so a loop is necessary to detect and fix up the race condition where the bottom half wraps exactly between the two reads. --- libavutil/riscv/timer.h | 53 +++++++++++++++++++++++++++++++++++++++++ libavutil/timer.h | 2 ++ 2 files changed, 55 insertions(+) create mode 100644 libavutil/riscv/timer.h diff --git a/libavutil/riscv/timer.h b/libavutil/riscv/timer.h new file mode 100644 index 0000000000..a34157a566 --- /dev/null +++ b/libavutil/riscv/timer.h @@ -0,0 +1,53 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_RISCV_TIMER_H +#define AVUTIL_RISCV_TIMER_H + +#include "config.h" + +#if HAVE_INLINE_ASM +#include + +static inline uint64_t rdcycle64(void) +{ +#if (__riscv_xlen >= 64) + uintptr_t cycles; + + __asm__ volatile ("rdcycle %0" : "=r"(cycles)); + +#else + uint64_t cycles; + uint32_t hi, lo, check; + + __asm__ volatile ( + "1: rdcycleh %0\n" + " rdcycle %1\n" + " rdcycleh %2\n" + " bne %0, %2, 1b\n" : "=r" (hi), "=r" (lo), "=r" (check)); + + cycles = (((uint64_t)hi) << 32) | lo; + +#endif + return cycles; +} + +#define AV_READ_TIME rdcycle64 + +#endif +#endif /* AVUTIL_RISCV_TIMER_H */ diff --git a/libavutil/timer.h b/libavutil/timer.h index 48e576739f..d3db5a27ef 100644 --- a/libavutil/timer.h +++ b/libavutil/timer.h @@ -57,6 +57,8 @@ # include "arm/timer.h" #elif ARCH_PPC # include "ppc/timer.h" +#elif ARCH_RISCV +# include "riscv/timer.h" #elif ARCH_X86 # include "x86/timer.h" #endif From patchwork Tue Sep 6 16:50:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37714 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3385385pzh; Tue, 6 Sep 2022 09:50:49 -0700 (PDT) X-Google-Smtp-Source: AA6agR61pt0NubUPAnm7QVwFKxpM2dKypILGwVB56unepB+VVJT/w0vEhH1/QYpBXbtvV5EvHL0j X-Received: by 2002:a05:6402:50ca:b0:447:3355:76e3 with SMTP id h10-20020a05640250ca00b00447335576e3mr47849002edb.72.1662483049097; Tue, 06 Sep 2022 09:50:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662483049; cv=none; d=google.com; s=arc-20160816; b=UOdv8gPMVwjzdgKdIDNbdzemx7YQXXHyucmbw2jaxlR6fvA579ke+0GC3pqPwGjT4p yDFmS2tJOgETsy5LFhIgm/Mn6qGxcgimRCEr41l85VszXgS6UPJ/N439o9WXRxjwLeqE sQfJgR7+2EB8i27dR5vurRLoPLI4aTnoy9uHLZZoDah6olCCld5FzeMM/7mYh1V112hp dVQM0LlA9dKPYlJY71aT4FegVkfx5q2u9FR8oUqIPd+w5XRBDqSBCDlMEwHZKzKwPHfZ iARRNOqYNIIF6kAa9AM/SncV3D94Qp3j4Zx3FCG4NtBRDAtieGkvgRAPy+OVu7qrGng4 sl/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ONZBQHtoJvrfNV/QO568SnsxRZmyZnJQQeRezT+R7iE=; b=GZ7RLLJcuJPG/LHuRZll1sdRomOPIeJAwrfI+CCPRqdRX3mvGqf107RTxYrEm/C/RU GTF2Uuf76pyZn5TkdGwjUX/ASNVvbpZ3+dck2fSq6JIdkFgdjGq59JudWWdgQvfBq7aW YFP2Rn2EH9c8jocIiNcu2kwLve1MM74wXEYSTheKjp9/PZxW7v7biQCwuGkXRhLol2Ej ka9R5cbjy6jKt2GQaelNx1bn2WCmjiecegsJFbhHstjlG6ThCnwd/IwKXWlkJi8YED98 OfmGAIB7SmJsrTYtp/wXLiWG9nNQ/XNXKlQIVjDgv+XxTrjZKmn+5soMEgNP8NtRaPwg DhbA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e18-20020a056402191200b0044852b45dd2si679059edz.206.2022.09.06.09.50.48; Tue, 06 Sep 2022 09:50:49 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B237768BB2B; Tue, 6 Sep 2022 19:50:31 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 37D0E68BB0C for ; Tue, 6 Sep 2022 19:50:28 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id C6736C00AF for ; Tue, 6 Sep 2022 19:50:27 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 19:50:25 +0300 Message-Id: <20220906165027.91347-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12048155.O9o76ZdvQC@basile.remlab.net> References: <12048155.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/5] configure/riscv: detect fast CLZ X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: vFP1q5PX2JcJ From: Rémi Denis-Courmont RISC-V defines the CLZ instruction as part of the ratified Zbb subset of the (not yet ratified) bit mapulation extension (B). We can detect it from the __riscv_zbb predefined constant. At least GCC 12 already supports this correctly. Note that the macro will be non-zero if supported, zero if enabled in the compiler flags (e.g. -march=rv64gzbb) but not known to the compiler, and undefined otherwise. --- configure | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/configure b/configure index 9e51abd0d3..b7dc1d8656 100755 --- a/configure +++ b/configure @@ -5334,6 +5334,12 @@ elif enabled ppc; then ;; esac +elif enabled riscv; then + + if test_cpp_condition stddef.h "__riscv_zbb"; then + enable fast_clz + fi + elif enabled sparc; then case $cpu in From patchwork Tue Sep 6 16:50:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37716 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3385436pzh; Tue, 6 Sep 2022 09:50:57 -0700 (PDT) X-Google-Smtp-Source: AA6agR4o1zR3ZxtCxYj5joCB5/ryUvnK6/NVp0CfkZDC9ma2wr6jxO3z6GheLUSa0kWbA7QEkhDH X-Received: by 2002:a05:6402:248f:b0:440:9bb3:5936 with SMTP id q15-20020a056402248f00b004409bb35936mr49458691eda.178.1662483057128; Tue, 06 Sep 2022 09:50:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662483057; cv=none; d=google.com; s=arc-20160816; b=yjoZ7jz2zI5TZyKd3Ug5/66ho6GZlR9X/vE7j012M6+6cu5s+TtT4UjPeElhTpCbay 82Dd7cAnkik9WNrpXOShOZR/80PglO8hBtalh3nPTQhXIl29H2s2FidOXG900xYW0tLu zlTyth7OzlsAQpQfmDrsbxL+eZ97rD19peam7Ecn1s4PUEkMWUgaJvUFeJ/7vTPpCjSh NjlAaYq891kyEsV0pUqERmzkEH5pls67eGHOf8HkxX/6/81Nt0KmGBCz7KgSA4tzh3Le yZcy5aGDNFmVUZfSTtciHYWpZIiVLsYyAVcO8LrW/1G+eVIzWX5F8VhA2Zce38JTKZOu jfng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=jVp59uuj9jWqE1sqKr9zVxd+9IcgkUK8L4g/icvyTF4=; b=sH0ud+9hO11nOziwIwo3OUP3hOkj1skdnUjsVoTQR5+l3XOI2JctZz44gkADO7aV8H +dKHABtsRcYcUVwKFMPSHWLk69kgE9n4jWjEwezOti1XwP3zzqNl46qIBwMh4ZyG2TDs 32NXJgVUPkfLEfdSrnI3Hi96rYdN4mWZ1r46cW0BrHkaAoG1zHgL6Uqnib0n0h1BYb/P C87AshSPhwua07TsYazB6MkeHcufPfvUhVA6trZh8WBjShCwsW6//m3ommhteo6Nf1iM fS4vWSHM34yv1Qumsj9vHDHZAiXE9hSne/ez7IhDPKfOdSzOizI5tOu7fF7+EfD2kaeK zkcw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ne3-20020a1709077b8300b007312789a037si11145939ejc.144.2022.09.06.09.50.56; Tue, 06 Sep 2022 09:50:57 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B793168BB35; Tue, 6 Sep 2022 19:50:32 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 41EAE68BB10 for ; Tue, 6 Sep 2022 19:50:28 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id EF780C00AD for ; Tue, 6 Sep 2022 19:50:27 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 19:50:26 +0300 Message-Id: <20220906165027.91347-4-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12048155.O9o76ZdvQC@basile.remlab.net> References: <12048155.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/5] lavu/riscv: byte-swap operations X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: gqMqTrW6rz4g From: Rémi Denis-Courmont If the target supports the Basic bit-manipulation (Zbb) extension, then the REV8 instruction is available to reverse byte order. Note that this instruction only exists at the "XLEN" register size, so we need to right shift the result down to the data width. If Zbb is not supported, then this patchset does nothing. Support for run-time detection is left for the future. Currently, there are no bits in auxv/ELF HWCAP for Z-extensions, so there are no clean ways to do this. --- libavutil/bswap.h | 2 ++ libavutil/riscv/bswap.h | 74 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+) create mode 100644 libavutil/riscv/bswap.h diff --git a/libavutil/bswap.h b/libavutil/bswap.h index 91cb79538d..4840ab433f 100644 --- a/libavutil/bswap.h +++ b/libavutil/bswap.h @@ -40,6 +40,8 @@ # include "arm/bswap.h" #elif ARCH_AVR32 # include "avr32/bswap.h" +#elif ARCH_RISCV +# include "riscv/bswap.h" #elif ARCH_SH4 # include "sh4/bswap.h" #elif ARCH_X86 diff --git a/libavutil/riscv/bswap.h b/libavutil/riscv/bswap.h new file mode 100644 index 0000000000..de1429c0f7 --- /dev/null +++ b/libavutil/riscv/bswap.h @@ -0,0 +1,74 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_RISCV_BSWAP_H +#define AVUTIL_RISCV_BSWAP_H + +#include +#include "config.h" +#include "libavutil/attributes.h" + +#if defined (__riscv_zbb) && (__riscv_zbb > 0) && HAVE_INLINE_ASM + +static av_always_inline av_const uintptr_t av_bswap_xlen(uintptr_t x) +{ + uintptr_t y; + + __asm__("rev8 %0, %1" : "=r" (y) : "r" (x)); + return y; +} + +#define av_bswap16 av_bswap16 + +static av_always_inline av_const uint_fast16_t av_bswap16(uint_fast16_t x) +{ + return av_bswap_xlen(x) >> (__riscv_xlen - 16); +} + +#if (__riscv_xlen == 32) +#define av_bswap32 av_bswap_xlen +#define av_bswap64 av_bswap64 + +static av_always_inline av_const uint64_t av_bswap64(uint64_t x) +{ + return (((uint64_t)av_bswap32(x)) << 32) | av_bswap32(x >> 32); +} + +#else +#define av_bswap32 av_bswap32 + +static av_always_inline av_const uint_fast32_t av_bswap32(uint_fast32_t x) +{ + return av_bswap_xlen(x) >> (__riscv_xlen - 32); +} + +#if (__riscv_xlen == 64) +#define av_bswap64 av_bswap_xlen + +#else +#define av_bswap64 av_bswap64 + +static av_always_inline av_const uint_fast64_t av_bswap64(uint_fast64_t x) +{ + return av_bswap_xlen(x) >> (__riscv_xlen - 64); +} + +#endif /* __riscv_xlen > 64 */ +#endif /* __riscv_xlen > 32 */ +#endif /* __riscv_zbb */ +#endif /* AVUTIL_RISCV_BSWAP_H */ From patchwork Tue Sep 6 16:53:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37715 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp3386951pzh; Tue, 6 Sep 2022 09:54:09 -0700 (PDT) X-Google-Smtp-Source: AA6agR6716VHy2zbkbrp7XZax94ZRqqKgTfIBbXet5oppl0LRMLwqDsZXCFRD8UaE+8JGeOBtbWy X-Received: by 2002:a17:906:9752:b0:738:364a:4ac with SMTP id o18-20020a170906975200b00738364a04acmr40417727ejy.759.1662483249270; Tue, 06 Sep 2022 09:54:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662483249; cv=none; d=google.com; s=arc-20160816; b=Ndh8dAwi4LyBgfiySdFZhOevCZWmyh+UiNFFUst9qcNCXvWOHNMl96u/y4qsIvILtv XZTXAy2wSHuGkJKYWtX86y9/p4MRzFN6BIwE98tog/rEqe9vis9ZJXxdTihKcP6SOwVo BZ8El1/FVbiLOcpBjDH/w2KjTeEncRHgQP6wdqvYNOpbOiIa2ZaOERMooOy4d1JA704a zKNL1V1NFHvawseWTiwFGY3Z29nHsMksYTrWg2HiDHDox6TvC7+oraIDBki/Uu2Oyu0I 1+VZYO2B3pbq0x5+EMRoQfbRnKexLNddlVdj9UJ3ssVspxWo2RRotn+n/wKmh+5Ej4nz 2tYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ikcL3R1PzkMsXVWEU9BTjRlmSX53KYr8dVHwo7KLukk=; b=kPKC+N+TmXlHBFny8/7cmA5HtF8mhFdX9q9vT9jhktZvlAIaESHleLj4rMGZOvQn13 tBQYdWQTq5Ykml0H7MO+UbFxxsPLpOo6CuADvLhleuUztUMqUbCXzQzIIEPVz9LV54fv 4P6NHLcVZ9n6SqtZ/4psGeHcygQioSrBl3G6/7ZjFx9MJaPgq4XPDpm+y5hsieysjE1A 1r8bJUsXYF9b3eCEQfeX+kCAha1oz7I+eFgHAEVGrTxcQefTihZmSO+XoVPX5LFhM9pI gMiZpYzZ90gCMYNthxwetL/Tguc5rDD+nttER+/9uQNEwCasWADfbGekw5p0NqatUNDM TIeQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id kk24-20020a170907767800b00731010c202dsi564136ejc.764.2022.09.06.09.54.08; Tue, 06 Sep 2022 09:54:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8C2BD68BB2E; Tue, 6 Sep 2022 19:54:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 83F5268BA95 for ; Tue, 6 Sep 2022 19:53:59 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 3CBFFC00AD for ; Tue, 6 Sep 2022 19:53:59 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Tue, 6 Sep 2022 19:53:59 +0300 Message-Id: <20220906165359.91560-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12048155.O9o76ZdvQC@basile.remlab.net> References: <12048155.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/5] lavu/riscv: add optimisations X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 8xUT5TwX0Suz From: Rémi Denis-Courmont This provides some micro-optimisations for signed integer clipping, and support for bit weight with the Zbb extension. --- libavutil/intmath.h | 5 +- libavutil/riscv/intmath.h | 103 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 106 insertions(+), 2 deletions(-) create mode 100644 libavutil/riscv/intmath.h diff --git a/libavutil/intmath.h b/libavutil/intmath.h index 9573109e9d..c54d23b7bf 100644 --- a/libavutil/intmath.h +++ b/libavutil/intmath.h @@ -28,8 +28,9 @@ #if ARCH_ARM # include "arm/intmath.h" -#endif -#if ARCH_X86 +#elif ARCH_RISCV +# include "riscv/intmath.h" +#elif ARCH_X86 # include "x86/intmath.h" #endif diff --git a/libavutil/riscv/intmath.h b/libavutil/riscv/intmath.h new file mode 100644 index 0000000000..78f7ba930a --- /dev/null +++ b/libavutil/riscv/intmath.h @@ -0,0 +1,103 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_RISCV_INTMATH_H +#define AVUTIL_RISCV_INTMATH_H + +#include + +#include "config.h" +#include "libavutil/attributes.h" + +/* + * The compiler is forced to sign-extend the result anyhow, so it is faster to + * compute it explicitly and use it. + */ +#define av_clip_int8 av_clip_int8_rvi +static av_always_inline av_const int8_t av_clip_int8_rvi(int a) +{ + union { uint8_t u; int8_t s; } u = { .u = a }; + + if (a != u.s) + a = ((a >> 31) ^ 0x7F); + return a; +} + +#define av_clip_int16 av_clip_int16_rvi +static av_always_inline av_const int16_t av_clip_int16_rvi(int a) +{ + union { uint8_t u; int8_t s; } u = { .u = a }; + + if (a != u.s) + a = ((a >> 31) ^ 0x7F); + return a; +} + +#define av_clipl_int32 av_clipl_int32_rvi +static av_always_inline av_const int32_t av_clipl_int32_rvi(int64_t a) +{ + union { uint32_t u; int32_t s; } u = { .u = a }; + + if (a != u.s) + a = ((a >> 63) ^ 0x7FFFFFFF); + return a; +} + +#define av_clip_intp2 av_clip_intp2_rvi +static av_always_inline av_const int av_clip_intp2_rvi(int a, int p) +{ + const int shift = 32 - p; + int b = (a << shift) >> shift; + + if (a != b) + b = (a >> 31) ^ ((1 << p) - 1); + return b; +} + +#if defined (__riscv_zbb) && (__riscv_zbb > 0) && HAVE_INLINE_ASM + +#define av_popcount av_popcount_rvb +static av_always_inline av_const int av_popcount_rvb(uint32_t x) +{ + int ret; + +#if (__riscv_xlen >= 64) + __asm__ ("cpopw %0, %1\n" : "=r" (ret) : "r" (x)); +#else + __asm__ ("cpop %0, %1\n" : "=r" (ret) : "r" (x)); +#endif + return ret; +} + +#if (__riscv_xlen >= 64) +#define av_popcount64 av_popcount64_rvb +static av_always_inline av_const int av_popcount64_rvb(uint64_t x) +{ + int ret; + +#if (__riscv_xlen >= 128) + __asm__ ("cpopd %0, %1\n" : "=r" (ret) : "r" (x)); +#else + __asm__ ("cpop %0, %1\n" : "=r" (ret) : "r" (x)); +#endif + return ret; +} +#endif /* __riscv_xlen >= 64 */ +#endif /* __riscv_zbb */ + +#endif /* AVUTIL_RISCV_INTMATH_H */