From patchwork Mon Sep 12 15:53:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37858 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp162851pzh; Mon, 12 Sep 2022 08:53:45 -0700 (PDT) X-Google-Smtp-Source: AA6agR5OhnbkLh63HTs6j4xvTV7PwLhFImR7Kb1SdqmiA/rE+MGWOn39Au2KgJa+FpipSXDcH1SL X-Received: by 2002:a17:906:8468:b0:77a:5905:81aa with SMTP id hx8-20020a170906846800b0077a590581aamr10145914ejc.481.1662998025472; Mon, 12 Sep 2022 08:53:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998025; cv=none; d=google.com; s=arc-20160816; b=hWJxuGpqdJTJ3VmCwv+zxGE0NkCGCiax3S8zFWqHOTPob8QAxoTtlAA8PhL3/eanJy DcxQ+o5PvYdYmjySbpsu2rajaCjtpL/Hv9ZTpFakQ1ue6IzSa9dHgD3HcsbtzzhxMCyC azOuD1gGf9AvtNM/OKw4HLgdy1TPu7jbn3HzcnzjPJ1qiIiIr8fDZ/ItQEkfasA+bnlj lfVd8BNzwwHlgsDquGoVcemxebAqfZsz+CZQwrK05unR3QFyLau1nIFjxD26El84ot9K yCwwqpjdNqzsXhJsCRYdwnSWOFxf+9O7pkSB/3Dv3jJdQYjHhAMaBR7LmsxTUpixl0Ip 5NaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=6PlyFkHqfrV5ybdML8P46hrfUGGhmQk9yJv6encpIKM=; b=jbsRXp3XocamcrfPiVREAAYluBt9sbUNFT91aZw6NSYiwgRTz+40HiwOAFxuhk5EdI woyZsoO4Q9oi60MAG5B/kuPL8Dp71EwyGf7+4SvjwCUKGkINA4a3y8flLg90MrUcKuM+ y5iKxXE8cGFKbF6EBgFNkK40p4rtGV9l/ujVbioZJ9D7d1HkE9j4Es9ueQcCXDv0TfOS sUxWDOPymJ2dwg2PfSIg7n751ZYpGHopiHxhcUl8slRpKYfXVGInXw2/pAm142VUjF6m 2S4jWvFyNF2tXbbvQikCSTFMVuyG++5WZdqO39j7mLcrIvUttzmPW5CwQf5x8IbEpHa8 OYMQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id z12-20020a05640235cc00b0044f0687e7f8si8713989edc.493.2022.09.12.08.53.44; Mon, 12 Sep 2022 08:53:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 70F9268BB49; Mon, 12 Sep 2022 18:53:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BC32C68BA81 for ; Mon, 12 Sep 2022 18:53:34 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 0D937C0014 for ; Mon, 12 Sep 2022 18:53:34 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:16 +0300 Message-Id: <20220912155333.59843-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/18] doc: reference the RISC-V specification X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: KV2yyl4qVeFD From: Rémi Denis-Courmont --- doc/optimization.txt | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/doc/optimization.txt b/doc/optimization.txt index 974e2f9af2..3ed29fe38c 100644 --- a/doc/optimization.txt +++ b/doc/optimization.txt @@ -267,6 +267,11 @@ CELL/SPU: http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/30B3520C93F437AB87257060006FFE5E/$file/Language_Extensions_for_CBEA_2.4.pdf http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/9F820A5FFA3ECE8C8725716A0062585F/$file/CBE_Handbook_v1.1_24APR2007_pub.pdf +RISC-V-specific: +---------------- +The RISC-V Instruction Set Manual, Volume 1, Unprivileged ISA: +https://riscv.org/technical/specifications/ + GCC asm links: -------------- official doc but quite ugly From patchwork Mon Sep 12 15:53:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37859 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp162945pzh; Mon, 12 Sep 2022 08:53:55 -0700 (PDT) X-Google-Smtp-Source: AA6agR5W9zQUOgYxgNua2xjt1lFqyXFWR+zq0OUi4lvZQ4M4bEGAsrGDpt5qnZVRks5zuuXTBThM X-Received: by 2002:a17:906:fe0b:b0:730:3646:d177 with SMTP id wy11-20020a170906fe0b00b007303646d177mr18360789ejb.688.1662998035566; Mon, 12 Sep 2022 08:53:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998035; cv=none; d=google.com; s=arc-20160816; b=WwPRQPKL8Uv32HubRLphKA1Hyypaejn/zHbqtbazPdO6hLM8Ls86LWsxncbDaREh6Q OJVdTmn9c9VeVOtmlG1I7m2WUNGVUs49j0yhyxlASBDEADk/klH/V4Xq1i/rXaDnrJ0W KgT+ar28WGmEhKT3jRM+036+wu5FVVCHTjb4t0eYhLRyxUk3aIoHJgeRyqFG6XndchiG FyOUrLFM8hVW/4r7VVera9UBT41JnzBibP/mhPACHxbTPtkOsLjtB5PGjb00BVycmUC9 +1IlX6INTLd4q20tCrZSZcGyUM3k6K2VpFtPoUc752uT8d459jmkEAkbePTkUuE9IDcA X+Bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=tY9v+MnUl5BAcjQ+7Po7fOjiEFSK1GpAdNAxjQs6RDo=; b=n3p0vsZjstg9zGBjgfJ1kx7zQjEjFalq+TmSvIOwPd3rsPxlgUv6UhjNUBrBaLnfEB TqKBJCxX/Re/qD3nwfPjGGcvHwVIusW+e2l2NkBqRNxJ1lDngJd2Pj67MqQ/Ixejf+pQ FxTedCCNHhfB0SGEl4gpyHhCYMgsI2Tda89108sHO98ZTm04iHrNrRro1v71cp3l7SD0 uhB0NWID/h/L9m3DQLJxaPevslrNywohbiPub5btI9nmeYg8kt0rjxmALIk8KZ8E4eHi T88kGkcYSTX1ZdzhXPY5i/fN8hpuPhHA/2g5z5mCH+bMFqm+BMg0sRtD8gL2JfdXg/x7 QuTg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i31-20020a0564020f1f00b0044e4e32ed3asi4891475eda.259.2022.09.12.08.53.55; Mon, 12 Sep 2022 08:53:55 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8D69D68BB47; Mon, 12 Sep 2022 18:53:42 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BF1C568BB2E for ; Mon, 12 Sep 2022 18:53:34 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 40F2FC001B for ; Mon, 12 Sep 2022 18:53:34 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:17 +0300 Message-Id: <20220912155333.59843-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/18] lavu/riscv: AV_READ_TIME cycle counter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: tfjbI5dTMF3F From: Rémi Denis-Courmont This uses the architected RISC-V 64-bit cycle counter from the RISC-V unprivileged instruction set. In 64-bit and 128-bit, this is a straightforward CSR read. In 32-bit mode, the 64-bit value is exposed as two CSRs, which cannot be read atomically, so a loop is necessary to detect and fix up the race condition where the bottom half wraps exactly between the two reads. --- libavutil/riscv/timer.h | 53 +++++++++++++++++++++++++++++++++++++++++ libavutil/timer.h | 2 ++ 2 files changed, 55 insertions(+) create mode 100644 libavutil/riscv/timer.h diff --git a/libavutil/riscv/timer.h b/libavutil/riscv/timer.h new file mode 100644 index 0000000000..a34157a566 --- /dev/null +++ b/libavutil/riscv/timer.h @@ -0,0 +1,53 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_RISCV_TIMER_H +#define AVUTIL_RISCV_TIMER_H + +#include "config.h" + +#if HAVE_INLINE_ASM +#include + +static inline uint64_t rdcycle64(void) +{ +#if (__riscv_xlen >= 64) + uintptr_t cycles; + + __asm__ volatile ("rdcycle %0" : "=r"(cycles)); + +#else + uint64_t cycles; + uint32_t hi, lo, check; + + __asm__ volatile ( + "1: rdcycleh %0\n" + " rdcycle %1\n" + " rdcycleh %2\n" + " bne %0, %2, 1b\n" : "=r" (hi), "=r" (lo), "=r" (check)); + + cycles = (((uint64_t)hi) << 32) | lo; + +#endif + return cycles; +} + +#define AV_READ_TIME rdcycle64 + +#endif +#endif /* AVUTIL_RISCV_TIMER_H */ diff --git a/libavutil/timer.h b/libavutil/timer.h index 48e576739f..d3db5a27ef 100644 --- a/libavutil/timer.h +++ b/libavutil/timer.h @@ -57,6 +57,8 @@ # include "arm/timer.h" #elif ARCH_PPC # include "ppc/timer.h" +#elif ARCH_RISCV +# include "riscv/timer.h" #elif ARCH_X86 # include "x86/timer.h" #endif From patchwork Mon Sep 12 15:53:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37860 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp163021pzh; Mon, 12 Sep 2022 08:54:04 -0700 (PDT) X-Google-Smtp-Source: AA6agR7rRT1ARtbRbhfoapXQVPSXv9g/rTUCb+PrDQXYwlQijMAMXwHpRGQHNtlvGhXIWRFRaWvd X-Received: by 2002:aa7:cc8a:0:b0:446:7668:2969 with SMTP id p10-20020aa7cc8a000000b0044676682969mr6507393edt.206.1662998044242; Mon, 12 Sep 2022 08:54:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998044; cv=none; d=google.com; s=arc-20160816; b=AFSUWaPkR9c6IF5elcpH09vKP71eX9l91nL5TrIveLuVmrGYG3y34Ma1qnDgsylziI Or43I/jlQR6+bRr3ySOl3mMZ1s2/Tm6/v8rqaMrLyD2rwsD5SMslf64HB/grv/gxryQJ 9HLIZP4keTF1vp5aN3e6n21DLBb3pInntaSISEfSAwuUD2BXQzWykbvniJW8ss7BPSUe 4rg6wpMtaK1CAUl7u4Kzccv6R3YqDZd08NKSVBcJLxytjqhW/vrtZdv8SsFkF869r7TI KDYwWGnP+m3buO3f78GAyb2ScPiwkfTM53nrpiPgR0Wll3SzVR/rVr2B+7QOx7gZJwoC ydUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ONZBQHtoJvrfNV/QO568SnsxRZmyZnJQQeRezT+R7iE=; b=EqZezQw24PBeK5pa2937qN4G4yrPgoc788fEx4bd4UVOmywO3b/ywaSuSJXaqkUghe ZQHp5BjTWu5+2nDktilDqAXJsEg+pvgrbYnCgvhqze98VfJ3lLfMkpkAbNjSLas5ww37 OGzk4H1JNR1T3hN+zgd/sHmfJ9FaJkoZ7LVKmY5jrTmMITlBOIBRFRJEmAScMzcdbq5u T9nEijeU/v2hc+c6Mdp2YNAAxMA6Ah6couVnUPEvvwUKcJuANs/UYZrLZ1hDRxKUQeIh nsR2aS4UAOXXRuO16dpckdv9i1XMIFHZJEmAXQdbxOAHtRPkSfiKZqQ41btMIVVtRQED 2k0A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dt12-20020a170907728c00b0076f0a1c1501si3902490ejc.698.2022.09.12.08.54.03; Mon, 12 Sep 2022 08:54:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B190F68BB4C; Mon, 12 Sep 2022 18:53:43 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C217A68BB38 for ; Mon, 12 Sep 2022 18:53:34 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 74580C00AF for ; Mon, 12 Sep 2022 18:53:34 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:18 +0300 Message-Id: <20220912155333.59843-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/18] configure/riscv: detect fast CLZ X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: kVXV1hJT7sfP From: Rémi Denis-Courmont RISC-V defines the CLZ instruction as part of the ratified Zbb subset of the (not yet ratified) bit mapulation extension (B). We can detect it from the __riscv_zbb predefined constant. At least GCC 12 already supports this correctly. Note that the macro will be non-zero if supported, zero if enabled in the compiler flags (e.g. -march=rv64gzbb) but not known to the compiler, and undefined otherwise. --- configure | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/configure b/configure index 9e51abd0d3..b7dc1d8656 100755 --- a/configure +++ b/configure @@ -5334,6 +5334,12 @@ elif enabled ppc; then ;; esac +elif enabled riscv; then + + if test_cpp_condition stddef.h "__riscv_zbb"; then + enable fast_clz + fi + elif enabled sparc; then case $cpu in From patchwork Mon Sep 12 15:53:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37861 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp163087pzh; Mon, 12 Sep 2022 08:54:12 -0700 (PDT) X-Google-Smtp-Source: AA6agR7O69JTROczL2PZrPNs2fF5wplZ2vFwZxiIfkjyC6tdtbxS4pY5XQEGsIlvOzOXhI9Sum2H X-Received: by 2002:a17:906:cc4a:b0:779:ed37:b5a3 with SMTP id mm10-20020a170906cc4a00b00779ed37b5a3mr10768127ejb.626.1662998052242; Mon, 12 Sep 2022 08:54:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998052; cv=none; d=google.com; s=arc-20160816; b=zNLWRLdhB8n3GKbMkfB6zvsfOeB+I9GHRBu9l4D7KL/Ej8o+QeeuxND9aV5wJIPIh3 CljqzgjSoKBevaAft04pgwoCtjNjyuoYHvVIXCz6A+t1FAk33wgyNAZXpTQWQP96o58X +O30NItq77YueAAMwMLgNbhfaOYgn4E5n1DqHc5jgfVEdCba0ZBpjt5WwlcxhRwq9MZU zg+TTQlU1w/XHM0XRu6FJ7UkMN2WTbwdXrZ6qEirD2jIp27iB0YGMiWseAGDZnti2whE UoyduJEGg0vWrTeeRInMF8KJJj07hmOOQSPIdBC/5P5ytsYXqVzp4oc+BtuzCnorHI+M BgrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=jVp59uuj9jWqE1sqKr9zVxd+9IcgkUK8L4g/icvyTF4=; b=ErTnEHSwr7e33LFSWro3NYw2Z9dhAbN9Vmg48PpQN3KTnpXPjX1EFPra0AVB+0Jbi3 Ib4hn1bfae2Pz/VNsls0Tm82Z2yO5MCKckwv/mozxnErf/46oVEfkAU24xNGXzBSXlqW 62l2UlBgrBIjP/tv5W82baONzhVv7DMDT78Vw7wl4M4DReCXItkMCxJC4to7CfXlLKe+ aPhtIKPVT3kHK1Huy/UAija51tVDglJbtGFd6AVO2KB05U3oQWB9Uxq5LCWuiYcvJXsK d4BCJ7StdeNCTvnv/Y6MyzYRCB2BtVu4UnUJCosYIuLGfhxFMrIP86kw6fJHoVqNVw1S aQfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e16-20020a056402191000b0044fc3c0e505si8144176edz.318.2022.09.12.08.54.11; Mon, 12 Sep 2022 08:54:12 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C267A68BB55; Mon, 12 Sep 2022 18:53:44 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id ECD1868BB2E for ; Mon, 12 Sep 2022 18:53:34 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 9DAD8C00B0 for ; Mon, 12 Sep 2022 18:53:34 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:19 +0300 Message-Id: <20220912155333.59843-4-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/18] lavu/riscv: byte-swap operations X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: LUTiQRs1F+ON From: Rémi Denis-Courmont If the target supports the Basic bit-manipulation (Zbb) extension, then the REV8 instruction is available to reverse byte order. Note that this instruction only exists at the "XLEN" register size, so we need to right shift the result down to the data width. If Zbb is not supported, then this patchset does nothing. Support for run-time detection is left for the future. Currently, there are no bits in auxv/ELF HWCAP for Z-extensions, so there are no clean ways to do this. --- libavutil/bswap.h | 2 ++ libavutil/riscv/bswap.h | 74 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+) create mode 100644 libavutil/riscv/bswap.h diff --git a/libavutil/bswap.h b/libavutil/bswap.h index 91cb79538d..4840ab433f 100644 --- a/libavutil/bswap.h +++ b/libavutil/bswap.h @@ -40,6 +40,8 @@ # include "arm/bswap.h" #elif ARCH_AVR32 # include "avr32/bswap.h" +#elif ARCH_RISCV +# include "riscv/bswap.h" #elif ARCH_SH4 # include "sh4/bswap.h" #elif ARCH_X86 diff --git a/libavutil/riscv/bswap.h b/libavutil/riscv/bswap.h new file mode 100644 index 0000000000..de1429c0f7 --- /dev/null +++ b/libavutil/riscv/bswap.h @@ -0,0 +1,74 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_RISCV_BSWAP_H +#define AVUTIL_RISCV_BSWAP_H + +#include +#include "config.h" +#include "libavutil/attributes.h" + +#if defined (__riscv_zbb) && (__riscv_zbb > 0) && HAVE_INLINE_ASM + +static av_always_inline av_const uintptr_t av_bswap_xlen(uintptr_t x) +{ + uintptr_t y; + + __asm__("rev8 %0, %1" : "=r" (y) : "r" (x)); + return y; +} + +#define av_bswap16 av_bswap16 + +static av_always_inline av_const uint_fast16_t av_bswap16(uint_fast16_t x) +{ + return av_bswap_xlen(x) >> (__riscv_xlen - 16); +} + +#if (__riscv_xlen == 32) +#define av_bswap32 av_bswap_xlen +#define av_bswap64 av_bswap64 + +static av_always_inline av_const uint64_t av_bswap64(uint64_t x) +{ + return (((uint64_t)av_bswap32(x)) << 32) | av_bswap32(x >> 32); +} + +#else +#define av_bswap32 av_bswap32 + +static av_always_inline av_const uint_fast32_t av_bswap32(uint_fast32_t x) +{ + return av_bswap_xlen(x) >> (__riscv_xlen - 32); +} + +#if (__riscv_xlen == 64) +#define av_bswap64 av_bswap_xlen + +#else +#define av_bswap64 av_bswap64 + +static av_always_inline av_const uint_fast64_t av_bswap64(uint_fast64_t x) +{ + return av_bswap_xlen(x) >> (__riscv_xlen - 64); +} + +#endif /* __riscv_xlen > 64 */ +#endif /* __riscv_xlen > 32 */ +#endif /* __riscv_zbb */ +#endif /* AVUTIL_RISCV_BSWAP_H */ From patchwork Mon Sep 12 15:53:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37862 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp163160pzh; Mon, 12 Sep 2022 08:54:21 -0700 (PDT) X-Google-Smtp-Source: AA6agR6qpzzw861U726NthZs9BRuMFcF/ZV1AgV+W3nE2a/YlN8NK6oIaO3b27Mnnk3DFE8/VXE5 X-Received: by 2002:a17:907:7205:b0:770:86f1:ae6e with SMTP id dr5-20020a170907720500b0077086f1ae6emr18326681ejc.396.1662998061168; Mon, 12 Sep 2022 08:54:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998061; cv=none; d=google.com; s=arc-20160816; b=b8rbMhPCOmAtFkQTvm5MBsaRjAXENnuGCI9I1SXaOTAxkoQcAnQgnhCnM00SIBDxSc sE6Y7j9QbAiwhf6+uByat9U8lOVTsSPPs8hXRJ0TsFHE5tl0L5xQgOiUmdLPPWldcbbx VnsvaDKGVFO6KqwzZocd3Jqa0uULkY0YwYbijQBaiNjvckBJftmP1coFEE4UTfbskGMB lwuQKrjgwhsbqD03Up9LDCroxi4RrN63eXvZsskvx3YLXBl5cB9v5UucapqTJyuNrPPa X+16pLoTXFBdmiXwwPshVYjrqQyfPFEEtGW8lnSBgvVhYxdu1IussGpkvj76iqsbpotw 11pg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ikcL3R1PzkMsXVWEU9BTjRlmSX53KYr8dVHwo7KLukk=; b=YZOX82M1RedEM/ZDRucWsdu3AuYQ4X5yptGG6HCShGSLr7zxxmrzXPmYQhh//9a8KJ mbefCVXc3rPb4SvwCb4RJlFXd/zXredgW5+N8JA522cFSMr2/GdwMWvU9JbiDNTthI0A 1jiIWNW3747r62IgMtXWTzxKmCXoggXLPpIBBEXWamtLwE4+ezwKcf02HfDwq5Tcl0g/ +UjRUwNOhH7GZ+ONmj2siICIXRuT5Z2Wg9kJIGGJU+NNLHeEyoYGuohiJemO41Zjm0XK zlY4lT9yu8D5VvjqTNnF0R9s2A9n55jehuCV1ajIA2AKiW5+NFwugOtAOvFEJGP7DAFN o/rw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i1-20020a05640242c100b0044dbb9afe1fsi8616799edc.467.2022.09.12.08.54.20; Mon, 12 Sep 2022 08:54:21 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C467868BB6B; Mon, 12 Sep 2022 18:53:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 28AB568BB2E for ; Mon, 12 Sep 2022 18:53:35 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id D0EEBC00B1 for ; Mon, 12 Sep 2022 18:53:34 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:20 +0300 Message-Id: <20220912155333.59843-5-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/18] lavu/riscv: add optimisations X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 6kiudngf/ng/ From: Rémi Denis-Courmont This provides some micro-optimisations for signed integer clipping, and support for bit weight with the Zbb extension. --- libavutil/intmath.h | 5 +- libavutil/riscv/intmath.h | 103 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 106 insertions(+), 2 deletions(-) create mode 100644 libavutil/riscv/intmath.h diff --git a/libavutil/intmath.h b/libavutil/intmath.h index 9573109e9d..c54d23b7bf 100644 --- a/libavutil/intmath.h +++ b/libavutil/intmath.h @@ -28,8 +28,9 @@ #if ARCH_ARM # include "arm/intmath.h" -#endif -#if ARCH_X86 +#elif ARCH_RISCV +# include "riscv/intmath.h" +#elif ARCH_X86 # include "x86/intmath.h" #endif diff --git a/libavutil/riscv/intmath.h b/libavutil/riscv/intmath.h new file mode 100644 index 0000000000..78f7ba930a --- /dev/null +++ b/libavutil/riscv/intmath.h @@ -0,0 +1,103 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_RISCV_INTMATH_H +#define AVUTIL_RISCV_INTMATH_H + +#include + +#include "config.h" +#include "libavutil/attributes.h" + +/* + * The compiler is forced to sign-extend the result anyhow, so it is faster to + * compute it explicitly and use it. + */ +#define av_clip_int8 av_clip_int8_rvi +static av_always_inline av_const int8_t av_clip_int8_rvi(int a) +{ + union { uint8_t u; int8_t s; } u = { .u = a }; + + if (a != u.s) + a = ((a >> 31) ^ 0x7F); + return a; +} + +#define av_clip_int16 av_clip_int16_rvi +static av_always_inline av_const int16_t av_clip_int16_rvi(int a) +{ + union { uint8_t u; int8_t s; } u = { .u = a }; + + if (a != u.s) + a = ((a >> 31) ^ 0x7F); + return a; +} + +#define av_clipl_int32 av_clipl_int32_rvi +static av_always_inline av_const int32_t av_clipl_int32_rvi(int64_t a) +{ + union { uint32_t u; int32_t s; } u = { .u = a }; + + if (a != u.s) + a = ((a >> 63) ^ 0x7FFFFFFF); + return a; +} + +#define av_clip_intp2 av_clip_intp2_rvi +static av_always_inline av_const int av_clip_intp2_rvi(int a, int p) +{ + const int shift = 32 - p; + int b = (a << shift) >> shift; + + if (a != b) + b = (a >> 31) ^ ((1 << p) - 1); + return b; +} + +#if defined (__riscv_zbb) && (__riscv_zbb > 0) && HAVE_INLINE_ASM + +#define av_popcount av_popcount_rvb +static av_always_inline av_const int av_popcount_rvb(uint32_t x) +{ + int ret; + +#if (__riscv_xlen >= 64) + __asm__ ("cpopw %0, %1\n" : "=r" (ret) : "r" (x)); +#else + __asm__ ("cpop %0, %1\n" : "=r" (ret) : "r" (x)); +#endif + return ret; +} + +#if (__riscv_xlen >= 64) +#define av_popcount64 av_popcount64_rvb +static av_always_inline av_const int av_popcount64_rvb(uint64_t x) +{ + int ret; + +#if (__riscv_xlen >= 128) + __asm__ ("cpopd %0, %1\n" : "=r" (ret) : "r" (x)); +#else + __asm__ ("cpop %0, %1\n" : "=r" (ret) : "r" (x)); +#endif + return ret; +} +#endif /* __riscv_xlen >= 64 */ +#endif /* __riscv_zbb */ + +#endif /* AVUTIL_RISCV_INTMATH_H */ From patchwork Mon Sep 12 15:53:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37864 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp163409pzh; Mon, 12 Sep 2022 08:54:39 -0700 (PDT) X-Google-Smtp-Source: AA6agR4tlWJVdE7j0gZaMOye7fTIEUPELfASmVZ46gmepjK2luI5wNz9YuUeXITfkz8aGo+PP2CC X-Received: by 2002:a17:906:fe0b:b0:730:3646:d177 with SMTP id wy11-20020a170906fe0b00b007303646d177mr18362527ejb.688.1662998079203; Mon, 12 Sep 2022 08:54:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998079; cv=none; d=google.com; s=arc-20160816; b=LsJXWpikKUgfDc03Q1MSshePLcbnYgYbwYO2GuVyPvnaAb6Zp7Q/YzqLPQbaZh8FED 7d6biFtm3Rb1BMf+cvnzLiQvdeXs/K4cLdMj4n4InlcPNNR3IZZXlFFpTdxejutq5kgq 0b7IgXovZiSZpUEOKUBnHcdw5n4rJ0Orp132ohzbljhkuJaiWfofBKGezlr66P50hCny PpgBs6ruOPMpKeZwdiV0n9hFNHAdim5aroTAUvvXl/yAK6hAl9W1SV70l5Zv96w/7kkj oWLzndbFKmzUKolkuDB+97LD3IbEd/NSVBk/L0q/N2LuwFMl8i1njzqwZCAG8IY46zQn oFVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=nB36rfTD4/CpAJhWCquORxjIm1k9BpjjXZPxWxZ+TyQ=; b=ExnsZVeMHwkdX6j8lxvH2rTeQ7vbb3J60agST4i6bACc4Ny3b1IRGYszOMCl9VEMcN gr3BQKKrYVFxUOpQTHYBE4gnK2n3dk1jbIGihNPtU/YORb1ov0ssyeLSkbM51AIghxcg qYT5XNlfhIg/qGPzglZJbyuuiNsmUJACZb6Pblj/0B55212dNBeGX17fYUlZJHr5Xuuv JgyUB4AdltzEVqSaMeGzxbR+Ua3WZ0oq+51HbRXILLqQNyyeSsUPc6K2OEJt9IuhXqW3 vUB6ryHMN3zOY9KClhC3kb1kjtv7YI6GdkZ49p6p5czPBg1zehSTN2mYe6ABFmofZDh6 385w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f17-20020a056402355100b004359f471717si8927545edd.0.2022.09.12.08.54.38; Mon, 12 Sep 2022 08:54:39 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BA18D68BA81; Mon, 12 Sep 2022 18:53:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F374168BB56 for ; Mon, 12 Sep 2022 18:53:39 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 10522C00B2 for ; Mon, 12 Sep 2022 18:53:35 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:21 +0300 Message-Id: <20220912155333.59843-6-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/18] configure: probe RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: bMoygAUkCcHT From: Rémi Denis-Courmont --- Makefile | 2 +- configure | 15 +++++++++++++++ ffbuild/arch.mak | 2 ++ 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 61f79e27ae..1fb742f390 100644 --- a/Makefile +++ b/Makefile @@ -91,7 +91,7 @@ ffbuild/.config: $(CONFIGURABLE_COMPONENTS) SUBDIR_VARS := CLEANFILES FFLIBS HOSTPROGS TESTPROGS TOOLS \ HEADERS ARCH_HEADERS BUILT_HEADERS SKIPHEADERS \ ARMV5TE-OBJS ARMV6-OBJS ARMV8-OBJS VFP-OBJS NEON-OBJS \ - ALTIVEC-OBJS VSX-OBJS MMX-OBJS X86ASM-OBJS \ + ALTIVEC-OBJS VSX-OBJS RVV-OBJS MMX-OBJS X86ASM-OBJS \ MIPSFPU-OBJS MIPSDSPR2-OBJS MIPSDSP-OBJS MSA-OBJS \ MMI-OBJS LSX-OBJS LASX-OBJS OBJS SLIBOBJS SHLIBOBJS \ STLIBOBJS HOSTOBJS TESTOBJS diff --git a/configure b/configure index b7dc1d8656..c5f20cc323 100755 --- a/configure +++ b/configure @@ -462,6 +462,7 @@ Optimization options (experts only): --disable-mmi disable Loongson MMI optimizations --disable-lsx disable Loongson LSX optimizations --disable-lasx disable Loongson LASX optimizations + --disable-rvv disable RISC-V Vector optimizations --disable-fast-unaligned consider unaligned accesses slow Developer options (useful when working on FFmpeg itself): @@ -2126,6 +2127,10 @@ ARCH_EXT_LIST_PPC=" vsx " +ARCH_EXT_LIST_RISCV=" + rvv +" + ARCH_EXT_LIST_X86=" $ARCH_EXT_LIST_X86_SIMD cpunop @@ -2135,6 +2140,7 @@ ARCH_EXT_LIST_X86=" ARCH_EXT_LIST=" $ARCH_EXT_LIST_ARM $ARCH_EXT_LIST_PPC + $ARCH_EXT_LIST_RISCV $ARCH_EXT_LIST_X86 $ARCH_EXT_LIST_MIPS $ARCH_EXT_LIST_LOONGSON @@ -2642,6 +2648,8 @@ ppc4xx_deps="ppc" vsx_deps="altivec" power8_deps="vsx" +rvv_deps="riscv" + loongson2_deps="mips" loongson3_deps="mips" mmi_deps_any="loongson2 loongson3" @@ -6110,6 +6118,10 @@ elif enabled ppc; then check_cpp_condition power8 "altivec.h" "defined(_ARCH_PWR8)" fi +elif enabled riscv; then + + enabled rvv && check_inline_asm rvv '".option arch, +v\nvsetivli zero, 0, e8, m1, ta, ma"' + elif enabled x86; then check_builtin rdtsc intrin.h "__rdtsc()" @@ -7596,6 +7608,9 @@ if enabled loongarch; then echo "LSX enabled ${lsx-no}" echo "LASX enabled ${lasx-no}" fi +if enabled riscv; then + echo "RISC-V Vector enabled ${riscv-no}" +fi echo "debug symbols ${debug-no}" echo "strip symbols ${stripping-no}" echo "optimize for size ${small-no}" diff --git a/ffbuild/arch.mak b/ffbuild/arch.mak index 997e31e85e..39d76ee152 100644 --- a/ffbuild/arch.mak +++ b/ffbuild/arch.mak @@ -15,5 +15,7 @@ OBJS-$(HAVE_LASX) += $(LASX-OBJS) $(LASX-OBJS-yes) OBJS-$(HAVE_ALTIVEC) += $(ALTIVEC-OBJS) $(ALTIVEC-OBJS-yes) OBJS-$(HAVE_VSX) += $(VSX-OBJS) $(VSX-OBJS-yes) +OBJS-$(HAVE_RVV) += $(RVV-OBJS) $(RVV-OBJS-yes) + OBJS-$(HAVE_MMX) += $(MMX-OBJS) $(MMX-OBJS-yes) OBJS-$(HAVE_X86ASM) += $(X86ASM-OBJS) $(X86ASM-OBJS-yes) From patchwork Mon Sep 12 15:53:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37863 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp163272pzh; Mon, 12 Sep 2022 08:54:30 -0700 (PDT) X-Google-Smtp-Source: AA6agR7ZEfsjWJRWU/BkaEPE6XNTHaNoH6i9CxnYqZz+BNec7V0+Lm0rxdxWLNSfnX0gfkL8SnU4 X-Received: by 2002:a17:906:7304:b0:6ff:a76:5b09 with SMTP id di4-20020a170906730400b006ff0a765b09mr18761816ejc.193.1662998069897; Mon, 12 Sep 2022 08:54:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998069; cv=none; d=google.com; s=arc-20160816; b=vEToKG9V+PJD0DkroJ6cdvcTLnrSa6R2R63YS3+PQdytjkwMretORknLJj4KI6Buuh 0o6tQk2LItmbSEtK4F29lCCG+5HLHZTw0rfOs0G+l+IEboh0y/dIfjso+z4OenGL5ZaS tJ42e/vUTL3nCwFW/jbkrX7f4In2GnBmgYbJTGxXrt623IJGo1ixuhXW8lF4VhYQwHxk Wrj76S75+yfSbjetlMbK2q2oBg0t9UA92YdQTd5UQ+F3+C++BAc/8iRU1Exdftk7XMBR vyZ1RZuMXfiPcrnnbCUmAKxqGAe6n5jlVymB8UhjyBlGDrXIsDxfnvZd2w29U37BoSX4 kFMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=eaEou0KQyOFm/1LpkQQhDLfhyGIV6wIp/FCAO/Z/+CM=; b=oJvy45tvRKgiiK5s0JtyfXMnQkVpQdgexT+m9zEHjTad/ybPsPIfc0WWlS4Ziy5De8 Rz/vYqjbSyORPcJZnsrpCss+Rg0H8/N6/XpooXkUkRWpvdSxTituBIav9QMywkPCHkj0 YWmKArV4b2NOpS8F2qo6pq/61KiQKRxdnRv4QLmL/UrmCDQ2Y+Tbzu5sZgtkOGs7W//Z QL/+M2WQC/lFVHHlKiWu1fQ0ybZOQOKRTcK+RQTAlpYHtp5kbWP//URX2pmDhBgPGf8H ScbpBfFN5VVUUU+nbjVbnX3FqD2a1m63veLXoiQiiaSz0QxgxZFhFYoZ8gcU5m473QyT YyEg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id x20-20020a05640226d400b00448ce617058si8070190edd.463.2022.09.12.08.54.29; Mon, 12 Sep 2022 08:54:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BF05268BB82; Mon, 12 Sep 2022 18:53:46 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F351D68BB3E for ; Mon, 12 Sep 2022 18:53:39 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 3A32FC00B3 for ; Mon, 12 Sep 2022 18:53:35 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:22 +0300 Message-Id: <20220912155333.59843-7-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/18] lavu/riscv: initial common header for assembler macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: QLTZuGet11wS From: Rémi Denis-Courmont --- libavutil/riscv/asm.S | 74 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 libavutil/riscv/asm.S diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S new file mode 100644 index 0000000000..7623c161cf --- /dev/null +++ b/libavutil/riscv/asm.S @@ -0,0 +1,74 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#if defined (__riscv_float_abi_soft) +#define NOHWF +#define NOHWD +#define HWF # +#define HWD # +#elif defined (__riscv_float_abi_single) +#define NOHWF # +#define NOHWD +#define HWF +#define HWD # +#else +#define NOHWF # +#define NOHWD # +#define HWF +#define HWD +#endif + + .macro func sym, ext= + .text + .align 2 + + .option push + .ifnb \ext + .option arch, +\ext + .endif + + .global \sym + .hidden \sym + .type \sym, %function + \sym: + + .macro endfunc + .size \sym, . - \sym + .option pop + .previous + .purgem endfunc + .endm + .endm + + .macro const sym, align=3, relocate=0 + .if \relocate + .pushsection .data.rel.ro + .else + .pushsection .rodata + .endif + .align \align + \sym: + + .macro endconst + .size \sym, . - \sym + .popsection + .purgem endconst + .endm + .endm From patchwork Mon Sep 12 15:53:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37866 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp163698pzh; Mon, 12 Sep 2022 08:54:55 -0700 (PDT) X-Google-Smtp-Source: AA6agR4uqu5dmS2xQ5tMycpbPAww1EZAK9V8IZOhdNIlsMqYh9WKl1FeVJbCehRBaWHzXhwJ64WU X-Received: by 2002:a05:6402:2692:b0:451:6515:1946 with SMTP id w18-20020a056402269200b0045165151946mr8615601edd.417.1662998095704; Mon, 12 Sep 2022 08:54:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998095; cv=none; d=google.com; s=arc-20160816; b=tJ1JKphUKbtkV0pjc66JsYc0kzskYI9EnkVXBsZ/B5IuDSd+zXVmhBIIiFLPadGEki SLdiNT8DF/mN9h5dUELmizp9kgVztE31VGtRAml5ppty8O6QsCmOpgKXrsEjJYSfCBSx oinLRR0spETecjOCaEMeQiRP4Zu3q/QL+LjT9m4Ul0QEvEWG39aSZqEvj6Hz/tXmArCJ mJ2aJXW9C6VQ9PkFOlR1+ZrLK/p9GhWddjBt43M8nDFErkksJJ2JeRiOzX3uHFUDYD3F YwNfsZfWnj07sHfoD6mX/MU5SMgQd+x3jmPN0Nzv0507ZO5FrkgS0jaotGr3YO8U9ezO ebGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=x4Z7817OBUBQjLadX40/XLCrT1iCj6y3Gii2EFLeS6A=; b=C5Zq3yI4AcGNnazTAIfbj+7pu8HfdwusoBDoHVlLNq2PkJ4lVyUXlOQ0F3YQu5DVbk ZXN46LuqNJh30s/1vXvELa6S0UQy0SkdAJD09yNeLgU/Ab/QW9spHE0aMWboU7AltM2T Z05zvCB5MrXaKBePJk2ddvG6eOcKma+htoqWdobOjoZSCip1yJyomWscu2sJJB4ygbZ4 6pq1PG/48nxs+iwH3ugOddR0KNWGiA9LtY7wNOGn8Qn7OLUcovLpJ5wX07wiINmmZ6AH iNyYAbh480bgaDuoogSlzmAx6fApC2Xkhjtg5et+AVwSMCPd4u0MjDoJFGv8QslhRBTU d54Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f17-20020a170906825100b0077fb99b8d09si246155ejx.304.2022.09.12.08.54.55; Mon, 12 Sep 2022 08:54:55 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E963568BB90; Mon, 12 Sep 2022 18:53:49 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 02AAA68BB3D for ; Mon, 12 Sep 2022 18:53:39 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 644ECC00B4 for ; Mon, 12 Sep 2022 18:53:35 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:23 +0300 Message-Id: <20220912155333.59843-8-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/18] lavu/riscv: add CPU flags for the RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: g2kx0p/hl2Au From: Rémi Denis-Courmont RVV defines a total of 12 different extensions, including: - 5 different instruction subsets: - Zve32x: 8-, 16- and 32-bit integers, - Zve32f: Zve32x plus single precision floats, - Zve64x: Zve32x plus 64-bit integers, - Zve64f: Zve32f plus Zve64x, - Zve64d: Zve64f plus double precision floats. - 6 different vector lengths: - Zvl32b (embedded only), - Zvl64b (embedded only), - Zvl128b, - Zvl256b, - Zvl512b, - Zvl1024b, - and the V extension proper: equivalent to Zve64f and Zvl128b. In total, there are 6 different possible sets of supported instructions (including the empty set), but for convenience we allocate one bit for each type sets: up-to-32-bit ints (ZVE32X), floats (ZV32F), 64-bit ints (ZV64X) and doubles (ZVE64D). Whence the vector size is needed, it can be retrieved by reading the unprivileged read-only vlenb CSR. This should probably be a separate helper macro if needed at a later point. --- libavutil/cpu.c | 15 +++++++++++ libavutil/cpu.h | 6 +++++ libavutil/cpu_internal.h | 1 + libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 57 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 80 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..89d2fb6f56 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -178,6 +180,19 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV +#define AV_CPU_FLAG_ZVE32X_M (AV_CPU_FLAG_ZVE32X) +#define AV_CPU_FLAG_ZVE32F_M (AV_CPU_FLAG_ZVE32X_M | AV_CPU_FLAG_ZVE32F) +#define AV_CPU_FLAG_ZVE64X_M (AV_CPU_FLAG_ZVE32X_M | AV_CPU_FLAG_ZVE64X) +#define AV_CPU_FLAG_ZVE64F_M (AV_CPU_FLAG_ZVE32F_M | AV_CPU_FLAG_ZVE64X) +#define AV_CPU_FLAG_ZVE64D_M (AV_CPU_FLAG_ZVE64F_M | AV_CPU_FLAG_ZVE64D) +#define AV_CPU_FLAG_VECTORS AV_CPU_FLAG_ZVE64D_M + { "vectors", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VECTORS }, .unit = "flags" }, + { "zve32x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE32X }, .unit = "flags" }, + { "zve32f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE32F_M }, .unit = "flags" }, + { "zve64x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64X_M }, .unit = "flags" }, + { "zve64f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64F_M }, .unit = "flags" }, + { "zve64d", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64D_M }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..44836e50d6 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,12 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V Vector extension +#define AV_CPU_FLAG_ZVE32X (1 << 0) /* 8-, 16-, 32-bit integers */ +#define AV_CPU_FLAG_ZVE32F (1 << 1) /* single precision scalars */ +#define AV_CPU_FLAG_ZVE64X (1 << 2) /* 64-bit integers */ +#define AV_CPU_FLAG_ZVE64D (1 << 3) /* double precision scalars */ + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..634f28bac4 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,6 +48,7 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..9e4cce5e8b --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,57 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#endif + +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; + + /* If RV-V is enabled statically at compile-time, check the details. */ +#ifdef __riscv_vectors + ret |= AV_CPU_FLAG_ZVE32X; +#if __riscv_v_elen >= 64 + ret |= AV_CPU_FLAG_ZVE64X; +#endif +#if __riscv_v_elen_fp >= 32 + ret |= AV_CPU_FLAG_ZVE32F; +#if __riscv_v_elen_fp >= 64 + ret |= AV_CPU_FLAG_ZVE64F; +#endif +#endif +#endif + +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + /* The V extension implies all subsets */ + if (hwcap & HWCAP_RV('V')) + ret |= AV_CPU_FLAG_ZVE32X | AV_CPU_FLAG_ZVE64X + | AV_CPU_FLAG_ZVE32F | AV_CPU_FLAG_ZVE64D; +#endif + + return ret; +} From patchwork Mon Sep 12 15:53:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37865 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp163558pzh; Mon, 12 Sep 2022 08:54:47 -0700 (PDT) X-Google-Smtp-Source: AA6agR6rOwJ24qv5nHrv7gbJ3Hf+BhOkmX09Cszfaq3NC8pagcvEVNNfSTUAP6JHTNxqC6MIDSon X-Received: by 2002:a17:907:b1b:b0:772:1dcc:a512 with SMTP id h27-20020a1709070b1b00b007721dcca512mr16492865ejl.247.1662998087445; Mon, 12 Sep 2022 08:54:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998087; cv=none; d=google.com; s=arc-20160816; b=KRx5QC9F1lkCAd2QbFfBXUEUuOndoQCwIUPVaTLDO142LJIe0LQMBJ1XNWVsUcjyaJ o80yzw8la3aLitKiGJei7PhIn8w/cdI4RINRrOhTVkb9L+gSTc6iBKIw0S4AaexgVf6h IWK+b2+FcykFg0Z4oaPU9psdVfU8ltroe8WMaOIQ0Cep0wopGFsQAUDWJXo6HU0sGpQx sL8hg8/O8j6YpPQbp3lQDiZM3SULQyT58g15ean39GunQRhYcu1lJeaL6zw2GHlm2UC0 TDEl0NOiTS6VX98Wvezwzwc6JWgC7/T9IYLWUL26SHvwrUSQrRjSmCqnRv3eXSqfF5hA Lp7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=uN7M5zdHFPiX2RopVgTytSwJNztn3reCBDQhsgul/NI=; b=R6DmbK8nMrzGCbf8Ll55Yx8iQAxy5pYTirliCwD8PUsvAOZ1wjNha724w/PYVsM4zm a/uW0NAxH81wrK+TzdfEVCf70+yOGGfZIV6OUSo1yygJWEREZttXhT10KaDubuJLn7Ds 0BJop9KmwiHK/Y6nLNStNuqfawmiPxLwlPQUg1pF978ClVWyFQa6HVHj5IjU4FYCfRBP 7kPLUKQNGfUPxTPThuCeeSpcjEuOJEj47VEY2fpoHXcGwW1uguX/CDhATaIT7BlNZ7LZ /ojWEvIJ4i1j//KfzPQO9rTNZ4AnUR2sLUD5FwxgzgjxZGuMHIt7IczvcTQjePa91a/M XU1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y7-20020a17090614c700b00741827c6304si6470545ejc.772.2022.09.12.08.54.47; Mon, 12 Sep 2022 08:54:47 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C4D4968BB3E; Mon, 12 Sep 2022 18:53:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0720D68BB5B for ; Mon, 12 Sep 2022 18:53:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 8EB5AC00B5 for ; Mon, 12 Sep 2022 18:53:35 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:24 +0300 Message-Id: <20220912155333.59843-9-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/18] checkasm: register the RISC-V V subsets X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: vobs+zM2UtUW From: Rémi Denis-Courmont --- tests/checkasm/checkasm.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index e56fd3850e..a5d0503811 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -226,6 +226,11 @@ static const struct { { "ALTIVEC", "altivec", AV_CPU_FLAG_ALTIVEC }, { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, +#elif ARCH_RISCV + { "Zve32x", "zve32x", AV_CPU_FLAG_ZVE32X }, + { "Zve32f", "zve32f", AV_CPU_FLAG_ZVE32F }, + { "Zve64x", "zve64x", AV_CPU_FLAG_ZVE64X }, + { "Zve64d", "zve64d", AV_CPU_FLAG_ZVE64D }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Mon Sep 12 15:53:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37868 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp163886pzh; Mon, 12 Sep 2022 08:55:12 -0700 (PDT) X-Google-Smtp-Source: AA6agR4UAzfANfW+VFVC0hLhnzX7JrLaBxBQNn0nQ+s56ilaIyyetXTajZMmZz11EyvOm3daAKkx X-Received: by 2002:a17:907:97ce:b0:779:ed37:b5a1 with SMTP id js14-20020a17090797ce00b00779ed37b5a1mr10823385ejc.650.1662998112015; Mon, 12 Sep 2022 08:55:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998112; cv=none; d=google.com; s=arc-20160816; b=mENKbE9nr8S7dxbA+2YjUgH4NihRv+dPp0p9nAo2kXaAB424sEfCbZHDT3HYOvUAu5 gue+PATxyXtEnIaYH8PqEoZfTgUl+BT/1qCQALx+kVkCUuGGdJ+fFtqAkKYLX5O+1teV q9MnDNZxoaJD7XA9eCa4pk24xhdtMnDlYxzF0Zzg/b93sp88kIXY1FhTeR4u52dRvkvd moP3ifZz+yoo/JvVsZViiPvqANOVDbUbgBYk1+KcdSVwIrK5M8JhZXaQW7ruceWAlByL 6yQh+f9QB8/uXf76UCA/8pm9MI91NNs8n+nFEnhXk1mvkWkyFnfirlop8HKztra1I/aj sfOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=W+SE30qC+DQGZphRMXs9MClTj7bqrLeqI445UFRyJ5I=; b=FwvP2iZLBqgpSERdkpbEqvaNtalcxkty88iHu3D7ywBsiaA7aVwI/az5mri57N0iPu ap6Egvzl5Lg0ltwWCaczfrMNEUTtTXAveCZ6WYlyCVjeGjChyLfpYefLPMzVy4hkwzWH dHEHT84nZdziy0j7tgMhrj1JVOl4SjHN/D/hrYiaOXcEaY1OXmCRhrqUXA1KqzJvjJCj 09E8YJg4zsuPbFyBzyOt53bY3GnvNonl94bHbKpVJFZRWYa4Ga85elGQCS1EsigtB0EU UhLrnSaIexByFL4jbOvOHIyIa9yMvIVzfGaW2cdCAdZolzo61g2BVYvEj6YfH5SguxMN a7vA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id o15-20020a170906974f00b007317a6beb8fsi7360080ejy.502.2022.09.12.08.55.11; Mon, 12 Sep 2022 08:55:12 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 19D9F68BB3D; Mon, 12 Sep 2022 18:53:52 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 09B8B68BB5C for ; Mon, 12 Sep 2022 18:53:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B81A0C00B6 for ; Mon, 12 Sep 2022 18:53:35 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:25 +0300 Message-Id: <20220912155333.59843-10-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/18] lavu/riscv: float vector-scalar multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: S9jefb183qMT From: Rémi Denis-Courmont This is based on existing code from the VLC git tree with two minor changes to account for the different function prototypes. --- libavutil/float_dsp.c | 2 ++ libavutil/float_dsp.h | 1 + libavutil/riscv/Makefile | 4 ++- libavutil/riscv/float_dsp_init.c | 44 +++++++++++++++++++++++++ libavutil/riscv/float_dsp_rvv.S | 56 ++++++++++++++++++++++++++++++++ 5 files changed, 106 insertions(+), 1 deletion(-) create mode 100644 libavutil/riscv/float_dsp_init.c create mode 100644 libavutil/riscv/float_dsp_rvv.S diff --git a/libavutil/float_dsp.c b/libavutil/float_dsp.c index 8676c8b0f8..742dd679d2 100644 --- a/libavutil/float_dsp.c +++ b/libavutil/float_dsp.c @@ -156,6 +156,8 @@ av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact) ff_float_dsp_init_arm(fdsp); #elif ARCH_PPC ff_float_dsp_init_ppc(fdsp, bit_exact); +#elif ARCH_RISCV + ff_float_dsp_init_riscv(fdsp); #elif ARCH_X86 ff_float_dsp_init_x86(fdsp); #elif ARCH_MIPS diff --git a/libavutil/float_dsp.h b/libavutil/float_dsp.h index 9c664592bd..7cad9fc622 100644 --- a/libavutil/float_dsp.h +++ b/libavutil/float_dsp.h @@ -205,6 +205,7 @@ float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len); void ff_float_dsp_init_aarch64(AVFloatDSPContext *fdsp); void ff_float_dsp_init_arm(AVFloatDSPContext *fdsp); void ff_float_dsp_init_ppc(AVFloatDSPContext *fdsp, int strict); +void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp); void ff_float_dsp_init_x86(AVFloatDSPContext *fdsp); void ff_float_dsp_init_mips(AVFloatDSPContext *fdsp); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 1f818043dc..89a8d0d990 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1 +1,3 @@ -OBJS += riscv/cpu.o +OBJS += riscv/float_dsp_init.o \ + riscv/cpu.o +RVV-OBJS += riscv/float_dsp_rvv.o diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c new file mode 100644 index 0000000000..f1d3d52877 --- /dev/null +++ b/libavutil/riscv/float_dsp_init.c @@ -0,0 +1,44 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/float_dsp.h" + +void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, + int len); + +void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, + int len); + +av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_ZVE32F) { + fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + + if (flags & AV_CPU_FLAG_ZVE64D) + fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } +#endif +} diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S new file mode 100644 index 0000000000..365e00190c --- /dev/null +++ b/libavutil/riscv/float_dsp_rvv.S @@ -0,0 +1,56 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_dmul_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 + +1: vsetvli t0, a2, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Mon Sep 12 15:53:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37867 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp163802pzh; Mon, 12 Sep 2022 08:55:03 -0700 (PDT) X-Google-Smtp-Source: AA6agR6+OTiYtQEAIfh/H43mu9eG5wrXBqk8t4y/jmdcwR5q4PeGEiZlCQRh2SLHFRU6P51xaHXD X-Received: by 2002:aa7:d315:0:b0:44e:6647:9dae with SMTP id p21-20020aa7d315000000b0044e66479daemr22015606edq.280.1662998103606; Mon, 12 Sep 2022 08:55:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998103; cv=none; d=google.com; s=arc-20160816; b=DtD/EvqLqksyiXb4ywfsRrq84D6L1HDedWrblesiPFruA3Hs3j51NpXjbYAp1RVLsL kAPVDLxQWK6bMD0YHb5lxx3OH8cZ8wsrO38KmljQ1UyjPXpF7Oog/ib8tV/ptzf/RJjU iID8qybTIxgAFxtgxPqFX3x139bl+9XksBbgZgT2PINFl5p68Su5yNim7xdHpXf53OA+ P3n2XLCEmrElcde8dy9GkFe0gYmwlFI0pskw98yiT46lh23ADkiom5UMS76mqNEycqHx D0cI6oCu7t1onS7kwstdj3ihw02cF0x6ubw2JTtqk500NPGrAEe5ng5lUVXoQzTgSn7m n6Jw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=vacL2gdgcoJUiuWF024pq8RoJ5xj8CEGOENZYoCn8PU=; b=kNRSRW20uH6/5zarratQOFmD5iirUEkwvT6r8kF8FcQlcGmTQuo/E3xDWW00ok0sx6 yuspob8kTtFzwKZpaTDiG4EUYqRMDRX7l+4TK/xy7IPkxBZSqopV0QZLnrurlQsXB2EG dbVaTfQd/XHla8JXphS0x40kouLzDUFWHp9/wP1jCSWsjKM7XZT8BSAI3PzTh/cQGKNT 64g4Rk55cliK1YqjnIsWes0HQYyK6UtRCFnda0NZUJBK1MD+p2pyxTqMpUvLNkr1vs2J YJlYI/W0QesUG5oLO7IwJw6lWYGGjg9VFR8RM271dL/sEDz5QAUeQze7SxvzC2gNPzCN 8RiQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nc4-20020a1709071c0400b0072f0f088ed7si10454322ejc.712.2022.09.12.08.55.03; Mon, 12 Sep 2022 08:55:03 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0CEF068BB5B; Mon, 12 Sep 2022 18:53:51 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0E95068BB5D for ; Mon, 12 Sep 2022 18:53:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id E2970C00B7 for ; Mon, 12 Sep 2022 18:53:35 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:26 +0300 Message-Id: <20220912155333.59843-11-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 11/18] lavu/riscv: float vector-vector multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Ag0aUpkDH+Zz From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 9 ++++++++- libavutil/riscv/float_dsp_rvv.S | 34 ++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index f1d3d52877..903da4eeda 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -23,9 +23,13 @@ #include "libavutil/cpu.h" #include "libavutil/float_dsp.h" +void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -35,10 +39,13 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_ZVE32F) { + fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; - if (flags & AV_CPU_FLAG_ZVE64D) + if (flags & AV_CPU_FLAG_ZVE64D) { + fdsp->vector_dmul = ff_vector_dmul_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } } #endif } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 365e00190c..65c3a77b01 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -19,6 +19,23 @@ #include "config.h" #include "asm.S" +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_fmul_rvv, zve32f +1: vsetvli t0, a3, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vle32.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 @@ -37,6 +54,23 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_dmul_rvv, zve64d +1: vsetvli t0, a3, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vle64.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Mon Sep 12 15:53:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37869 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp163968pzh; Mon, 12 Sep 2022 08:55:21 -0700 (PDT) X-Google-Smtp-Source: AA6agR5XtifWng3fssLgvQCQrPJRRvR4jYa5c3QNH9b1oQIwFQ40fEWRnnmzW767uIKZCl6+bZlf X-Received: by 2002:a17:906:da85:b0:741:40a7:d08d with SMTP id xh5-20020a170906da8500b0074140a7d08dmr19909193ejb.263.1662998121057; Mon, 12 Sep 2022 08:55:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998121; cv=none; d=google.com; s=arc-20160816; b=kxszjQ84c8AMGSzCvAgTcMdwOyxVJefS9Elj0ZPhcv4LgqCf2GHPNC5WhcirN9/raY DgKMB1Vj6MCUzmPBB8XJxiFMeWVL2WpVtsqs9D6+1PgHjUECRnw0mlKp7+uZAiUJ2Af7 HJems94TDI+UYIy4E7LzVwm4/qyjLPOL3GV/lO58sbj4jwlpRRNGcEZuz0XOlhqDpwdM 0fd5S8dZi/DcvfcFDwu6JOkaJ6lfE8jE+UFCNLUvOLuKRawAsei+C5h/HTzArbBkjJDl Kt4mhjpTKREby0vJyfTKLQQxbwUXHjNwxtJZ6g8i7LpfOoGLcxLggn2+yxWP4PSA0WL4 xmpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=BIG2d96JUfD1WLTNPWD6xijIBkJV+APd4BPF8Pom9hA=; b=0MCO/Eo9AGOG//CXP74hG0xe2iyIAUmNB4Wt2P4llrsgYCQZSaEEJcwU4BLf3YiSEL 8AzxOx9+R4wkqTth8rrysdBSKEHiqaqAERKjDiQArqnO4HzEMyhNNx4EbaHqiiLrnas+ uv2YQXZIkHqhTsT3+u+HlX2BFBrMNszj5Q0oeQ8c8cfXyPzURiLefDUhN2cxlOplHL2A oH211FekBT45H7SMyb7GF7PM3mFkucga11PnKFKki3sdb8YRO5CykQL6uC2FM2k0vFUl j0ItLlW6ujZMN7wOEQg0joh+sXxAJaa8gKNpwjs0hV4Z/UjcLStCwd3Y7/xtcXBSrHt2 ZAgQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gs9-20020a1709072d0900b0077fb99b8cfcsi311168ejc.301.2022.09.12.08.55.20; Mon, 12 Sep 2022 08:55:21 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 12D4768BB44; Mon, 12 Sep 2022 18:53:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2E08D68BB3E for ; Mon, 12 Sep 2022 18:53:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 18369C00B8 for ; Mon, 12 Sep 2022 18:53:36 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:27 +0300 Message-Id: <20220912155333.59843-12-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 12/18] lavu/riscv: float vector multiply-accumulate with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: VcjieWz9/S8r From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 6 +++++ libavutil/riscv/float_dsp_rvv.S | 38 ++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 903da4eeda..1381eadab6 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -25,11 +25,15 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, int len); +void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); +void ff_vector_dmac_scalar_rvv(double *dst, const double *src, double mul, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -40,10 +44,12 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_ZVE32F) { fdsp->vector_fmul = ff_vector_fmul_rvv; + fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; + fdsp->vector_dmac_scalar = ff_vector_dmac_scalar_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; } } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 65c3a77b01..5a7d92abd6 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -36,6 +36,25 @@ func ff_vector_fmul_rvv, zve32f ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_fmac_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v24, (a1) + add a1, a1, t1 + vle32.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 @@ -71,6 +90,25 @@ func ff_vector_dmul_rvv, zve64d ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_dmac_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 + +1: vsetvli t0, a2, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v24, (a1) + add a1, a1, t1 + vle64.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Mon Sep 12 15:53:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37872 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp164194pzh; Mon, 12 Sep 2022 08:55:46 -0700 (PDT) X-Google-Smtp-Source: AA6agR4BhMFmIX8WUBAPPPjMEKcwS+MCErESkXZ6OJhM+/MOrXIqJDfR8eah6QMr3PEqRbFQJ0Em X-Received: by 2002:a05:6402:13c4:b0:44f:1442:de49 with SMTP id a4-20020a05640213c400b0044f1442de49mr22307387edx.261.1662998145977; Mon, 12 Sep 2022 08:55:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998145; cv=none; d=google.com; s=arc-20160816; b=bPAM+3YppvWp72it9B6H7BZ2ldQtAoEAE+1lmvrUzOTTJU6C90qvao0dn56CDGWLgf 0m5wveWgtRKohbZR2I83R2KYwVf48qQgXLtKJk51VWJdNvd9Q3BJTMXWbqlgWlUCHldZ iox0JCthrT7hout/BO6EWLd6B6gZlZtwqps0vvLwSFf86FgMJ+ayxwOVH9EpG8yPIIO7 X5wc8hO0fscz4QzXBj8e0f7S2TUi9AE7+/HXgSbfSj6ciGiLrn3jtW5rOQwtejgaiS04 AGULlgBEgUpru4u5zsG8+caQophvgsmFmQp5U50ieY3GP8DKGQFOVJPXtjdFWXoIdSg1 f6Bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=p2l7hJTs+625m/0P8boe2TTEhexc69DOfFImO2Q6y8A=; b=RQNy2I6gqyIJAaGyatoRnERzn+vjEEZ/qGZKHC8Msy/qY8kgjmln4NLadvBdF7EW0T Thpszb2/rtXeaiUofoKkGzPpd+T0eFmdG9aT+ljsRG8g41RF/Dil2bQ8nXVblR9HmEtp Ho4BRA9qdUU1fd24TnO9IXDrtA+B0UIh88/MkWeFXCml9zwmBVYLmnh2R0rfd/LcxlDB 1RqkrToN3TS2++D+xgrtt4TRiH59E5ysbvVfFtrYFvrbl4IcewzB+Cz7/5hGcay7H5EO 2mF5nCKmHnIx7fU6gyKilPFLrpuE7KM3evURMwEiZmb578eFKkc2YbzGbmbedv+Hhrpt QBYg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h11-20020a0564020e8b00b00445fba6c607si5950961eda.140.2022.09.12.08.55.45; Mon, 12 Sep 2022 08:55:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 831F468BBA2; Mon, 12 Sep 2022 18:53:56 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 35D4868BB65 for ; Mon, 12 Sep 2022 18:53:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 42422C00B9 for ; Mon, 12 Sep 2022 18:53:36 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:28 +0300 Message-Id: <20220912155333.59843-13-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 13/18] lavu/riscv: float vector multiplication-addition with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: rXHDygrAX6jJ From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 1381eadab6..9bc1976d04 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -29,6 +29,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, + const float *src2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -46,6 +48,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 5a7d92abd6..efbf12179f 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -73,6 +73,25 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) + (a3) [0..a4-1] +func ff_vector_fmul_add_rvv, zve32f +1: vsetvli t0, a4, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v8, (a1) + add a1, a1, t1 + vle32.v v16, (a2) + add a2, a2, t1 + vle32.v v24, (a3) + add a3, a3, t1 + vfmadd.vv v8, v16, v24 + sub a4, a4, t0 + vse32.v v8, (a0) + add a0, a0, t1 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Mon Sep 12 15:53:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37874 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp164316pzh; Mon, 12 Sep 2022 08:56:01 -0700 (PDT) X-Google-Smtp-Source: AA6agR7wQFt0xTzTacUgwB12XgpZbynLg8xyuMI7speq3ivuIG/9XmxuTka6IHWfqJXCsD5xuvoF X-Received: by 2002:aa7:cb92:0:b0:443:98d6:20da with SMTP id r18-20020aa7cb92000000b0044398d620damr22482928edt.399.1662998161729; Mon, 12 Sep 2022 08:56:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998161; cv=none; d=google.com; s=arc-20160816; b=xDqTcKzYaNWdJU7aNDSm/ZiQRTvj/e9vDagqeF115KNBbr+O64Xh+C/Grgy0gSUxBT eCSyWxBE7N9lTiSx+CZIMT32NC6VSPRP2UHHGzkRS3vn080vvVGny2dm29opuMT5vtsr jNqe68Z5TPb39woChWKW+GTeuinsQ5fxCSnuHB5SXqWEZA9VycVIHJr9j6az19a4g9pi RX1zaF+z7Sdu/5qwrXY4TmQu2p6ZwDkhhZRLiSnQQ6lYliuVceQ3BBxHFBYXLZ9+F8/K tgRqoU24gKtDJ6uDyKPK/jDia1OPXk3J5yYdjyM/v6NWppr//pBPwoZivjUKPoh9/IMT KcZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=62Y15UErQORzBNghFGU9yq3aYcPCj4vGeIyDliueBGs=; b=h2rrSRwQBH59OnMr1N/SDA7uwbCmCYlxpFEEwMIBA+cRRBw+bVYV9QoGfcXBsNRyyi TnbePiZMbqIJJMiRXztnKZC9hNCwWeD+JdRm1y0gVCT3qH/2fVgIHFje0aIiS0Ow6fRS IGLpFJBTfSokB87w58QmTn1ppbmDFSaCh+S7M1g7TPiBWb8r3kVFfu6uasvbTjGtpkoe 1fyCYFpepZuSfc6Wnbyxo5tWgrwJW7Xuyf8ilWnh6U/krlEUNZX582c1R3trUBB672m2 s6TfLIXAFzIT+MRuxPxGC7LULpucxBW5qmY/po/bNiVihkKY1mq6p9UCZw9Sj+sScyJU aRww== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id xg12-20020a170907320c00b0077087c45329si7476701ejb.968.2022.09.12.08.56.01; Mon, 12 Sep 2022 08:56:01 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8993F68BBAA; Mon, 12 Sep 2022 18:53:58 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5C50268BB5C for ; Mon, 12 Sep 2022 18:53:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6BE9DC00BA for ; Mon, 12 Sep 2022 18:53:36 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:29 +0300 Message-Id: <20220912155333.59843-14-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 14/18] lavu/riscv: float vector sum-and-difference with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: zncueL9gEkPn From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 9bc1976d04..c2b72c3b25 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -31,6 +31,7 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -49,6 +50,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index efbf12179f..1c3b08b94f 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -92,6 +92,24 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_float_rvv, zve32f +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vfadd.vv v0, v16, v24 + vfsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Mon Sep 12 15:53:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37875 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp164384pzh; Mon, 12 Sep 2022 08:56:10 -0700 (PDT) X-Google-Smtp-Source: AA6agR7bJ8ppUqgjuNGKj9tki209RCxUcDTpKLy73T0jSzI4EFHxXYZnfth9MXlC0TOvF5V3njqi X-Received: by 2002:a05:6402:51d1:b0:451:e6a6:4919 with SMTP id r17-20020a05640251d100b00451e6a64919mr3249970edd.58.1662998169947; Mon, 12 Sep 2022 08:56:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998169; cv=none; d=google.com; s=arc-20160816; b=kEmUzrUgDw5RVTQV7rT6P227F6SFLQ23mo8ibcw9vL3XO6kNp048cCsk9yMayisNX3 n/bHOTPT+hPukONYHkbjwWaavhm4mmw+LONsaRA1Ff36qtlaVYGIqJqAr8YF6h5Hpz8N 0Rnz83jw9aeMAoFAtJY//+Wxlaiuh2SKW+LyZhTsDctgyUGyA/PWR3BH4szqvGhI2Dqq bCalmNckyj4NrWcJVh8Nglq14R+nJUCbn+ek2T5elp97HnQof+DMErA4YXL9scAqJv+u C9I0Q134sy0KLiBkIfqIbGa1Hi3KWeIehezWBFuY4saAt32nxjFyk9/HdPRYevdQ6uhu aD/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=F15XuuxlYPAwNYmNfN6kP9zqrKctamMCAQ/zIYXsJF8=; b=swbHkkxEOu/wE8/xQXaCKhPZQp37lWomOrZcxUYME+0tbWqh0hnE5sWm68rJHv0WTc aBgSeT1IA9kIuWTghxKuLPOoR5I6MtHUA4Cw6FvnMl80ZKMsjtE08Df3wbZWtGVli86N DBOd4qnnpHcThv55Lhcftrz7KAuJI4bQHTUHpbHe3fpq9jiYOX/5u5jmUwMdT0t6zd3J LW/6pPo7BZh/xcNu2/djD0eMDLgLZptXblHUP2qlr1bctgX6XHUdMzcDm2io8LNi4jW+ i4QMIvdBjACpLr+X8EzDXFjGqROjZ4TOPsQ3yGj4XKXfEfFEcFbNv1ULcgIz/7ZeC6QP 0VKA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h2-20020aa7c602000000b0044e8330487esi5999264edq.264.2022.09.12.08.56.09; Mon, 12 Sep 2022 08:56:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 65D0968BB74; Mon, 12 Sep 2022 18:53:59 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 62DB468BB6B for ; Mon, 12 Sep 2022 18:53:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 95815C00BB for ; Mon, 12 Sep 2022 18:53:36 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:30 +0300 Message-Id: <20220912155333.59843-15-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 15/18] lavu/riscv: float reversed vector multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: JKFrBe3UNrBK From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 22 ++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index c2b72c3b25..ae089d2fdb 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -31,6 +31,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, + const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, @@ -50,6 +52,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 1c3b08b94f..b376392294 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -92,6 +92,28 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a1) * reverse(a2) [0..a3-1] +func ff_vector_fmul_reverse_rvv, zve32f + add t3, a3, -1 + li t2, -4 // byte stride + slli t3, t3, 2 + add a2, a2, t3 + +1: vsetvli t0, a3, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vlse32.v v24, (a2), t2 + sub a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] func ff_butterflies_float_rvv, zve32f 1: vsetvli t0, a2, e32, m8, ta, ma From patchwork Mon Sep 12 15:53:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37873 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp164262pzh; Mon, 12 Sep 2022 08:55:54 -0700 (PDT) X-Google-Smtp-Source: AA6agR7XkPdgyonXW5jaKAKH/SL1Ec4UHqgUkr/C9+eu+daBeVnAH5TRAIUGq/7Y8x56OH0IMnOD X-Received: by 2002:a05:6402:19:b0:447:901f:6b28 with SMTP id d25-20020a056402001900b00447901f6b28mr22198022edu.392.1662998153935; Mon, 12 Sep 2022 08:55:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998153; cv=none; d=google.com; s=arc-20160816; b=iODt4JyxG1yhYAJK2XJGNy8TVNHp4NhGMAlBO/xkab/P5wdwJ/Pasgj9BnSB48dtVL iekJNkoDA22KIMU4NoHJhmcxDdwVpEdCgaKNQWsM7jgIY/VNMx2YQ4xQhSOoT9p3vA48 20+K0hpCn2pgxJdIqtMvtLg9XO6KvSfNIm8FlcHeDOJEYw1VgTZ4Xlceqq83YaeeS7UJ 1iAkefsJc5YLSRohxbYOeqftBA93ha9iYr4z/MF4AMNPVKjwdCnPa7NOaWfFw0h1RCKk 82PXqIwcDuuCPtErtuxVro04iL9lDpJa6YHlp2A84WzMdlJr+2jbPcArfw/p63LR57qe 991w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=kj6oxgqlWiLEmjEUY0Vnp8DOSIeWq3DVbtQ50w+EnWA=; b=paxcUSd5d1XhLeVbF4924fktbPsjTESSj0CVzV4QmUYScJBxGgNymsCsa0gVgfswpS PXfSJwGAqyIL8uIvwdpBvEj7wSJLQV1zzw4ypjlpdMfTDSkOXHOyAVkMAslB10/fRNQp Ee+xnxPucMMWngNEC8RjkQ7IYS1wT9ndytrwaf8wdS+Nr57nfrrew3bDls8kTSKCTKh0 Qu8PHAIsQeI/7Ne42Sl1iVNSSfSB5mDAjPfGrdcimuJQ5wJhkhVr61w/9hPD2Zgi7bxF iodePW4momwd1JvmG+gZbQAPCexIGCHrp+U5t/+tNoKktZGnpRwucKoNFvMnKS62JpY+ KTCg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v10-20020a056402348a00b0044de197f1ddsi5652628edc.550.2022.09.12.08.55.53; Mon, 12 Sep 2022 08:55:53 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 88A5468BBA6; Mon, 12 Sep 2022 18:53:57 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3831468BB68 for ; Mon, 12 Sep 2022 18:53:45 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id BF060C00BC for ; Mon, 12 Sep 2022 18:53:36 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:31 +0300 Message-Id: <20220912155333.59843-16-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 16/18] lavu/riscv: float vector windowed overlap/add with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NzvVT4GlKHjO From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 35 ++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index ae089d2fdb..cf8c995d7c 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -29,6 +29,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_window_rvv(float *dst, const float *src0, + const float *src1, const float *win, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, @@ -51,6 +53,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_window = ff_vector_fmul_window_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index b376392294..65daaa2d27 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -73,6 +73,41 @@ NOHWF mv a2, a3 ret endfunc +func ff_vector_fmul_window_rvv, zve32f + // a0: dst, a1: src0, a2: src1, a3: window, a4: length + addi t0, a4, -1 + add t1, t0, a4 + slli t0, t0, 2 + slli t1, t1, 2 + add a2, a2, t0 + add t0, a0, t1 + add t3, a3, t1 + li t1, -4 // byte stride + +1: vsetvli t2, a4, e32, m4, ta, ma + slli t4, t2, 2 + vle32.v v16, (a1) + add a1, a1, t4 + vlse32.v v20, (a2), t1 + sub a2, a2, t4 + vle32.v v24, (a3) + add a3, a3, t4 + vlse32.v v28, (t3), t1 + sub t3, t3, t4 + vfmul.vv v0, v16, v28 + sub a4, a4, t2 + vfmul.vv v8, v16, v24 + vfnmsac.vv v0, v20, v24 + vfmacc.vv v8, v20, v28 + vse32.v v0, (a0) + add a0, a0, t4 + vsse32.v v8, (t0), t1 + sub t0, t0, t4 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) + (a3) [0..a4-1] func ff_vector_fmul_add_rvv, zve32f 1: vsetvli t0, a4, e32, m8, ta, ma From patchwork Mon Sep 12 15:53:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37871 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp164128pzh; Mon, 12 Sep 2022 08:55:38 -0700 (PDT) X-Google-Smtp-Source: AA6agR7qqR72pOXIk5mvCPias+ccKOFo9mWyX/ogHFA9Q8SPdGZ/F5yC5ELDjI0A8K/xC17YCuZ8 X-Received: by 2002:a17:907:97c6:b0:771:edd:d86d with SMTP id js6-20020a17090797c600b007710eddd86dmr17419914ejc.618.1662998138448; Mon, 12 Sep 2022 08:55:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998138; cv=none; d=google.com; s=arc-20160816; b=nORDP9SgggvFvS0a4XGPl1L5SXSxlcs4o+84UywtCjPKnpppBS07Dd/KvXdDlj+Q0B eh1BEk8eI0uhdMRIHa5HrnG2ssIK67o99+mzVThVaMK96JDqBEPUs6rBFrDsbdwwpPHg xXG7qfVwuaLLFkWpDsuhmwh6tKHT6sgmyLn0+DuYvvfUwKqD3lx5HtQ8uCl3GbdwmQVs 74n4qr+sZaLOmNfYuEwO5qDxGbl8j5xd+iXjwSBcN5Bnk4Gv3t8uWi8jGBtLmJh1kt3g pHnguIPsZw+8BjMjYqvfYfec7GMcYz46G5iDBUszNuZQ8F33SLitwqR4W9TXccxx3wIO me5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=GbD/A1Dnrm9pJEw7qhGRPfe6JcHYc59kKQPGI36OYlU=; b=059AIlI+vj640Kmyq/hfE3uxIEagaxamqHu7J6MMgLfWTo3OIbZ+bB9nsRFdMw71Kt PTX8iVYOzinwjaKJq51VGIXl+GIU+gX5Iubz8B6jirTiphfuAGzTTXBC0RNxftVJst17 tux3CA80G440hbX7p/R0pID+l73yvK/aO4tewmUBZl2Yca5Ju9ACEUlON0zM4wv02bHt qg/YggYOMncDqVTTH4sCZj745M1z7T359bB7n23DPlJGgDag1sQ5oebkBrJI8tDPx2Qg 2dxRq2uoV91+KnhPvPdqFJY05jkffFSVP9fXcEu9XfcFQ611mt/GrkO15Ycz/mhf8w6p OG8A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ee4-20020a056402290400b0044eb6f979d7si6549994edb.152.2022.09.12.08.55.38; Mon, 12 Sep 2022 08:55:38 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5049068BB69; Mon, 12 Sep 2022 18:53:55 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 35ED168BB66 for ; Mon, 12 Sep 2022 18:53:45 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id E867EC00BD for ; Mon, 12 Sep 2022 18:53:36 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:32 +0300 Message-Id: <20220912155333.59843-17-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 17/18] lavu/riscv: float vector dot product with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: tuUxgyg/jjpd From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index cf8c995d7c..055cdc7520 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -36,6 +36,7 @@ void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); +float ff_scalarproduct_float_rvv(const float *v1, const float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -57,6 +58,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; + fdsp->scalarproduct_float = ff_scalarproduct_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 65daaa2d27..81bd0e510a 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -167,6 +167,27 @@ func ff_butterflies_float_rvv, zve32f ret endfunc +// a0 = (a0).(a1) [0..a2-1] +func ff_scalarproduct_float_rvv, zve32f + vsetvli zero, zero, e32, m8, ta, ma + vmv.s.x v8, zero + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + add a0, a0, t1 + vle32.v v24, (a1) + add a1, a1, t1 + vfmul.vv v16, v16, v24 + sub a2, a2, t0 + vfredusum.vs v8, v16, v8 + bnez a2, 1b + + vfmv.f.s fa0, v8 +NOHWF fmv.x.w a0, fa0 + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Mon Sep 12 15:53:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37870 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp164047pzh; Mon, 12 Sep 2022 08:55:30 -0700 (PDT) X-Google-Smtp-Source: AA6agR42rvAAKx+zmfrOWoQc7NicQyKucONUF0b5+gqqxfXulUvl4zMZoSUJ8PyGfFHi+0ZtUFk4 X-Received: by 2002:a05:6402:2547:b0:450:668c:9d93 with SMTP id l7-20020a056402254700b00450668c9d93mr18283561edb.92.1662998129980; Mon, 12 Sep 2022 08:55:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662998129; cv=none; d=google.com; s=arc-20160816; b=v7VQTb9ybYfQkEamdUfACqAd9MbgcACtCrBXIVHTtdj1xbu8oljahuyMuUeUdQnlJa Uug7knf10uthilvQ0gAfOs5IMOys6DTTNzbSFnLLp48psFVMEv+YY+00w7UPgxP9jMRd f5ObMOhA21zFb3/K41wI32zUnhI/hQwQ/c0NkJ53p6u2KwsZth5wyI6Eqbyq7rfOLoRp NSjzxYnFXiegp+tT+TtD/lEvkAEDeFt/AwZD1w3a+uP7l9Fo0onmALZOdWbmneWrXXRi YM2Ra2BQSM2jZoDdGdtDXvYx7Q2S4gERpp/AV9a0ow0p8skMUaRO8LSBdVrOJmt3cM6R 5+eg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=SP0jHVHZPswT3d/9TT7dK+E+1Br08ocu5UOlMsleXOM=; b=NqAbISD6AY/bM1mC9yce6u5rZ/eip6jng4eBPf2b7WswmwSCx+cH5AcIExhjLLbJHw ZacvS3UdaKAIFHUJQMWInoBlwmZErSm9PXP9iKQ2a5mXAo7OjTVK8Wt0BN0R95ihrmGr 3mdJgZnIpJL4sZ4FYzaLG138KOccKgYxArirzmhrTVWiZgsFLFwIQXsvYDnTRbJUW30g uWgU+5tOJpcyA7sL7NlNBVKaWWc6qg1UDS5LmJnIy1gU+/uJndKYCKTlQPRYhf6rmMzQ O48J2b+1zW1AhezQgERvWCDHd67NJfymPUulxnw1D3cPxmi75WLmPRIBjVn6BbGZChp8 KE7w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l26-20020aa7c31a000000b0044811b18be1si6278549edq.631.2022.09.12.08.55.29; Mon, 12 Sep 2022 08:55:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 396F368BB96; Mon, 12 Sep 2022 18:53:54 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3261268BB56 for ; Mon, 12 Sep 2022 18:53:45 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 1DC99C00BE for ; Mon, 12 Sep 2022 18:53:37 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 12 Sep 2022 18:53:33 +0300 Message-Id: <20220912155333.59843-18-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <2652141.mvXUDI8C0e@basile.remlab.net> References: <2652141.mvXUDI8C0e@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 18/18] lavu/riscv: fixed vector sum-and-difference with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: PmQOomhwADyq From: Rémi Denis-Courmont --- libavutil/fixed_dsp.c | 4 +++- libavutil/fixed_dsp.h | 1 + libavutil/riscv/Makefile | 4 +++- libavutil/riscv/fixed_dsp_init.c | 36 ++++++++++++++++++++++++++++++ libavutil/riscv/fixed_dsp_rvv.S | 38 ++++++++++++++++++++++++++++++++ 5 files changed, 81 insertions(+), 2 deletions(-) create mode 100644 libavutil/riscv/fixed_dsp_init.c create mode 100644 libavutil/riscv/fixed_dsp_rvv.S diff --git a/libavutil/fixed_dsp.c b/libavutil/fixed_dsp.c index 154f3bc2d3..bc847949dc 100644 --- a/libavutil/fixed_dsp.c +++ b/libavutil/fixed_dsp.c @@ -162,7 +162,9 @@ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int bit_exact) fdsp->butterflies_fixed = butterflies_fixed_c; fdsp->scalarproduct_fixed = scalarproduct_fixed_c; -#if ARCH_X86 +#if ARCH_RISCV + ff_fixed_dsp_init_riscv(fdsp); +#elif ARCH_X86 ff_fixed_dsp_init_x86(fdsp); #endif diff --git a/libavutil/fixed_dsp.h b/libavutil/fixed_dsp.h index fec806ff2d..1217d3a53b 100644 --- a/libavutil/fixed_dsp.h +++ b/libavutil/fixed_dsp.h @@ -161,6 +161,7 @@ typedef struct AVFixedDSPContext { */ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int strict); +void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp); void ff_fixed_dsp_init_x86(AVFixedDSPContext *fdsp); /** diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 89a8d0d990..1597154ba5 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1,3 +1,5 @@ OBJS += riscv/float_dsp_init.o \ + riscv/fixed_dsp_init.o \ riscv/cpu.o -RVV-OBJS += riscv/float_dsp_rvv.o +RVV-OBJS += riscv/float_dsp_rvv.o \ + riscv/fixed_dsp_rvv.o diff --git a/libavutil/riscv/fixed_dsp_init.c b/libavutil/riscv/fixed_dsp_init.c new file mode 100644 index 0000000000..fc143fb419 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_init.c @@ -0,0 +1,36 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/fixed_dsp.h" + +void ff_butterflies_fixed_rvv(int *v1, int *v2, int len); + +av_cold void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_ZVE32X) + fdsp->butterflies_fixed = ff_butterflies_fixed_rvv; +#endif +} diff --git a/libavutil/riscv/fixed_dsp_rvv.S b/libavutil/riscv/fixed_dsp_rvv.S new file mode 100644 index 0000000000..beb1b949f7 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_rvv.S @@ -0,0 +1,38 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_fixed_rvv, zve32x +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vadd.vv v0, v16, v24 + vsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc