From patchwork Fri Sep 9 15:48:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37790 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp997593pzh; Fri, 9 Sep 2022 08:49:10 -0700 (PDT) X-Google-Smtp-Source: AA6agR43YkOn67acTloIEDlMIS5Ut8rafZ0gbtbMr8BE0uO5EP/1AsZBttwh0Mr1K2TywIGT0B0w X-Received: by 2002:a17:906:30c8:b0:73c:81a9:f8e1 with SMTP id b8-20020a17090630c800b0073c81a9f8e1mr10204400ejb.649.1662738550045; Fri, 09 Sep 2022 08:49:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738550; cv=none; d=google.com; s=arc-20160816; b=mjy96ud56XvzUYUvYjKymDlSHlw8vjRbvY4AH42eG9O19+oIKPtUlAnV1N1cQ8LmTP hMGXUKSowHHilnu5DX5aRKcmm776fM2TBpA43qFwVyHaCdMGhiuIKOSm+47MtFYnFu6l hc69Sg4TLQGEVjWkSWNwPUHo3gCp2ZUcCsb0J3dc1SXItqy+C35h1yo2wTvgj0ubQOPZ wJk1UFCKwF5Dvf7JDiZkGcXTXOQdTD9nK5kJMdehwvA/XMfUb9uUlrGui45iVYCuml/4 WfL7LR+rHexoHI1DvwHbdDlBbssQVGz9qNQYxSg6LKQv1AfPTL8uuyMeqSfgGpiV3lEt KiTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=6PlyFkHqfrV5ybdML8P46hrfUGGhmQk9yJv6encpIKM=; b=B4KbyTZNft/P4hBouv8st5D4JD0Eq65slZEU79j+eJDkAZoQvWejYpaBNH0rOv0M9T OC/PlzY2KDXnwQ6V+XwDku5UYcSmQ/fH1wcha1EvWXf6DsrkoIGsVOR5KQrWN9WcfaG7 R5IcTGN4GLrwQQI4T/Kjd7JWhZkasU7y8T6KEzrUE5P/fXzIE8Cr5aLgbNcLpjPo3h04 EfffK/+hxKaPsrSW4uz5Ctw32/n+PzJNCVHQwZYsC5ubCg7TJZtO3HgRzyO8lXL1yQpX N3ybYAIHi1m8yIG4v4GI75YOgxv5eCb18KMD00jHKA/b2nxpH5GXxqOA2klBm4Qgkh7K ZcNg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c12-20020a056402120c00b0044e9be0186fsi621450edw.546.2022.09.09.08.49.09; Fri, 09 Sep 2022 08:49:10 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AF26568BB39; Fri, 9 Sep 2022 18:49:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E141C68B980 for ; Fri, 9 Sep 2022 18:48:59 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 3F3C8C000E for ; Fri, 9 Sep 2022 18:48:59 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:42 +0300 Message-Id: <20220909154859.68954-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/18] doc: reference the RISC-V specification X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: rUQNKkWQjy7M From: Rémi Denis-Courmont --- doc/optimization.txt | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/doc/optimization.txt b/doc/optimization.txt index 974e2f9af2..3ed29fe38c 100644 --- a/doc/optimization.txt +++ b/doc/optimization.txt @@ -267,6 +267,11 @@ CELL/SPU: http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/30B3520C93F437AB87257060006FFE5E/$file/Language_Extensions_for_CBEA_2.4.pdf http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/9F820A5FFA3ECE8C8725716A0062585F/$file/CBE_Handbook_v1.1_24APR2007_pub.pdf +RISC-V-specific: +---------------- +The RISC-V Instruction Set Manual, Volume 1, Unprivileged ISA: +https://riscv.org/technical/specifications/ + GCC asm links: -------------- official doc but quite ugly From patchwork Fri Sep 9 15:48:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37793 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp997836pzh; Fri, 9 Sep 2022 08:49:37 -0700 (PDT) X-Google-Smtp-Source: AA6agR5MD/sfmOSYO+9HZbnkBk5SCW5PH9u5R22XUq7RFHiqkW3pMMF0UY2s7BuZYnKWMs5si7L6 X-Received: by 2002:a05:6402:2289:b0:44e:f490:319a with SMTP id cw9-20020a056402228900b0044ef490319amr12259119edb.28.1662738567257; Fri, 09 Sep 2022 08:49:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738567; cv=none; d=google.com; s=arc-20160816; b=Ll2l9oKKX3Xag6n10gTPIV/RX4N4J71xn9wezuMvliI0GpldKPzxSe9fCmo++N9AFb PfQHA+hTxu0RaIrXTFDsEIc+/mAYpfAj4rrrMEXz9Fam/+781A0k3UwyXGOrsNaAmq3E jqHL8Y2I7FWPW3mfQhaVGh6eHFpssN4YflYuadAv+w7q+F+DweX4jbGIol3M1LB44Hqc htm45mUml2qMF+uiFqmWwkhwXuOcLZoiBzQ+xYM3pD2IZu9hCTgkaHwokUN+AXO0t/Z9 Wb0cDCpJpr3yD2HM+8swcFzhziJ6Q1ENKhXDjofc1lZU4369wWnFJ8h8cyIkg4kBns6o uPew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=tY9v+MnUl5BAcjQ+7Po7fOjiEFSK1GpAdNAxjQs6RDo=; b=bqDoC+DBnUXYC1hH5cEbBoGJbvZl2/uTRc9zD4aXilQYJ2cczydSgfW/J4K03AFYAM FImth//ScTHzdQrxXCpRSLNi0Io4UYk7WeqLNoDKIn+cn/CI1CxD4QH8I0JbeOqP8bjq O3yEANNbFF/CBcdqyQL9RtGfjPlBwp+7YMek3tWqCm8yRynLLS2uGq4Pix+Z3JmJ6eoV 9jlXz3nSQaW323NUpab/nyg3sx2PSVhCGyrX82kDsXZUOh31Vjiv0wZM0ezuUTS9rxXx b7ffCLDUgWxFLXv1iqRP02hibpph14lTlECyjXQSO6LT6TKd8hNwDcgjS7xJYrGAHYy4 FSuQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ds10-20020a170907724a00b0073d6859e5c7si781050ejc.373.2022.09.09.08.49.26; Fri, 09 Sep 2022 08:49:27 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7ED4668BB13; Fri, 9 Sep 2022 18:49:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 08F2868BB06 for ; Fri, 9 Sep 2022 18:49:00 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 71EF4C0015 for ; Fri, 9 Sep 2022 18:48:59 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:43 +0300 Message-Id: <20220909154859.68954-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/18] lavu/riscv: AV_READ_TIME cycle counter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: mcg+4IxsntHO From: Rémi Denis-Courmont This uses the architected RISC-V 64-bit cycle counter from the RISC-V unprivileged instruction set. In 64-bit and 128-bit, this is a straightforward CSR read. In 32-bit mode, the 64-bit value is exposed as two CSRs, which cannot be read atomically, so a loop is necessary to detect and fix up the race condition where the bottom half wraps exactly between the two reads. --- libavutil/riscv/timer.h | 53 +++++++++++++++++++++++++++++++++++++++++ libavutil/timer.h | 2 ++ 2 files changed, 55 insertions(+) create mode 100644 libavutil/riscv/timer.h diff --git a/libavutil/riscv/timer.h b/libavutil/riscv/timer.h new file mode 100644 index 0000000000..a34157a566 --- /dev/null +++ b/libavutil/riscv/timer.h @@ -0,0 +1,53 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_RISCV_TIMER_H +#define AVUTIL_RISCV_TIMER_H + +#include "config.h" + +#if HAVE_INLINE_ASM +#include + +static inline uint64_t rdcycle64(void) +{ +#if (__riscv_xlen >= 64) + uintptr_t cycles; + + __asm__ volatile ("rdcycle %0" : "=r"(cycles)); + +#else + uint64_t cycles; + uint32_t hi, lo, check; + + __asm__ volatile ( + "1: rdcycleh %0\n" + " rdcycle %1\n" + " rdcycleh %2\n" + " bne %0, %2, 1b\n" : "=r" (hi), "=r" (lo), "=r" (check)); + + cycles = (((uint64_t)hi) << 32) | lo; + +#endif + return cycles; +} + +#define AV_READ_TIME rdcycle64 + +#endif +#endif /* AVUTIL_RISCV_TIMER_H */ diff --git a/libavutil/timer.h b/libavutil/timer.h index 48e576739f..d3db5a27ef 100644 --- a/libavutil/timer.h +++ b/libavutil/timer.h @@ -57,6 +57,8 @@ # include "arm/timer.h" #elif ARCH_PPC # include "ppc/timer.h" +#elif ARCH_RISCV +# include "riscv/timer.h" #elif ARCH_X86 # include "x86/timer.h" #endif From patchwork Fri Sep 9 15:48:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37791 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp997676pzh; Fri, 9 Sep 2022 08:49:18 -0700 (PDT) X-Google-Smtp-Source: AA6agR4j0reMX+ZpzRUR+RLgdBC2l98aTWztR8PHYNEzO+ohgUkl9iFnjRRLe3ewvU1U1ovj2ZP3 X-Received: by 2002:a17:906:8a4e:b0:740:2450:d69a with SMTP id gx14-20020a1709068a4e00b007402450d69amr10280483ejc.523.1662738558491; Fri, 09 Sep 2022 08:49:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738558; cv=none; d=google.com; s=arc-20160816; b=gemKBykw0kGnAyCrcCD6tKp/3XPnN6YPba6v2kNnWfctw9B7aYHOmsQVByDCkkKmCl +3zttOyK18XCfPA/i+GspsDcTT6J+sL3WAHi47Bid+sPtuksfhiHySDm1xOsm2HEr0j1 j6g3a4RmZ9n0mDvioSVqhDcZRT1yMfOB5sLtg7x+lwOacWWrkX/8nVaDeO4vkBtX3ybx W5zz8Sgk3IpoocOL4yo+9j2WmfDY6aNE1s8hvASiHa04SfLk60UgJtUy9912Cty+Jdao 0apmluD4NK/lUqQC+BWFbgq8Ryn9sjxfFC0AoWK4bjV+fz5uqaiRAS/vtuLnW69AOs4a lTDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=ONZBQHtoJvrfNV/QO568SnsxRZmyZnJQQeRezT+R7iE=; b=cJC7xUAobQ0+MPrXWhAOaTGH+vnB0mloMnHQRnUIJBzvzkwXzHeNlzp+PoRYpE5Joh Wz3PFY9NvoLfTLdNNdVrV8cKZy0gUBqMWo+Nh7pWSaTURiub4aw0srSyq0pQaoFqOrSx QYyR1mc/LiY8DbfJwPOZ2svvmdJ15yfAQ99nM2JWTR4Tli7RbtZLHIZCAUEb2TfH5lB9 5Kq1Lr9F/yDkHhEdpcMFp7zhqQiQJLCDOw+pmGZjOoXnnVYKArNfuBBpe4BUO2U4iVtH 0gzH6MXixkNaaPnDiVCKaehzBvlGZ9jPJRdL+DDpvDrzJDNH+QAB6sZqFwcQIzdXUP1J Zzwg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hd20-20020a170907969400b00710487d3a4fsi826246ejc.67.2022.09.09.08.49.18; Fri, 09 Sep 2022 08:49:18 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 91B4068BB0A; Fri, 9 Sep 2022 18:49:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EA4D168BAFE for ; Fri, 9 Sep 2022 18:48:59 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id A4C8BC00AF for ; Fri, 9 Sep 2022 18:48:59 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:44 +0300 Message-Id: <20220909154859.68954-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/18] configure/riscv: detect fast CLZ X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NkWqi1pJKuof From: Rémi Denis-Courmont RISC-V defines the CLZ instruction as part of the ratified Zbb subset of the (not yet ratified) bit mapulation extension (B). We can detect it from the __riscv_zbb predefined constant. At least GCC 12 already supports this correctly. Note that the macro will be non-zero if supported, zero if enabled in the compiler flags (e.g. -march=rv64gzbb) but not known to the compiler, and undefined otherwise. --- configure | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/configure b/configure index 9e51abd0d3..b7dc1d8656 100755 --- a/configure +++ b/configure @@ -5334,6 +5334,12 @@ elif enabled ppc; then ;; esac +elif enabled riscv; then + + if test_cpp_condition stddef.h "__riscv_zbb"; then + enable fast_clz + fi + elif enabled sparc; then case $cpu in From patchwork Fri Sep 9 15:48:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37792 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp997815pzh; Fri, 9 Sep 2022 08:49:35 -0700 (PDT) X-Google-Smtp-Source: AA6agR7sAFKZlNdTTVg4DXbYgFVkcqCXNTBhgGni89Lmq9s7YCUYnrgJZqgTzj3h37h+7GMWMPDH X-Received: by 2002:a17:907:6293:b0:769:9dfc:10eb with SMTP id nd19-20020a170907629300b007699dfc10ebmr10559064ejc.191.1662738575313; Fri, 09 Sep 2022 08:49:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738575; cv=none; d=google.com; s=arc-20160816; b=V7grA1r6k5NZBgmxINdqGipapQ/FYRq5CieOmebf7Z3SknIfoxmTZrFEmSZgvvdkbb FIrL1N/R0hxyj8+h7wYgSSqa9tSSMJ/nTHliCVeuYRW7CUcWLUZg0ArErUrRQIJr06z7 vAbAeLSrvCW62W5akCDp076nMXXsVuFnFBWxn3Ly8iRbUe3SSZlntZ9q506wRDRgoyyx s9qmEnY/4FlVYC+l9NBkxt0ZfnlRhGLJ77PTbwIztrd4auOCXAcSDe58wGGjoKChtSK5 cBRY89c0X1u1hk3fcuJ9PyZHy2VRhk6bzrvHfQrrl2W52teMQgPV13WcZ9Ay8cTTkFPr 7DrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=jVp59uuj9jWqE1sqKr9zVxd+9IcgkUK8L4g/icvyTF4=; b=KQDMLdWAWY6uv3EXVD85RlQIMI+PCzr28kpeMc1l6Lhc6Gy3WI7NOnQHMZ7T2H+jjD UNR1jzmOpB6meUrzjBNMuk70dFspEUio4FHEh0bOUYYPt+bWYjZaFe5aIs5g7umc6D+6 z8iqP4/ByBInZyNyfCtQ+U9dRrw0nMGZzgXczNTjSPObMkicj1qICibxV1MmmUlSSFkb IK43JugLpF4kNKExSy4ZEBIxk659Zsc4qTaUx3hF7FZZV72gNCX6JdEAaboT8iAJ7Tm9 qmjlRUp7HKalYjTsU5VGNhza4A1WTYaoSOFZCAZQD9SZZp7QLsEEqlIC5mN+zMd0/y6x M0yQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l6-20020a056402124600b00448b8836866si679208edw.586.2022.09.09.08.49.34; Fri, 09 Sep 2022 08:49:35 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6C49168BAFE; Fri, 9 Sep 2022 18:49:09 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2E63568BB06 for ; Fri, 9 Sep 2022 18:49:00 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id D7589C00B0 for ; Fri, 9 Sep 2022 18:48:59 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:45 +0300 Message-Id: <20220909154859.68954-4-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/18] lavu/riscv: byte-swap operations X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: zWRXngJYNDT9 From: Rémi Denis-Courmont If the target supports the Basic bit-manipulation (Zbb) extension, then the REV8 instruction is available to reverse byte order. Note that this instruction only exists at the "XLEN" register size, so we need to right shift the result down to the data width. If Zbb is not supported, then this patchset does nothing. Support for run-time detection is left for the future. Currently, there are no bits in auxv/ELF HWCAP for Z-extensions, so there are no clean ways to do this. --- libavutil/bswap.h | 2 ++ libavutil/riscv/bswap.h | 74 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+) create mode 100644 libavutil/riscv/bswap.h diff --git a/libavutil/bswap.h b/libavutil/bswap.h index 91cb79538d..4840ab433f 100644 --- a/libavutil/bswap.h +++ b/libavutil/bswap.h @@ -40,6 +40,8 @@ # include "arm/bswap.h" #elif ARCH_AVR32 # include "avr32/bswap.h" +#elif ARCH_RISCV +# include "riscv/bswap.h" #elif ARCH_SH4 # include "sh4/bswap.h" #elif ARCH_X86 diff --git a/libavutil/riscv/bswap.h b/libavutil/riscv/bswap.h new file mode 100644 index 0000000000..de1429c0f7 --- /dev/null +++ b/libavutil/riscv/bswap.h @@ -0,0 +1,74 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_RISCV_BSWAP_H +#define AVUTIL_RISCV_BSWAP_H + +#include +#include "config.h" +#include "libavutil/attributes.h" + +#if defined (__riscv_zbb) && (__riscv_zbb > 0) && HAVE_INLINE_ASM + +static av_always_inline av_const uintptr_t av_bswap_xlen(uintptr_t x) +{ + uintptr_t y; + + __asm__("rev8 %0, %1" : "=r" (y) : "r" (x)); + return y; +} + +#define av_bswap16 av_bswap16 + +static av_always_inline av_const uint_fast16_t av_bswap16(uint_fast16_t x) +{ + return av_bswap_xlen(x) >> (__riscv_xlen - 16); +} + +#if (__riscv_xlen == 32) +#define av_bswap32 av_bswap_xlen +#define av_bswap64 av_bswap64 + +static av_always_inline av_const uint64_t av_bswap64(uint64_t x) +{ + return (((uint64_t)av_bswap32(x)) << 32) | av_bswap32(x >> 32); +} + +#else +#define av_bswap32 av_bswap32 + +static av_always_inline av_const uint_fast32_t av_bswap32(uint_fast32_t x) +{ + return av_bswap_xlen(x) >> (__riscv_xlen - 32); +} + +#if (__riscv_xlen == 64) +#define av_bswap64 av_bswap_xlen + +#else +#define av_bswap64 av_bswap64 + +static av_always_inline av_const uint_fast64_t av_bswap64(uint_fast64_t x) +{ + return av_bswap_xlen(x) >> (__riscv_xlen - 64); +} + +#endif /* __riscv_xlen > 64 */ +#endif /* __riscv_xlen > 32 */ +#endif /* __riscv_zbb */ +#endif /* AVUTIL_RISCV_BSWAP_H */ From patchwork Fri Sep 9 15:48:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37794 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp997886pzh; Fri, 9 Sep 2022 08:49:43 -0700 (PDT) X-Google-Smtp-Source: AA6agR5pFgEqPwFIwA84T3vwh8y2hQmPXhHBNmjDa5Ci+/fjD3eI0gQdXDKl7RRVi/yP5r/N31CD X-Received: by 2002:a17:907:2ccb:b0:76f:908:bc56 with SMTP id hg11-20020a1709072ccb00b0076f0908bc56mr10625224ejc.763.1662738583704; Fri, 09 Sep 2022 08:49:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738583; cv=none; d=google.com; s=arc-20160816; b=m4a9RFob/Hl9hjq/1KsgPjVsB4V6xswCLaf1M0X5SYdIofmXRKMcZ+1QfiFeVIZNtG 2rfnfiwDUidXDRD8qzBQhfpFqJc3PjdXr6zBeI2eZ09n18EkZ1n8D/Lnyic6UzQzLzuN A6cVsvaCf1YR4oexkLTnsXx3OgISPssmDoCUjUeN535TK0iipr/izCZvzIQtA9Xa4JhH lnmyo2+wdn09KAVCp25+0J4X280LhADxacPWCrzg2TnDZ9qVXubXo7kW9Y0HjDR/IDgI GfcVQ2xEy/agC+j+d7AXAVnF32ZR4bzoB87hem9opCUxJKUpdcgDhazoCuyS5/n869dX lQuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=ikcL3R1PzkMsXVWEU9BTjRlmSX53KYr8dVHwo7KLukk=; b=rbg4W+6Trl/y553q4qdFxIH9ezQxiFi6cduIA8Wt0iiaPc35ExowPevYFp7jAq/yf+ 7lPWi5hRYKc9Hg3igHRFdR+bLl5PP2L907L3rl0qur01Uy2kppWSiTIJc13sXG7c1bDH XvbFOvXE43DTmHLaI44XWUboSpvSUeFkR82TERKRH59KhvVj9AbeuKphfVYKFh+CmO8M zvPAGBD8st/G/eMeMyuKd2RSCYOp/8FyuZv5Zz9nUTdT+xtulIfm/ubXrl23P/aoIHDa QnqsmUp/DhAVeFsoyMiRXVidU1vxceDATSPJqlUDTImZ2XukmoVYMVdxpXoOgXhjXgD1 drzw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y15-20020a056402270f00b0044ed166d7besi934463edd.320.2022.09.09.08.49.43; Fri, 09 Sep 2022 08:49:43 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3074D68BB48; Fri, 9 Sep 2022 18:49:10 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6132768BB06 for ; Fri, 9 Sep 2022 18:49:00 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 16662C00B1 for ; Fri, 9 Sep 2022 18:49:00 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:46 +0300 Message-Id: <20220909154859.68954-5-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/18] lavu/riscv: add optimisations X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: uNmjeVQ76GJF From: Rémi Denis-Courmont This provides some micro-optimisations for signed integer clipping, and support for bit weight with the Zbb extension. --- libavutil/intmath.h | 5 +- libavutil/riscv/intmath.h | 103 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 106 insertions(+), 2 deletions(-) create mode 100644 libavutil/riscv/intmath.h diff --git a/libavutil/intmath.h b/libavutil/intmath.h index 9573109e9d..c54d23b7bf 100644 --- a/libavutil/intmath.h +++ b/libavutil/intmath.h @@ -28,8 +28,9 @@ #if ARCH_ARM # include "arm/intmath.h" -#endif -#if ARCH_X86 +#elif ARCH_RISCV +# include "riscv/intmath.h" +#elif ARCH_X86 # include "x86/intmath.h" #endif diff --git a/libavutil/riscv/intmath.h b/libavutil/riscv/intmath.h new file mode 100644 index 0000000000..78f7ba930a --- /dev/null +++ b/libavutil/riscv/intmath.h @@ -0,0 +1,103 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_RISCV_INTMATH_H +#define AVUTIL_RISCV_INTMATH_H + +#include + +#include "config.h" +#include "libavutil/attributes.h" + +/* + * The compiler is forced to sign-extend the result anyhow, so it is faster to + * compute it explicitly and use it. + */ +#define av_clip_int8 av_clip_int8_rvi +static av_always_inline av_const int8_t av_clip_int8_rvi(int a) +{ + union { uint8_t u; int8_t s; } u = { .u = a }; + + if (a != u.s) + a = ((a >> 31) ^ 0x7F); + return a; +} + +#define av_clip_int16 av_clip_int16_rvi +static av_always_inline av_const int16_t av_clip_int16_rvi(int a) +{ + union { uint8_t u; int8_t s; } u = { .u = a }; + + if (a != u.s) + a = ((a >> 31) ^ 0x7F); + return a; +} + +#define av_clipl_int32 av_clipl_int32_rvi +static av_always_inline av_const int32_t av_clipl_int32_rvi(int64_t a) +{ + union { uint32_t u; int32_t s; } u = { .u = a }; + + if (a != u.s) + a = ((a >> 63) ^ 0x7FFFFFFF); + return a; +} + +#define av_clip_intp2 av_clip_intp2_rvi +static av_always_inline av_const int av_clip_intp2_rvi(int a, int p) +{ + const int shift = 32 - p; + int b = (a << shift) >> shift; + + if (a != b) + b = (a >> 31) ^ ((1 << p) - 1); + return b; +} + +#if defined (__riscv_zbb) && (__riscv_zbb > 0) && HAVE_INLINE_ASM + +#define av_popcount av_popcount_rvb +static av_always_inline av_const int av_popcount_rvb(uint32_t x) +{ + int ret; + +#if (__riscv_xlen >= 64) + __asm__ ("cpopw %0, %1\n" : "=r" (ret) : "r" (x)); +#else + __asm__ ("cpop %0, %1\n" : "=r" (ret) : "r" (x)); +#endif + return ret; +} + +#if (__riscv_xlen >= 64) +#define av_popcount64 av_popcount64_rvb +static av_always_inline av_const int av_popcount64_rvb(uint64_t x) +{ + int ret; + +#if (__riscv_xlen >= 128) + __asm__ ("cpopd %0, %1\n" : "=r" (ret) : "r" (x)); +#else + __asm__ ("cpop %0, %1\n" : "=r" (ret) : "r" (x)); +#endif + return ret; +} +#endif /* __riscv_xlen >= 64 */ +#endif /* __riscv_zbb */ + +#endif /* AVUTIL_RISCV_INTMATH_H */ From patchwork Fri Sep 9 15:48:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37801 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998370pzh; Fri, 9 Sep 2022 08:50:44 -0700 (PDT) X-Google-Smtp-Source: AA6agR43btux2gtvaeRDIy1XoA4c2Mo7BevZTAPAi8qvJgRK+/cVP4nXMoz2tSaA5fJf//O12RUA X-Received: by 2002:a17:906:fe0a:b0:76f:e373:d84b with SMTP id wy10-20020a170906fe0a00b0076fe373d84bmr10348012ejb.297.1662738643785; Fri, 09 Sep 2022 08:50:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738643; cv=none; d=google.com; s=arc-20160816; b=nAQcWJ9IDCkqU/4jS+SgF3Aki+p1X8Te44moiA2XNjJBmmz2WEF7oAux879/Q7EkVC A+NXdJFVeoILVwA2Hl6qSODxUd1Qsw8gC1ty/BWLU7S+HUPY4z7+fzfKP8IqIzhBodxY v5vrJm+x5Wmmjhhx12LQ+3BsjrRoRTqFvRWVEcwsCtNIsMkxA3UfgTwG+U5jMObPj4iu VcnJy4rIQKhO5Wqcs6a58OvX1DY4cx6UntllQgTWzi1tNZJFEZVw2/0MZDVLVd30XeUR lU8uAqKuBESs1IWfiv0hShM3izv94PTf7y0HfXNZc9Gxn4BtpAB1iyDZv89TxYTHbw7T Ruvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=HwNy8donc04RdinaQp7yc3YhFdPSoNTa/Kt6zJrcJv4=; b=y/io1YaTDhikXEjsPGOdBE4TH29AIJIfd2NFDoloS/+NQA7pUUjUWaz3wU2EkjGjfO VHyG3zQ+HdtxynkC9c/LtJecY689j9YgEyto82Tz6c/FYQK4moTLrtyP4Ut73P8gxpsP yyLFupgHlV9AINZ31u/fOzxZPJIgaGnaalxnBIYmaKXC/l9WyRoiiyh7ezQz0fPuHVJe 4s/BrRnOG7s5giytZAazCK1RbMoOuNQbpAVXbTCi903AbhJfkg6eXRcGFggrdsNaidPT l0QqkOby4j57q0GtXGmRgDVYywt88HO8l+xPeO4qNwLokXnwuz3/WS3o/tW05vPjYiha MUIg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h3-20020aa7de03000000b00450e1ffe3edsi596366edv.382.2022.09.09.08.50.43; Fri, 09 Sep 2022 08:50:43 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6E72E68BB7E; Fri, 9 Sep 2022 18:49:17 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 223E968BB47 for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 48BE9C00B2 for ; Fri, 9 Sep 2022 18:49:00 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:47 +0300 Message-Id: <20220909154859.68954-6-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/18] configure: probe RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: LllvtsAXb+5g From: Rémi Denis-Courmont --- configure | 15 +++++++++++++++ ffbuild/arch.mak | 2 ++ 2 files changed, 17 insertions(+) diff --git a/configure b/configure index b7dc1d8656..c5f20cc323 100755 --- a/configure +++ b/configure @@ -462,6 +462,7 @@ Optimization options (experts only): --disable-mmi disable Loongson MMI optimizations --disable-lsx disable Loongson LSX optimizations --disable-lasx disable Loongson LASX optimizations + --disable-rvv disable RISC-V Vector optimizations --disable-fast-unaligned consider unaligned accesses slow Developer options (useful when working on FFmpeg itself): @@ -2126,6 +2127,10 @@ ARCH_EXT_LIST_PPC=" vsx " +ARCH_EXT_LIST_RISCV=" + rvv +" + ARCH_EXT_LIST_X86=" $ARCH_EXT_LIST_X86_SIMD cpunop @@ -2135,6 +2140,7 @@ ARCH_EXT_LIST_X86=" ARCH_EXT_LIST=" $ARCH_EXT_LIST_ARM $ARCH_EXT_LIST_PPC + $ARCH_EXT_LIST_RISCV $ARCH_EXT_LIST_X86 $ARCH_EXT_LIST_MIPS $ARCH_EXT_LIST_LOONGSON @@ -2642,6 +2648,8 @@ ppc4xx_deps="ppc" vsx_deps="altivec" power8_deps="vsx" +rvv_deps="riscv" + loongson2_deps="mips" loongson3_deps="mips" mmi_deps_any="loongson2 loongson3" @@ -6110,6 +6118,10 @@ elif enabled ppc; then check_cpp_condition power8 "altivec.h" "defined(_ARCH_PWR8)" fi +elif enabled riscv; then + + enabled rvv && check_inline_asm rvv '".option arch, +v\nvsetivli zero, 0, e8, m1, ta, ma"' + elif enabled x86; then check_builtin rdtsc intrin.h "__rdtsc()" @@ -7596,6 +7608,9 @@ if enabled loongarch; then echo "LSX enabled ${lsx-no}" echo "LASX enabled ${lasx-no}" fi +if enabled riscv; then + echo "RISC-V Vector enabled ${riscv-no}" +fi echo "debug symbols ${debug-no}" echo "strip symbols ${stripping-no}" echo "optimize for size ${small-no}" diff --git a/ffbuild/arch.mak b/ffbuild/arch.mak index 997e31e85e..39d76ee152 100644 --- a/ffbuild/arch.mak +++ b/ffbuild/arch.mak @@ -15,5 +15,7 @@ OBJS-$(HAVE_LASX) += $(LASX-OBJS) $(LASX-OBJS-yes) OBJS-$(HAVE_ALTIVEC) += $(ALTIVEC-OBJS) $(ALTIVEC-OBJS-yes) OBJS-$(HAVE_VSX) += $(VSX-OBJS) $(VSX-OBJS-yes) +OBJS-$(HAVE_RVV) += $(RVV-OBJS) $(RVV-OBJS-yes) + OBJS-$(HAVE_MMX) += $(MMX-OBJS) $(MMX-OBJS-yes) OBJS-$(HAVE_X86ASM) += $(X86ASM-OBJS) $(X86ASM-OBJS-yes) From patchwork Fri Sep 9 15:48:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37803 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998477pzh; Fri, 9 Sep 2022 08:51:00 -0700 (PDT) X-Google-Smtp-Source: AA6agR6lM2qnqaS81c5G71ZnmjqbMFNI1oJChLCbcKDVeyPt5xZJUewcFDYP4IYKtelqbWAU2TsY X-Received: by 2002:a17:907:78b:b0:741:3d29:33d2 with SMTP id xd11-20020a170907078b00b007413d2933d2mr10580885ejb.103.1662738660635; Fri, 09 Sep 2022 08:51:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738660; cv=none; d=google.com; s=arc-20160816; b=exPnzKnuEnJBOdVXNA9w4KdTbCZx+pkRtppjvEn1qFp2AxkbEZPEt9mwQwLe0g5TE4 gmf+uo0kcbGxyzlVkNNp6HwK5nPPshBuYTn/dDII3T81hN2gr8xI3TgEb8J9lN9/i2JW XtodRkD0RFmJH6Rt++rCxcRUiaruwhtPRL6TcdwYl6Hef5s1F8Ukwla6OwIMgtV+2JB5 PVPlH/g6qwnDQvGo6eqva+v3wnHEadKw43DlxJxUtdvu4YxEp3LPiCkz3jKZMngzuHPf 5Elqs1OZcsOOpD9SO6j6vmNjqe8oPzAg97gvDaXlXxDfvAh9xN5TFdmor+04raunC12R 7Afg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=eaEou0KQyOFm/1LpkQQhDLfhyGIV6wIp/FCAO/Z/+CM=; b=Tm1XBia+RBMuJmsxHOkOO02PKw/LZjifnGpjPI0meMYu8U0AVhPsEtHbnIJ3pELLKU Qfr6VlD+SV9eoZPBzEDGkhHaTqWwjuTyWqeu9IYeGC8dNivirE93t9DhzO+7FK7j3F75 Iuco8bxW0xS13LHIAGlrxJbqjsDGELVa0s2xg6QmeZ1cwXsMsFPYPmgOHfh4vGdlytyp LJu9KP7u88ALWoDFppK/zfJA1wfNONn9w+UaW7WBrqEJohLaX1N8JahKY4EUNXPwXwVK EVHJeHKkwmPRWVUXsWSoR7Ykdt/a7Ho62sUdcBmBcDZ/bFhgPvaiO8OUeE/+pmfaarTT T+dg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ju9-20020a17090798a900b0073d92f673f8si610709ejc.937.2022.09.09.08.50.59; Fri, 09 Sep 2022 08:51:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 53A2268BB8D; Fri, 9 Sep 2022 18:49:19 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3DE7868BB4C for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 719ADC00B3 for ; Fri, 9 Sep 2022 18:49:00 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:48 +0300 Message-Id: <20220909154859.68954-7-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/18] lavu/riscv: initial common header for assembler macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: xV9On5OUXXk2 From: Rémi Denis-Courmont --- libavutil/riscv/asm.S | 74 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 libavutil/riscv/asm.S diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S new file mode 100644 index 0000000000..7623c161cf --- /dev/null +++ b/libavutil/riscv/asm.S @@ -0,0 +1,74 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#if defined (__riscv_float_abi_soft) +#define NOHWF +#define NOHWD +#define HWF # +#define HWD # +#elif defined (__riscv_float_abi_single) +#define NOHWF # +#define NOHWD +#define HWF +#define HWD # +#else +#define NOHWF # +#define NOHWD # +#define HWF +#define HWD +#endif + + .macro func sym, ext= + .text + .align 2 + + .option push + .ifnb \ext + .option arch, +\ext + .endif + + .global \sym + .hidden \sym + .type \sym, %function + \sym: + + .macro endfunc + .size \sym, . - \sym + .option pop + .previous + .purgem endfunc + .endm + .endm + + .macro const sym, align=3, relocate=0 + .if \relocate + .pushsection .data.rel.ro + .else + .pushsection .rodata + .endif + .align \align + \sym: + + .macro endconst + .size \sym, . - \sym + .popsection + .purgem endconst + .endm + .endm From patchwork Fri Sep 9 15:48:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37806 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998695pzh; Fri, 9 Sep 2022 08:51:26 -0700 (PDT) X-Google-Smtp-Source: AA6agR6jVJSs4p36SXtmHFqlMHOPrWwWtKEcQ+k2lwULL+VmyAI/uMBLy8QnSWoP2VlD4NrQ9z8n X-Received: by 2002:a05:6402:547:b0:44e:8d81:cd9c with SMTP id i7-20020a056402054700b0044e8d81cd9cmr11929335edx.196.1662738686083; Fri, 09 Sep 2022 08:51:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738686; cv=none; d=google.com; s=arc-20160816; b=u5LXiE4+GfNze5ihO4Glsooi1AlFEAfbcK8sKr8ENnn4ZscT/7p7SY3mAGCLDEqtnB E8CaV3WaYLYWwF4sXQSi76mPEWdpJeTmwpZw3JTgqGZZmvkLsi/bH9vw4jprmrQpfZQ4 23vBFVHS7k67PhTp/znhCsecHY68gEod3s0pW9kh0GC+IacRVCTsTHLnZ3Utw3yOE66M Shr2OnAiXq+ya49suAvKH4kr70h0WscDyLe+jJQTOfZafqcQAkKcdVaD18zxw6jeiY7e m+mUa+fvtwlIHmQA5qDZ/rgFK55SMvakt8KA9H2mDwDRlE4c3ANNuM824O1eW73GBjbY kg4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=x4Z7817OBUBQjLadX40/XLCrT1iCj6y3Gii2EFLeS6A=; b=Vh/tWCagDDfUkpN6HU+fh8oelsq61HgXULgzCfnkSJa9gcSGSpFhbqJOzIsPJiYQkX ughrzx6CCLmtUor2GGrK7GnUwXsr9SU1bACGveNoyX0VgjrDn8PCEgTQOdRMV7LEcrNZ 1SdEfUcI+Ct+46mfiS+siAhM/m5vdtKfCBMzbWRld5Yyyb4RhGDGdmmZyVhOxKxcrEwu C4R/k8yN7f9Av6NOqARxdNw67pfV4P5+gTh27I92Jf4GgDzYSk0Kzmn57pPeMe6uwH5d O6AdKYOUdY4LDeDtlycv83K5ghrr+Sz9SZiOujDWLQMrqA0EvPsYOv2LdYyb4faOcCPd 39sQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id r6-20020a05640251c600b004477532706bsi817962edd.517.2022.09.09.08.51.25; Fri, 09 Sep 2022 08:51:26 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1AD8F68BB99; Fri, 9 Sep 2022 18:49:22 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4AC7968BB50 for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 9ADF7C00B4 for ; Fri, 9 Sep 2022 18:49:00 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:49 +0300 Message-Id: <20220909154859.68954-8-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/18] lavu/riscv: add CPU flags for the RISC-V Vector extension X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: XKxKCNeKzXsm From: Rémi Denis-Courmont RVV defines a total of 12 different extensions, including: - 5 different instruction subsets: - Zve32x: 8-, 16- and 32-bit integers, - Zve32f: Zve32x plus single precision floats, - Zve64x: Zve32x plus 64-bit integers, - Zve64f: Zve32f plus Zve64x, - Zve64d: Zve64f plus double precision floats. - 6 different vector lengths: - Zvl32b (embedded only), - Zvl64b (embedded only), - Zvl128b, - Zvl256b, - Zvl512b, - Zvl1024b, - and the V extension proper: equivalent to Zve64f and Zvl128b. In total, there are 6 different possible sets of supported instructions (including the empty set), but for convenience we allocate one bit for each type sets: up-to-32-bit ints (ZVE32X), floats (ZV32F), 64-bit ints (ZV64X) and doubles (ZVE64D). Whence the vector size is needed, it can be retrieved by reading the unprivileged read-only vlenb CSR. This should probably be a separate helper macro if needed at a later point. --- libavutil/cpu.c | 15 +++++++++++ libavutil/cpu.h | 6 +++++ libavutil/cpu_internal.h | 1 + libavutil/riscv/Makefile | 1 + libavutil/riscv/cpu.c | 57 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 80 insertions(+) create mode 100644 libavutil/riscv/Makefile create mode 100644 libavutil/riscv/cpu.c diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 0035e927a5..89d2fb6f56 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -62,6 +62,8 @@ static int get_cpu_flags(void) return ff_get_cpu_flags_arm(); #elif ARCH_PPC return ff_get_cpu_flags_ppc(); +#elif ARCH_RISCV + return ff_get_cpu_flags_riscv(); #elif ARCH_X86 return ff_get_cpu_flags_x86(); #elif ARCH_LOONGARCH @@ -178,6 +180,19 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) #elif ARCH_LOONGARCH { "lsx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LSX }, .unit = "flags" }, { "lasx", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_LASX }, .unit = "flags" }, +#elif ARCH_RISCV +#define AV_CPU_FLAG_ZVE32X_M (AV_CPU_FLAG_ZVE32X) +#define AV_CPU_FLAG_ZVE32F_M (AV_CPU_FLAG_ZVE32X_M | AV_CPU_FLAG_ZVE32F) +#define AV_CPU_FLAG_ZVE64X_M (AV_CPU_FLAG_ZVE32X_M | AV_CPU_FLAG_ZVE64X) +#define AV_CPU_FLAG_ZVE64F_M (AV_CPU_FLAG_ZVE32F_M | AV_CPU_FLAG_ZVE64X) +#define AV_CPU_FLAG_ZVE64D_M (AV_CPU_FLAG_ZVE64F_M | AV_CPU_FLAG_ZVE64D) +#define AV_CPU_FLAG_VECTORS AV_CPU_FLAG_ZVE64D_M + { "vectors", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VECTORS }, .unit = "flags" }, + { "zve32x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE32X }, .unit = "flags" }, + { "zve32f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE32F_M }, .unit = "flags" }, + { "zve64x", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64X_M }, .unit = "flags" }, + { "zve64f", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64F_M }, .unit = "flags" }, + { "zve64d", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ZVE64D_M }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 9711e574c5..44836e50d6 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -78,6 +78,12 @@ #define AV_CPU_FLAG_LSX (1 << 0) #define AV_CPU_FLAG_LASX (1 << 1) +// RISC-V Vector extension +#define AV_CPU_FLAG_ZVE32X (1 << 0) /* 8-, 16-, 32-bit integers */ +#define AV_CPU_FLAG_ZVE32F (1 << 1) /* single precision scalars */ +#define AV_CPU_FLAG_ZVE64X (1 << 2) /* 64-bit integers */ +#define AV_CPU_FLAG_ZVE64D (1 << 3) /* double precision scalars */ + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 650d47fc96..634f28bac4 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -48,6 +48,7 @@ int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); +int ff_get_cpu_flags_riscv(void); int ff_get_cpu_flags_x86(void); int ff_get_cpu_flags_loongarch(void); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile new file mode 100644 index 0000000000..1f818043dc --- /dev/null +++ b/libavutil/riscv/Makefile @@ -0,0 +1 @@ +OBJS += riscv/cpu.o diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c new file mode 100644 index 0000000000..9e4cce5e8b --- /dev/null +++ b/libavutil/riscv/cpu.c @@ -0,0 +1,57 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "config.h" + +#if HAVE_GETAUXVAL +#include +#endif + +#define HWCAP_RV(letter) (1ul << ((letter) - 'A')) + +int ff_get_cpu_flags_riscv(void) +{ + int ret = 0; + + /* If RV-V is enabled statically at compile-time, check the details. */ +#ifdef __riscv_vectors + ret |= AV_CPU_FLAG_ZVE32X; +#if __riscv_v_elen >= 64 + ret |= AV_CPU_FLAG_ZVE64X; +#endif +#if __riscv_v_elen_fp >= 32 + ret |= AV_CPU_FLAG_ZVE32F; +#if __riscv_v_elen_fp >= 64 + ret |= AV_CPU_FLAG_ZVE64F; +#endif +#endif +#endif + +#if HAVE_GETAUXVAL + const unsigned long hwcap = getauxval(AT_HWCAP); + + /* The V extension implies all subsets */ + if (hwcap & HWCAP_RV('V')) + ret |= AV_CPU_FLAG_ZVE32X | AV_CPU_FLAG_ZVE64X + | AV_CPU_FLAG_ZVE32F | AV_CPU_FLAG_ZVE64D; +#endif + + return ret; +} From patchwork Fri Sep 9 15:48:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37802 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998421pzh; Fri, 9 Sep 2022 08:50:52 -0700 (PDT) X-Google-Smtp-Source: AA6agR4tUd13g5UNId/anLHYaPOgDu2sfii2XEgjAMWJ3GgavQ2peKi03hM0r6foWOS62ajII9jv X-Received: by 2002:a17:906:4fd0:b0:73d:be5b:291b with SMTP id i16-20020a1709064fd000b0073dbe5b291bmr10389831ejw.157.1662738652233; Fri, 09 Sep 2022 08:50:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738652; cv=none; d=google.com; s=arc-20160816; b=iULIOmZJB0t6j3MZAnlVrGkTH+TlQhyj/Sy4QzG9qsv5RC/m4A0PGVI1iQIrDlLOpS z5BKG2+OEZ0GX65wn5yh1NvMAnvRSlcTTrH+luXyFdMkz3YwGGDYSkEN/KD9z27IFaFL ZaqHsvJ84V5fO3gIDhuSBdqBjYRtpoeRBkbcN2rn5HzeQjmTqSt7oOrwk/NNDjv5dD2m D+tvCkb85jIoR8pZntmDL0b8DNZQLjvHpB/zHXTD11ApAofIlhaghmBJHfu6kBy8qSkc YXQo7FZEUwcLfm65iWssAOzbI7mX4jxy/1hx8Qt/SYVbxkrkt4uFeTsaU3iKUPxqN0fM aFyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=uN7M5zdHFPiX2RopVgTytSwJNztn3reCBDQhsgul/NI=; b=CvXj+bKKEC0x/94BxDTscnha6BO17mvK4dx/4Dw+JUtVcVZp5BebinuWK+jXUonon/ B30aTCT4+G+LJi4KQn4MZvK2V4BC6Nx9tDg7jNyBXwL0QsZMVaQCJwsEtMIAotGLAho9 U0ZvBeZAcFcMD/R4l900kb5ZXc6e1ub8IdiquQXgHfkkjS2vdo2TE+T379oePDHxjS2R AKK29uVseUZKKA9ZcfWZ5dl81fjC6iH724eOiV8qNd+Vcjz53sDjvMJJqxDx7iwQLOq6 R8AHmtrhlCigpEidOLRosRgsKDktTY0Uenc7+BcI6Xb2XwUc28t9n85D9L/SU/jNPTvG fONg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nb3-20020a1709071c8300b0076fb816dae7si780841ejc.97.2022.09.09.08.50.51; Fri, 09 Sep 2022 08:50:52 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6652B68BB86; Fri, 9 Sep 2022 18:49:18 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3395768BB4B for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id C3EECC00B5 for ; Fri, 9 Sep 2022 18:49:00 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:50 +0300 Message-Id: <20220909154859.68954-9-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/18] checkasm: register the RISC-V V subsets X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: gKOAS7q3zbI9 From: Rémi Denis-Courmont --- tests/checkasm/checkasm.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index e56fd3850e..a5d0503811 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -226,6 +226,11 @@ static const struct { { "ALTIVEC", "altivec", AV_CPU_FLAG_ALTIVEC }, { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, +#elif ARCH_RISCV + { "Zve32x", "zve32x", AV_CPU_FLAG_ZVE32X }, + { "Zve32f", "zve32f", AV_CPU_FLAG_ZVE32F }, + { "Zve64x", "zve64x", AV_CPU_FLAG_ZVE64X }, + { "Zve64d", "zve64d", AV_CPU_FLAG_ZVE64D }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Fri Sep 9 15:48:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37804 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998547pzh; Fri, 9 Sep 2022 08:51:09 -0700 (PDT) X-Google-Smtp-Source: AA6agR5sMDudU6CAtzdfgblE6Hms5Y7DWLZ86Ej6hb5Ztz70AqErqRnJ4CDp7+11Q0FjQuZ6W5tz X-Received: by 2002:a05:6402:27d1:b0:44f:2c17:1a44 with SMTP id c17-20020a05640227d100b0044f2c171a44mr9692624ede.18.1662738669022; Fri, 09 Sep 2022 08:51:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738669; cv=none; d=google.com; s=arc-20160816; b=T5p8oN9gjbtRLO78SrxT3A0/qYh8H5TtzUjgLenRbYDjtUVRgpxLQZDGrTFPiyHj1W 5NprCslKovY7dZqKAyGp4vY01MhiYlUsBJwThWORJu1DsfLp6GJu+XpTnWhdGqHxdCcn bKhYShrlNwD6x569V2DGK2oB/P2QOhCqLd/0hf5w1cLsSSfwI6X5972WEP3SNuKBrZaw 2NoH7RWR697HUjNFpoR88sTB5817KaCknevAekKt+ErW+oJDj4SlBuDhOyioOzKh7QkU Pxt8+VOlaZunbgk0+nUPVegUn3BWyJfyOhgdnxPa8wKeQeuk1uPzxSCnF2KmSt0fvU6T iyyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=6fBkCwVcpXSNo/EKC3gq8ErlWS/0MwCqM3drryKU4tY=; b=Z7K5d0qkmH5EOJ4GrYNVKkGJsBC4+Rs+cUPffzB9aYEvZXj90iV91jBStnm9I8dvr5 eg4TSVp5fOWzLyhoed07pzkCz4/ZBrs7NRfTb0g7YnLhla5a1uizFfuZS3GwfsneHVai 5R1t/uxk6zKvBoS8dU48TNefT4RrmhOUJ262kbX4M/APhcQ614BgaHt1DO7agwlePIkx DvvB0MuKKYnQ/t4rTWNOmIDr6iippvPawsQBbBErlV72sAxUjW4Yrt6iixD7epJ1QHp8 f+Ep6a+tt0ki7JOZZ9hBpfPBRJWCF21m/gIlThimepB0IYh373JjPVk/kDef2aF4IN/C 0MgQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id q28-20020a50aa9c000000b0043ddc200046si690093edc.454.2022.09.09.08.51.08; Fri, 09 Sep 2022 08:51:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3EA3D68BB84; Fri, 9 Sep 2022 18:49:20 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 42C0D68BB47 for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id ECEEFC00B6 for ; Fri, 9 Sep 2022 18:49:00 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:51 +0300 Message-Id: <20220909154859.68954-10-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/18] lavu/riscv: float vector-scalar multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: zTnT2Rnt2v5s From: Rémi Denis-Courmont This is based on existing code from the VLC git tree with two minor changes to account for the different function prototypes. --- libavutil/float_dsp.c | 2 ++ libavutil/float_dsp.h | 1 + libavutil/riscv/Makefile | 4 ++- libavutil/riscv/float_dsp_init.c | 43 ++++++++++++++++++++++++ libavutil/riscv/float_dsp_rvv.S | 56 ++++++++++++++++++++++++++++++++ 5 files changed, 105 insertions(+), 1 deletion(-) create mode 100644 libavutil/riscv/float_dsp_init.c create mode 100644 libavutil/riscv/float_dsp_rvv.S diff --git a/libavutil/float_dsp.c b/libavutil/float_dsp.c index 8676c8b0f8..742dd679d2 100644 --- a/libavutil/float_dsp.c +++ b/libavutil/float_dsp.c @@ -156,6 +156,8 @@ av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact) ff_float_dsp_init_arm(fdsp); #elif ARCH_PPC ff_float_dsp_init_ppc(fdsp, bit_exact); +#elif ARCH_RISCV + ff_float_dsp_init_riscv(fdsp); #elif ARCH_X86 ff_float_dsp_init_x86(fdsp); #elif ARCH_MIPS diff --git a/libavutil/float_dsp.h b/libavutil/float_dsp.h index 9c664592bd..7cad9fc622 100644 --- a/libavutil/float_dsp.h +++ b/libavutil/float_dsp.h @@ -205,6 +205,7 @@ float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len); void ff_float_dsp_init_aarch64(AVFloatDSPContext *fdsp); void ff_float_dsp_init_arm(AVFloatDSPContext *fdsp); void ff_float_dsp_init_ppc(AVFloatDSPContext *fdsp, int strict); +void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp); void ff_float_dsp_init_x86(AVFloatDSPContext *fdsp); void ff_float_dsp_init_mips(AVFloatDSPContext *fdsp); diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 1f818043dc..89a8d0d990 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1 +1,3 @@ -OBJS += riscv/cpu.o +OBJS += riscv/float_dsp_init.o \ + riscv/cpu.o +RVV-OBJS += riscv/float_dsp_rvv.o diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c new file mode 100644 index 0000000000..7c553e9173 --- /dev/null +++ b/libavutil/riscv/float_dsp_init.c @@ -0,0 +1,43 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/float_dsp.h" + +void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, + int len); + +void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, + int len); + +av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_ZVE32F) { + fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + + if (flags & AV_CPU_FLAG_ZVE64D) + fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } +#endif +} diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S new file mode 100644 index 0000000000..365e00190c --- /dev/null +++ b/libavutil/riscv/float_dsp_rvv.S @@ -0,0 +1,56 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_fmul_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + +// (a0) = (a1) * fa0 [0..a2-1] +func ff_vector_dmul_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 + +1: vsetvli t0, a2, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vfmul.vf v16, v16, fa0 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc From patchwork Fri Sep 9 15:48:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37805 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998599pzh; Fri, 9 Sep 2022 08:51:17 -0700 (PDT) X-Google-Smtp-Source: AA6agR7KSu0WGWltelud67sRgMyPtUgFKl+Bs0N5mrvGstK65IGCekOLyzXswSPtXyXPPvTiRE8U X-Received: by 2002:a17:907:7da6:b0:774:53b9:5ae5 with SMTP id oz38-20020a1709077da600b0077453b95ae5mr6525208ejc.759.1662738677520; Fri, 09 Sep 2022 08:51:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738677; cv=none; d=google.com; s=arc-20160816; b=q86GtS9JAGmdSL4qcw+JCST8zB+7+mHLjDTu+kXyeO0myakOugsX36VVzqK1WqPKPp TRFVkxj8tDedU1d4c8vHGB+g1kPaOy9uZ9WJCI4yKWlFb8dNlzgfNy6xbKcjcGTayohI ovkl3corV4WbA7tMtuYcSalTwniGVlyRdF4UdoaGxnPwdJx/ySHhRj7xOGBTEYAGVvGQ FbuE/QHtmRxZpf6YVQdQ9+4sydBbQrP0rYQnMMeZ2uTlujkM7Kgv/lyX+uN+6iN/h/82 eKwJCPcrC6ONF37rDVetzUqhOY0YB4RjYbmjUngw1VDf7RBKbncRTkeTHu99gt7pQXct Qa2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=aMVZOvsU2Cx5Yu6lV+O23cdzJguLPDN5WnXVRGss1gM=; b=v+S8ekdXdh4Q4fPTJI2e5UBxVAI113AAilZZSz2KAjaZvJy0sQ8M9g23JDuzwRGSqR 5gCAIf0QS6vz9leofF7DwJOflAPAOvNmFc/LRMwLwee55qbXZtis7UbR+yFlSpT77CJ8 7Ja19qFHhJU64InVF4hQEyHxsJE3iSwks8DdwMUHBhfDKajQmbftPGE/J3tJcaZAzkjX IitEWfULQUgCu7m5NEGCNC2p1zx1LZC8mL+KS2s8+PbboI8g5Bd0VPDs2ksXKOOQmFmc zZ/vn5t3iWCtmRDFRec+kC8EIxlaLzfFrOVhWrfshnNNN3bro52JJV2fogKkViuRYOoR DrIw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sh20-20020a1709076e9400b0076f65528ab0si723362ejc.733.2022.09.09.08.51.17; Fri, 09 Sep 2022 08:51:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3490268BB96; Fri, 9 Sep 2022 18:49:21 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4634168BB4E for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 22B81C00B7 for ; Fri, 9 Sep 2022 18:49:01 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:52 +0300 Message-Id: <20220909154859.68954-11-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 11/18] lavu/riscv: float vector-vector multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: q5N+zgeeg7O4 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 9 ++++++++- libavutil/riscv/float_dsp_rvv.S | 34 ++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+), 1 deletion(-) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 7c553e9173..49a4c95a0b 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -22,9 +22,13 @@ #include "libavutil/cpu.h" #include "libavutil/float_dsp.h" +void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -34,10 +38,13 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_ZVE32F) { + fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; - if (flags & AV_CPU_FLAG_ZVE64D) + if (flags & AV_CPU_FLAG_ZVE64D) { + fdsp->vector_dmul = ff_vector_dmul_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + } } #endif } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 365e00190c..65c3a77b01 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -19,6 +19,23 @@ #include "config.h" #include "asm.S" +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_fmul_rvv, zve32f +1: vsetvli t0, a3, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vle32.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 @@ -37,6 +54,23 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) [0..a3-1] +func ff_vector_dmul_rvv, zve64d +1: vsetvli t0, a3, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v16, (a1) + add a1, a1, t1 + vle64.v v24, (a2) + add a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Fri Sep 9 15:48:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37807 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998748pzh; Fri, 9 Sep 2022 08:51:35 -0700 (PDT) X-Google-Smtp-Source: AA6agR4WVBkmEXfclaLmlEIfYWMXmNK84TINbsw+M5Bi3mlKDUuuhra2MixnY2+yFA/mc1sxfWpz X-Received: by 2002:a05:6402:1a4f:b0:44e:f731:f7d5 with SMTP id bf15-20020a0564021a4f00b0044ef731f7d5mr11831220edb.357.1662738694887; Fri, 09 Sep 2022 08:51:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738694; cv=none; d=google.com; s=arc-20160816; b=hsWKHktWdHIhW+gPzq9zgHGZV1dWnR8KwYSfYkDwuBjR9/ykVbW8Lu2Dm1d2b4kV+/ 8FWEFTfrASQlcsgMbWChihw2YklTLRE60cpkOprwAMf3ZjcLlp1KCq5akCgQ0dRfWdav 9I9Cyzs/Phqjd1iU2+36oa5F7IcTVtPChu3WaFmTDzK4oHbafd0TnxK1O+2mgQa1s8cO Kl0vy0/nm/cDV0bJ7y5zaKwxH6LMUGkij8eg3jriwNmNFrASwaFp1aQFXpKiA4KEKBUm s9tmso/h3JcOqJxlKvwD5vHwO2H92ORvGSVL05GrQACNwpnqmlNmoQ2Xcpv/LzGuKa4R Oj9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=jWABRTLogK9lpFxfzusEOrTC2Ei/RwvEISAqs5Qx+nA=; b=Z/uwvNG3BIO7SfVucjC7SNo4zKY4AzF9ePW1hlDxvMyqnXet4Tkv3sLnkwnSXMJrCh v60smch2zYnpkdmrd9uPYsiACy4olMHyucS7ncF+QE34mdhPITBtmkOdfi4XEUlX4Ysr yYqtY6jDDnx/ksLeRzKJzsKmhkv7vyV/iLtstoU2qlHzfAJSDYEspBo7W3YVyxlmbAUp oy13wThjMNPY6DmCL2EzWIyVEeuNI4ohR8Ixdvw/IN32XWeNwGlSbdxeO5hJPlUDVtbw 9lZOPUbkQkipwg/Fd61n6mHfCFZGCU4JYOZic/WxPUNzsCkmDbFOMdKd58dDkgD8fXPV /l9Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i18-20020a17090671d200b0076ed46e4445si622393ejk.810.2022.09.09.08.51.34; Fri, 09 Sep 2022 08:51:34 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 19AA068BBA2; Fri, 9 Sep 2022 18:49:23 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6177E68BB4B for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 4BD8BC00B8 for ; Fri, 9 Sep 2022 18:49:01 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:53 +0300 Message-Id: <20220909154859.68954-12-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 12/18] lavu/riscv: float vector multiply-accumulate with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: QbisStHOFzel From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 6 +++++ libavutil/riscv/float_dsp_rvv.S | 38 ++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 49a4c95a0b..b63da72acd 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -24,11 +24,15 @@ void ff_vector_fmul_rvv(float *dst, const float *src0, const float *src1, int len); +void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, + int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); +void ff_vector_dmac_scalar_rvv(double *dst, const double *src, double mul, + int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); @@ -39,10 +43,12 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) if (flags & AV_CPU_FLAG_ZVE32F) { fdsp->vector_fmul = ff_vector_fmul_rvv; + fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; + fdsp->vector_dmac_scalar = ff_vector_dmac_scalar_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; } } diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 65c3a77b01..5a7d92abd6 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -36,6 +36,25 @@ func ff_vector_fmul_rvv, zve32f ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_fmac_scalar_rvv, zve32f +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v24, (a1) + add a1, a1, t1 + vle32.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_fmul_scalar_rvv, zve32f NOHWF fmv.w.x fa0, a2 @@ -71,6 +90,25 @@ func ff_vector_dmul_rvv, zve64d ret endfunc +// (a0) += (a1) * fa0 [0..a2-1] +func ff_vector_dmac_scalar_rvv, zve64d +NOHWD fmv.d.x fa0, a2 +NOHWD mv a2, a3 + +1: vsetvli t0, a2, e64, m8, ta, ma + slli t1, t0, 3 + vle64.v v24, (a1) + add a1, a1, t1 + vle64.v v16, (a0) + vfmacc.vf v16, fa0, v24 + sub a2, a2, t0 + vse64.v v16, (a0) + add a0, a0, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * fa0 [0..a2-1] func ff_vector_dmul_scalar_rvv, zve64d NOHWD fmv.d.x fa0, a2 From patchwork Fri Sep 9 15:48:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37795 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp997976pzh; Fri, 9 Sep 2022 08:49:52 -0700 (PDT) X-Google-Smtp-Source: AA6agR41Dk62p3u3EvqD19dtwRm8oBrTTQ9fxNwYToVvur+80SA+bZ3DSkvxd3ogvS6+u9XXsBg6 X-Received: by 2002:a17:907:6e14:b0:730:a229:f747 with SMTP id sd20-20020a1709076e1400b00730a229f747mr10781870ejc.202.1662738592697; Fri, 09 Sep 2022 08:49:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738592; cv=none; d=google.com; s=arc-20160816; b=XlEUu04TFc2RdZMgPvMjuM5y1QmrjnXUghHzUOkLaXuHliIL9xv6WG/Ygp7+k+yZP1 RcZXG0J6EQ3ojBa2DFLzjarzldcXrGwkQCPu9xUixlzygWTKA3VfA49mlMfAFtLhJeQH mNTNxRWcEjYY+0XwgVxJZxYWAihVvBkn9wInsm4lGd1o2jNGIwMBGY8ROSQke7bmlBEj y2pLaBoZeDTMF8Qo/fVwApjiDrMqwBmDj8FkR3CIdvdjzdjCHwMy2dMQ9B8CapactSFz R4V9LJo/ckZJEf7FqKyQQNIcguumyRW8/9lp5tTy56uozzDZ0/ThGek6qn6ax0ivUlrY +rjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=cINSXvzTe0pN+aG5zm6iXBGyDN5gUuI+UzdJZtKxXzk=; b=TmM6sJyIOU0ydN5f40KSSQDaMoTEghgOxVdZx56CLks6P1aUQyRRuDDVHrkjIjSwYz Mgr+T8yrLs3q0sG8Ui5f9kTSQUx5aYVDlcyj/EwHUAVudBQYrAAyFy/C5IWywgJtCV5X TP2mHzij79kuFnEtD7ieGsnTjMjLZvtRLtIPlEFggOP/zruu/FfdNgQ4+hqpyAd0HRRk vnk46jn4iS8gR0HthaCKH28uIu5CQb2AnBf4vR3hbKwntD0BD967UfUa0fREAcS6rJSy ErLWaI2GhHUat9wu2GQ1Zg0MZSURvSQTsjsa+K56l+F3D2HpV5f22oLPmlIOibFU3irK yxDw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id r16-20020a05640251d000b0044f2a308edesi748914edd.498.2022.09.09.08.49.52; Fri, 09 Sep 2022 08:49:52 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1D8D368BB56; Fri, 9 Sep 2022 18:49:11 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 69B3C68BB1F for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 75DB4C00B9 for ; Fri, 9 Sep 2022 18:49:01 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:54 +0300 Message-Id: <20220909154859.68954-13-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 13/18] lavu/riscv: float vector multiplication-addition with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: SpRic7p/B8kn From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index b63da72acd..9b31ed2ed1 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -28,6 +28,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, + const float *src2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -45,6 +47,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 5a7d92abd6..efbf12179f 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -73,6 +73,25 @@ NOHWF mv a2, a3 ret endfunc +// (a0) = (a1) * (a2) + (a3) [0..a4-1] +func ff_vector_fmul_add_rvv, zve32f +1: vsetvli t0, a4, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v8, (a1) + add a1, a1, t1 + vle32.v v16, (a2) + add a2, a2, t1 + vle32.v v24, (a3) + add a3, a3, t1 + vfmadd.vv v8, v16, v24 + sub a4, a4, t0 + vse32.v v8, (a0) + add a0, a0, t1 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Fri Sep 9 15:48:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37797 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998096pzh; Fri, 9 Sep 2022 08:50:09 -0700 (PDT) X-Google-Smtp-Source: AA6agR4XGrGGhOdO4uCUpRTR7CoF97D8SlcyeyMzSTfLZ0XyO4pR3GEShSev3EQimE8+EC0deeIR X-Received: by 2002:a17:906:ee8e:b0:730:3646:d178 with SMTP id wt14-20020a170906ee8e00b007303646d178mr10505927ejb.426.1662738608986; Fri, 09 Sep 2022 08:50:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738608; cv=none; d=google.com; s=arc-20160816; b=dmQN18h7bV9N/jAkPZTAWCc3rdch0ACi52JH0h442LId6V/NjMRw3ZFXAUNWJR3cfo Mw1ImJ9BdlO7IekcEEl7FJXu/DHO6euqS/iQZSyavmBcijCuCk7uc8ywnXQ0AiXTm4ZO SND/vR+YRoMMP4M/SmOidlmRFGC1uCtOYRyDYlxvYiCrzpeAphoBV1mtL198vFzM47LB 9itsspQQ5XhObMy7yz+z8kwXI3O+3WiuvzVfM2OIOKpZiFZGjwQslDFhgdyZDQ7wmqay 9R9p7kj7uuDvqxYpbqyOJv4j9zlGNoVIL7Z//rvm7teO5xfnEYXxyxHBGEgkfOsxfFLe 0qaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=m7ejFF85fWpDqXHxmq2y6FVRR5pFd45LKdT6GOwSesc=; b=eZrvnNP8pGYogxosGXW7HiyKMfY+plkm1VhRDC/TEBX0eI61TdtEq+HNp9/aGgYp2r A9q3SQSkieqiH0c1wB8EkFDprHKpsTn073RNeuaTcAwqt2let2m6LoGml2N5lKuHoMt6 F7qvLGRv49fnckAhLFWLyJF6sPSkQ2SNtAkmVQtZtDPfzEPCfwvKvUhULh7oNNRZfbrG jXY+wn+VbgYhP/mDPdqYucc9scxhct8S3NU4ZtYkkTNEq+EmIRCyqSaiVP17pjWcFVBy cYuJ/jzurgz5LXeMAPXL57/qsoProUhVBLEp8sZvQRO1LjZ40EdziS1wlU5F8SznKznG A9Xg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sb30-20020a1709076d9e00b0073d8e26e78bsi651778ejc.960.2022.09.09.08.50.08; Fri, 09 Sep 2022 08:50:08 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CDEE468BB6A; Fri, 9 Sep 2022 18:49:12 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9478768BB1F for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 9F8ECC00BA for ; Fri, 9 Sep 2022 18:49:01 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:55 +0300 Message-Id: <20220909154859.68954-14-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 14/18] lavu/riscv: float vector sum-and-difference with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: MUZnrZtTHia+ From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 18 ++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 9b31ed2ed1..4980214821 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -30,6 +30,7 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -48,6 +49,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index efbf12179f..1c3b08b94f 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -92,6 +92,24 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_float_rvv, zve32f +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vfadd.vv v0, v16, v24 + vfsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Fri Sep 9 15:48:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37798 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998186pzh; Fri, 9 Sep 2022 08:50:17 -0700 (PDT) X-Google-Smtp-Source: AA6agR5+3G+YThEAd2yz3tdB+7k6dOS5lrLnBBC9jnua7bTazPDMU2INLWu/qHCjX+HXLMjtQLSJ X-Received: by 2002:a17:907:e91:b0:741:a0a2:cbb8 with SMTP id ho17-20020a1709070e9100b00741a0a2cbb8mr10302791ejc.637.1662738617492; Fri, 09 Sep 2022 08:50:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738617; cv=none; d=google.com; s=arc-20160816; b=aMsceWj/n34AL9YvoX47k7DYxR2bkiQxbdO0N4pa0cV2sP1Z4+JUkz3ehU2oE3WGmN r8AQtKiEwgAvHjCQV9r3O99FCk9JGlF07WnjnLDgA3Cthd6zrEF5XBooSK9GerPpNful GFrzfY7gLAH/T8Fvqesp/Rcmp51k6w2CZSzOGVwkdjBRgiXds3sk6EJ76Rl82x1Lvg5I s3HdCbTOMMosZCfgvHTZbQ/5pV14Ix37sYmMIEmyDeBRsQ4/2S+lOy0WUk1gLP6jbMaW CIakJX43hzsGLcgpWV+knolCPlGrjkeVcFgFMknxNM6yz5aZeLO+iIu7gXzos0bcFicV I21Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=iFrUEysRNZaeFQPW4phUK07bqtQpR0v2uKgfuEvHscg=; b=vzCX4MNkNL+73uBrPKX87mU8thKM5YwOJGfjnuvaKi+NjH348fsJY+dXhAAa6dx3eG QIAYxpBrzIODut0e9rZI8LjAsVtsjZRvLAjynVGdWRx5afM5BDb3DNFMHXydbC4HND4l SrzLT7PFj3Op5d0KaL+Of+FlLYIl/X85u3tCPz0KOFuEaYHIOdl7uTcgzXa5Y8SP5vLg sLZ+wJtAAxtGpBu3bEHfED/HCPd/vLUruQjsMo6VLTMchmqSwpSSnZsC0DrosZzx6wxL +5LmlRSxjkfKj8DWEisTlg3uKqD3XkxcD3E4zSf3J3/cfi4fsTTg43TYlFUAnMIyX5hP cx/g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hq3-20020a1709073f0300b0076f591c4692si929878ejc.330.2022.09.09.08.50.17; Fri, 09 Sep 2022 08:50:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E840C68BB28; Fri, 9 Sep 2022 18:49:13 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9B72D68BB26 for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id C8695C00BB for ; Fri, 9 Sep 2022 18:49:01 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:56 +0300 Message-Id: <20220909154859.68954-15-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 15/18] lavu/riscv: float reversed vector multiplication with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 37fr4tNL/Adl From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 22 ++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 4980214821..e6a5efbf68 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -30,6 +30,8 @@ void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); +void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, + const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, @@ -49,6 +51,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; + fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 1c3b08b94f..b376392294 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -92,6 +92,28 @@ func ff_vector_fmul_add_rvv, zve32f ret endfunc +// (a0) = (a1) * reverse(a2) [0..a3-1] +func ff_vector_fmul_reverse_rvv, zve32f + add t3, a3, -1 + li t2, -4 // byte stride + slli t3, t3, 2 + add a2, a2, t3 + +1: vsetvli t0, a3, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a1) + add a1, a1, t1 + vlse32.v v24, (a2), t2 + sub a2, a2, t1 + vfmul.vv v16, v16, v24 + sub a3, a3, t0 + vse32.v v16, (a0) + add a0, a0, t1 + bnez a3, 1b + + ret +endfunc + // (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] func ff_butterflies_float_rvv, zve32f 1: vsetvli t0, a2, e32, m8, ta, ma From patchwork Fri Sep 9 15:48:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37796 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998027pzh; Fri, 9 Sep 2022 08:50:01 -0700 (PDT) X-Google-Smtp-Source: AA6agR4izJ7/WXT9EJqbF53A9ooZeV9Dl6DiLPIAtyKBXGsH6psZfE4Bt4m+d5z43ZeNKxanXbAq X-Received: by 2002:a17:907:96a3:b0:740:a894:f with SMTP id hd35-20020a17090796a300b00740a894000fmr10320622ejc.460.1662738600934; Fri, 09 Sep 2022 08:50:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738600; cv=none; d=google.com; s=arc-20160816; b=pcKp0oMSxlqb3WzhWlupbetzgIr1CYcFQPwBerWgI94n4tZcY8mij39S6/iFGyG+Bi jFz8CUbLpFGx6O6LVP303x27U8uYaW/yI+q3sfWaYJCKw5lEQUvHgHxruWI82hgPvx9Y UGjeuzhiO0yTPv+zRR/1cEojtHBa7ySEr/2nKNFXYFw9Lbter3ok84I5VaPwImKHmlq8 CIZzxLzHnzN5VB7KXqMVBDB+1Mh5pivAiMYhJFbU9dph+H8hpJ0rog+gtwNZLTPVEGkZ /4TczjIL+EVpvJZtB7Bj9vTOGwNpjP04yJXl1pRiZx59tRSUIDsX+peEZdRCOH9TOXWI 0VIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=QxFu0j1Z3wLEGmBqFOrUCfZyJFYwBMOQoylv88D+eBU=; b=RcgjWA+7iozWGjOPAo63pNrzk+CquBr2VMZqYQPox3Gpyj8U8X0WIg1tt+HrONpqBJ XpRrkDSUxmAq6oZu2wSMjDI0IUOhLcIxdqOmWFcn2MenrfaNfcRQRjkOQKDwpDejZZsT 25/YOqQytpRvoEe/xerzAH0xwsYC1Dc9lRXr9yCUZyG2xVCuhtB7IuhSoDzOOVeKBfQo FIzVMIyAkRhOH29eeh2pPNNgRM/p2U6s3W/h+3LW4aGugD7LCzzQcaNd52bZH+L3SC2K 5OPAC0Qud1yFmIaDAs+utAetz9bVayIEeRNlnmPwU2ZGZJ2RnLaRql8z4KGzzNw5VRMS GmdQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id q18-20020a170906361200b007788260dfdbsi542460ejb.862.2022.09.09.08.50.00; Fri, 09 Sep 2022 08:50:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F3B3968BB5B; Fri, 9 Sep 2022 18:49:11 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8A54768BB25 for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id F1090C00BC for ; Fri, 9 Sep 2022 18:49:01 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:57 +0300 Message-Id: <20220909154859.68954-16-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 16/18] lavu/riscv: float vector windowed overlap/add with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: yJy3tWjbO0O8 From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 35 ++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index e6a5efbf68..99cc8afd31 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -28,6 +28,8 @@ void ff_vector_fmac_scalar_rvv(float *dst, const float *src, float mul, int len); void ff_vector_fmul_scalar_rvv(float *dst, const float *src, float mul, int len); +void ff_vector_fmul_window_rvv(float *dst, const float *src0, + const float *src1, const float *win, int len); void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, const float *src2, int len); void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, @@ -50,6 +52,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul = ff_vector_fmul_rvv; fdsp->vector_fmac_scalar = ff_vector_fmac_scalar_rvv; fdsp->vector_fmul_scalar = ff_vector_fmul_scalar_rvv; + fdsp->vector_fmul_window = ff_vector_fmul_window_rvv; fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index b376392294..65daaa2d27 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -73,6 +73,41 @@ NOHWF mv a2, a3 ret endfunc +func ff_vector_fmul_window_rvv, zve32f + // a0: dst, a1: src0, a2: src1, a3: window, a4: length + addi t0, a4, -1 + add t1, t0, a4 + slli t0, t0, 2 + slli t1, t1, 2 + add a2, a2, t0 + add t0, a0, t1 + add t3, a3, t1 + li t1, -4 // byte stride + +1: vsetvli t2, a4, e32, m4, ta, ma + slli t4, t2, 2 + vle32.v v16, (a1) + add a1, a1, t4 + vlse32.v v20, (a2), t1 + sub a2, a2, t4 + vle32.v v24, (a3) + add a3, a3, t4 + vlse32.v v28, (t3), t1 + sub t3, t3, t4 + vfmul.vv v0, v16, v28 + sub a4, a4, t2 + vfmul.vv v8, v16, v24 + vfnmsac.vv v0, v20, v24 + vfmacc.vv v8, v20, v28 + vse32.v v0, (a0) + add a0, a0, t4 + vsse32.v v8, (t0), t1 + sub t0, t0, t4 + bnez a4, 1b + + ret +endfunc + // (a0) = (a1) * (a2) + (a3) [0..a4-1] func ff_vector_fmul_add_rvv, zve32f 1: vsetvli t0, a4, e32, m8, ta, ma From patchwork Fri Sep 9 15:48:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37799 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998257pzh; Fri, 9 Sep 2022 08:50:27 -0700 (PDT) X-Google-Smtp-Source: AA6agR76a9V+xWss94F6WxXT6u6AEdPLntvGI5+hnnwOu4YoI/2YSQw4MZJDooVrYmMrvs9ocZeL X-Received: by 2002:a17:907:2bf7:b0:730:996d:5e8d with SMTP id gv55-20020a1709072bf700b00730996d5e8dmr10529254ejc.382.1662738626868; Fri, 09 Sep 2022 08:50:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738626; cv=none; d=google.com; s=arc-20160816; b=eE2kvbjbpcASeEPtDR4ypiknT1sUtT1xRvw3/GRxOhm7hWq7s/+TSTpAVjKwy+nz/Y fmExryIq51gY7FAWa6BkXsOgq3oTSp7qhsB/7lBCXeF183zQD8eBQX1NFDuyf22mxXbJ ihLO9YzyDybAPlPEyhps4agDwvAxRk4L9aA4zMVIk5ubRuX6nToI7Zxwym6oL5CNbIzO tHSByN9cwDfPc8KOGSgoxroPZFwDNFgxvHfluRhlZw6qTm3004ZzJDWsuGZta8r4NYkp ++O40RYQ7nyFne8YdphxcwteuQY6L4q03oL/rmrdayHZKIx/KVkqaJUpnLMHuLT2ru/K Vatw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=r8N5UDG+/3PEx6wF7Hwv630JUndvMGux23gMCZI74KE=; b=SxDck2Tm4mWen5DlggmiqkhbdER8mi9UxxAg6mH4zzfHhqe6YyMb+q59eqwt1Hf1Kf IyZlc94gz2uqVuGm1bI3RJF7Hk4ZDB7dYFELjljQgOWtpC2twwSqV4KzaxSVYGO+pUHV 3hTSXAwf8rgGldTw9OfI9ceQnpbufkEr2UzbW0YnM36SjsZsDIlQ9TsrmpqIccvIjQIy SXXDAVf+Om682IY40jaA0xB7fFMcmk3klxPHBQEhuS26KKpqquXsFx7CAWCzbX8GeXtH UiZLDKkBC4kL+EWmeyggMGCJxm25Cu5WAN9FdcnFH/RkDlaT48tkoVga8zi5gQeLZf3b GbeA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sb37-20020a1709076da500b00730fdb36019si628763ejc.21.2022.09.09.08.50.26; Fri, 09 Sep 2022 08:50:26 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F2B0268BB6F; Fri, 9 Sep 2022 18:49:14 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A5FC268BB27 for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 282EEC00BD for ; Fri, 9 Sep 2022 18:49:02 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:58 +0300 Message-Id: <20220909154859.68954-17-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 17/18] lavu/riscv: float vector dot product with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: RSdmYYqSbrSk From: Rémi Denis-Courmont --- libavutil/riscv/float_dsp_init.c | 2 ++ libavutil/riscv/float_dsp_rvv.S | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 99cc8afd31..9c5e06bae9 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -35,6 +35,7 @@ void ff_vector_fmul_add_rvv(float *dst, const float *src0, const float *src1, void ff_vector_fmul_reverse_rvv(float *dst, const float *src0, const float *src1, int len); void ff_butterflies_float_rvv(float *v1, float *v2, int len); +float ff_scalarproduct_float_rvv(const float *v1, const float *v2, int len); void ff_vector_dmul_rvv(double *dst, const double *src0, const double *src1, int len); @@ -56,6 +57,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_fmul_add = ff_vector_fmul_add_rvv; fdsp->vector_fmul_reverse = ff_vector_fmul_reverse_rvv; fdsp->butterflies_float = ff_butterflies_float_rvv; + fdsp->scalarproduct_float = ff_scalarproduct_float_rvv; if (flags & AV_CPU_FLAG_ZVE64D) { fdsp->vector_dmul = ff_vector_dmul_rvv; diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 65daaa2d27..81bd0e510a 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -167,6 +167,27 @@ func ff_butterflies_float_rvv, zve32f ret endfunc +// a0 = (a0).(a1) [0..a2-1] +func ff_scalarproduct_float_rvv, zve32f + vsetvli zero, zero, e32, m8, ta, ma + vmv.s.x v8, zero + +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + add a0, a0, t1 + vle32.v v24, (a1) + add a1, a1, t1 + vfmul.vv v16, v16, v24 + sub a2, a2, t0 + vfredusum.vs v8, v16, v8 + bnez a2, 1b + + vfmv.f.s fa0, v8 +NOHWF fmv.x.w a0, fa0 + ret +endfunc + // (a0) = (a1) * (a2) [0..a3-1] func ff_vector_dmul_rvv, zve64d 1: vsetvli t0, a3, e64, m8, ta, ma From patchwork Fri Sep 9 15:48:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 37800 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp998309pzh; Fri, 9 Sep 2022 08:50:35 -0700 (PDT) X-Google-Smtp-Source: AA6agR4kCn6ogZUk3hpD7cpd+MpVs63YsT5eRSD5NFXq8RXqMtB4OCJaQt5LGR6iXL8pqlZHIuGG X-Received: by 2002:a05:6402:51d1:b0:44b:ea34:6c0a with SMTP id r17-20020a05640251d100b0044bea346c0amr11882698edd.369.1662738635375; Fri, 09 Sep 2022 08:50:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662738635; cv=none; d=google.com; s=arc-20160816; b=HX9z4YVL5xnGImfxSLWuY5qbx4rjLfFTv+ziozmhRIOviygQD7Vcf3ggyF39VkXfoN CAbLJHD9CcAo2laGPDknJWfLSmTQTU3plpFiRC3nwvzUQXNwEiFnMipXGUo7BjuTK6U4 7Dfk8EIIi/5ON9NRQ8oWXVK2FAVJsruxRXXqcjSZ4wDgfV3BnRrnJbqrIK7JJHwAbdf8 xmHfMa70nmcRPeJwn8UnWZRq5E70LqPDTNnexZPQzPP6Koys+SCo16oyZVtFtWmjqOtD YV8hTpgdccS5EXgaiKjvLUR3O5FAOPszsgkzttpcS/dhWCHi8eMM3tjKOdEjsyUsgdsJ bT9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=6tG66ps41/uZ49zpWDkWCXLGdJpkQYXVkow0uUhhCBs=; b=n9tjqvk6cyLpjSoxNqcVlb1DjWBbcIyQwR0VyKPZ3YwyfZufudHdnPeIEIZ/aWeaPN wEgmg1/UzJYbvYxFRtCBTxXtL/mojgMjrDzZQOHab/V/ylw5GRdMM8a6JJtFdZtwGIia njrv5tQBA+3WJIknQIER9brWszS0jQNlCBPGamH7etnRlrAxE5tj0U7TLSSHCASYZC4j MjphnS929uqYsLyP5PwaR4D+g7YDpXRppGOwWGQaZGrZkWMh+QZgp/BoGpmHzKUCReBg Glv2P1egc/IO0vSDGJCwQKcKC1JFjBgU0N2WIRVKStTVal8Vvb5MUb+O7+/hM0iY00v1 wSiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s10-20020a056402520a00b0044f441d2372si867184edd.88.2022.09.09.08.50.34; Fri, 09 Sep 2022 08:50:35 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5889768BB79; Fri, 9 Sep 2022 18:49:16 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AA90268BB29 for ; Fri, 9 Sep 2022 18:49:05 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 51014C00BE for ; Fri, 9 Sep 2022 18:49:02 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Fri, 9 Sep 2022 18:48:59 +0300 Message-Id: <20220909154859.68954-18-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 18/18] lavu/riscv: fixed vector sum-and-difference with RVV X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: y6UXsmnYsX0r From: Rémi Denis-Courmont --- libavutil/fixed_dsp.c | 4 +++- libavutil/fixed_dsp.h | 1 + libavutil/riscv/Makefile | 4 +++- libavutil/riscv/fixed_dsp_init.c | 33 +++++++++++++++++++++++++++ libavutil/riscv/fixed_dsp_rvv.S | 38 ++++++++++++++++++++++++++++++++ 5 files changed, 78 insertions(+), 2 deletions(-) create mode 100644 libavutil/riscv/fixed_dsp_init.c create mode 100644 libavutil/riscv/fixed_dsp_rvv.S diff --git a/libavutil/fixed_dsp.c b/libavutil/fixed_dsp.c index 154f3bc2d3..bc847949dc 100644 --- a/libavutil/fixed_dsp.c +++ b/libavutil/fixed_dsp.c @@ -162,7 +162,9 @@ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int bit_exact) fdsp->butterflies_fixed = butterflies_fixed_c; fdsp->scalarproduct_fixed = scalarproduct_fixed_c; -#if ARCH_X86 +#if ARCH_RISCV + ff_fixed_dsp_init_riscv(fdsp); +#elif ARCH_X86 ff_fixed_dsp_init_x86(fdsp); #endif diff --git a/libavutil/fixed_dsp.h b/libavutil/fixed_dsp.h index fec806ff2d..1217d3a53b 100644 --- a/libavutil/fixed_dsp.h +++ b/libavutil/fixed_dsp.h @@ -161,6 +161,7 @@ typedef struct AVFixedDSPContext { */ AVFixedDSPContext * avpriv_alloc_fixed_dsp(int strict); +void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp); void ff_fixed_dsp_init_x86(AVFixedDSPContext *fdsp); /** diff --git a/libavutil/riscv/Makefile b/libavutil/riscv/Makefile index 89a8d0d990..1597154ba5 100644 --- a/libavutil/riscv/Makefile +++ b/libavutil/riscv/Makefile @@ -1,3 +1,5 @@ OBJS += riscv/float_dsp_init.o \ + riscv/fixed_dsp_init.o \ riscv/cpu.o -RVV-OBJS += riscv/float_dsp_rvv.o +RVV-OBJS += riscv/float_dsp_rvv.o \ + riscv/fixed_dsp_rvv.o diff --git a/libavutil/riscv/fixed_dsp_init.c b/libavutil/riscv/fixed_dsp_init.c new file mode 100644 index 0000000000..08d4c4d9a7 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_init.c @@ -0,0 +1,33 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/fixed_dsp.h" + +void ff_butterflies_fixed_rvv(int *v1, int *v2, int len); + +av_cold void ff_fixed_dsp_init_riscv(AVFixedDSPContext *fdsp) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_ZVE32X) + fdsp->butterflies_fixed = ff_butterflies_fixed_rvv; +} diff --git a/libavutil/riscv/fixed_dsp_rvv.S b/libavutil/riscv/fixed_dsp_rvv.S new file mode 100644 index 0000000000..beb1b949f7 --- /dev/null +++ b/libavutil/riscv/fixed_dsp_rvv.S @@ -0,0 +1,38 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +// (a0) = (a0) + (a1), (a1) = (a0) - (a1) [0..a2-1] +func ff_butterflies_fixed_rvv, zve32x +1: vsetvli t0, a2, e32, m8, ta, ma + slli t1, t0, 2 + vle32.v v16, (a0) + vle32.v v24, (a1) + vadd.vv v0, v16, v24 + vsub.vv v8, v16, v24 + sub a2, a2, t0 + vse32.v v0, (a0) + add a0, a0, t1 + vse32.v v8, (a1) + add a1, a1, t1 + bnez a2, 1b + + ret +endfunc