From patchwork Sat Oct 1 12:32:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38492 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp252118pzh; Sat, 1 Oct 2022 05:33:06 -0700 (PDT) X-Google-Smtp-Source: AMsMyM75/DgxHd+ufHo3R88gwqbjXjwib18byCigrfsMM6k//Hg7elJMRacYbOvLPD0kTOdzjjf9 X-Received: by 2002:a05:6402:5249:b0:451:67ff:f02 with SMTP id t9-20020a056402524900b0045167ff0f02mr11748034edd.227.1664627586568; Sat, 01 Oct 2022 05:33:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664627586; cv=none; d=google.com; s=arc-20160816; b=HjHQirHs8Y6zHjvsQzwdKU+2Sk2WsQtzYTxygcwvb5BSiDCfr3PSs6HqdKcVSkn0Hz +OoN4NpLFVdP8ZlUHkiz76S/bfgv6WlJoVXA9rjPGgDK3MTg3a1HsBkCLUhkQKNVu9AL oRRQ179LpA9vDInXrgZmJR5b8t6iIGYD6UOyxvIP7mCQtd0OCZnPKVr9SXa4WX4cIa9K BKIV1ZHDz5yY5YGQVN6ciP0NZxYTtz3PgdmKxUfmU2abzZLMU0cya+kFZEWgD2N8FiOR v7Ym+Y3jmo0iTKMWvPHrwLxAlaHk9257cdkPywbI0GLdnSiPpj6rtvxnooIYMHL+gCKF 5LTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=HnxAtNZfQJGpDHUHi37b1tqOOQ+3KbwS3IetMO85GH0=; b=jWXQAn0Yxm/oOZ42K34vrfadtBAlwW8AO+OTMgO29fnkt6KCx91mkZRYbpa8MF1UAt GE5y5oEuD0sVbXC+JfME/RjZPbOqEQANaWRZ8aPUA9LRSWTRAcJlZmdsEajdC6rj09Nf 9JOUfFB5prXFqs2FEwZ453J65dzmlxUdvsyo5rCeZVgCoqcHqU/rFwTWXUHraHIl5zuX 033HuDOR9HkUOlJXIkaeS/W0mNxVkBInWbMYOM4nFQO+CutZFvJTapjetjdaEI0tyf/z 0xiUZ1h/4vvnROvqn5kuHC7gwBLmWm4Rk3QnZUQvVnc5JC1DcRbY/gB5wwowQ+ERSDoc kNVw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e5-20020a170906844500b0077b2ad71224si3765065ejy.136.2022.10.01.05.33.06; Sat, 01 Oct 2022 05:33:06 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 15A0368BB36; Sat, 1 Oct 2022 15:32:49 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9765E68BB06 for ; Sat, 1 Oct 2022 15:32:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 643DDC00AF for ; Sat, 1 Oct 2022 15:32:40 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Sat, 1 Oct 2022 15:32:39 +0300 Message-Id: <20221001123239.33042-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5606030.DvuYhMxLoT@basile.remlab.net> References: <5606030.DvuYhMxLoT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/3] lavc/opusdsp: RISC-V V (256-bit vectors) postfilter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: X8956edpS25P From: RĂ©mi Denis-Courmont This adds a variant of the postfilter for use with 256-bit vectors (or larger). Since the function requires 160-bit logical vectors, we can cut the group multiplier down to just one. The different vector type is passed via register. Unfortunately, there is no VSETIVL instruction, so the constant vector size (5) also needs to be passed via a register. --- libavcodec/riscv/opusdsp_init.c | 17 ++++++++++++++--- libavcodec/riscv/opusdsp_rvv.S | 10 +++++++++- 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/libavcodec/riscv/opusdsp_init.c b/libavcodec/riscv/opusdsp_init.c index 18d3892329..433b71e710 100644 --- a/libavcodec/riscv/opusdsp_init.c +++ b/libavcodec/riscv/opusdsp_init.c @@ -25,14 +25,25 @@ #include "libavutil/riscv/cpu.h" #include "libavcodec/opusdsp.h" -void ff_opus_postfilter_rvv(float *data, int period, float *gains, int len); +void ff_opus_postfilter_rvv_32(float *data, int period, float *gains, int len); +void ff_opus_postfilter_rvv_16(float *data, int period, float *gains, int len); av_cold void ff_opus_dsp_init_riscv(OpusDSP *d) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if ((flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16) - d->postfilter = ff_opus_postfilter_rvv; + if (flags & AV_CPU_FLAG_RVV_I32) + switch (ff_get_rv_vlenb()) { + default: + d->postfilter = ff_opus_postfilter_rvv_32; + break; + case 16: + d->postfilter = ff_opus_postfilter_rvv_16; + break; + case 8: + case 4: + break; + } #endif } diff --git a/libavcodec/riscv/opusdsp_rvv.S b/libavcodec/riscv/opusdsp_rvv.S index f42a9c36c5..cfe332227e 100644 --- a/libavcodec/riscv/opusdsp_rvv.S +++ b/libavcodec/riscv/opusdsp_rvv.S @@ -21,7 +21,15 @@ #include "config.h" #include "libavutil/riscv/asm.S" -func ff_opus_postfilter_rvv, zve32f +func ff_opus_postfilter_rvv_16, zve32f + lvtypei a5, e32, m2, ta, ma + j 1f +endfunc + +func ff_opus_postfilter_rvv_32, zve32f + lvtypei a5, e32, m1, ta, ma +1: + li a4, 5 addi a1, a1, 2 slli a1, a1, 2 sh2add a3, a3, a0