From patchwork Sat Nov 11 18:18:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44615 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:92a5:b0:181:818d:5e7f with SMTP id q37csp432634pzg; Sat, 11 Nov 2023 10:18:32 -0800 (PST) X-Google-Smtp-Source: AGHT+IH5RyC9BjqPEvlrsCLQQsFLjec/w09E0tPT2J/9/kmJCPWMHhH7mHfdPuZoT+9vljv1jSfD X-Received: by 2002:a17:906:5fc4:b0:9c1:9b3a:4cd1 with SMTP id k4-20020a1709065fc400b009c19b3a4cd1mr1532436ejv.3.1699726712158; Sat, 11 Nov 2023 10:18:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1699726712; cv=none; d=google.com; s=arc-20160816; b=WCGdz0ssP3OOagYKaRhfJshA1Q3yQbyaGvNHnsso5w+5/4NDbIUPmmKuCR40ZUG9X0 J11WoXgRVSIN+MUbabErNgjz0f5aICNedWzsBcjept/TwJknl+EogBbEXYgS021EjryS t+iZ5BUZjvUwZvEMReFG506ah5j4oI+FlhvHNLxcK74cop54+xO7EAm5TgvD5zSzAaVg G54MlY682Gpz8i2FsH/P2xc37hM1bvrEwUcpwyUa35ReSFYCr5AQqoacId/KVdlXu5sd hCWZGcHkUX7EFfesVP3ziVv4u+gzae3SXvUrRUd7tPxmGU0m235uKg9ZpJmI5ATV8+1R tvdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=5Nz6BOMjhbDi3swd1crwfXw4u/RrYooo0qncdykQgP4=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=lO3xHhhPP9tNEEAKLHKGcHZtxjkOlTw2CNrTvzSSQQdFQ1OjtS1d1Nxm615bZpuiRM 9GqEZcRxKk1ECHhQ+DBl7BI+iOYbWQ0JQdfrnFszghGCMqpqJKbybW2exVY1DRdjKKm+ 3+vuzumk3cnRGOvETbDy8ZyEjhXlfLpG5UCQqc8+YSCr6ELYOOtyNeT4JohBJ6390CVY aUJo1p1YbVUIK+M4NecxOIosDz62aK4020m+ARU2nVmnYKWPABkl46qT/HPuKHtLWzm8 tfE21O8ARKSmfAN2ROAWSmNzH7nqs8MiRJDBwOG43GsJqDo/N1kQy/Aq+xYN4qjcCplB K53w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hz24-20020a1709072cf800b009e5f63cf900si983673ejc.967.2023.11.11.10.18.31; Sat, 11 Nov 2023 10:18:32 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 41E4068CCB5; Sat, 11 Nov 2023 20:18:28 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 76D7368CC90 for ; Sat, 11 Nov 2023 20:18:22 +0200 (EET) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 161A1C006B for ; Sat, 11 Nov 2023 20:18:22 +0200 (EET) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 11 Nov 2023 20:18:21 +0200 Message-ID: <20231111181821.60210-1-remi@remlab.net> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/opusdsp: R-V V deemphasis function X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: AJfG5j69QFtU Considering the marginality of the measured performance gains (3-4%), I suppose that we should not merge this. Furthermore those measurements are not expected to improve with large vector sizes, since the code uses only 32 bits per vector no matter what. deemphasis_c: 7703.2 deemphasis_rvv_f32: 7452.0 --- libavcodec/riscv/opusdsp_init.c | 10 +++++--- libavcodec/riscv/opusdsp_rvv.S | 43 +++++++++++++++++++++++++++++++++ 2 files changed, 50 insertions(+), 3 deletions(-) diff --git a/libavcodec/riscv/opusdsp_init.c b/libavcodec/riscv/opusdsp_init.c index 88d8e77f0e..8d363aaf37 100644 --- a/libavcodec/riscv/opusdsp_init.c +++ b/libavcodec/riscv/opusdsp_init.c @@ -26,14 +26,18 @@ #include "libavcodec/opusdsp.h" void ff_opus_postfilter_rvv(float *data, int period, float *g, int len); +float ff_opus_deemphasis_rvv(float *y, float *x, float coeff, int len); av_cold void ff_opus_dsp_init_riscv(OpusDSP *d) { #if HAVE_RVV int flags = av_get_cpu_flags(); - if ((flags & AV_CPU_FLAG_RVV_F32) && (flags & AV_CPU_FLAG_RVB_ADDR) && - (flags & AV_CPU_FLAG_RVB_BASIC)) - d->postfilter = ff_opus_postfilter_rvv; + if (flags & AV_CPU_FLAG_RVV_F32) { + if ((flags & AV_CPU_FLAG_RVB_ADDR) && (flags & AV_CPU_FLAG_RVB_BASIC)) + d->postfilter = ff_opus_postfilter_rvv; + if (ff_get_rv_vlenb() >= 8) + d->deemphasis = ff_opus_deemphasis_rvv; + } #endif } diff --git a/libavcodec/riscv/opusdsp_rvv.S b/libavcodec/riscv/opusdsp_rvv.S index 79ae86c30e..839edfa4b0 100644 --- a/libavcodec/riscv/opusdsp_rvv.S +++ b/libavcodec/riscv/opusdsp_rvv.S @@ -64,3 +64,46 @@ func ff_opus_postfilter_rvv, zve32f ret endfunc + +// FIXME: Zvl64b +func ff_opus_deemphasis_rvv, zve32f + li t0, 0x3f599a00 // 0.85f + li t1, 8 +NOHWF fmv.w.x fa0, a2 +NOHWF mv a2, a3 + vsetivli zero, 1, e32, mf2, ta, ma + vmv.s.x v8, t0 + fmv.w.x ft0, t0 + blt a2, t1, 2f +1: + vlseg8e32.v v0, (a1) + addi a2, a2, -8 + vfmacc.vf v0, fa0, v8 + addi a1, a1, 8 * 4 + vfmacc.vf v1, ft0, v0 + vfmacc.vf v2, ft0, v1 + vfmacc.vf v3, ft0, v2 + vfmacc.vf v4, ft0, v3 + vfmacc.vf v5, ft0, v4 + vfmacc.vf v6, ft0, v5 + vfmacc.vf v7, ft0, v6 + vfmv.f.s fa0, v7 + vsseg8e32.v v0, (a0) + addi a0, a0, 8 * 4 + bge a2, t1, 1b +2: + beqz a2, 4f +3: + flw fa1, (a1) + addi a2, a2, -1 + fmadd.s fa0, ft0, fa0, fa1 + addi a1, a1, 4 + fsw fa0, (a0) + addi a0, a0, 4 + bnez a2, 3b +4: + ret + +NOHWF fmv.x.w a0, fa0 + ret +endfunc