From patchwork Sun Oct 29 20:25:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44430 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dd83:b0:15d:8365:d4b8 with SMTP id kw3csp1077763pzb; Sun, 29 Oct 2023 13:26:12 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHTNUM3+hm18Gjs5NfonAbsDNvcQ/wf2TtH/U70p+7xOPl5tI+T8f4JDy8KLV6kMGIDX3zc X-Received: by 2002:a17:906:c10c:b0:9c4:950:92b5 with SMTP id do12-20020a170906c10c00b009c4095092b5mr5675085ejc.6.1698611171732; Sun, 29 Oct 2023 13:26:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698611171; cv=none; d=google.com; s=arc-20160816; b=RX8s+c9z6Bus/yFfZjVP5wmFSd+3BuMbtK765TRMuDjZ0lGoAbH4u9xt9p3Ks2QIM7 jZ2rPEzoWeSLv+HWmtnKrjnxYsnarErAxwtfFjkx66U4jFaYE5kZxPqkVKlZ/7GsGILW swl3l++viPb8cV/VaP37G+tm4glc9PuSHE2nVbEK3t6mqh118m/SsdPx8b7QyhbHUKGW v/gCef8L3JtiJsAPWT6SHtaJWrI6E7hZizX/don06b4WSd2Xnie2i7uAAjd+JyYW/12a GMHjbjEYCpeUAS3GmGZ/0qt+X5BRtnKPG9du1tGh5u2T31/AjJfmtFZWUYElgXKU7t+N uspA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=GXWVkVMHbqfm3/oQ6Mg10EbeOhp2jnqkZOiD3QERRug=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=fq2aO3SU7RVq/fQqrCgFQUfGPDj1rdpJK8I8HeZLLQZz6eCFYEoR8LAjzlwCTHoPA2 O9w9yLM0TlMrso9wrqz6H+t1Vu1MUq/8WetWDZBm8r4KqZjA2AgEWFgbC+X8h+Z7tobH YDcqWN1J6PFl1d6zZnzHuqtQICkTENJ/JkR2kc/oFkNBe9hUsYMNoQ130SOe7d4px/ys Kr51qJJnZqPHWWJoLJXQYDfb0d65VNYTwcHlbuaXq7v/8RJh8Iz2LebozTPcFUhmMXVb 4dmyRx5LhtAkyGBhulDSMLpd2UpzOlwHacNGgsHUX4SMACg/y2QDz5C1vzZMZDwxydFc YdoQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j29-20020a170906105d00b009adf712770bsi2998422ejj.432.2023.10.29.13.26.10; Sun, 29 Oct 2023 13:26:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1559D68CC68; Sun, 29 Oct 2023 22:26:07 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 47DD568CC64 for ; Sun, 29 Oct 2023 22:26:00 +0200 (EET) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id DB008C006F for ; Sun, 29 Oct 2023 22:25:59 +0200 (EET) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sun, 29 Oct 2023 22:25:57 +0200 Message-ID: <20231029202559.95350-1-remi@remlab.net> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/3] lavc/sbrdsp: R-V V sum64x5 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: htDnSPE1a94U sum64x5_c: 385.0 sum64x5_rvv_f32: 116.0 --- libavcodec/riscv/Makefile | 4 +-- libavcodec/riscv/sbrdsp_init.c | 37 +++++++++++++++++++++++++ libavcodec/riscv/sbrdsp_rvv.S | 50 ++++++++++++++++++++++++++++++++++ libavcodec/sbrdsp.h | 1 + libavcodec/sbrdsp_template.c | 2 ++ 5 files changed, 92 insertions(+), 2 deletions(-) create mode 100644 libavcodec/riscv/sbrdsp_init.c create mode 100644 libavcodec/riscv/sbrdsp_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 06815d3170..2c9af16782 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,5 +1,5 @@ -OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o -RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o +OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o riscv/sbrdsp_init.o +RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o riscv/sbrdsp_rvv.o OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_init.o \ riscv/ac3dsp_rvb.o OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_init.o diff --git a/libavcodec/riscv/sbrdsp_init.c b/libavcodec/riscv/sbrdsp_init.c new file mode 100644 index 0000000000..837f24e1e0 --- /dev/null +++ b/libavcodec/riscv/sbrdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2023 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/sbrdsp.h" + +void ff_sbr_sum64x5_rvv(float *z); + +av_cold void ff_sbrdsp_init_riscv(SBRDSPContext *c) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if ((flags & AV_CPU_FLAG_RVV_F32) && (flags & AV_CPU_FLAG_RVB_ADDR)) { + c->sum64x5 = ff_sbr_sum64x5_rvv; + } +#endif +} diff --git a/libavcodec/riscv/sbrdsp_rvv.S b/libavcodec/riscv/sbrdsp_rvv.S new file mode 100644 index 0000000000..e1d548b41b --- /dev/null +++ b/libavcodec/riscv/sbrdsp_rvv.S @@ -0,0 +1,50 @@ +/* + * Copyright © 2023 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_sbr_sum64x5_rvv, zve32f + li a5, 64 + addi a1, a0, 64 * 4 + addi a2, a0, 128 * 4 + addi a3, a0, 192 * 4 + addi a4, a0, 256 * 4 +1: + vsetvli t0, a5, e32, m8, ta, ma + sub a5, a5, t0 + vle32.v v0, (a0) + vle32.v v8, (a1) + sh2add a1, t0, a1 + vle32.v v16, (a2) + vfadd.vv v0, v0, v8 + sh2add a2, t0, a2 + vle32.v v24, (a3) + vfadd.vv v0, v0, v16 + sh2add a3, t0, a3 + vle32.v v8, (a4) + vfadd.vv v0, v0, v24 + sh2add a4, t0, a4 + vfadd.vv v0, v0, v8 + vse32.v v0, (a0) + sh2add a0, t0, a0 + bnez a5, 1b + + ret +endfunc diff --git a/libavcodec/sbrdsp.h b/libavcodec/sbrdsp.h index 8513c423af..49782202a7 100644 --- a/libavcodec/sbrdsp.h +++ b/libavcodec/sbrdsp.h @@ -48,6 +48,7 @@ extern const INTFLOAT AAC_RENAME(ff_sbr_noise_table)[][2]; void AAC_RENAME(ff_sbrdsp_init)(SBRDSPContext *s); void ff_sbrdsp_init_arm(SBRDSPContext *s); void ff_sbrdsp_init_aarch64(SBRDSPContext *s); +void ff_sbrdsp_init_riscv(SBRDSPContext *s); void ff_sbrdsp_init_x86(SBRDSPContext *s); void ff_sbrdsp_init_mips(SBRDSPContext *s); diff --git a/libavcodec/sbrdsp_template.c b/libavcodec/sbrdsp_template.c index 89e389d9a0..79cd2156d9 100644 --- a/libavcodec/sbrdsp_template.c +++ b/libavcodec/sbrdsp_template.c @@ -98,6 +98,8 @@ av_cold void AAC_RENAME(ff_sbrdsp_init)(SBRDSPContext *s) ff_sbrdsp_init_arm(s); #elif ARCH_AARCH64 ff_sbrdsp_init_aarch64(s); +#elif ARCH_RISCV + ff_sbrdsp_init_riscv(s); #elif ARCH_X86 ff_sbrdsp_init_x86(s); #elif ARCH_MIPS From patchwork Sun Oct 29 20:25:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44431 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dd83:b0:15d:8365:d4b8 with SMTP id kw3csp1077818pzb; Sun, 29 Oct 2023 13:26:22 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEhp52pWJ+qC69rj+bsW1rmEnngBJq45+ZIOdc7EPdIqFVfShXNhJqitQFHevv47zTlGXgm X-Received: by 2002:a17:907:9813:b0:9b2:babd:cd51 with SMTP id ji19-20020a170907981300b009b2babdcd51mr5653543ejc.5.1698611182775; Sun, 29 Oct 2023 13:26:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698611182; cv=none; d=google.com; s=arc-20160816; b=NIURDnnvGJMKhBDQ9gtzNDGSRz+ZOX7f+8F3tJfd+EtxuqROYG3ozXLNF0vMnc2DOT x0mL28ldr00IxHMldP4Ni+u97GH0VRkg9PXwPQoq8/g8ybA5H6V4qCvo/m837r/6PGb4 SEZurdxoyqwerek7HOw0B9EQkJT0kITGehG89xiIIs9wTVdbkGxrTofmthk8HbDU/3wG l/MvzPzndEXDQx7IH4n0bOmhFjDIcNQPXuz+H+flGzlSsWFhMQz6M9OhmEuoryssJr2U VhUehSIazsj4gHUVxPN7BElBe+67h+lCGZOioyw0X/hf+L/xuDeu0QqHctaWmHYVqx0S 3Mag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=75VpEAGy5HYDgdtihWrVDZwpdm/1RcatD4nwXzhIeRc=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=e3RDgqCqzee1U88Qojb4xAIhln7STeMYRwGCUD99gQ7YuI1dxlwnV9F371Wh+K5J6R cM9Y1IZit/mCwlzGr8nvQMs3uBiguV6wA+s81q1Je8uPAUCEGOVSwnD3f+z2ugNc/hCL bK1RNyre7QtKp6r8lwmST/ORqax3LRGMAT+p3Wf2++WCCIxRHWUEFkYOFai+PaGTC51d gi0XG3QOB5XiQ+fiJCRIlakCgJgvD18Z807tbg9w5SczyrYqsurwQpgcQIpCL09qh83v F0h9KEAUCXMc91zKYReR3lAQWpgsBYpBBWFDphhqjvfvX7z3cP02qOXdRE7Xou/BDkfq 7JUg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id wg8-20020a17090705c800b009c7695ae577si2821095ejb.626.2023.10.29.13.26.19; Sun, 29 Oct 2023 13:26:22 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 33AFE68CC6E; Sun, 29 Oct 2023 22:26:08 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7BB3B68CC64 for ; Sun, 29 Oct 2023 22:26:00 +0200 (EET) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 1BA7EC009A for ; Sun, 29 Oct 2023 22:26:00 +0200 (EET) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sun, 29 Oct 2023 22:25:58 +0200 Message-ID: <20231029202559.95350-2-remi@remlab.net> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231029202559.95350-1-remi@remlab.net> References: <20231029202559.95350-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/3] lavc/sbrdsp: R-V V sum_square X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: B5PoeCWbxJTu sum_square_c: 803.5 sum_square_rvv_f32: 283.2 --- libavcodec/riscv/sbrdsp_init.c | 2 ++ libavcodec/riscv/sbrdsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/libavcodec/riscv/sbrdsp_init.c b/libavcodec/riscv/sbrdsp_init.c index 837f24e1e0..e0e62278b0 100644 --- a/libavcodec/riscv/sbrdsp_init.c +++ b/libavcodec/riscv/sbrdsp_init.c @@ -24,6 +24,7 @@ #include "libavcodec/sbrdsp.h" void ff_sbr_sum64x5_rvv(float *z); +float ff_sbr_sum_square_rvv(float (*x)[2], int n); av_cold void ff_sbrdsp_init_riscv(SBRDSPContext *c) { @@ -32,6 +33,7 @@ av_cold void ff_sbrdsp_init_riscv(SBRDSPContext *c) if ((flags & AV_CPU_FLAG_RVV_F32) && (flags & AV_CPU_FLAG_RVB_ADDR)) { c->sum64x5 = ff_sbr_sum64x5_rvv; + c->sum_square = ff_sbr_sum_square_rvv; } #endif } diff --git a/libavcodec/riscv/sbrdsp_rvv.S b/libavcodec/riscv/sbrdsp_rvv.S index e1d548b41b..4684630953 100644 --- a/libavcodec/riscv/sbrdsp_rvv.S +++ b/libavcodec/riscv/sbrdsp_rvv.S @@ -48,3 +48,22 @@ func ff_sbr_sum64x5_rvv, zve32f ret endfunc + +func ff_sbr_sum_square_rvv, zve32f + vsetvli t0, zero, e32, m8, ta, ma + slli a1, a1, 1 + vmv.v.x v8, zero + vmv.s.x v0, zero +1: + vsetvli t0, a1, e32, m8, tu, ma + vle32.v v16, (a0) + sub a1, a1, t0 + vfmacc.vv v8, v16, v16 + sh2add a0, t0, a0 + bnez a1, 1b + + vfredusum.vs v0, v8, v0 + vfmv.f.s fa0, v0 +NOHWF fmv.x.w a0, fa0 + ret +endfunc From patchwork Sun Oct 29 20:25:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44432 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dd83:b0:15d:8365:d4b8 with SMTP id kw3csp1077852pzb; Sun, 29 Oct 2023 13:26:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHQXCQ4ADgJnnOnjguo1R7IvUzK52wm45mWh4kgmUA0okPIVq2zfDebO0BIt2vBZIl9NrA4 X-Received: by 2002:a17:906:f149:b0:9d3:8d1e:ce9 with SMTP id gw9-20020a170906f14900b009d38d1e0ce9mr1122395ejb.20.1698611187221; Sun, 29 Oct 2023 13:26:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698611187; cv=none; d=google.com; s=arc-20160816; b=TQeWLkbzycrBoQ8rs9YRVZ6fCeboerp638thfKgE/omXZwQITmZAlJ4TeP42XpOYQ1 Xw6gX1PKnjY08pOIJtWNfYcJKMzI/jV/D9aVhBIviJlkr1YAQwWfJJ2wAvlaY2wQuCoL P7nYUeH2LZ6gG2Lm0/Q1saKqoxdUM9KCLW+X0LbuyR43HHCensf3xzVyPsJcmBkS1HUh 59pA2koIti2iFVNI6/6WNQK/Xu8Pdvukh6TF3EnAoKfUKnfJ1bEPkM+TTpGbPQcQkevY io5eyY0Jkq5y92moeSCPlUgRbnQr4UraupDvbRwhq3xUA7zcrvnn0+GHsENsOlpRbklJ aKoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=dkjWlXXm+45Oe2+m07Tr2GjLHzAjCJppJhjAzkb40h8=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=NTfPybM4iopyZEkPzCCBA3vDjO/nxyQYYCWPsB0xo3VdBB4rvJz/Fq9+XOVRbshE+d mZSGj+OqcOY1tdxN7D/2bL0DOheP37DTs59qqzhOVW1vusWn991soVGdjUn+MJRGWASI F8BmKMbzdDroaVCHJVRi5Rg2To8JzRpHmG/1bt2TL1I4KMzdzUXwbrdN5UggfB3AdqRn X+151H8BpVF0Iw6l0udEPsS4FOzxQkItfTQf+ZD6VFiYw6pWM7/APcfjrZfpIKMzDJ4W tJvwahGWKwQT4e2Fa2wjCD8a2v1WwazGmT95122g6gYTSmKt5ut5SbWKgpvuP6jokJWL BXXA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 9-20020a170906224900b009ad8796a6b2si2876769ejr.190.2023.10.29.13.26.26; Sun, 29 Oct 2023 13:26:27 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3BF6F68CC79; Sun, 29 Oct 2023 22:26:09 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B0DBA68CC64 for ; Sun, 29 Oct 2023 22:26:00 +0200 (EET) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 4FAE0C01A3 for ; Sun, 29 Oct 2023 22:26:00 +0200 (EET) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sun, 29 Oct 2023 22:25:59 +0200 Message-ID: <20231029202559.95350-3-remi@remlab.net> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231029202559.95350-1-remi@remlab.net> References: <20231029202559.95350-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/3] lavc/sbrdsp: R-V V neg_odd_64 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: VNO4CP8ifegM With 128-bit vectors, this is mostly pointless but also harmless. Performance gains should be more noticeable with larger vector sizes. neg_odd_64_c: 76.2 neg_odd_64_rvv_i64: 74.7 --- libavcodec/riscv/sbrdsp_init.c | 5 +++++ libavcodec/riscv/sbrdsp_rvv.S | 17 +++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/libavcodec/riscv/sbrdsp_init.c b/libavcodec/riscv/sbrdsp_init.c index e0e62278b0..1b85b2cae9 100644 --- a/libavcodec/riscv/sbrdsp_init.c +++ b/libavcodec/riscv/sbrdsp_init.c @@ -25,6 +25,7 @@ void ff_sbr_sum64x5_rvv(float *z); float ff_sbr_sum_square_rvv(float (*x)[2], int n); +void ff_sbr_neg_odd_64_rvv(float *x); av_cold void ff_sbrdsp_init_riscv(SBRDSPContext *c) { @@ -35,5 +36,9 @@ av_cold void ff_sbrdsp_init_riscv(SBRDSPContext *c) c->sum64x5 = ff_sbr_sum64x5_rvv; c->sum_square = ff_sbr_sum_square_rvv; } +#if __riscv_xlen >= 64 + if ((flags & AV_CPU_FLAG_RVV_I64) && (flags & AV_CPU_FLAG_RVB_ADDR)) + c->neg_odd_64 = ff_sbr_neg_odd_64_rvv; +#endif #endif } diff --git a/libavcodec/riscv/sbrdsp_rvv.S b/libavcodec/riscv/sbrdsp_rvv.S index 4684630953..b510190b15 100644 --- a/libavcodec/riscv/sbrdsp_rvv.S +++ b/libavcodec/riscv/sbrdsp_rvv.S @@ -67,3 +67,20 @@ func ff_sbr_sum_square_rvv, zve32f NOHWF fmv.x.w a0, fa0 ret endfunc + +#if __riscv_xlen >= 64 +func ff_sbr_neg_odd_64_rvv, zve64x + li a1, 32 + li t1, 1 << 63 +1: + vsetvli t0, a1, e64, m8, ta, ma + vle64.v v8, (a0) + sub a1, a1, t0 + vxor.vx v8, v8, t1 + vse64.v v8, (a0) + sh3add a0, t0, a0 + bnez t0, 1b + + ret +endfunc +#endif