From patchwork Thu Jun 15 10:36:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?5rKI5L2p5am3?= X-Patchwork-Id: 42098 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c526:b0:117:ac03:c9de with SMTP id gm38csp670381pzb; Thu, 15 Jun 2023 03:37:11 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4PR86FBGcqoFSesykJ+aDZkoyX1BkVZ570AuAqXHv/Tbzju602cltYbKGGYAb8741fg1kw X-Received: by 2002:aa7:cd95:0:b0:514:a484:e85d with SMTP id x21-20020aa7cd95000000b00514a484e85dmr12560196edv.1.1686825431142; Thu, 15 Jun 2023 03:37:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686825431; cv=none; d=google.com; s=arc-20160816; b=CgoKB0ERJYf9PHOa37+I+HQ27zK/csUAplTwnKmZdaiLwOjMyT55nrq+Ti6PpDLYPN qYyBNhovVEv/Z2p04Nqnc3utzW+bCGsL8FqsO4u/HPCxFmyaKpn+XYhTjYe/w/vaW351 GygOANbeJG6ZaqOMmHngE0x716oFaqftY9z3K1RoEtKM9R6RKEIBn6FT+JhqCEsxOpk8 Pjb9+Z1aa7Uu/eRgttNI526c8po79Ps7CrFgqTp2ZdBqs7o1KsRLjSkiwcSNW+aZgwDP 9gyoxgOweDwVRPnDE1Lg2FO40jRxhh3N6xpho/YRBOVvRuUewXGhps0A0EiLDvBh928N I4IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:delivered-to; bh=kIU0XsjGA35JDtYcQ8keUZ53KRD6y+4cXptMUMbpQEE=; b=eT2CMERE2jjrFnAe/rO4SxA5ZEpEyxBB0KmwrTtBAbzLFm3Tx2/L7NlbhllEdo9ooF kbp+8vQ2W2zkJGSmUwvNNyZ18dUsnC+xc30wofS7QtHUDkBCoE2MHZlXJm8vL6cy7UXY udUa1ylJJtrTy4lrnR1UTH8c8gGPRkA/S1H5bdCLLtyZM1X+avz6dn1gsk+SHw2XhX/R IFL/X58izOjmLYaUqHI0YQ8pyKHgC3sxPjVRqki7fjzYUeVE91x1b3zK2MwHu5gxJald zce+Xd3SE5BztkP3hVjN5mbOD/Sfq5awpFR+31N7Kc7j93cbEnyFsi/+1bnLB2XkSW7x Lk4A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id m19-20020aa7d353000000b0051a2db13ddbsi236374edr.192.2023.06.15.03.37.10; Thu, 15 Jun 2023 03:37:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2090B68C4EC; Thu, 15 Jun 2023 13:36:59 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from zg8tmty3ljk5ljewns4xndka.icoremail.net (zg8tmty3ljk5ljewns4xndka.icoremail.net [167.99.105.149]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 568F268C30E for ; Thu, 15 Jun 2023 13:36:51 +0300 (EEST) Received: from host042-ubuntu-1804.lxd (unknown [10.12.130.38]) by app1 (Coremail) with SMTP id EwgMCgDnhMS_6Ypk6XcmAA--.44896S5; Thu, 15 Jun 2023 18:36:48 +0800 (CST) From: Peiting Shen To: ffmpeg-devel@ffmpeg.org Date: Thu, 15 Jun 2023 10:36:40 +0000 Message-Id: <20230615103645.25778-2-shenpeiting@eswincomputing.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230615103645.25778-1-shenpeiting@eswincomputing.com> References: <20230615103645.25778-1-shenpeiting@eswincomputing.com> X-CM-TRANSID: EwgMCgDnhMS_6Ypk6XcmAA--.44896S5 X-Coremail-Antispam: 1UD129KBjvJXoW3Jr1UXFW3XFWDJw4xur4DArb_yoW7Zr4rpF WxWw13Grn3J39Fkas3GF15ZF1rJ34rWFZ5KF17uw17Zr4Ut3y8XrnFyr13A34kXrWfAF15 uF45GF13CF18tw7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUB014x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE 2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4 x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E 0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67 AKxVWUGVWUXwAv7VC2z280aVAFwI0_Gr0_Cr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48I cxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY02Avz4vE-syl42xK82IYc2Ij64 vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8G jcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1Y6r17MIIYrxkI7VAKI48JMIIF0xvE2I x0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4j6F4UMIIF0xvE42xK 8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I 0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUOxR6DUUUU X-CM-SenderInfo: hvkh01phlwx03j6h245lqf0zpsxwx03jof0z/ Subject: [FFmpeg-devel] [PATCH 1/6] lavc/ac3dsp: RISC-V V ac3_exponent_min X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Shen Peiting MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: vBCPd+6y3T3k From: Shen Peiting Find scalar minium optimized by using RVV instructions Benchmarks on Spike(cycles): *exp=1280*4;num_reuse_blocks=5;nb_coefs=16 ac3_exponent_min_c: 1993 ac3_exponent_min_rvv: 258 *exp=1280*4;num_reuse_blocks=19;nb_coefs=255 ac3_exponent_min_c: 99010 ac3_exponent_min_rvv: 3843 The optimization performance is more obvious with the increase of number of reuse blocks and number of coefs. Co-Authored by: Yang Xiaojun Co-Authored by: Huang Xing Co-Authored by: Zeng Fanchen Signed-off-by: Shen Peiting --- libavcodec/ac3dsp.c | 2 ++ libavcodec/ac3dsp.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/ac3dsp_init.c | 37 +++++++++++++++++++++++++++ libavcodec/riscv/ac3dsp_rvv.S | 46 ++++++++++++++++++++++++++++++++++ 5 files changed, 88 insertions(+) create mode 100644 libavcodec/riscv/ac3dsp_init.c create mode 100644 libavcodec/riscv/ac3dsp_rvv.S diff --git a/libavcodec/ac3dsp.c b/libavcodec/ac3dsp.c index 22cb5f242e..302b786b15 100644 --- a/libavcodec/ac3dsp.c +++ b/libavcodec/ac3dsp.c @@ -395,5 +395,7 @@ av_cold void ff_ac3dsp_init(AC3DSPContext *c) ff_ac3dsp_init_x86(c); #elif ARCH_MIPS ff_ac3dsp_init_mips(c); +#elif ARCH_RISCV + ff_ac3dsp_init_riscv(c); #endif } diff --git a/libavcodec/ac3dsp.h b/libavcodec/ac3dsp.h index 33e51e202e..a01bff3d11 100644 --- a/libavcodec/ac3dsp.h +++ b/libavcodec/ac3dsp.h @@ -109,6 +109,7 @@ void ff_ac3dsp_init (AC3DSPContext *c); void ff_ac3dsp_init_arm(AC3DSPContext *c); void ff_ac3dsp_init_x86(AC3DSPContext *c); void ff_ac3dsp_init_mips(AC3DSPContext *c); +void ff_ac3dsp_init_riscv(AC3DSPContext *c); void ff_ac3dsp_downmix(AC3DSPContext *c, float **samples, float **matrix, int out_ch, int in_ch, int len); diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index ee17a521fd..a627924cac 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -1,5 +1,7 @@ OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_init.o RVV-OBJS-$(CONFIG_AAC_DECODER) += riscv/aacpsdsp_rvv.o +OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_init.o +RVV-OBJS-$(CONFIG_AC3DSP) += riscv/ac3dsp_rvv.o OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_init.o RVV-OBJS-$(CONFIG_ALAC_DECODER) += riscv/alacdsp_rvv.o OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ diff --git a/libavcodec/riscv/ac3dsp_init.c b/libavcodec/riscv/ac3dsp_init.c new file mode 100644 index 0000000000..bb67d86998 --- /dev/null +++ b/libavcodec/riscv/ac3dsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright 2023 Beijing ESWIN Computing Technology Co., Ltd. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ +#include + +#include "libavutil/attributes.h" +#include "libavcodec/ac3dsp.h" +#include "libavutil/cpu.h" +#include "config.h" + +void ff_ac3_exponent_min_rvv(uint8_t *exp, int num_reuse_blocks, int nb_coefs); + +av_cold void ff_ac3dsp_init_riscv(AC3DSPContext *c) +{ + int flags = av_get_cpu_flags(); +#if HAVE_RVV + if (flags & AV_CPU_FLAG_RVV_I32) + c->ac3_exponent_min = ff_ac3_exponent_min_rvv; +#endif +} + diff --git a/libavcodec/riscv/ac3dsp_rvv.S b/libavcodec/riscv/ac3dsp_rvv.S new file mode 100644 index 0000000000..879123f4a7 --- /dev/null +++ b/libavcodec/riscv/ac3dsp_rvv.S @@ -0,0 +1,46 @@ +/* + * Copyright 2023 Beijing ESWIN Computing Technology Co., Ltd. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_ac3_exponent_min_rvv, zve32x + beq a1, x0, 3f + li t0, 256 + addi a1, a1, 1 +1: + mv t2, a0 + mv t3, a1 + lb t4, (t2) +2: + vsetvli t1, t3, e8, m8 + vlse8.v v0, (t2), t0 + vmv.s.x v8, t4 + sub t3, t3, t1 + vredminu.vs v8, v0, v8 + vmv.x.s t4, v8 + bnez t3, 2b + vsetivli t1, 1, e8 + vse8.v v8, (a0) + addi a0, a0, 1 + addi a2, a2, -1 + bnez a2, 1b +3: + ret +endfunc