From patchwork Mon Sep 26 14:52:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38339 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp2304030pzh; Mon, 26 Sep 2022 07:53:22 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4W9RAFehsHlsG8TNPgXAqLNHRil7gHA6m2xCkJsufQXrYdzdbIojWjEqqIoltejqjo/FWO X-Received: by 2002:a17:906:8a46:b0:781:71fc:d23f with SMTP id gx6-20020a1709068a4600b0078171fcd23fmr18841901ejc.500.1664204001834; Mon, 26 Sep 2022 07:53:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664204001; cv=none; d=google.com; s=arc-20160816; b=Y3o2yKocgbZ3XDuzo8/Llswlbpu3gRcA8ikOIK4yWEIoCdsAKU6N+dFr8UQvm3jAGW adjz3TvkoNtKdFfzUFnxTCj7+qPcD+9N54c3uIn9MM7lfb0nGjm3nsnbrYKk2gQgudEM gZbWjfsVnEKp9yIc69lbbiXxQo/nAOXXL50stt0xovkvYR14hmSbCFGARG9C2MZhuImx Etp264Cf9/KxC4dGCu2jQep2mzs0cMWW80pIO6zgx1ejEX/XfGhCAfBx5Gmc25xgHNT7 tPgPqLXzvIlr6dt9YxAnZnChlw20KJAdd6Kc7Fp2yblKQBnw1wX3Ad2iO4wscIs5vHxr 2nrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Hd++iMz3VwjtbZCtNQUPsT5V+P6VHH6YU5sCfBL1v18=; b=jXn9foOVWKLpYvRgu5e/l37bNg6bQsAQyVsgdpFHxVBd/Ig5MqSGGj4e8SjxXjtwSG jLHVqe4YyIQvgvG+jYc/9bc3muWCMTtilwje2tCVaQkB01ggzho9Cl9cA9bg4KrCunb7 xdwNzoycrqb6wlvEnHPvhLc9qsnvYiPPVChZbgNcJMqzDXBLmysAJ7VQEHP8k0ypLNww Nrq0jRIeR0fkHPTUUzIT3yYbDiCC2XcYIXJWCWZaNqNd+NQfHFSPyPjcQyx1+/1HLqLu hcRlXqtjO6Y5Gruy85rckxi2IC7coBJKUgrbNNzFGoWPaEKpY4eVl5Mg5XL62zi10ssS nb7w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v8-20020a50a448000000b0044f2bdfc098si13515259edb.532.2022.09.26.07.53.21; Mon, 26 Sep 2022 07:53:21 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3AECC68BB39; Mon, 26 Sep 2022 17:53:00 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DE82268B32D for ; Mon, 26 Sep 2022 17:52:51 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 93417C00AF for ; Mon, 26 Sep 2022 17:52:51 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Mon, 26 Sep 2022 17:52:23 +0300 Message-Id: <20220926145251.56351-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <5862173.lOV4Wx5bFT@basile.remlab.net> References: <5862173.lOV4Wx5bFT@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/31] lavc/audiodsp: RISC-V F vector_clipf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: qtuXLgSJwqgN From: Rémi Denis-Courmont RV64G supports MIN & MAX instructions natively only on floating point registers, not general purpose ones. The later would require the Zbb extension. Due to that, it is actually faster to perform the clipping "properly" in FPU. Benchmarks on SiFive U74-MC (courtesy of Shanghai StarFive Tech): audiodsp.vector_clipf_c: 29551.5 audiodsp.vector_clipf_rvf: 17871.0 Also tried unrolling with 2 or 8 elements but it gets worse either way. --- libavcodec/audiodsp.c | 2 ++ libavcodec/audiodsp.h | 1 + libavcodec/riscv/Makefile | 2 ++ libavcodec/riscv/audiodsp_init.c | 33 +++++++++++++++++++++ libavcodec/riscv/audiodsp_rvf.S | 49 ++++++++++++++++++++++++++++++++ 5 files changed, 87 insertions(+) create mode 100644 libavcodec/riscv/Makefile create mode 100644 libavcodec/riscv/audiodsp_init.c create mode 100644 libavcodec/riscv/audiodsp_rvf.S diff --git a/libavcodec/audiodsp.c b/libavcodec/audiodsp.c index ff43e87dce..eba6e809fd 100644 --- a/libavcodec/audiodsp.c +++ b/libavcodec/audiodsp.c @@ -113,6 +113,8 @@ av_cold void ff_audiodsp_init(AudioDSPContext *c) ff_audiodsp_init_arm(c); #elif ARCH_PPC ff_audiodsp_init_ppc(c); +#elif ARCH_RISCV + ff_audiodsp_init_riscv(c); #elif ARCH_X86 ff_audiodsp_init_x86(c); #endif diff --git a/libavcodec/audiodsp.h b/libavcodec/audiodsp.h index aa6fa7898b..485b512839 100644 --- a/libavcodec/audiodsp.h +++ b/libavcodec/audiodsp.h @@ -55,6 +55,7 @@ typedef struct AudioDSPContext { void ff_audiodsp_init(AudioDSPContext *c); void ff_audiodsp_init_arm(AudioDSPContext *c); void ff_audiodsp_init_ppc(AudioDSPContext *c); +void ff_audiodsp_init_riscv(AudioDSPContext *c); void ff_audiodsp_init_x86(AudioDSPContext *c); #endif /* AVCODEC_AUDIODSP_H */ diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile new file mode 100644 index 0000000000..414a9e9bd8 --- /dev/null +++ b/libavcodec/riscv/Makefile @@ -0,0 +1,2 @@ +OBJS-$(CONFIG_AUDIODSP) += riscv/audiodsp_init.o \ + riscv/audiodsp_rvf.o diff --git a/libavcodec/riscv/audiodsp_init.c b/libavcodec/riscv/audiodsp_init.c new file mode 100644 index 0000000000..c5842815d6 --- /dev/null +++ b/libavcodec/riscv/audiodsp_init.c @@ -0,0 +1,33 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavcodec/audiodsp.h" + +void ff_vector_clipf_rvf(float *dst, const float *src, int len, float min, float max); + +av_cold void ff_audiodsp_init_riscv(AudioDSPContext *c) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVF) + c->vector_clipf = ff_vector_clipf_rvf; +} diff --git a/libavcodec/riscv/audiodsp_rvf.S b/libavcodec/riscv/audiodsp_rvf.S new file mode 100644 index 0000000000..2ec8a11691 --- /dev/null +++ b/libavcodec/riscv/audiodsp_rvf.S @@ -0,0 +1,49 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vector_clipf_rvf, f +NOHWF fmv.w.x fa0, a3 +NOHWF fmv.w.x fa1, a4 +1: + flw ft0, (a1) + flw ft1, 4(a1) + fmax.s ft0, ft0, fa0 + flw ft2, 8(a1) + fmax.s ft1, ft1, fa0 + flw ft3, 12(a1) + fmax.s ft2, ft2, fa0 + addi a2, a2, -4 + fmax.s ft3, ft3, fa0 + addi a1, a1, 16 + fmin.s ft0, ft0, fa1 + fmin.s ft1, ft1, fa1 + fsw ft0, (a0) + fmin.s ft2, ft2, fa1 + fsw ft1, 4(a0) + fmin.s ft3, ft3, fa1 + fsw ft2, 8(a0) + fsw ft3, 12(a0) + addi a0, a0, 16 + bnez a2, 1b + + ret +endfunc