From patchwork Wed May 29 14:59:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49350 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:8f0d:0:b0:460:55fa:d5ed with SMTP id i13csp707111vqu; Wed, 29 May 2024 08:00:40 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXSHnzf41h9pA4QUxPw72hqwo1f611VSAI5LvfYDulcejZqk0hY+I1A0DMFa3Iof96Thf5v44yDXu9iF6LEPAjlaQ8R3SZyLM9W/A== X-Google-Smtp-Source: AGHT+IHsaGvI/6r8oSwu4bMLvxNaxAydQUvJDOxTW+NNhj6B9+wYXlyIbsbuWBJQa2YAKXsmd3iX X-Received: by 2002:a17:906:f9c9:b0:a5a:f16:32b1 with SMTP id a640c23a62f3a-a62641cfb1dmr1064148866b.31.1716994840259; Wed, 29 May 2024 08:00:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716994840; cv=none; d=google.com; s=arc-20160816; b=jXk1nX8721MosLoKy2tGT57kDaruhgbS1W0leFfMjQHqW5cTfXT3Chnnq1j+N33zk6 Rv/9NyRUjIeL28WOWiujfumNp74pfYwE+pyTSSgd6EM6ZbcUsS30vbDYbQSzHWWiS5xk M71IGSMcNJU3l20/xDSXJ5xPpQZ2E+iHWIidDR0+YecSEDQI8HN5qGk+xjnxz7nyDL65 RUYp3gaWjQXR6sQSu2mNKmNkdhIUHF/nbBV+PCSe6kJ/xKHcw4AlyDjUQlfIHtcZwDnn 0GMkal6A51Olp88AL86XrvV6XfgrN2AhS2ZIhwdNph8f6f7REU4ljR8JQVGWBqu2vtdO 6k4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=JDHhBq5jJODS/+J9vVvfxjzYpWNVPmE92FcHY1t0Qz0=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=CQ590r25bqUF+vySkwKNQiFnn5+nKGeum067o3ImT/sLFFZlJUBpODrV62egsrIamg 3Pq/8VRmZ5YZvq00vuCFdkHr4WyQTugbB8wHGPqtzrOtyT8jyT2EUsJy2zMtyuYra+dw GRbI4jq33uCoynkKjv3J19PlxbeTA/AUhDqqZDA6b3Ym4YKMfs/G5Z2vPT6P0QCEtzJy hlZa9HgomcUaDxwqhZmQyePp0FSlMJ0ap5nGd4DSmCTQshJ76/IpzfCZkcmldgIuYm0T 3cRauS+K3GtxH2At8+yRxhz36bZAXVpZkgdk0b6FsMj0piKzinKXPJBO0t9xeOImz6DI /1eg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cd928dfsi603334266b.787.2024.05.29.08.00.39; Wed, 29 May 2024 08:00:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D719F68D22E; Wed, 29 May 2024 18:00:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4417D68D22E for ; Wed, 29 May 2024 17:59:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B665EC0215 for ; Wed, 29 May 2024 17:59:56 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Wed, 29 May 2024 17:59:55 +0300 Message-ID: <20240529145955.32189-4-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240529145955.32189-1-remi@remlab.net> References: <20240529145955.32189-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/4] lavc/float_dsp: R-V V scalarproduct_double X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: keSveDaYhIox C908: scalarproduct_double_c: 39.2 scalarproduct_double_rvv_f64: 10.5 X60: scalarproduct_double_c: 35.0 scalarproduct_double_rvv_f64: 5.2 --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 21 +++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 585f237225..155496fa6b 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -46,6 +46,8 @@ void ff_vector_dmac_scalar_rvv(double *dst, const double *src, double mul, int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); +double ff_scalarproduct_double_rvv(const double *v1, const double *v2, + size_t len); av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) { @@ -68,6 +70,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_dmul = ff_vector_dmul_rvv; fdsp->vector_dmac_scalar = ff_vector_dmac_scalar_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + fdsp->scalarproduct_double = ff_scalarproduct_double_rvv; } } #endif diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 7cfc890bc2..4379534af7 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -237,3 +237,24 @@ NOHWD mv a2, a3 ret endfunc + +func ff_scalarproduct_double_rvv, zve64f + vsetvli t0, zero, e64, m8, ta, ma + vmv.v.x v8, zero + vmv.s.x v0, zero +1: + vsetvli t0, a2, e64, m8, tu, ma + vle64.v v16, (a0) + sub a2, a2, t0 + vle64.v v24, (a1) + sh3add a0, t0, a0 + vfmacc.vv v8, v16, v24 + sh3add a1, t0, a1 + bnez a2, 1b + + vsetvli t0, zero, e64, m8, ta, ma + vfredusum.vs v0, v8, v0 + vfmv.f.s fa0, v0 +NOHWD fmv.x.w a0, fa0 + ret +endfunc