From patchwork Sat May 25 15:38:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49258 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:542:0:b0:460:55fa:d5ed with SMTP id 63csp2371785vqf; Sat, 25 May 2024 08:39:31 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCW5jUlw7Xzgi393s0oWuIPTgoAa+r3exNTjh56hGmCPR+U0I8rvzOaLMuajRpi2zhJDoiMhvR/aaNwVMYZvK3YAybqSVW+1GGUISQ== X-Google-Smtp-Source: AGHT+IFmjQm5m/i+RWWHK/9H+OWLaA7rsKvIQtn+cRUQpiKEmJVY2nEhRbeRpJ81Cs8xIpmHnagU X-Received: by 2002:ac2:4843:0:b0:528:f1c0:1d3 with SMTP id 2adb3069b0e04-529645e2ba2mr2909126e87.22.1716651571015; Sat, 25 May 2024 08:39:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716651570; cv=none; d=google.com; s=arc-20160816; b=K+9LBTwKn06fxD5grbG/5NdSSvXh2Nk9nPGTLhVHgh6aGUoMZF3+r1Ibrp7hEwMhZS Bnxc5iGpEE5JacQic2u1kni8n8+AFrbCJ/FTMl7lOUNzz3/GUo4CEPwkNlQl6AFgC3Ns EX8DOf8FfIbCmZMpQmxM2d66rKFmwXcGbEPdYyMImtiYd0WkHEaiNqmRzb81+dIr2HTn dRDmlgHFP7FQEOSOt7yHOn6NCoCdDlXmulfeUFQCWH79O/mfVhwGldXwgxS2NG9jqS8h dtywpg5UIOCuiiKdj8yXWreqU7CBHmLynwRqRWVpeqMWsidmNXgYVCIZmJTb48SbqmH+ ZwMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=6c10TQAFFzaX7UQ9S0ylS6rvmI8fxRUfIlwbfNDZoNI=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=aD1kjdGbMRjSAkQb+us+KMOHKvQ/qclp2AKcPCLbJwb8VpzXLETd9hCuBDMrSnxKyl Z3vlToxA8NLp0vzhgn2vBWIuRAuQdYLfct9LHHeE4Njnf5JYPwjjI1v3ZzbTifO0Jwxn uyqPNV0COez4ygijufLUO4d6uxxjvnTdCa8iDJbV2cBVaSF0I9TX6Pwor9TWwuuN8Oko Gn2OFtiPk91uG7jPKyubBpdCp/cI5AoYnUGBbM8l3Wa1xTUbezUIiNu006TApDVFfro8 BYVxHWmDE6m96oV6QnIj7u5re3fufrAAUGvQoRiO7Kd/892KDlCnqyj5IgTbczSwsrOk W/XQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cc63801si193235666b.533.2024.05.25.08.39.30; Sat, 25 May 2024 08:39:30 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EB6E268BEFE; Sat, 25 May 2024 18:38:51 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8A79D68D4D6 for ; Sat, 25 May 2024 18:38:41 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 11D05C021C for ; Sat, 25 May 2024 18:38:41 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 May 2024 18:38:40 +0300 Message-ID: <20240525153840.78147-5-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240525153840.78147-1-remi@remlab.net> References: <20240525153840.78147-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/5] lavc/vp8dsp: factor R-V V EPEL functions for all lengths X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NC4BLO/nUStW --- libavcodec/riscv/vp8dsp_rvv.S | 56 ++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index a4fcd158a5..002e7f3174 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -32,16 +32,6 @@ .endif .endm -.macro vsetvlstatic16 len -.if \len <= 4 - vsetivli zero, \len, e16, mf2, ta, ma -.elseif \len <= 8 - vsetivli zero, \len, e16, m1, ta, ma -.elseif \len <= 16 - vsetivli zero, \len, e16, m2, ta, ma -.endif -.endm - .macro vp8_idct_dc_add vlse32.v v0, (a0), a2 lh a5, 0(a1) @@ -181,13 +171,8 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel len size type -func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x -.ifc \type,v - addi t0, a6, -1 -.else - addi t0, a5, -1 -.endif +.macro epel_common size, type +func ff_put_vp8_epel_\type\()\size\().rvv, zve32x lla t2, subpel_filters sh1add t0, t0, t0 sh1add t0, t0, t2 @@ -198,7 +183,6 @@ func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x lb t5, 5(t0) lb t0, (t0) .endif - vsetvlstatic8 \len 1: addi a4, a4, -1 .ifc \type,v @@ -236,11 +220,11 @@ func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x vwmaccsu.vx v16, t1, v22 vwmaccsu.vx v16, t4, v28 vwadd.wx v16, v16, t6 - vsetvlstatic16 \len + vsetvl zero, zero, a6 # e16 vwadd.vv v24, v16, v20 vnsra.wi v24, v24, 7 vmax.vx v24, v24, zero - vsetvlstatic8 \len + vsetvl zero, zero, a5 # e8 vnclipu.wi v30, v24, 0 add a2, a2, a3 vse8.v v30, (a0) @@ -251,9 +235,33 @@ func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x endfunc .endm +.macro epel len, size, type +func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x +.ifc \type,v + addi t0, a6, -1 +.else + addi t0, a5, -1 +.endif +.if \len <= 4 + li a5, 0306 # e8, mf4, ta, ma + li a6, 0317 # e16, mf2, ta, ma +.elseif \len <= 8 + li a5, 0307 # e8, mf2, ta, ma + li a6, 0310 # e16, m1, ta, ma +.else # if len <= 16 + li a5, 0300 # e8, m1, ta, ma + li a6, 0311 # e16, m2, ta, ma +.endif + vsetvlstatic8 \len + j ff_put_vp8_epel_\type\()\size\().rvv +endfunc +.endm + +.irp type,h,v +.irp size,4,6 +epel_common \size, \type .irp len,16,8,4 -epel \len 6 h -epel \len 4 h -epel \len 6 v -epel \len 4 v +epel \len, \size, \type +.endr +.endr .endr