From patchwork Sat May 25 15:38:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49255 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:542:0:b0:460:55fa:d5ed with SMTP id 63csp2371623vqf; Sat, 25 May 2024 08:39:03 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWYJcZHZLTlToGrTHXcoC2dW/PvRegmxSBVfm0kso+9qZg0z9OfQZQLcAVP7eXggwLDHo0HBXZ8Iuio3wtUSK5LCDYYl0RBnwP1lw== X-Google-Smtp-Source: AGHT+IEBO+70Zw6MDkMob3TVdvJGhRLrdQNb78ndtXyYX2hnETMbi8o3MhxgttMWWylwE4IWAV5L X-Received: by 2002:a05:6512:2813:b0:51f:40a6:234a with SMTP id 2adb3069b0e04-5296717f75dmr3319799e87.4.1716651543086; Sat, 25 May 2024 08:39:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716651543; cv=none; d=google.com; s=arc-20160816; b=nxirfhUM/z1Uq9QeY5yt22G/7GR/mMVG6j+9seOy2EQWYsWoQQuBTFr3EFMdyHGP08 BLzm41YBJW1NFUgSPMUd7oIdM8aI/rtRzPXSTA4H6cARw04uMr2DssiQnpm1hcKIxbin b1Syxznej1rBw3kKdI3qpLNVpoUHUSRfpcQrXRyUxcQJgB+5+CvjuWu3kx9DL2/7klpy VSJQ67zfXVYhs0GcVTxcW1xRRSnugejnnfI/MFrcen4BHwVaIV3mJN+s7nkEqFEmDWhY EcjV5+2MqqhivVXjj4xRU7BwzqagJA7g4Ig27L3VOmI+sJdsmV504DorNd7oKzcXjhkY E9Tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=Q6o8sgo+lVYH1iPbWlMdCrLUNcrKBXtfcfMkLR9+UM4=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=icKtKJXxm8s570ckpxcHC4pSq/uWutqsM5qGHEG3HJC5eKWW2FM33HZmDNo8Elxa++ uXudT6JDdw9EPcYk3RUnhxTJxMreU1dt2Xk7gYQzWbaIGO/TrBHewVc7yER2pO8E9kq6 8AaAw5aIy6iU2u8x+d+Vy6vVVdHk48vH3b+PBOUDQNLtrq14avfHnAMJtr2ASsu72NJh 7Vr486WHHcchvs9GSHi2igFeSVTQG9lKlzV5UlOG+naQsq37MaqFxJLqhN0iXGX3oDvf Gk+t16KbOry64CQkozc1DGOexli8PIX3FlOyJIJtYBG+OK3Zl4Duk045ExwZlwExNhca Vp8w==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cc6430dsi191752766b.498.2024.05.25.08.39.02; Sat, 25 May 2024 08:39:03 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CBACE68D51F; Sat, 25 May 2024 18:38:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 167FA68D499 for ; Sat, 25 May 2024 18:38:41 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 45A5BC006B for ; Sat, 25 May 2024 18:38:40 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 May 2024 18:38:36 +0300 Message-ID: <20240525153840.78147-1-remi@remlab.net> X-Mailer: git-send-email 2.45.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/5] lavc/vp8dsp: avoid one multiplication on RISC-V X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: iF70hVRaVNvY Use shifts rather than multiply, and save one instruction. --- libavcodec/riscv/vp8dsp_init.c | 26 ++++++++++++++------------ libavcodec/riscv/vp8dsp_rvv.S | 7 +++---- 2 files changed, 17 insertions(+), 16 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 31e8227fa4..2413fbf449 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -84,19 +84,21 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; - c->put_vp8_epel_pixels_tab[0][0][2] = ff_put_vp8_epel16_h6_rvv; - c->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_rvv; - c->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_rvv; - c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; - c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; - c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; + if (flags & AV_CPU_FLAG_RVB_ADDR) { + c->put_vp8_epel_pixels_tab[0][0][2] = ff_put_vp8_epel16_h6_rvv; + c->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_rvv; + c->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_rvv; + c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; + c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; + c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; - c->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_rvv; - c->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_rvv; - c->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_rvv; - c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; - c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; - c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; + c->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; + } } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 9c84b1503e..cb9b0b8b5f 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -162,15 +162,14 @@ const subpel_filters endconst .macro epel_filter size type - lla t2, subpel_filters .ifc \type,v addi t0, a6, -1 .else addi t0, a5, -1 .endif - li t1, 6 - mul t0, t0, t1 - add t0, t0, t2 + lla t2, subpel_filters + sh1add t0, t0, t0 + sh1add t0, t0, t2 .irp n,1,2,3,4 lb t\n, \n(t0) .endr From patchwork Sat May 25 15:38:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49254 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:542:0:b0:460:55fa:d5ed with SMTP id 63csp2371570vqf; Sat, 25 May 2024 08:38:53 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVQCUTUS1fwZfqe6sdA4NpKD1yevCCvOMs6hVzQcC6zfxVm9kB4hBMaTYy8qXUYyn4vo7AuZKei9RjLf/IfkwxQANBFigofy9XB7A== X-Google-Smtp-Source: AGHT+IHjnSzxYKripfIwRZgIqr6apKZeddvUtk2XQbfcCF4QZtvmn0Hwi9F0rknxiTjOXKc39D24 X-Received: by 2002:ac2:4ec1:0:b0:526:92d7:52dc with SMTP id 2adb3069b0e04-52967465024mr4103571e87.59.1716651533029; Sat, 25 May 2024 08:38:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716651533; cv=none; d=google.com; s=arc-20160816; b=zskdY26CaPIy7su/OqmMqiJIqSSrKPw+sC1l6kgWdJvsZKYkKowR/EkgFE0GLWxn7P k7p2fDgmpqKHMl6pKY0uaH9/v0IBVRQKnEHDMrRm48xz4x3F9FblTnOE/dQPs10Qv5kx KL+cPmVRAMcIxM2LTMENJ7ddazdeXNKTDILbCYULTuy6/wjmUPLRVpWy3ga/d3SDcBlY sJoJGfNQdTnh2Sbz/lc+DE3327R5JuJHcrmaFgcRtXDPIbbVsLUSTqiZhbbdTAYXPgpb rjv3H2RLZttON7/J/lGWCsVLnTxBy6NhsJXwFlblo0V+g4ktRSfj42XYQEe942/tYQfd n5AQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=mY7tHGoAC9OmhNiDBTcBy21fSJhjDzeUJSG0pCDg6qA=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=eGXUOVx534WGYF1OP3GTUNfcwbMkZtGraleHq9wFLVoMVAQeg7ZKv35SzMjqImGj9u dnHZlmidoIGWi6lhqoZl9Q6mkx1tqNmN65nndspTjviEzJ3flQdFMGVWAywU+TmANHLI mGrk7RYJJ9RxMuKNc+9SyKmtrrWPTxos1IlQ1+Go5RRBFQsbHvLBARfYFhzkmzlzKYp0 uUbIhN/+NCCyYWPjQZzD7MiXgpZMeFf6reRAqY70wZG/Eye/StN1PdiGfJE2Pa1sYGiN Bds3+1l9iW8Xshx9w3YT74WxlpzDfphW+b5glw5xHWxju9jDozqPIWMUszmqJEuSLvU4 QwRA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 2adb3069b0e04-52970382ff6si1197176e87.251.2024.05.25.08.38.52; Sat, 25 May 2024 08:38:53 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9683B68D499; Sat, 25 May 2024 18:38:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 13AF168BEFE for ; Sat, 25 May 2024 18:38:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 799A9C006C for ; Sat, 25 May 2024 18:38:40 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 May 2024 18:38:37 +0300 Message-ID: <20240525153840.78147-2-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240525153840.78147-1-remi@remlab.net> References: <20240525153840.78147-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/5] lavc/vp8dsp: expand single use R-V macros X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Bo9BO43R6lKM --- libavcodec/riscv/vp8dsp_rvv.S | 24 ++++++------------------ 1 file changed, 6 insertions(+), 18 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index cb9b0b8b5f..bb0c7bf02a 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -161,7 +161,8 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel_filter size type +.macro epel len size type +func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x .ifc \type,v addi t0, a6, -1 .else @@ -177,9 +178,9 @@ endconst lb t5, 5(t0) lb t0, (t0) .endif -.endm - -.macro epel_load dst len size type + vsetvlstatic8 \len +1: + addi a4, a4, -1 .ifc \type,v mv a5, a3 .else @@ -212,21 +213,8 @@ endconst vnsra.wi v24, v24, 7 vmax.vx v24, v24, zero vsetvlstatic8 \len - vnclipu.wi \dst, v24, 0 -.endm - -.macro epel_load_inc dst len size type - epel_load \dst \len \size \type + vnclipu.wi v30, v24, 0 add a2, a2, a3 -.endm - -.macro epel len size type -func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x - epel_filter \size \type - vsetvlstatic8 \len -1: - addi a4, a4, -1 - epel_load_inc v30 \len \size \type vse8.v v30, (a0) add a0, a0, a1 bnez a4, 1b From patchwork Sat May 25 15:38:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49257 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:542:0:b0:460:55fa:d5ed with SMTP id 63csp2371742vqf; Sat, 25 May 2024 08:39:22 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWNO7kTxWPY6bpY94qcgNIJKm9DQ1zM0c0C4klGLGt9gAIUt2nyWLK2rICcvmoXLlRCvdMc/U0tkH1KxXAr5mTAKVDtpcdsKld9ag== X-Google-Smtp-Source: AGHT+IEpbkbDKCoxGxCHYKr0+rKdG9i6rT35xcqn7vgC5jdkQfOGLqid5Qk4twiQ3jFkuNx0ifR6 X-Received: by 2002:a17:906:27c4:b0:a59:c46b:c529 with SMTP id a640c23a62f3a-a62630a3c93mr337966266b.0.1716651562010; Sat, 25 May 2024 08:39:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716651561; cv=none; d=google.com; s=arc-20160816; b=WDlFEshGtA1QW+K2gH/FhYqZxGQNFfYgF4R1Ue+h7LapKjoYOdpzYW5abMwe04QOfQ PsoT7+k7AXKQcwb8FGhRHVfOcwjo4uBsf831Acik1aCrh/bTuvwHC8LQzMWmU5vJrEtR g+1EKE+9U/RRzox6OtL64X6BQFqdiiDaCPUYl9rqNXa9r1u5wjKvYepl0leK4YZZuXDb VBN1zwH2Hbb781vv22MyD7QMDg+GaWgCzF7hTxzqf2n0SYXvqucCE1wDNbTujhETBAul tyzca2wBesGT2uNiSW7ml12PldBKrwQCrlDr/HOwultANmHerBZ+D0hS0lLHUnyqaRQS XxSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=PBdAwvfBUFeDa/F4IOrSUi/OeHN7ImO3FArX5fYy4dI=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=QMScbs3HXB7R3g1YVneMy94GNcylnntnlt7MvMpG83f9RNC4DjpWO2rGTJoZu2CoVg 5QM28Xeuh1h6ZhzPUPIJAYypZqfkNgD9GO0WhBrFVvWyR2XcUl4zsOG4fkd8Ry4Gq/cN Kq+wthugXvhV/wSQJcAf88exwt2mZswWVWb1HD2oiG5/lj08F1BVRqTu8/a91y084KTs C1xUMxenpviU6LxBKf2wV728gikNjtCXwxBovQwwbxieotWi2MDxY9oEXU4JiN/pau1W vDeLqiPVk0cxOW+OfsqYN5cOSlgRsdiRaTgOs1sR9TFIqHdeiKvtTi2zdBnSQ8B2ntVz ditg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cd9182csi190422266b.732.2024.05.25.08.39.21; Sat, 25 May 2024 08:39:21 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0B8C868D4DC; Sat, 25 May 2024 18:38:51 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8325768D499 for ; Sat, 25 May 2024 18:38:41 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id AA236C0214 for ; Sat, 25 May 2024 18:38:40 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 May 2024 18:38:38 +0300 Message-ID: <20240525153840.78147-3-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240525153840.78147-1-remi@remlab.net> References: <20240525153840.78147-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/5] lavc/vp8dsp: factor R-V V bilin functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: KY77TAo8o8FD For a given type, only the first VSETVLI instruction varies depending on the size. --- libavcodec/riscv/vp8dsp_rvv.S | 37 +++++++++++++++++++++++++---------- 1 file changed, 27 insertions(+), 10 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index bb0c7bf02a..545c2e9728 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -108,9 +108,10 @@ endfunc vnsra.wi \dst, v24, 3 .endm -.macro put_vp8_bilin_h_v len type mn -func ff_put_vp8_bilin\len\()_\type\()_rvv, zve32x - vsetvlstatic8 \len +.macro put_vp8_bilin_h_v type mn +func ff_put_vp8_bilin4_\type\()_rvv, zve32x + vsetvlstatic8 4 +.Lbilin_\type: li t1, 8 li t4, 4 sub t1, t1, \mn @@ -126,9 +127,12 @@ func ff_put_vp8_bilin\len\()_\type\()_rvv, zve32x endfunc .endm -.macro put_vp8_bilin_hv len -func ff_put_vp8_bilin\len\()_hv_rvv, zve32x - vsetvlstatic8 \len +put_vp8_bilin_h_v h a5 +put_vp8_bilin_h_v v a6 + +func ff_put_vp8_bilin4_hv_rvv, zve32x + vsetvlstatic8 4 +.Lbilin_hv: li t3, 8 sub t1, t3, a5 sub t2, t3, a6 @@ -149,7 +153,23 @@ func ff_put_vp8_bilin\len\()_hv_rvv, zve32x ret endfunc -.endm + +.irp len,16,8 +func ff_put_vp8_bilin\len\()_h_rvv, zve32x + vsetvlstatic8 \len + j .Lbilin_h +endfunc + +func ff_put_vp8_bilin\len\()_v_rvv, zve32x + vsetvlstatic8 \len + j .Lbilin_v +endfunc + +func ff_put_vp8_bilin\len\()_hv_rvv, zve32x + vsetvlstatic8 \len + j .Lbilin_hv +endfunc +.endr const subpel_filters .byte 0, -6, 123, 12, -1, 0 @@ -224,9 +244,6 @@ endfunc .endm .irp len,16,8,4 -put_vp8_bilin_h_v \len h a5 -put_vp8_bilin_h_v \len v a6 -put_vp8_bilin_hv \len epel \len 6 h epel \len 4 h epel \len 6 v From patchwork Sat May 25 15:38:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49256 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:542:0:b0:460:55fa:d5ed with SMTP id 63csp2371682vqf; Sat, 25 May 2024 08:39:13 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVP3ePo/9Qrf3N66aNAVLDFHWT0cp2n8rUnI+Ism7ta4C9p0xr+X2xk/TZRxc6F+WhtFLu2YaR+l5JFlWWtUclaRulUwcYrGMZTBA== X-Google-Smtp-Source: AGHT+IHEg6/dN2PafRgpG+w2oqjHEiGQNPyUsWHbuh7MLiVO02pHYAXNFBnme2bjKPQVkbzz3r+e X-Received: by 2002:a05:6512:3da8:b0:51c:68a3:6f8e with SMTP id 2adb3069b0e04-52965198adamr3863902e87.31.1716651553018; Sat, 25 May 2024 08:39:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716651553; cv=none; d=google.com; s=arc-20160816; b=JnqfR/65ewwYf3y93Xi42ARalwFVyX2+JS9PBauISGxKRPcIIu4UF9mTnUa2HcDVnc B9U/RoY0//peVPi415ixoaDk7gRukutEH2P4PT0LU7e0CIM8gf0LWacRi34Io9WNxLSW i/FnsG+6EPixzE6YOjqURR/CE7iPdpqme0MHAlkSKChWcsi5Z+V54C0iI+/K/W94M+Pb J74v3L/rzxjuv7665qQ+0Jt2tDz6xvklJMSVoRAKVBAnuCXr8yr+b5gPEE8Ue1Rew9fe Pix9WsYxA9KE4iHsQGjAMuZ/VPZ0V9B4jZG/76yV5zOSVrCJ9clP+fYbNaFnKIm8kwFF owgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=4yGdXRFs5YDoztzdC3osCYuXG/4lvGkNI5Iq0XSDm6g=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=LMkzlducfarSvaIYJ5hZKx8bxGnSnzFxrPv3xLb3MPoXv/9oxmgxq4STMoOgJLWUGm sXizShanBgteI4SQAEJiqhhd01xcUUwF30y6AAX/s/WurCWSst5MCoZhqkC+Jv5m1FPZ z/hmTQ8ng2kM+6fsChbO7Sxk3qF4N3cAZ1ofXXUlIOFJNPSKeMV+U4Yaw2vvWeAaa967 IYPfqgrk8XGtv//kKPnXz0xvNWY7FT7iDqtXKwsnd60mWLFKB+Bcqep4ptbT72PWtrOT nz+7GG6U2uCppLOOL45aKHYLRH//ANzWS9/GJccf7Ose5jv2GGzaT5X+j2jAL4xZKt7Y n/3w==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 2adb3069b0e04-5297094b84esi1218330e87.427.2024.05.25.08.39.12; Sat, 25 May 2024 08:39:12 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EBA0C68D565; Sat, 25 May 2024 18:38:49 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7DE1D68BEFE for ; Sat, 25 May 2024 18:38:41 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id D4F8AC0215 for ; Sat, 25 May 2024 18:38:40 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 May 2024 18:38:39 +0300 Message-ID: <20240525153840.78147-4-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240525153840.78147-1-remi@remlab.net> References: <20240525153840.78147-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/5] lavc/vp8dsp: save one R-V GPR X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: wMJWi/cRhYXO This saves one instruction and frees up A5, which will be repurposed in later changes. Unfortunately, we need to add quite a lot of alternative code for this. --- libavcodec/riscv/vp8dsp_rvv.S | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 545c2e9728..a4fcd158a5 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -202,23 +202,31 @@ func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x 1: addi a4, a4, -1 .ifc \type,v - mv a5, a3 + sub t6, a2, a3 + add a7, a2, a3 .else - li a5, 1 + addi t6, a2, -1 + addi a7, a2, 1 .endif - sub t6, a2, a5 - add a7, a2, a5 - vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) - add a7, a7, a5 +.ifc \type,v + add a7, a7, a3 +.else + addi a7, a7, 1 +.endif vle8.v v28, (a7) vwmulu.vx v16, v24, t2 vwmulu.vx v20, v26, t3 .ifc \size,6 - sub t6, t6, a5 - add a7, a7, a5 +.ifc \type,v + sub t6, t6, a3 + add a7, a7, a3 +.else + addi t6, t6, -1 + addi a7, a7, 1 +.endif vle8.v v24, (t6) vle8.v v26, (a7) vwmaccu.vx v16, t0, v24 From patchwork Sat May 25 15:38:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49258 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:542:0:b0:460:55fa:d5ed with SMTP id 63csp2371785vqf; Sat, 25 May 2024 08:39:31 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCW5jUlw7Xzgi393s0oWuIPTgoAa+r3exNTjh56hGmCPR+U0I8rvzOaLMuajRpi2zhJDoiMhvR/aaNwVMYZvK3YAybqSVW+1GGUISQ== X-Google-Smtp-Source: AGHT+IFmjQm5m/i+RWWHK/9H+OWLaA7rsKvIQtn+cRUQpiKEmJVY2nEhRbeRpJ81Cs8xIpmHnagU X-Received: by 2002:ac2:4843:0:b0:528:f1c0:1d3 with SMTP id 2adb3069b0e04-529645e2ba2mr2909126e87.22.1716651571015; Sat, 25 May 2024 08:39:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716651570; cv=none; d=google.com; s=arc-20160816; b=K+9LBTwKn06fxD5grbG/5NdSSvXh2Nk9nPGTLhVHgh6aGUoMZF3+r1Ibrp7hEwMhZS Bnxc5iGpEE5JacQic2u1kni8n8+AFrbCJ/FTMl7lOUNzz3/GUo4CEPwkNlQl6AFgC3Ns EX8DOf8FfIbCmZMpQmxM2d66rKFmwXcGbEPdYyMImtiYd0WkHEaiNqmRzb81+dIr2HTn dRDmlgHFP7FQEOSOt7yHOn6NCoCdDlXmulfeUFQCWH79O/mfVhwGldXwgxS2NG9jqS8h dtywpg5UIOCuiiKdj8yXWreqU7CBHmLynwRqRWVpeqMWsidmNXgYVCIZmJTb48SbqmH+ ZwMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=6c10TQAFFzaX7UQ9S0ylS6rvmI8fxRUfIlwbfNDZoNI=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=aD1kjdGbMRjSAkQb+us+KMOHKvQ/qclp2AKcPCLbJwb8VpzXLETd9hCuBDMrSnxKyl Z3vlToxA8NLp0vzhgn2vBWIuRAuQdYLfct9LHHeE4Njnf5JYPwjjI1v3ZzbTifO0Jwxn uyqPNV0COez4ygijufLUO4d6uxxjvnTdCa8iDJbV2cBVaSF0I9TX6Pwor9TWwuuN8Oko Gn2OFtiPk91uG7jPKyubBpdCp/cI5AoYnUGBbM8l3Wa1xTUbezUIiNu006TApDVFfro8 BYVxHWmDE6m96oV6QnIj7u5re3fufrAAUGvQoRiO7Kd/892KDlCnqyj5IgTbczSwsrOk W/XQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cc63801si193235666b.533.2024.05.25.08.39.30; Sat, 25 May 2024 08:39:30 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EB6E268BEFE; Sat, 25 May 2024 18:38:51 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8A79D68D4D6 for ; Sat, 25 May 2024 18:38:41 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 11D05C021C for ; Sat, 25 May 2024 18:38:41 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 May 2024 18:38:40 +0300 Message-ID: <20240525153840.78147-5-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240525153840.78147-1-remi@remlab.net> References: <20240525153840.78147-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/5] lavc/vp8dsp: factor R-V V EPEL functions for all lengths X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NC4BLO/nUStW --- libavcodec/riscv/vp8dsp_rvv.S | 56 ++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index a4fcd158a5..002e7f3174 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -32,16 +32,6 @@ .endif .endm -.macro vsetvlstatic16 len -.if \len <= 4 - vsetivli zero, \len, e16, mf2, ta, ma -.elseif \len <= 8 - vsetivli zero, \len, e16, m1, ta, ma -.elseif \len <= 16 - vsetivli zero, \len, e16, m2, ta, ma -.endif -.endm - .macro vp8_idct_dc_add vlse32.v v0, (a0), a2 lh a5, 0(a1) @@ -181,13 +171,8 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel len size type -func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x -.ifc \type,v - addi t0, a6, -1 -.else - addi t0, a5, -1 -.endif +.macro epel_common size, type +func ff_put_vp8_epel_\type\()\size\().rvv, zve32x lla t2, subpel_filters sh1add t0, t0, t0 sh1add t0, t0, t2 @@ -198,7 +183,6 @@ func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x lb t5, 5(t0) lb t0, (t0) .endif - vsetvlstatic8 \len 1: addi a4, a4, -1 .ifc \type,v @@ -236,11 +220,11 @@ func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x vwmaccsu.vx v16, t1, v22 vwmaccsu.vx v16, t4, v28 vwadd.wx v16, v16, t6 - vsetvlstatic16 \len + vsetvl zero, zero, a6 # e16 vwadd.vv v24, v16, v20 vnsra.wi v24, v24, 7 vmax.vx v24, v24, zero - vsetvlstatic8 \len + vsetvl zero, zero, a5 # e8 vnclipu.wi v30, v24, 0 add a2, a2, a3 vse8.v v30, (a0) @@ -251,9 +235,33 @@ func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x endfunc .endm +.macro epel len, size, type +func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x +.ifc \type,v + addi t0, a6, -1 +.else + addi t0, a5, -1 +.endif +.if \len <= 4 + li a5, 0306 # e8, mf4, ta, ma + li a6, 0317 # e16, mf2, ta, ma +.elseif \len <= 8 + li a5, 0307 # e8, mf2, ta, ma + li a6, 0310 # e16, m1, ta, ma +.else # if len <= 16 + li a5, 0300 # e8, m1, ta, ma + li a6, 0311 # e16, m2, ta, ma +.endif + vsetvlstatic8 \len + j ff_put_vp8_epel_\type\()\size\().rvv +endfunc +.endm + +.irp type,h,v +.irp size,4,6 +epel_common \size, \type .irp len,16,8,4 -epel \len 6 h -epel \len 4 h -epel \len 6 v -epel \len 4 v +epel \len, \size, \type +.endr +.endr .endr