From patchwork Sat May 25 15:38:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49255 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:542:0:b0:460:55fa:d5ed with SMTP id 63csp2371623vqf; Sat, 25 May 2024 08:39:03 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWYJcZHZLTlToGrTHXcoC2dW/PvRegmxSBVfm0kso+9qZg0z9OfQZQLcAVP7eXggwLDHo0HBXZ8Iuio3wtUSK5LCDYYl0RBnwP1lw== X-Google-Smtp-Source: AGHT+IEBO+70Zw6MDkMob3TVdvJGhRLrdQNb78ndtXyYX2hnETMbi8o3MhxgttMWWylwE4IWAV5L X-Received: by 2002:a05:6512:2813:b0:51f:40a6:234a with SMTP id 2adb3069b0e04-5296717f75dmr3319799e87.4.1716651543086; Sat, 25 May 2024 08:39:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716651543; cv=none; d=google.com; s=arc-20160816; b=nxirfhUM/z1Uq9QeY5yt22G/7GR/mMVG6j+9seOy2EQWYsWoQQuBTFr3EFMdyHGP08 BLzm41YBJW1NFUgSPMUd7oIdM8aI/rtRzPXSTA4H6cARw04uMr2DssiQnpm1hcKIxbin b1Syxznej1rBw3kKdI3qpLNVpoUHUSRfpcQrXRyUxcQJgB+5+CvjuWu3kx9DL2/7klpy VSJQ67zfXVYhs0GcVTxcW1xRRSnugejnnfI/MFrcen4BHwVaIV3mJN+s7nkEqFEmDWhY EcjV5+2MqqhivVXjj4xRU7BwzqagJA7g4Ig27L3VOmI+sJdsmV504DorNd7oKzcXjhkY E9Tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=Q6o8sgo+lVYH1iPbWlMdCrLUNcrKBXtfcfMkLR9+UM4=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=icKtKJXxm8s570ckpxcHC4pSq/uWutqsM5qGHEG3HJC5eKWW2FM33HZmDNo8Elxa++ uXudT6JDdw9EPcYk3RUnhxTJxMreU1dt2Xk7gYQzWbaIGO/TrBHewVc7yER2pO8E9kq6 8AaAw5aIy6iU2u8x+d+Vy6vVVdHk48vH3b+PBOUDQNLtrq14avfHnAMJtr2ASsu72NJh 7Vr486WHHcchvs9GSHi2igFeSVTQG9lKlzV5UlOG+naQsq37MaqFxJLqhN0iXGX3oDvf Gk+t16KbOry64CQkozc1DGOexli8PIX3FlOyJIJtYBG+OK3Zl4Duk045ExwZlwExNhca Vp8w==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cc6430dsi191752766b.498.2024.05.25.08.39.02; Sat, 25 May 2024 08:39:03 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CBACE68D51F; Sat, 25 May 2024 18:38:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 167FA68D499 for ; Sat, 25 May 2024 18:38:41 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 45A5BC006B for ; Sat, 25 May 2024 18:38:40 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 May 2024 18:38:36 +0300 Message-ID: <20240525153840.78147-1-remi@remlab.net> X-Mailer: git-send-email 2.45.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/5] lavc/vp8dsp: avoid one multiplication on RISC-V X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: iF70hVRaVNvY Use shifts rather than multiply, and save one instruction. --- libavcodec/riscv/vp8dsp_init.c | 26 ++++++++++++++------------ libavcodec/riscv/vp8dsp_rvv.S | 7 +++---- 2 files changed, 17 insertions(+), 16 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 31e8227fa4..2413fbf449 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -84,19 +84,21 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; - c->put_vp8_epel_pixels_tab[0][0][2] = ff_put_vp8_epel16_h6_rvv; - c->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_rvv; - c->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_rvv; - c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; - c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; - c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; + if (flags & AV_CPU_FLAG_RVB_ADDR) { + c->put_vp8_epel_pixels_tab[0][0][2] = ff_put_vp8_epel16_h6_rvv; + c->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_rvv; + c->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_rvv; + c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; + c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; + c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; - c->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_rvv; - c->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_rvv; - c->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_rvv; - c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; - c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; - c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; + c->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; + } } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 9c84b1503e..cb9b0b8b5f 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -162,15 +162,14 @@ const subpel_filters endconst .macro epel_filter size type - lla t2, subpel_filters .ifc \type,v addi t0, a6, -1 .else addi t0, a5, -1 .endif - li t1, 6 - mul t0, t0, t1 - add t0, t0, t2 + lla t2, subpel_filters + sh1add t0, t0, t0 + sh1add t0, t0, t2 .irp n,1,2,3,4 lb t\n, \n(t0) .endr