From patchwork Wed Aug 14 17:18:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 51025 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:a746:0:b0:489:2eb3:e4c4 with SMTP id f6csp1089229vqm; Wed, 14 Aug 2024 10:56:13 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUlHJHjHETkJeRgQijIE3qRtRqLoPKfAQKuQ0EqtPXB1qgSn/MlbDyZrnDubSaFo7NJHNM29ft4/sHvJiT/sKSLLiaQYiHc9HVoLQ== X-Google-Smtp-Source: AGHT+IGXGfW4H48c9LVpOfHQgTDSYLJn64Zwy37twnTiFDT8Jcfru8dMXpNdokEd50LXLm1PjDZy X-Received: by 2002:a2e:8054:0:b0:2ef:2893:796d with SMTP id 38308e7fff4ca-2f3aa1ff016mr19277461fa.46.1723658173602; Wed, 14 Aug 2024 10:56:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1723658173; cv=none; d=google.com; s=arc-20160816; b=Hyn5GR15VzO/QCrsLsXrYX7OmV+sPEdhj/Jh97a/0nsgh5zsfonyCh3pi5YOgaoy7N E45fVHlJ06Wu1+llAWz5oHjeebQfSSMgqnNLtsmBNXg7cA9RCV1qJzL0bNbefq3b4byI JkOQi4elKseriew9KBKs0goQuldqtO/Be/BfFYaZvu0AHgiyWC7juBt5YuPZ18iEpEaK JeUGHzwfBTWuniY9q7QVZHnxpb79X6ABlEVhvbnr+CNDZrE9fSimeevDI08Gw0jCFo1E 8usxm9PVbDMA04sZVdXlWQOPWEjlwRPrVvg44Ggf5i16ZuZmUHBRCAvxrufL70buKIYI 0YXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=JdvYhqQ/2ehvoMMO1xxddMuKRYH8yUFqKBTnAmoJIkY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=DIVpzBOSlIBxhe6U3cMegZge+7lKA9alV0jBq4wy1lyoL1twub+hfGmIE/TSdKbcqd tamfbB4FOXmDh7VXdfU7UEj5XQtECun80kNiEvZb08kg8edPOG+5BPLhl7ewMoEam7jY CxnAn5l9duSC+Bt+2ZyINtmXsAZT3Pt6rohBL5A9WwVim41gbhduAKWmuJCkkaTwwgZ/ cQiCPskBTmMYr9CFzol3YYbQgYQfaTAf/x+HeykP8rXxivXElbZGFhskOKSVVtYY8tJO dGAY0p6oJ7gpg79XEGufEOu2r8ShU369Uk1b0LwV/W/Zj3rTwUTbsk5oGOuYSUyBxeZP 9J0A==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 38308e7fff4ca-2f2920742e3si25883241fa.658.2024.08.14.10.56.13; Wed, 14 Aug 2024 10:56:13 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 123EC68DB43; Wed, 14 Aug 2024 20:19:05 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 91A3268DA5B for ; Wed, 14 Aug 2024 20:18:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 1F4DCC01A1 for ; Wed, 14 Aug 2024 20:18:57 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Wed, 14 Aug 2024 20:18:56 +0300 Message-ID: <20240814171856.6360-2-remi@remlab.net> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] lavc/mpegvideoencdsp: R-V V add_8x8basis X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 9KUPl+PTfjGQ T-Head C908: add_8x8basis_c: 440.6 add_8x8basis_rvv_i32: 70.3 SpacemiT X60: add_8x8basis_c: 436.3 add_8x8basis_rvv_i32: 40.5 --- libavcodec/riscv/mpegvideoencdsp_init.c | 5 ++++- libavcodec/riscv/mpegvideoencdsp_rvv.S | 19 +++++++++++++++++++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/mpegvideoencdsp_init.c b/libavcodec/riscv/mpegvideoencdsp_init.c index 4c156c1cf2..1ac808af16 100644 --- a/libavcodec/riscv/mpegvideoencdsp_init.c +++ b/libavcodec/riscv/mpegvideoencdsp_init.c @@ -25,6 +25,7 @@ int ff_try_8x8basis_rvv(const int16_t rem[64], const int16_t weight[64], const int16_t basis[16], int scale); +void ff_add_8x8basis_rvv(int16_t rem[64], const int16_t basis[16], int scale); int ff_pix_sum_rvv(const uint8_t *pix, int line_size); int ff_pix_norm1_rvv(const uint8_t *pix, int line_size); @@ -35,8 +36,10 @@ av_cold void ff_mpegvideoencdsp_init_riscv(MpegvideoEncDSPContext *c, int flags = av_get_cpu_flags(); if (flags & AV_CPU_FLAG_RVV_I32) { - if (flags & AV_CPU_FLAG_RVB) + if (flags & AV_CPU_FLAG_RVB) { c->try_8x8basis = ff_try_8x8basis_rvv; + c->add_8x8basis = ff_add_8x8basis_rvv; + } if (flags & AV_CPU_FLAG_RVV_I64) { if ((flags & AV_CPU_FLAG_RVB) && ff_rv_vlen_least(128)) diff --git a/libavcodec/riscv/mpegvideoencdsp_rvv.S b/libavcodec/riscv/mpegvideoencdsp_rvv.S index 9408de47c8..7c50526934 100644 --- a/libavcodec/riscv/mpegvideoencdsp_rvv.S +++ b/libavcodec/riscv/mpegvideoencdsp_rvv.S @@ -55,6 +55,25 @@ func ff_try_8x8basis_rvv, zve32x, b ret endfunc +func ff_add_8x8basis_rvv, zve32x, b + li t1, 64 + csrwi vxrm, 0 +1: + vsetvli t0, t1, e16, m4, ta, ma + vle16.v v4, (a1) + sub t1, t1, t0 + vwmul.vx v16, v4, a2 + sh1add a1, t0, a1 + vle16.v v8, (a0) + vnclip.wi v4, v16, BASIS_SHIFT - RECON_SHIFT + vadd.vv v4, v8, v4 + vse16.v v4, (a0) + sh1add a0, t0, a0 + bnez t1, 1b + + ret +endfunc + func ff_pix_sum_rvv, zve64x, b lpad 0 vsetivli t0, 16, e16, m1, ta, ma