From patchwork Sat May 4 15:03:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48497 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431182pzb; Sat, 4 May 2024 08:03:50 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWb0mLFTQ2Q4ZiT3Gpeber3M34bWgn93uz6waaDEs3HjA3otEXIru0RZVYVajzTKnONwbHqO0YhQ8cwgkJdu9xVM4mXs6K15U8W4w== X-Google-Smtp-Source: AGHT+IELoNZwvYCqLIKdk5BBo5BkkZA8CSAA5pLxjbGhzB6goeFlaSIIisltgWZRrX+g7nFVzJuW X-Received: by 2002:a50:c047:0:b0:572:7b08:d497 with SMTP id u7-20020a50c047000000b005727b08d497mr3302539edd.17.1714835030504; Sat, 04 May 2024 08:03:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835030; cv=none; d=google.com; s=arc-20160816; b=N9lmn9nhoMsKqoWzrGuEJDYQmmdBSqOA4cHfK1c33OzoWK4o5rPCmQAj0tnkufdJnv YGH8044zgZH/giezBi2o6OOdVlI5we4roZKrgiKfGEDL2h7Ffhsp2h4tew0+6E4SRr+O Ql+mvyInNTc6/MvB9PcoCnPV+EpSNidUdcSgtklhGRBZc+bD/IaU4T8OwHah1MLpphJZ uc/NoHzvVNMPoa8x8A75VQsAtsgKnuVHD22fnbj8QBFlGj4pEepfnP+dgzLaXenDD+DB 1kGxwxufjSjVLiVIoTBm6mSP58GnHlrQAyDh3Hd932eUOa9U48wiOk+N+SULp/19bcBQ mLvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:date:to:from:message-id :dkim-signature:delivered-to; bh=cOPBAOq9E6Vn9HX7A4YWJLedveXuUPvuhqlR1kqx2Vw=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=uUP4ytLFxkWvO003OQMSl5ykf9oFk/AN4sNEXAYtkw4obEGXNl6giGfc6oqN/pATZ9 9n/QgCthXtzo2TN8LsUIBDmUGM5RwyhjqbfwkJcKryAg68Agnzeru4kjM2irg7Tt4Mtp EjVtqLBdJ83XqvOUqT+tN4iTaG5h6EuFehJfcSAHI5e0e5bf7f7P9ACF491wEiWJOu7Y ZTL7zRhF9pIL5pRuF0Qe/E5Vgz84fjFwfp6v8cpX/uBbomRN6cw0bmTI1gvi8u/f+IFj wYbxb97VwLR5LdSRmFq/OWC8lZoEDIWEOT+xv9dAvYNzaHTbW2tJecqvU9X2FYU00t1z Z5TA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=nXxdHWWr; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b62-20020a509f44000000b00572befab32asi2735799edf.664.2024.05.04.08.03.50; Sat, 04 May 2024 08:03:50 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=nXxdHWWr; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C8A4268D76C; Sat, 4 May 2024 18:03:37 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-235.mail.qq.com (out203-205-221-235.mail.qq.com [203.205.221.235]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 076C468D741 for ; Sat, 4 May 2024 18:03:28 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714834999; bh=9UU425LMPQBHkyMbaTpXhnBPLHPjWh1KvupJ8RaPtBs=; h=From:To:Cc:Subject:Date; b=nXxdHWWrnDJLjZnAlIWiEiXeJ5ICu+DhK9hCsKHyZB69x46MJG5P4Lkygb+pdjugG v5VbOL9X4r0zSjOQRW1193OJOR7DCWVQwCCJU+hcrDVesBUOmd+ihCqSoSyxhmwAqc Jl508NeBPu9gfNEXIVaN8jq9Yjx6aURqjClavFN4= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714834998t8rojnazi Message-ID: X-QQ-XMAILINFO: OeFOmPvKThpBH8rLj53msOgUbmkGqUzfPJF0W3Ej1k2To2xMF6izghINxLrlo+ LytxVApCImMQupV+UR4ET7P2VYEHRtoIfq/MrQ9qC2VYlcP5lbeLfqZ7VMeLYtlAstzD4BOICIyP 36JAgpg38xAANry6bAVRKtt+Tuv98Vmxc5w+lNN/13o8xR6//t0E/Hs/HfaPD99oRzMoV0wOfGqJ WdZxSJkcUZP4jsexy3o91uh0BCuzTOO5HhxX58mKNdwWMnsDkLF0ggtSyPOm7O9K7N1L2WdHXDwD f190Y7dkZgkWLUT/GYPar6dnfMZHTbiY/uHp42QqNaGyt0p46HWvLUQfKG5BY/pgH+CB1ATWpiy9 b+dWxsKllsLfwsU1YTAOtDJuFA+4eDdDmMOtETFieaiFFS79mStIHGdXlhjarawLkSP2Lzqn+0tO hf3obUMHNqsmex2IrLKwiIi9pxTyWvcJVBOMgS5XmOjivp3Fm4MYiwHgvsW62VLX7Q+/9Iv47iwL 6iX2UOdE6pN2tbXmoV1g25JAGS9trWecRB63kNUB2KWrAW6oLI3Cy0H+li6NACY3uQMgNdeYEHOG M0SWUqMycGcZDRnjTbXkAnfyT4YNcrLoG/DZ4KjCBCubqELv0NUpnTt6DmgJ4RJQC/5+wt8xLsr3 8t8NtjjSAyWy0g9drBXnSujSlMIJObwwfvpyWaMNyUE3qD2SuYL6YV7VPF8shhzFl8hRePKIYg9t YX55wjzk3C/n0Mb278kjLFLKLqGOTodjkY+9mQ3SJbZFcQwvKg6DMB+HfU5c8l7NyaYbeBz3zGdG aJPDeFRPHyDMMS43kOPmf3M2ymiL1RpRworjcL6JmtxLt6ScdS5YNJB6TVyaKyDWoHGVDUNlxD3m IbGXJnUkG/igNiuK3QBqnRMme723Z4fON4ziKP82FeMgomZbXCeBVej/UJxdtV45t8msLDTt7nNt CSmSnM0KZgoctJwSOW1Uxc0XvGD3CjbQldu6wo0c2chO54lLpVg4bM3KA15xr+fxgrXY0YkSBZ9d 334d5U6A== X-QQ-XMRINFO: Nq+8W0+stu50PRdwbJxPCL0= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:04 +0800 X-OQ-MSGID: <20240504150313.2472910-1-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/10] lavc/vp9dsp: R-V V ipred vert X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 4TY6qfbK/p/R From: sunyuechi C908: vp9_vert_8x8_8bpp_c: 22.0 vp9_vert_8x8_8bpp_rvv_i64: 18.5 vp9_vert_16x16_8bpp_c: 71.2 vp9_vert_16x16_8bpp_rvv_i32: 50.7 vp9_vert_32x32_8bpp_c: 300.2 vp9_vert_32x32_8bpp_rvv_i32: 136.7 --- libavcodec/riscv/vp9_intra_rvv.S | 35 ++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp.h | 6 ++++++ libavcodec/riscv/vp9dsp_init.c | 3 +++ 3 files changed, 44 insertions(+) diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S index db9774c263..b5f0f9d3c3 100644 --- a/libavcodec/riscv/vp9_intra_rvv.S +++ b/libavcodec/riscv/vp9_intra_rvv.S @@ -113,3 +113,38 @@ func_dc dc_left 8 left 3 0 zve64x func_dc dc_top 32 top 5 1 zve32x func_dc dc_top 16 top 4 1 zve32x func_dc dc_top 8 top 3 0 zve64x + +func ff_v_32x32_rvv, zve32x + vsetivli zero, 8, e8, mf2, ta, ma + vle32.v v8, (a3) + + .rept 31 + vse32.v v8, (a0) + add a0, a0, a1 + .endr + vse32.v v8, (a0) + + ret +endfunc + +func ff_v_16x16_rvv, zve32x + vsetivli zero, 4, e8, mf4, ta, ma + vle32.v v8, (a3) + + .rept 15 + vse32.v v8, (a0) + add a0, a0, a1 + .endr + vse32.v v8, (a0) + + ret +endfunc + +func ff_v_8x8_rvv, zve64x + ld t0, (a3) + vsetivli zero, 8, e64, m4, ta, ma + vmv.v.x v8, t0 + vsse64.v v8, (a0), a1 + + ret +endfunc diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h index 25047ed507..113397ce86 100644 --- a/libavcodec/riscv/vp9dsp.h +++ b/libavcodec/riscv/vp9dsp.h @@ -60,6 +60,12 @@ void ff_dc_129_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); void ff_dc_129_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); +void ff_v_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_v_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_v_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); #define VP9_8TAP_RISCV_RVV_FUNC(SIZE, type, type_idx) \ void ff_put_8tap_##type##_##SIZE##h_rvv(uint8_t *dst, ptrdiff_t dststride, \ diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 69ab39004c..9c550d40b5 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -36,6 +36,7 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_8X8][DC_128_PRED] = ff_dc_128_8x8_rvv; dsp->intra_pred[TX_8X8][DC_129_PRED] = ff_dc_129_8x8_rvv; dsp->intra_pred[TX_8X8][TOP_DC_PRED] = ff_dc_top_8x8_rvv; + dsp->intra_pred[TX_8X8][VERT_PRED] = ff_v_8x8_rvv; } if (bpp == 8 && flags & AV_CPU_FLAG_RVV_I32 && ff_get_rv_vlenb() >= 16) { @@ -51,6 +52,8 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_16X16][DC_129_PRED] = ff_dc_129_16x16_rvv; dsp->intra_pred[TX_32X32][TOP_DC_PRED] = ff_dc_top_32x32_rvv; dsp->intra_pred[TX_16X16][TOP_DC_PRED] = ff_dc_top_16x16_rvv; + dsp->intra_pred[TX_32X32][VERT_PRED] = ff_v_32x32_rvv; + dsp->intra_pred[TX_16X16][VERT_PRED] = ff_v_16x16_rvv; } #endif }