From patchwork Sat May 4 15:03:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48497 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431182pzb; Sat, 4 May 2024 08:03:50 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWb0mLFTQ2Q4ZiT3Gpeber3M34bWgn93uz6waaDEs3HjA3otEXIru0RZVYVajzTKnONwbHqO0YhQ8cwgkJdu9xVM4mXs6K15U8W4w== X-Google-Smtp-Source: AGHT+IELoNZwvYCqLIKdk5BBo5BkkZA8CSAA5pLxjbGhzB6goeFlaSIIisltgWZRrX+g7nFVzJuW X-Received: by 2002:a50:c047:0:b0:572:7b08:d497 with SMTP id u7-20020a50c047000000b005727b08d497mr3302539edd.17.1714835030504; Sat, 04 May 2024 08:03:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835030; cv=none; d=google.com; s=arc-20160816; b=N9lmn9nhoMsKqoWzrGuEJDYQmmdBSqOA4cHfK1c33OzoWK4o5rPCmQAj0tnkufdJnv YGH8044zgZH/giezBi2o6OOdVlI5we4roZKrgiKfGEDL2h7Ffhsp2h4tew0+6E4SRr+O Ql+mvyInNTc6/MvB9PcoCnPV+EpSNidUdcSgtklhGRBZc+bD/IaU4T8OwHah1MLpphJZ uc/NoHzvVNMPoa8x8A75VQsAtsgKnuVHD22fnbj8QBFlGj4pEepfnP+dgzLaXenDD+DB 1kGxwxufjSjVLiVIoTBm6mSP58GnHlrQAyDh3Hd932eUOa9U48wiOk+N+SULp/19bcBQ mLvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:date:to:from:message-id :dkim-signature:delivered-to; bh=cOPBAOq9E6Vn9HX7A4YWJLedveXuUPvuhqlR1kqx2Vw=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=uUP4ytLFxkWvO003OQMSl5ykf9oFk/AN4sNEXAYtkw4obEGXNl6giGfc6oqN/pATZ9 9n/QgCthXtzo2TN8LsUIBDmUGM5RwyhjqbfwkJcKryAg68Agnzeru4kjM2irg7Tt4Mtp EjVtqLBdJ83XqvOUqT+tN4iTaG5h6EuFehJfcSAHI5e0e5bf7f7P9ACF491wEiWJOu7Y ZTL7zRhF9pIL5pRuF0Qe/E5Vgz84fjFwfp6v8cpX/uBbomRN6cw0bmTI1gvi8u/f+IFj wYbxb97VwLR5LdSRmFq/OWC8lZoEDIWEOT+xv9dAvYNzaHTbW2tJecqvU9X2FYU00t1z Z5TA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=nXxdHWWr; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b62-20020a509f44000000b00572befab32asi2735799edf.664.2024.05.04.08.03.50; Sat, 04 May 2024 08:03:50 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=nXxdHWWr; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C8A4268D76C; Sat, 4 May 2024 18:03:37 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-235.mail.qq.com (out203-205-221-235.mail.qq.com [203.205.221.235]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 076C468D741 for ; Sat, 4 May 2024 18:03:28 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714834999; bh=9UU425LMPQBHkyMbaTpXhnBPLHPjWh1KvupJ8RaPtBs=; h=From:To:Cc:Subject:Date; b=nXxdHWWrnDJLjZnAlIWiEiXeJ5ICu+DhK9hCsKHyZB69x46MJG5P4Lkygb+pdjugG v5VbOL9X4r0zSjOQRW1193OJOR7DCWVQwCCJU+hcrDVesBUOmd+ihCqSoSyxhmwAqc Jl508NeBPu9gfNEXIVaN8jq9Yjx6aURqjClavFN4= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714834998t8rojnazi Message-ID: X-QQ-XMAILINFO: OeFOmPvKThpBH8rLj53msOgUbmkGqUzfPJF0W3Ej1k2To2xMF6izghINxLrlo+ LytxVApCImMQupV+UR4ET7P2VYEHRtoIfq/MrQ9qC2VYlcP5lbeLfqZ7VMeLYtlAstzD4BOICIyP 36JAgpg38xAANry6bAVRKtt+Tuv98Vmxc5w+lNN/13o8xR6//t0E/Hs/HfaPD99oRzMoV0wOfGqJ WdZxSJkcUZP4jsexy3o91uh0BCuzTOO5HhxX58mKNdwWMnsDkLF0ggtSyPOm7O9K7N1L2WdHXDwD f190Y7dkZgkWLUT/GYPar6dnfMZHTbiY/uHp42QqNaGyt0p46HWvLUQfKG5BY/pgH+CB1ATWpiy9 b+dWxsKllsLfwsU1YTAOtDJuFA+4eDdDmMOtETFieaiFFS79mStIHGdXlhjarawLkSP2Lzqn+0tO hf3obUMHNqsmex2IrLKwiIi9pxTyWvcJVBOMgS5XmOjivp3Fm4MYiwHgvsW62VLX7Q+/9Iv47iwL 6iX2UOdE6pN2tbXmoV1g25JAGS9trWecRB63kNUB2KWrAW6oLI3Cy0H+li6NACY3uQMgNdeYEHOG M0SWUqMycGcZDRnjTbXkAnfyT4YNcrLoG/DZ4KjCBCubqELv0NUpnTt6DmgJ4RJQC/5+wt8xLsr3 8t8NtjjSAyWy0g9drBXnSujSlMIJObwwfvpyWaMNyUE3qD2SuYL6YV7VPF8shhzFl8hRePKIYg9t YX55wjzk3C/n0Mb278kjLFLKLqGOTodjkY+9mQ3SJbZFcQwvKg6DMB+HfU5c8l7NyaYbeBz3zGdG aJPDeFRPHyDMMS43kOPmf3M2ymiL1RpRworjcL6JmtxLt6ScdS5YNJB6TVyaKyDWoHGVDUNlxD3m IbGXJnUkG/igNiuK3QBqnRMme723Z4fON4ziKP82FeMgomZbXCeBVej/UJxdtV45t8msLDTt7nNt CSmSnM0KZgoctJwSOW1Uxc0XvGD3CjbQldu6wo0c2chO54lLpVg4bM3KA15xr+fxgrXY0YkSBZ9d 334d5U6A== X-QQ-XMRINFO: Nq+8W0+stu50PRdwbJxPCL0= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:04 +0800 X-OQ-MSGID: <20240504150313.2472910-1-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/10] lavc/vp9dsp: R-V V ipred vert X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 4TY6qfbK/p/R From: sunyuechi C908: vp9_vert_8x8_8bpp_c: 22.0 vp9_vert_8x8_8bpp_rvv_i64: 18.5 vp9_vert_16x16_8bpp_c: 71.2 vp9_vert_16x16_8bpp_rvv_i32: 50.7 vp9_vert_32x32_8bpp_c: 300.2 vp9_vert_32x32_8bpp_rvv_i32: 136.7 --- libavcodec/riscv/vp9_intra_rvv.S | 35 ++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp.h | 6 ++++++ libavcodec/riscv/vp9dsp_init.c | 3 +++ 3 files changed, 44 insertions(+) diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S index db9774c263..b5f0f9d3c3 100644 --- a/libavcodec/riscv/vp9_intra_rvv.S +++ b/libavcodec/riscv/vp9_intra_rvv.S @@ -113,3 +113,38 @@ func_dc dc_left 8 left 3 0 zve64x func_dc dc_top 32 top 5 1 zve32x func_dc dc_top 16 top 4 1 zve32x func_dc dc_top 8 top 3 0 zve64x + +func ff_v_32x32_rvv, zve32x + vsetivli zero, 8, e8, mf2, ta, ma + vle32.v v8, (a3) + + .rept 31 + vse32.v v8, (a0) + add a0, a0, a1 + .endr + vse32.v v8, (a0) + + ret +endfunc + +func ff_v_16x16_rvv, zve32x + vsetivli zero, 4, e8, mf4, ta, ma + vle32.v v8, (a3) + + .rept 15 + vse32.v v8, (a0) + add a0, a0, a1 + .endr + vse32.v v8, (a0) + + ret +endfunc + +func ff_v_8x8_rvv, zve64x + ld t0, (a3) + vsetivli zero, 8, e64, m4, ta, ma + vmv.v.x v8, t0 + vsse64.v v8, (a0), a1 + + ret +endfunc diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h index 25047ed507..113397ce86 100644 --- a/libavcodec/riscv/vp9dsp.h +++ b/libavcodec/riscv/vp9dsp.h @@ -60,6 +60,12 @@ void ff_dc_129_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); void ff_dc_129_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); +void ff_v_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_v_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_v_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); #define VP9_8TAP_RISCV_RVV_FUNC(SIZE, type, type_idx) \ void ff_put_8tap_##type##_##SIZE##h_rvv(uint8_t *dst, ptrdiff_t dststride, \ diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 69ab39004c..9c550d40b5 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -36,6 +36,7 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_8X8][DC_128_PRED] = ff_dc_128_8x8_rvv; dsp->intra_pred[TX_8X8][DC_129_PRED] = ff_dc_129_8x8_rvv; dsp->intra_pred[TX_8X8][TOP_DC_PRED] = ff_dc_top_8x8_rvv; + dsp->intra_pred[TX_8X8][VERT_PRED] = ff_v_8x8_rvv; } if (bpp == 8 && flags & AV_CPU_FLAG_RVV_I32 && ff_get_rv_vlenb() >= 16) { @@ -51,6 +52,8 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_16X16][DC_129_PRED] = ff_dc_129_16x16_rvv; dsp->intra_pred[TX_32X32][TOP_DC_PRED] = ff_dc_top_32x32_rvv; dsp->intra_pred[TX_16X16][TOP_DC_PRED] = ff_dc_top_16x16_rvv; + dsp->intra_pred[TX_32X32][VERT_PRED] = ff_v_32x32_rvv; + dsp->intra_pred[TX_16X16][VERT_PRED] = ff_v_16x16_rvv; } #endif } From patchwork Sat May 4 15:03:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48496 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431081pzb; Sat, 4 May 2024 08:03:41 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUG5KWAaP7SLnxwNrwA1GNKIk5SnPfOsEwwRLRTM0qjWfiCFF6eMIpXnKw2m+oRcCEoDbZCxxf4kxObdQP7f8V2nwmjyiu5Rqpg4A== X-Google-Smtp-Source: AGHT+IFz6NdxrBIC6b+2GbISkkDynSHBKaCEuEGxzDhN0rIM7GWs25K1ssBsQpApnvrFgk2zsQUg X-Received: by 2002:a17:906:ce2b:b0:a59:9eee:837d with SMTP id sd11-20020a170906ce2b00b00a599eee837dmr1686224ejb.2.1714835020715; Sat, 04 May 2024 08:03:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835020; cv=none; d=google.com; s=arc-20160816; b=blvwwYODak9818n78lLbI5Pqw1C6iuHZfEBq5ju7jfXpVSPaebmhOz+Im7QmKbxTLp N2E/EEE6ec72ZR5XbpQyeTbSunhrH+n8V8dE5XYTXy9usfJZBZWaZPRjcq5B7BJPOElk SLYkkEFNIReOTEHkWFQjEUqZevHGr+d8HZp8VPQRpq6UIGyenkEukKSONAAHKodhNwec uULIUnOnK/O4d2HrFnWbtLhChjfLImJoa3CrVGoqsR1kTUXTHTY+hTsxZsg3oMRdcFp8 xSgOSQPMUkCYu/YkXF7C+qR2XZixBcotSKZSVpYiONOosqVWyIb+O0Q461kCuOP1PzOe PxPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=kQGBML5CKr4PRoTixLmBsNENmPQc6m11KbOZgC6gt2M=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=eAXm2Oh8oBkoY70epR7oy4gUU6Atsl+cvoetfzWPF9dQYz/gZz725EOee+0ttWfITT KrXtps9aZw93USq9Hu61gkpq8+/u/psrl2RjTVjWm30nmKKYVQQkHOaU00u4s25ZMrAZ 1gnyDsska1NMb/ag7TDRHR1sD10qs+ZXr2bktclvyZnd698SS7vzYtCVmKSDMulXFOb2 WfB27+0O28U++R0SisoCz5CcKCdCJqk6sSV9bCxTBWygsQntZhZGFD1YLz87GEhzXUPe eK7xchOvYoJs4olam+T39DGQqbUpS/+sTbXTm74yE5loBTNWoXbsjxdwq4oa/xmc2Bbj p/nA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=hIsP1G6m; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id kw9-20020a170907770900b00a558f00e835si2724685ejc.235.2024.05.04.08.03.40; Sat, 04 May 2024 08:03:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=hIsP1G6m; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CB40E68D761; Sat, 4 May 2024 18:03:36 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-191.mail.qq.com (out203-205-221-191.mail.qq.com [203.205.221.191]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 046D568D6FF for ; Sat, 4 May 2024 18:03:28 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835000; bh=hTKAs8bZZbzmGSzb2t21iVDokGcT83OkoXdzHujD/3U=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=hIsP1G6mLNKwGcttq5z623dFaU5UPY1oXUKfMrdlN/2372+wo/ASqyUamZc6NMwbc QNHmYB1eys0Y3zZK5+xrjnkNbAe0vwwEV/7Jk4kM+6zwD9opI2Wa3AfEg8wl4DnR0u hkZY4qXxrBzpW0KmNwuiEJZYY8+eAkZwjtTtSzgo= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714834999t2jp9aurn Message-ID: X-QQ-XMAILINFO: Mjkyxxrp/C7/F/FkVCB/ejQFvf8Py0/CyNFkmtAfp+xz5UvdDzKTcfO/lGKrw5 xQmmTODKxuAJQfbAqdtaDu6OYsIZB9ixGE8eb4RxU3ic+2R0HCeGWfwSG/lUu9UcSu7gRd2XS6gd Q/RXnmPfVLghiUyGKWOi10jFLWlHpYUI/+ZRDdsJNd6nd5xYuK7CA1Fq8A/pHSSMWjjClPn0/2o7 qepuvHC/UZ8oBQr/UQxnrSey2lYYAkCRgXHlX83S2WZvpf8uNpvS7WSK5LHgfExr+AYt41eGkayK Ro4gxaQ4LzUBsgS7+hzgDn5X854kvWRe8bgComHCW0QpZ8fCohPbemu4aukYu/dYjILN2XIrn36C 4qLMmNY5DZWnbgpzVvhKUYIRTCXLeocjb1wvJEpaJMFNXK+FDJrg9jiSHUehrr4gz8d/Z0g0oxzv F9770cgE3hd4kPtUAUZMA1HOaULaxYMJ/TzZew/h/9BfMF59x6qBFdgwvticsuFbCdOs2vS9u+Dj x7OxbtCL6iLP7Q+JrLdk60JJDpuQ1CPDXd9cBCIMBOlls0YwwAPrTJEdayN+BbeI2L58fHke0B/7 wuB2oRY7cvzv4WDPVJbQzhPWKa4RhJsBsNzg7LOOD21VyyEwbPat6P+20cznjINkLgcAWNrZ9ngd GPEA0idcplLv4eTdKIutpDcLmP8n54kYiRW7EAXgs6ujQoOVlvnBqtGi3CM/VfATJH+lfH2VXsRO tWpPpTuA5q8MtOvGt1USTwAUKY9r3o9ipBoaFW//JET/tCC/lAk7jvgheIFgH7z3PtHgO8hZNd+1 OEX8kEP0r0qCUwvTGJCxb+7Ia+8UYHLxBLFw6H5XMsXUu+U7fRpEcfA9Sxq58qvV8fRDC1+V0dsg XOSHCUxBNVNNfmXL8lmiwL79V24PNN4/ZZtPYJ4g44GDKs64vy0i0vd94nsLv5VdVm9FH8cQzTZF hmvsNErsFJhYKvgnamZ6MaL2TkWMkz X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:05 +0800 X-OQ-MSGID: <20240504150313.2472910-2-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/10] lavc/vp9dsp: R-V V ipred hor X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: +L/20TZ1VX3B From: sunyuechi C908: vp9_hor_8x8_8bpp_c: 74.7 vp9_hor_8x8_8bpp_rvv_i32: 35.7 vp9_hor_16x16_8bpp_c: 175.5 vp9_hor_16x16_8bpp_rvv_i32: 80.2 vp9_hor_32x32_8bpp_c: 510.2 vp9_hor_32x32_8bpp_rvv_i32: 264.0 --- libavcodec/riscv/vp9_intra_rvv.S | 56 ++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp.h | 6 ++++ libavcodec/riscv/vp9dsp_init.c | 3 ++ 3 files changed, 65 insertions(+) diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S index b5f0f9d3c3..1b270215fb 100644 --- a/libavcodec/riscv/vp9_intra_rvv.S +++ b/libavcodec/riscv/vp9_intra_rvv.S @@ -148,3 +148,59 @@ func ff_v_8x8_rvv, zve64x ret endfunc + +func ff_h_32x32_rvv, zve32x + li t0, 32 + addi a2, a2, 31 + vsetvli zero, t0, e8, m2, ta, ma + + .rept 2 + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + .endr + + ret +endfunc + +func ff_h_16x16_rvv, zve32x + addi a2, a2, 15 + vsetivli zero, 16, e8, m1, ta, ma + + .irp n 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vse8.v v23, (a0) + + ret +endfunc + +func ff_h_8x8_rvv, zve32x + addi a2, a2, 7 + vsetivli zero, 8, e8, mf2, ta, ma + + .irp n 8, 9, 10, 11, 12, 13, 14, 15 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 8, 9, 10, 11, 12, 13, 14 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vse8.v v15, (a0) + + ret +endfunc diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h index 113397ce86..d4c7652286 100644 --- a/libavcodec/riscv/vp9dsp.h +++ b/libavcodec/riscv/vp9dsp.h @@ -66,6 +66,12 @@ void ff_v_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); void ff_v_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); +void ff_h_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_h_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_h_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); #define VP9_8TAP_RISCV_RVV_FUNC(SIZE, type, type_idx) \ void ff_put_8tap_##type##_##SIZE##h_rvv(uint8_t *dst, ptrdiff_t dststride, \ diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 9c550d40b5..16aeeb260a 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -54,6 +54,9 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_16X16][TOP_DC_PRED] = ff_dc_top_16x16_rvv; dsp->intra_pred[TX_32X32][VERT_PRED] = ff_v_32x32_rvv; dsp->intra_pred[TX_16X16][VERT_PRED] = ff_v_16x16_rvv; + dsp->intra_pred[TX_32X32][HOR_PRED] = ff_h_32x32_rvv; + dsp->intra_pred[TX_16X16][HOR_PRED] = ff_h_16x16_rvv; + dsp->intra_pred[TX_8X8][HOR_PRED] = ff_h_8x8_rvv; } #endif } From patchwork Sat May 4 15:03:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48498 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431298pzb; Sat, 4 May 2024 08:03:59 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCX/a0Utzw8hEucWKpkqeOQkuPwMnajR6frQb1sw8yYEZrB2BgTOCyvZ+dTuBSh67FQn6+1jba976EjfhiXdx35tbO82lmF4HbEoYg== X-Google-Smtp-Source: AGHT+IFkeaC+Xp6jhwrjz9AAwtgE6uKib1O1ga9ozKR3kw49/+BCiHPw21qvlAKPzzkyFb2WUdia X-Received: by 2002:a17:906:1d51:b0:a55:73c3:f818 with SMTP id o17-20020a1709061d5100b00a5573c3f818mr3434780ejh.14.1714835038905; Sat, 04 May 2024 08:03:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835038; cv=none; d=google.com; s=arc-20160816; b=ppWfxS2gzh/HlnrOGW8C8iT4id45GXfg0mDvMzSbFOGKv4gg1OaxY7re4X7qkaQzxZ ZPUFarr94T20X0HV+MN0+und5WlbcWimZ8Ie6OwEuhZIJ16hvr5o6MB2bionjRn7ATwx zUi8d6rR+xRZ+uN28b18ch7s3udBV/wfwjXPSXJquU5R21rfYFsLMe5KJCF0c8UIjWP+ 6hyKaGqudgnjUubIwAvk689UQMJ3qCWIN/JxSO4kv1UFLq/6mND1xHOtoOnZmxjNGQ4e 4sfwuaRZ2ApB8x+xTgmAEN6ixIlFHPVDto1Zi+v9BaaZnLXAAycH+btzefsPpBgJpU8Q cabA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=U4LZaAgF3oehY1wg6Qg9MBGDJMv2R/OMSOKlXrOIZXk=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=hDlNJsPyF3DgJk+4rvQk2FKaHWpcHB3StfIR9Aqb8VCzFGW6T2xl1TEj/lBLXedPAM bd1gqOrRzO56AB6DTBHeguzaTXNMco6GM0pgdGQjj2R7XmmFDPWPwxYZQNDIZnxEdRTw 2PXj6YPI3QZUPb2L4UbFjWJs1eB/ws64oE2PWbC5ni9abJ7nWzKUHEYTRZnT0zSDWvvR ksYT/L518fJEXTR/Es/Cs3MHiLfNot9u6jEs4ZFIfdmFGidesO9UaH5Me2igN9NfkkIg a2lR0Z1IZs9Qh2qL/E6q1aRLz03XCg7XQv2NQNeaqvwCm0+uVhOOzhLIdvuoQkWcn9mj PmwA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=Q2eu5QGZ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dk9-20020a170907940900b00a5995e4e98bsi1707963ejc.735.2024.05.04.08.03.58; Sat, 04 May 2024 08:03:58 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=Q2eu5QGZ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E587968D755; Sat, 4 May 2024 18:03:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-58-211.mail.qq.com (out162-62-58-211.mail.qq.com [162.62.58.211]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B7E8968D74C for ; Sat, 4 May 2024 18:03:29 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835001; bh=Qa3zld0/CHkIhk1kv7rH1mzsqLNQ8jLVOsHCG8sQayQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=Q2eu5QGZgxhRWTJAB1T2FzyZcYDau1nUU4/uUolD4ZqLqhlMXg6HOYbQNWRXX3oEo e+Qhyau7sNeLGDjhH/pHhUDTb3OBiHQ3qPFXSDkmHrvYOYFQfDMRHj7XSDi7EPI+Bm ibcKblVVzqfa9MiMC/OMM8MlvM3QOK27r7Ej0EYU= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714835000tokj3iupi Message-ID: X-QQ-XMAILINFO: OEaI/iTChbugdMxy43IvWQJCRLqEF8MZX9/i14iAMju3jyuBragaE6WabY6Pzy Q8AvC2AFEBeFsyBh5assb9/Kgy1nCKCriQH0nquvhEVzU2bq+GFkmbE5XVQTsvkpdXasola47cp3 B0hzqgaRdwNAGfj03cx/dFwWmPEoXOSeD5WfPT4eYTdUkYX3Smv3MpvTVC4JKrhgei8vCHDvG3Cf PYUqUMOT4JsFvDiCzE53DECLFqblS6/B6j2fSH/GPtpL/ZcufqLyeN/zLFtLo2uLQphd2CZ1nzk3 F5UfCspTww8Hn1NBfDn78DOBDIS36qFT8Gd2Qgg+LzHo9JiVPkAMNAJUboE4RBmLrlXqAUfQ7sRF unRgdtz+AQWgz5+NdfqnKCnMLijD0D8YY81m3pXz9lP92qm8ACRXQszTKjc8OsgE+IK6V8+Y1QdC hVcp3nHs2rDbOlj3D+V9x8HojBnibjlfeHfjBCur4eIFYRrZgOKU52SxZP1dXsMZsB5T6O44GuIf jpGxVQ/yAhPQT4zWBOGJCL6q73+i3EyfPv2Bl/oUbqQws11JuAWT9XnqUMHjQCTtFFpygY/Y5+ww Df9MNHbmN6dixxGelkHrxprwfTPai4eXQwD511wSb9iomL6FLccyCifnKGrPvl3tC3slj+pba6ZY QMv54uyZ3IT3lOzkQJVkhujrfP5sBi8I+nORyuyJdkWYsgMQz+RouXozPOr9Rkkt4Y7bxPiLGBWi meTXLak5hLuLARJkLhoFEB40nxeHQMgRnHGx+5Nj6zO5/r4ZjK9CSSGJdIwXPdL4IbrTa9Dd4dTi s9Itf/tpBKtFnwIeVI1U77GlzvQU52/lPWKO2E1+fQnu+bYq3ENT/V/7RhBiP7Mh8RkIHjfz025R DWGOgj8NXwaHaXc9nMQdTLYpXj01eZ5nbDE9kOTEaw8uUd3OYqGAzAtnyLSv0lOnqW4gGZBvZREB gS3UYLQ9Y2W0KGxEWpxWA22Bo7S1cR X-QQ-XMRINFO: MPJ6Tf5t3I/ycC2BItcBVIA= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:06 +0800 X-OQ-MSGID: <20240504150313.2472910-3-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/10] lavc/vp9dsp: R-V V ipred tm X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: bLuAhzjGQQul From: sunyuechi C908: vp9_tm_4x4_8bpp_c: 116.5 vp9_tm_4x4_8bpp_rvv_i32: 43.5 vp9_tm_8x8_8bpp_c: 416.2 vp9_tm_8x8_8bpp_rvv_i32: 86.0 vp9_tm_16x16_8bpp_c: 1665.5 vp9_tm_16x16_8bpp_rvv_i32: 187.2 vp9_tm_32x32_8bpp_c: 6974.2 vp9_tm_32x32_8bpp_rvv_i32: 625.7 --- libavcodec/riscv/vp9_intra_rvv.S | 143 +++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp.h | 8 ++ libavcodec/riscv/vp9dsp_init.c | 4 + 3 files changed, 155 insertions(+) diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S index 1b270215fb..c6ddd541e4 100644 --- a/libavcodec/riscv/vp9_intra_rvv.S +++ b/libavcodec/riscv/vp9_intra_rvv.S @@ -204,3 +204,146 @@ func ff_h_8x8_rvv, zve32x ret endfunc + +.macro tm_sum dst, top, offset + lbu t3, \offset(a2) + sub t3, t3, a4 + vadd.vx \dst, \top, t3 +.endm + +func ff_tm_32x32_rvv, zve32x + lbu a4, -1(a3) + li t5, 32 + + .macro tm_sum32 n1,n2,n3,n4,n5,n6,n7,n8 + + vsetvli zero, t5, e16, m4, ta, ma + vle8.v v8, (a3) + vzext.vf2 v28, v8 + + tm_sum v0, v28, \n1 + tm_sum v4, v28, \n2 + tm_sum v8, v28, \n3 + tm_sum v12, v28, \n4 + tm_sum v16, v28, \n5 + tm_sum v20, v28, \n6 + tm_sum v24, v28, \n7 + tm_sum v28, v28, \n8 + + .irp n 0, 4, 8, 12, 16, 20, 24, 28 + vmax.vx v\n, v\n, zero + .endr + + vsetvli zero, zero, e8, m2, ta, ma + .irp n 0, 4, 8, 12, 16, 20, 24, 28 + vnclipu.wi v\n, v\n, 0 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + + .endm + + tm_sum32 31, 30, 29, 28, 27, 26, 25, 24 + tm_sum32 23, 22, 21, 20, 19, 18, 17, 16 + tm_sum32 15, 14, 13, 12, 11, 10, 9, 8 + tm_sum32 7, 6, 5, 4, 3, 2, 1, 0 + + ret +endfunc + +func ff_tm_16x16_rvv, zve32x + vsetivli zero, 16, e16, m2, ta, ma + vle8.v v8, (a3) + vzext.vf2 v30, v8 + lbu a4, -1(a3) + + tm_sum v0, v30, 15 + tm_sum v2, v30, 14 + tm_sum v4, v30, 13 + tm_sum v6, v30, 12 + tm_sum v8, v30, 11 + tm_sum v10, v30, 10 + tm_sum v12, v30, 9 + tm_sum v14, v30, 8 + tm_sum v16, v30, 7 + tm_sum v18, v30, 6 + tm_sum v20, v30, 5 + tm_sum v22, v30, 4 + tm_sum v24, v30, 3 + tm_sum v26, v30, 2 + tm_sum v28, v30, 1 + tm_sum v30, v30, 0 + + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + vmax.vx v\n, v\n, zero + .endr + + vsetvli zero, zero, e8, m1, ta, ma + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 + vnclipu.wi v\n, v\n, 0 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vnclipu.wi v30, v30, 0 + vse8.v v30, (a0) + + ret +endfunc + +func ff_tm_8x8_rvv, zve32x + vsetivli zero, 8, e16, m1, ta, ma + vle8.v v8, (a3) + vzext.vf2 v28, v8 + lbu a4, -1(a3) + + tm_sum v16, v28, 7 + tm_sum v17, v28, 6 + tm_sum v18, v28, 5 + tm_sum v19, v28, 4 + tm_sum v20, v28, 3 + tm_sum v21, v28, 2 + tm_sum v22, v28, 1 + tm_sum v23, v28, 0 + + .irp n 16, 17, 18, 19, 20, 21, 22, 23 + vmax.vx v\n, v\n, zero + .endr + + vsetvli zero, zero, e8, mf2, ta, ma + .irp n 16, 17, 18, 19, 20, 21, 22 + vnclipu.wi v\n, v\n, 0 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vnclipu.wi v24, v23, 0 + vse8.v v24, (a0) + + ret +endfunc + +func ff_tm_4x4_rvv, zve32x + vsetivli zero, 4, e16, mf2, ta, ma + vle8.v v8, (a3) + vzext.vf2 v28, v8 + lbu a4, -1(a3) + + tm_sum v16, v28, 3 + tm_sum v17, v28, 2 + tm_sum v18, v28, 1 + tm_sum v19, v28, 0 + + .irp n 16, 17, 18, 19 + vmax.vx v\n, v\n, zero + .endr + + vsetvli zero, zero, e8, mf4, ta, ma + .irp n 16, 17, 18 + vnclipu.wi v\n, v\n, 0 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vnclipu.wi v24, v19, 0 + vse8.v v24, (a0) + + ret +endfunc diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h index d4c7652286..36e1a8795b 100644 --- a/libavcodec/riscv/vp9dsp.h +++ b/libavcodec/riscv/vp9dsp.h @@ -72,6 +72,14 @@ void ff_h_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); void ff_h_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); +void ff_tm_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_tm_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_tm_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_tm_4x4_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); #define VP9_8TAP_RISCV_RVV_FUNC(SIZE, type, type_idx) \ void ff_put_8tap_##type##_##SIZE##h_rvv(uint8_t *dst, ptrdiff_t dststride, \ diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 16aeeb260a..f08c8f6a42 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -57,6 +57,10 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_32X32][HOR_PRED] = ff_h_32x32_rvv; dsp->intra_pred[TX_16X16][HOR_PRED] = ff_h_16x16_rvv; dsp->intra_pred[TX_8X8][HOR_PRED] = ff_h_8x8_rvv; + dsp->intra_pred[TX_32X32][TM_VP8_PRED] = ff_tm_32x32_rvv; + dsp->intra_pred[TX_16X16][TM_VP8_PRED] = ff_tm_16x16_rvv; + dsp->intra_pred[TX_8X8][TM_VP8_PRED] = ff_tm_8x8_rvv; + dsp->intra_pred[TX_4X4][TM_VP8_PRED] = ff_tm_4x4_rvv; } #endif } From patchwork Sat May 4 15:03:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48501 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431627pzb; Sat, 4 May 2024 08:04:27 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCX3MP18E3qrGmLPgPv9gt5YrdpI+/tpMOPpVpMlFrI/psXmZQBdoelBeau1z0COBERfvlnvpcNiq75YstiPdgvdvp/myoh5oZUucA== X-Google-Smtp-Source: AGHT+IH1oI5xBllHB2fmoXiUwPNB8DwIpVgT9tmAsJsJSB2Hzy0QBfcUCUWUUmnbH9AR1w1jgauc X-Received: by 2002:a2e:780f:0:b0:2e1:cb0f:4e20 with SMTP id t15-20020a2e780f000000b002e1cb0f4e20mr3563667ljc.3.1714835067028; Sat, 04 May 2024 08:04:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835067; cv=none; d=google.com; s=arc-20160816; b=Tu7UR22hVPu2ZvO90Tmuzrm6M/Tbz0jilVl9zlzdIxYGuv7mwvt+Zzk3VIT7T8d6LM /ppY1jw7H7aOku0pPZQCBbpJqZKNbdnh9eho9Gh7UZrJgfarMRALGmBkHwVp8GrsTiYH BIcEum5mB50JLkbuu6JanTVVLNmcKfYDxomO7XR//jz+Csq24OJ47S8DAFmzbTwVE08o ZnAXcITiqOaS1KyvPjAkbNR2aAsl28kShK2dX/PD+aKHQVWMPWA3xFWjhU2jWzzwSZi8 NBstxScyKLHS0aYUpteUyljer5UhRoHcmZywOKO0r/p9yMhJa5c9/U14bE1snfIr9Rqq 1DNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=LfsUFvvE+FpVHrW4+yJXwohQtouYMckxKIvHlTGdX/U=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=ug8DvMsZnL16fABVSjoYgZ89aZGVot53mmffabS49GREDiOVoY9d8TCQscn3QRceDC yhCcWaYyl6xY4Ezf7jJBMNB8Inmn3e51kysjd3GbaHwgcYDiKd4QCptyX6X+EAqK6Sa+ Bv/E73QwtmwJQk8bT7y1cwH8ISDI0piaadZ3WDtNwWmNNMPnaAhgBK6UnWphlomBsPT1 KUj9lgZXz88XWz6xG/xvaL2GKoHIF9k2vHOKica2g+lJwKLWxThkqQ9Bq0Z+kCOWRb+4 hNXVNfQVZdbOqzUPqvPd7GFVnk6RAklExVExvBgkkg3MxdiOhXYKThrLm9mlKqWMJV9r TwGw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=VgjVH6BN; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i21-20020a2e8095000000b002d85a75f615si1547694ljg.381.2024.05.04.08.04.26; Sat, 04 May 2024 08:04:27 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=VgjVH6BN; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 87CDD68D7B1; Sat, 4 May 2024 18:03:42 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-233.mail.qq.com (out203-205-221-233.mail.qq.com [203.205.221.233]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4D3B768D767 for ; Sat, 4 May 2024 18:03:31 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835002; bh=/gD8Ugrwd00twentgiK+mcWmQ148JA60y/gwXvXV+fA=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=VgjVH6BNs4nZCMFYiCwB90D+T6pu5c5G3kp9p4aGBkunXHIJU8vwrXblaPwLpSR1v Dr0dWUupIDJc88lcx72hvI7RPbcAFsoJT5l8fM/3i3ceupuxD4e2TeLh/aJCqGop0n a3NwjziKcLlRVIfk7ZaTZ9+TRIjrLL5ADlUf7gdw= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714835001tljvw0ndr Message-ID: X-QQ-XMAILINFO: MJ0QbbALek+qjK6Q94zkunpDiWRP/TFlXuzv5NmLKiZVk5sFbwjYFqEWQOEnrw +4SlKYv6gcNIZ0dr4T0IinzgP5C9Imf2tndPuUqjJKm4k5JHBBQ9MGjs8lDdbJQl21h+sYAjjYBq 6Z/smLvzJyVNHCdk0f5oD1b23P/M/zYdQQspAJgJMBnwndfSH1Mr/Asv2aXXN2aRSFFVz0EiJn4d cG9zUy5567dlYDupU1Kt2GpHJJKazgfAcyOzvA1HOaQ3rBKWcgu6/KkkcsntxuvZY1uGMn805sOF fMNrYjjNpU7UkhyAtR5rVu1I3ktIiZGizm3IC2kfhzUAbeLUPwJmtDErqKMMUZ+oYNHBKGQFArbE n0nJsHSN6JzYkgv4A+e0fkxO+Kv9Zdegtc2arGcJ++tA2ZZumqsozysPdHjk33BKtRGay/8KDH6+ lKyB0w4ANJIW8/lWryYWf6va/DVOy8LJ+aWa7ypQXynbL/WlobfQd2pVZztEeWdu7+9g3tgiy31Q pvug5aK0qLNb0+uAcLQ6QNSMXheczbSFyJCVu7tsH2Az0FMofDVidYSMRuCSqGo4e4bn1RFdYmnE WlCnfdMbuPfBk4p+IvyJSwWOvW0bJWkYEQeMz7wrF5XcJTb6fipmV1VkPL2MsYHVBHdgqo45KNCa W+C3iX+I6OalYe3WSYgIILsxan9cbjmYj/5GdOVkguW7jjWSKYnK4tBgWCOy1hjD08RqIlU9YOUf EEdmeb7ub+n9whHUTCXZlbPKBcttNSaiwxxKXkAmF5/JzfGugYjEbzM/05gl9fpGvY6k4oLS9xMa E/LKv6CDFt25QmCHXqGa0awl/RbIRlgboky3nWzgaKhNuBalH+pRDYXPZ/tE+uig8ZWkJFGqA5zI drHgcUC1ggv9w55rLsZlPgLlcYY+nAJpL7F0NtqKBfqV0eGNiG+r5NhyJhKJsUWVr6Fae2KMFQTn qfdsJUiiyUvvsll2NDg/zC6YpLdql3 X-QQ-XMRINFO: Mp0Kj//9VHAxr69bL5MkOOs= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:07 +0800 X-OQ-MSGID: <20240504150313.2472910-4-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/10] lavc/vp9dsp: R-V mc copy_avg X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: B99Lv7niKSf3 From: sunyuechi C908: vp9_avg4_8bpp_c: 1.2 vp9_avg4_8bpp_rvv_i64: 1.0 vp9_avg8_8bpp_c: 3.7 vp9_avg8_8bpp_rvv_i64: 1.5 vp9_avg16_8bpp_c: 14.7 vp9_avg16_8bpp_rvv_i64: 3.5 vp9_avg32_8bpp_c: 57.7 vp9_avg32_8bpp_rvv_i64: 10.0 vp9_avg64_8bpp_c: 229.0 vp9_avg64_8bpp_rvv_i64: 31.7 vp9_put4_8bpp_c: 0.7 vp9_put4_8bpp_rvi: 0.2 vp9_put8_8bpp_c: 2.5 vp9_put8_8bpp_rvi: 0.5 vp9_put16_8bpp_c: 16.5 vp9_put16_8bpp_rvv_i64: 1.7 vp9_put32_8bpp_c: 37.2 vp9_put32_8bpp_rvv_i64: 5.7 vp9_put64_8bpp_c: 91.2 vp9_put64_8bpp_rvv_i64: 19.7 --- libavcodec/riscv/Makefile | 4 ++- libavcodec/riscv/vp9_mc_rvi.S | 43 +++++++++++++++++++++++ libavcodec/riscv/vp9_mc_rvv.S | 64 ++++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp_init.c | 47 +++++++++++++++++++++++++ 4 files changed, 157 insertions(+), 1 deletion(-) create mode 100644 libavcodec/riscv/vp9_mc_rvi.S create mode 100644 libavcodec/riscv/vp9_mc_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 050c08ee61..43b5c21cf4 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -63,6 +63,8 @@ RVV-OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_rvv.o OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_init.o RVV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvv.o OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9dsp_init.o -RVV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_intra_rvv.o +RV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_mc_rvi.o +RVV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_intra_rvv.o \ + riscv/vp9_mc_rvv.o OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o RVV-OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_rvv.o diff --git a/libavcodec/riscv/vp9_mc_rvi.S b/libavcodec/riscv/vp9_mc_rvi.S new file mode 100644 index 0000000000..03d8dbbbae --- /dev/null +++ b/libavcodec/riscv/vp9_mc_rvi.S @@ -0,0 +1,43 @@ +/* + * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_copy8_rvi +1: + addi a4, a4, -1 + ld t4, (a2) + sd t4, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + ret +endfunc + +func ff_copy4_rvi +1: + addi a4, a4, -1 + lw t4, (a2) + sw t4, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + ret +endfunc diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S new file mode 100644 index 0000000000..ba9ec3431f --- /dev/null +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -0,0 +1,64 @@ +/* + * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +.macro copy_avg len type +.ifc \type,avg + csrwi vxrm, 0 +.endif +.ifc \len,64 + li t5, 64 + vsetvli t0, t5, e8, m4, ta, ma +.elseif \len == 32 + li t5, 32 + vsetvli t0, t5, e8, m2, ta, ma +.elseif \len == 16 + vsetivli t0, 16, e8, m1, ta, ma +.elseif \len == 8 + vsetivli t0, 8, e8, mf2, ta, ma +.elseif \len == 4 + vsetivli t0, 4, e8, mf4, ta, ma +.endif +1: + addi a4, a4, -1 + vle8.v v8, (a2) +.ifc \type,avg + vle8.v v16, (a0) + vaaddu.vv v8, v8, v16 +.endif + vse8.v v8, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + ret +.endm + +.irp len 64, 32, 16 +func ff_copy\len\()_rvv, zve32x + copy_avg \len copy +endfunc +.endr + +.irp len 64, 32, 16, 8, 4 +func ff_avg\len\()_rvv, zve32x + copy_avg \len avg +endfunc +.endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index f08c8f6a42..da33e15e97 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -65,7 +65,54 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) #endif } +static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) +{ +#if HAVE_RV + int flags = av_get_cpu_flags(); + + if (bpp == 8 && flags & AV_CPU_FLAG_RVI) { + dsp->mc[3][FILTER_8TAP_SMOOTH][0][0][0] = ff_copy8_rvi; + dsp->mc[3][FILTER_8TAP_REGULAR][0][0][0] = ff_copy8_rvi; + dsp->mc[3][FILTER_8TAP_SHARP][0][0][0] = ff_copy8_rvi; + dsp->mc[3][FILTER_BILINEAR][0][0][0] = ff_copy8_rvi; + dsp->mc[4][FILTER_8TAP_SMOOTH][0][0][0] = ff_copy4_rvi; + dsp->mc[4][FILTER_8TAP_REGULAR][0][0][0] = ff_copy4_rvi; + dsp->mc[4][FILTER_8TAP_SHARP][0][0][0] = ff_copy4_rvi; + dsp->mc[4][FILTER_BILINEAR][0][0][0] = ff_copy4_rvi; + } + +#if HAVE_RVV + if (bpp == 8 && flags & AV_CPU_FLAG_RVV_I64 && ff_get_rv_vlenb() >= 16) { + +#define init_fpel(idx1, idx2, sz, type) \ + dsp->mc[idx1][FILTER_8TAP_SMOOTH ][idx2][0][0] = ff_##type##sz##_rvv; \ + dsp->mc[idx1][FILTER_8TAP_REGULAR][idx2][0][0] = ff_##type##sz##_rvv; \ + dsp->mc[idx1][FILTER_8TAP_SHARP ][idx2][0][0] = ff_##type##sz##_rvv; \ + dsp->mc[idx1][FILTER_BILINEAR ][idx2][0][0] = ff_##type##sz##_rvv + +#define init_copy_avg(idx, sz) \ + init_fpel(idx, 0, sz, copy); \ + init_fpel(idx, 1, sz, avg) + +#define init_avg(idx, sz) \ + init_fpel(idx, 1, sz, avg) + + init_copy_avg(0, 64); + init_copy_avg(1, 32); + init_copy_avg(2, 16); + init_avg(3, 8); + init_avg(4, 4); + +#undef init_copy_avg +#undef init_avg +#undef init_fpel + } +#endif +#endif +} + av_cold void ff_vp9dsp_init_riscv(VP9DSPContext *dsp, int bpp, int bitexact) { vp9dsp_intrapred_init_rvv(dsp, bpp); + vp9dsp_mc_init_rvv(dsp, bpp); } From patchwork Sat May 4 15:03:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48499 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431400pzb; Sat, 4 May 2024 08:04:08 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWeWDjz4WZnbL02bxLzunI61hM+/UaDCZYIx0nRJQhA7etlBuUueWS/GFPvJguWI82m7QK0s9Fd/5XveH4yphzH0EQ1umMTpatM7g== X-Google-Smtp-Source: AGHT+IHo9/8EzNIHp7reqX3n47OuXQQlzmtnQPib4T1yZgYu+3lA0byLNpzAg1XfqCl7MWVYLuF/ X-Received: by 2002:a17:906:d146:b0:a58:e75e:b059 with SMTP id br6-20020a170906d14600b00a58e75eb059mr3521892ejb.0.1714835047799; Sat, 04 May 2024 08:04:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835047; cv=none; d=google.com; s=arc-20160816; b=gNR7FRcGBgB52GevIMxwPHQMyQcjWRmfKm4uIUB2knUhZHe8bF6k305z5VoOMqlDjd jFejgsQ68LmIaVPo7acvWTNMLJ0t5JsBlpSxh0t+fiVSb26cYmJQc5/F3LGCZYzvgJXP uPyP7zvBkWU1aCXmMplkwpXIyJeMPwAXe2h4d8SknyTxu0mM+Y4sr0wykjxBo2pbbX9Q oIWQ4OXaL2uhIGIba9/EbXmmgSjwywjV2Lf4qPrik2T4BVxNbopCzi3krlYqS8SINHIM xr9yoQpq3ChODdB/fo8ppHE72jduQyMoFQ8n7kriIYoxIpy9L0bOjRqT0kdMcc58DNMf nTgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=y/KqkCTe2dAdfx8TBKXdQx+k8MJ5Jxu2VA1wVQ2+mUw=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=otYQ715WaPClUNbY8O1st9T9yUqw8uA0+mK/nutszGCo2lOk0Ea56XPzXd+BuoZgnF t3rcHAopsJ24dbZ6/dsFwHC3BKn/n7y+vuRTcmDvmO7BWayZkLBQ4n4l1UHblZy5pITb w6ujcAKCzGQKQVkR36Jd915R4kiw05SQN9toTlSIJ6FdGidKV//ur2ir2cz8e6YpZPBF N/2WYNgDepERpJUCTWBGkiPOor3O4qDN8SZn7zL8rrUVtq2MNIMwmJQlEfuieFsDFSLw S4UzeoS/MhpsGAdpgW2dwvogZkIuYdEm9oKROXCkcTb+wgSGrl4BPdaj4CzfDGUfkVDa HO9g==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=WBwYgQnf; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id se22-20020a170906ce5600b00a599cff4915si1413099ejb.555.2024.05.04.08.04.07; Sat, 04 May 2024 08:04:07 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=WBwYgQnf; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1618868D797; Sat, 4 May 2024 18:03:40 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-251-53.mail.qq.com (out203-205-251-53.mail.qq.com [203.205.251.53]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 58EBA68D741 for ; Sat, 4 May 2024 18:03:31 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835003; bh=ahoMFM0lfab16vgzGVWzQH7ifNe8hmrOQnCPZln2HyY=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=WBwYgQnfu4+iq+uH3mgFki0p1HUFI9MsDgnrSf8j5TGH+aWmq/josrioeq3aSMOWV LoLBUL0TdmncBOtZ/AIXsZ2kJme7ilpXVq+3I8DlbknLgtpldheAsT6JNYU0h5CpFa JlImsMFyjAmBDp6nRedJwpRjJ90gl50iba2D/H6k= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714835002tnpywczjl Message-ID: X-QQ-XMAILINFO: Mdc3TkmnJyI/pptKhFwGnlTLQ/Rsgc0M2x4Gsfs3U9mrxg6c5Jv6I1mVDcNqca niu4kesj6Clh3MLPxWrbKBIi9c6584TN05oup3Jt/74FbbLkYL6RQQvVlM8PT4XhT7Kg9nhp+28h hjA508Tbc1fITrhuGkbBUqGKs+8LKrD+scKPfh9i1bqjYwk/xCBjrou9+PC3YuVqUqu/GB0Sp/OQ 6R4GWz7RoTS7O/Ggi9KGeb/4AmO8VjiI5fBqDGE0lMtufU76Vlr+bPNmWxRdUlB+FMAelEKQ7Tho uRTitwZSOssc+vLtd7ig+oVhJxa2vxjnzvptQadNkrAJT/dIVup7imYfqMbEg8Y8xeTzBxfAIZPi idiCuEaA0Y6uss6g2sWovwpoT6ms1BY8WimyoZ+DILSXyp7w86ysTJNKUpv84frokwXQipmC3/ru elTMki1J7Xfdhox7W35sDse0G1getMxS2fMn1pCdUdpOhfBqU9R8Qm7y5lE+ThWl0obAcgQ7urpR 2wi5aufeiE0iWZag/lwKfy/nKoYXP6euM6tTnb8+ckmtWTvi4nrfIj/JB6Qy8BFHllIVn0lTHh1Q hdLqsMAte6xJF8DTbQtr780FxGJnXlBqlD6iNqpXHB7W0eTbOgMjFPhGrxWj2zJuuCd1VnVHnRD+ kkq4zOqWOSOd6DuZBCWFjkG9F8P/14Wn41rsfYwLicT81uAQHRhCpvOq1iYGEykGce64aItqOLw4 wfSoes2BIa7xzu9JlqnKgIHI5dOWjtBECWmka4dKJIC5hQp3QxyrEuWtC6tkzNf7Z58EavcYNNhP Awp/n/7WB9cU3Q8i3fcGeOdtugcXODueHDf3wVEgQnTRi1NMxHWftB0xJ2r0aX/Anssb9fLRU9p1 fCHKKLh6U6YwHJIihGWbWMqI5dVrBSNyBK4y8kOvKiMlcn46r+oBOy8Qgp+juJL7a9l+nB7diYIv FqVvo2t0SQOx9uapG12JyfBatPVBk5 X-QQ-XMRINFO: OWPUhxQsoeAVDbp3OJHYyFg= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:08 +0800 X-OQ-MSGID: <20240504150313.2472910-5-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/10] lavc/vp9dsp: R-V V mc bilin h X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 8Cr31YFlmotm From: sunyuechi C908: vp9_avg_bilin_4h_8bpp_c: 5.5 vp9_avg_bilin_4h_8bpp_rvv_i64: 2.5 vp9_avg_bilin_8h_8bpp_c: 19.7 vp9_avg_bilin_8h_8bpp_rvv_i64: 5.0 vp9_avg_bilin_16h_8bpp_c: 78.2 vp9_avg_bilin_16h_8bpp_rvv_i64: 10.0 vp9_avg_bilin_32h_8bpp_c: 325.2 vp9_avg_bilin_32h_8bpp_rvv_i64: 28.5 vp9_avg_bilin_64h_8bpp_c: 1266.2 vp9_avg_bilin_64h_8bpp_rvv_i64: 115.0 vp9_put_bilin_4h_8bpp_c: 4.5 vp9_put_bilin_4h_8bpp_rvv_i64: 2.2 vp9_put_bilin_8h_8bpp_c: 16.7 vp9_put_bilin_8h_8bpp_rvv_i64: 4.2 vp9_put_bilin_16h_8bpp_c: 65.2 vp9_put_bilin_16h_8bpp_rvv_i64: 8.7 vp9_put_bilin_32h_8bpp_c: 273.5 vp9_put_bilin_32h_8bpp_rvv_i64: 26.7 vp9_put_bilin_64h_8bpp_c: 1041.0 vp9_put_bilin_64h_8bpp_rvv_i64: 87.2 --- libavcodec/riscv/vp9_mc_rvv.S | 73 ++++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp_init.c | 17 ++++++++ 2 files changed, 90 insertions(+) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index ba9ec3431f..a97807633e 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -51,6 +51,72 @@ ret .endm +.macro bilin_h_load dst len type +.ifc \len,4 + vsetivli zero, 5, e8, mf2, ta, ma +.elseif \len == 8 + vsetivli zero, 9, e8, m1, ta, ma +.elseif \len == 16 + vsetivli zero, 17, e8, m2, ta, ma +.elseif \len == 32 + li t0, 33 + vsetvli zero, t0, e8, m4, ta, ma +.elseif \len == 64 + li t0, 65 + vsetvli zero, t0, e8, m8, ta, ma +.endif + + vle8.v v8, (a2) + vslide1down.vx v0, v8, t5 + +.ifc \len,4 + vsetivli zero, 4, e8, mf4, ta, ma +.elseif \len == 8 + vsetivli zero, 8, e8, mf2, ta, ma +.elseif \len == 16 + vsetivli zero, 16, e8, m1, ta, ma +.elseif \len == 32 + li t0, 32 + vsetvli zero, t0, e8, m2, ta, ma +.elseif \len == 64 + li t0, 64 + vsetvli zero, t0, e8, m4, ta, ma +.endif + + vwmulu.vx v16, v0, a5 + vwmaccsu.vx v16, t1, v8 + vwadd.wx v16, v16, t4 + vnsra.wi v16, v16, 4 + vadd.vv \dst, v16, v8 + +.ifc \type,put + vadd.vv \dst, v16, v8 +.elseif \type == avg + vadd.vv v16, v16, v8 + vle8.v \dst, (a0) + vaaddu.vv \dst, \dst, v16 +.endif + +.endm + +.macro bilin_h len type +.ifc \type,avg + csrwi vxrm, 0 +.endif + li t4, 8 + li t5, 1 + neg t1, a5 +1: + addi a4, a4, -1 + bilin_h_load v0, \len, \type + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + .irp len 64, 32, 16 func ff_copy\len\()_rvv, zve32x copy_avg \len copy @@ -61,4 +127,11 @@ endfunc func ff_avg\len\()_rvv, zve32x copy_avg \len avg endfunc + +func ff_put_bilin_\len\()h_rvv, zve32x + bilin_h \len put +endfunc +func ff_avg_bilin_\len\()h_rvv, zve32x + bilin_h \len avg +endfunc .endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index da33e15e97..248501f5d2 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -106,6 +106,23 @@ static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) #undef init_copy_avg #undef init_avg #undef init_fpel + +#define init_subpel1(idx1, idx2, idxh, idxv, sz, dir, type) \ + dsp->mc[idx1][FILTER_BILINEAR ][idx2][idxh][idxv] = \ + ff_##type##_bilin_##sz##dir##_rvv; + +#define init_subpel2(idx, idxh, idxv, dir, type) \ + init_subpel1(0, idx, idxh, idxv, 64, dir, type); \ + init_subpel1(1, idx, idxh, idxv, 32, dir, type); \ + init_subpel1(2, idx, idxh, idxv, 16, dir, type); \ + init_subpel1(3, idx, idxh, idxv, 8, dir, type); \ + init_subpel1(4, idx, idxh, idxv, 4, dir, type) + +#define init_subpel3(idx, type) \ + init_subpel2(idx, 1, 0, h, type) + + init_subpel3(0, put); + init_subpel3(1, avg); } #endif #endif From patchwork Sat May 4 15:03:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48500 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431523pzb; Sat, 4 May 2024 08:04:17 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUIgr3W/y8lUoWqeitq5PPED+LjwVokfQknn+2a8oChv0jJz6VW8/MdpJ9E9s/EHghugtI9A9wlMtqFsRkeaGpTT9O55UhLr7v8pA== X-Google-Smtp-Source: AGHT+IGDbZyAp1HgncUy2SRZ+Vw1dK/iXtduOMZ6YxWBSxHEAercttff4sOeNSTiNF2cZhxFwF0L X-Received: by 2002:a19:5e11:0:b0:51f:23b8:d3c4 with SMTP id s17-20020a195e11000000b0051f23b8d3c4mr1681431lfb.27.1714835057527; Sat, 04 May 2024 08:04:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835057; cv=none; d=google.com; s=arc-20160816; b=X3ma+CSfvgUSDlYgr31f3smtcYGMZYvePYQb2tRiCncRS9w/H65AkIShD+Fn6NzkM9 NvB+FqP1dRLV4bOOr5IR0STzluVKVzYPdJGMlYZzFGKPFdCesoDrfvW1wc1Ql5WqiIb/ wobIo6yApw65wzZPY+W7LP9+kjyfy3K5FCAjrMOPp17d1eXXIFiElqJS7mqX2xNlmGIu o29sYz2RxREJ05HyZco9lgeQrOKOySq59c2GQKaV7WtEPwWoTZdUz6DrANT7B+PDfmJE g5cJUv7PZiZTi+HdyIAXPn4NP6rf2RjsCCAQU59jwCoIxbtTyNjoK02fVK/dre/9CAnI mZVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=BmGdKZZKSn/5M4iNogEuTscb9rPJEj7TMJhitb7eog0=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=hKj6/NsjPjaMLs3TTeGXbw1TsYXQ5tfWJdj7XtFuqyU055xAnv2Y9wezVMGgpyyHGD hIyRs5C3u/V5Aew/WikRihl5L45HB3ZwKUWCutXbe6hll5idHQn/TD3f1D2bFHa9Z5r4 Z+4RFLqBQ+079o9aRPoK9yp93e6YitdyjzHpmXfgVNLonheAHfEJbgMJs1N446cp6Id4 a3w49AaiSLdb8b8PZccP+XV+TCh/vXQqznymxqpYtWC4nbSE9vjDLrR1vFbY9rHFz7EE lMpcQ0lHX47y4Oal79tXz7Ot9s/+coyM1Sp8bgfIb6Gjta7CCOQ9u7hm82504ug5pkpM ckjA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=aYNPHqr8; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id n16-20020a05651203f000b0051f1c793062si1643353lfq.3.2024.05.04.08.04.17; Sat, 04 May 2024 08:04:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=aYNPHqr8; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2E58468D79F; Sat, 4 May 2024 18:03:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-235.mail.qq.com (out203-205-221-235.mail.qq.com [203.205.221.235]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E1C8968D74C for ; Sat, 4 May 2024 18:03:31 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835003; bh=/okv3hVQ+yovNPGiFxJA5wpFZM59RDDauVYqUtemfic=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=aYNPHqr8xo2CNHyLZJkCkhPofauyw/aGSkT3JvuPbrgHMj+I4+S3AeYVaIGIqjmMm uJ6EKoxXxRFzUDQmo3rJ/rDU7MDtlOhOUxWY8jLAjFI5+f6NYu4xCdm/ITv9rAiQ4/ EMfTxz/ENLmHfDLswxEvM9PkUIk3XXARMVC2dbK4= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714835002tq3lygvkj Message-ID: X-QQ-XMAILINFO: OZZSS56D9fAjkprx2Gm/FlbSVzH5e/VAWBASok2IzlxKTpFhez6sZF/s7m4YgQ AhkQvEBxl8CuVKGD9SM6r4l9eX+nRgSsDEjwi3idiPXlMc4hq7HWhIY6kT5Iew+pmyzpQFfBLnWv 707yj6RWITV0ouvL0A75IJJaJcoXC70S6w6L3rC+vEVQa1VHO4/aEhmOQFZ/4WyLq2SLb5mp3EUw eOkydwMIrrJP6tEku6KRRajHa0G5gmFdiQca68zcyLIvdEugRCGvvCRhV58wyUivLkPbOzWsMpsR Th2LpxiO7FTnFkbyGOroyoEYgOMh3IPn14pBIfJwWhw+yi7A/4Pb0S+fKSxWmXntbMAU+t01VPkc e4n+CaLCmYWO0R/X7MrEOb3BuzxhBmF/xaJkcxOrcN1GmpxIwGh+va30LDElGavj3a97IS9uojOd 8fvVYdGRyfZue3YID++j3kZ55KY/r4+kR14I6dLUZ7VGmicsZZ4uOan43MVR7ZaHe3i38h3QGZG5 ycaBBhqL0feiDYT8FolWhvo1W9sHtMHViskpkTTvJehvIG5PwdlvTLs0N6j5hPGcPPIPezeP4Rve btt8T2/HTCDeYULxU4iwLy5U2S8LxnZQF55JUGk/sESCiHMtyHZL7mZW2Nakh0nGidA9hmdrLNt0 OaaB6msHxMjvXxMeg/1/i8Zem5m0SMGTA7taufZt7xfgkgavImEyaKBg4nlZ0zwlEO7KaMXuYlY8 u5lsxFMpYqKOLjl8nBNOG5+RJf5qIFd3mgBVOrJU3V4T/gUJpGpnXHBBGwcOyEiUdnzNzZ6w6JPh VZAlCT8nkw7ruiyFl4i9mn6GDKX1XMfz0/9+xlzPucLmN4bl8Uzg0NMcOu7cksMHly3c40SdXKe5 4wPTbyiauYC5vxnDZHva/ufd4aCX/05KhMak8tjHBHldQKl+8XhVBUKzpZoCrC7aVQvVF44tEXKy EE0colqld0h5yX0eCudw== X-QQ-XMRINFO: MPJ6Tf5t3I/ycC2BItcBVIA= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:09 +0800 X-OQ-MSGID: <20240504150313.2472910-6-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/10] lavc/vp9dsp: R-V V mc tap h X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: jg5hjhH3Q1TT From: sunyuechi C908: vp9_avg_8tap_smooth_4h_8bpp_c: 12.7 vp9_avg_8tap_smooth_4h_8bpp_rvv_i64: 5.0 vp9_avg_8tap_smooth_8h_8bpp_c: 48.5 vp9_avg_8tap_smooth_8h_8bpp_rvv_i64: 9.2 vp9_avg_8tap_smooth_16h_8bpp_c: 191.7 vp9_avg_8tap_smooth_16h_8bpp_rvv_i64: 21.0 vp9_avg_8tap_smooth_32h_8bpp_c: 780.0 vp9_avg_8tap_smooth_32h_8bpp_rvv_i64: 66.5 vp9_avg_8tap_smooth_64h_8bpp_c: 3123.7 vp9_avg_8tap_smooth_64h_8bpp_rvv_i64: 264.2 vp9_put_8tap_smooth_4h_8bpp_c: 11.0 vp9_put_8tap_smooth_4h_8bpp_rvv_i64: 4.2 vp9_put_8tap_smooth_8h_8bpp_c: 42.0 vp9_put_8tap_smooth_8h_8bpp_rvv_i64: 8.2 vp9_put_8tap_smooth_16h_8bpp_c: 165.5 vp9_put_8tap_smooth_16h_8bpp_rvv_i64: 19.7 vp9_put_8tap_smooth_32h_8bpp_c: 659.0 vp9_put_8tap_smooth_32h_8bpp_rvv_i64: 64.0 vp9_put_8tap_smooth_64h_8bpp_c: 2682.0 vp9_put_8tap_smooth_64h_8bpp_rvv_i64: 272.2 --- libavcodec/riscv/vp9_mc_rvv.S | 233 +++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp_init.c | 8 +- 2 files changed, 240 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index a97807633e..289c377a42 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -123,6 +123,231 @@ func ff_copy\len\()_rvv, zve32x endfunc .endr +const subpel_filters_regular + .byte 0, 0, 0, 128, 0, 0, 0, 0 + .byte 0, 1, -5, 126, 8, -3, 1, 0 + .byte -1, 3, -10, 122, 18, -6, 2, 0 + .byte -1, 4, -13, 118, 27, -9, 3, -1 + .byte -1, 4, -16, 112, 37, -11, 4, -1 + .byte -1, 5, -18, 105, 48, -14, 4, -1 + .byte -1, 5, -19, 97, 58, -16, 5, -1 + .byte -1, 6, -19, 88, 68, -18, 5, -1 + .byte -1, 6, -19, 78, 78, -19, 6, -1 + .byte -1, 5, -18, 68, 88, -19, 6, -1 + .byte -1, 5, -16, 58, 97, -19, 5, -1 + .byte -1, 4, -14, 48, 105, -18, 5, -1 + .byte -1, 4, -11, 37, 112, -16, 4, -1 + .byte -1, 3, -9, 27, 118, -13, 4, -1 + .byte 0, 2, -6, 18, 122, -10, 3, -1 + .byte 0, 1, -3, 8, 126, -5, 1, 0 +subpel_filters_sharp: + .byte 0, 0, 0, 128, 0, 0, 0, 0 + .byte -1, 3, -7, 127, 8, -3, 1, 0 + .byte -2, 5, -13, 125, 17, -6, 3, -1 + .byte -3, 7, -17, 121, 27, -10, 5, -2 + .byte -4, 9, -20, 115, 37, -13, 6, -2 + .byte -4, 10, -23, 108, 48, -16, 8, -3 + .byte -4, 10, -24, 100, 59, -19, 9, -3 + .byte -4, 11, -24, 90, 70, -21, 10, -4 + .byte -4, 11, -23, 80, 80, -23, 11, -4 + .byte -4, 10, -21, 70, 90, -24, 11, -4 + .byte -3, 9, -19, 59, 100, -24, 10, -4 + .byte -3, 8, -16, 48, 108, -23, 10, -4 + .byte -2, 6, -13, 37, 115, -20, 9, -4 + .byte -2, 5, -10, 27, 121, -17, 7, -3 + .byte -1, 3, -6, 17, 125, -13, 5, -2 + .byte 0, 1, -3, 8, 127, -7, 3, -1 +subpel_filters_smooth: + .byte 0, 0, 0, 128, 0, 0, 0, 0 + .byte -3, -1, 32, 64, 38, 1, -3, 0 + .byte -2, -2, 29, 63, 41, 2, -3, 0 + .byte -2, -2, 26, 63, 43, 4, -4, 0 + .byte -2, -3, 24, 62, 46, 5, -4, 0 + .byte -2, -3, 21, 60, 49, 7, -4, 0 + .byte -1, -4, 18, 59, 51, 9, -4, 0 + .byte -1, -4, 16, 57, 53, 12, -4, -1 + .byte -1, -4, 14, 55, 55, 14, -4, -1 + .byte -1, -4, 12, 53, 57, 16, -4, -1 + .byte 0, -4, 9, 51, 59, 18, -4, -1 + .byte 0, -4, 7, 49, 60, 21, -3, -2 + .byte 0, -4, 5, 46, 62, 24, -3, -2 + .byte 0, -4, 4, 43, 63, 26, -2, -2 + .byte 0, -3, 2, 41, 63, 29, -2, -2 + .byte 0, -3, 1, 38, 64, 32, -1, -3 +endconst + +.macro epel_filter name type regtype + lla \regtype\()2, subpel_filters_\name + li \regtype\()1, 8 + mul \regtype\()0, a5, \regtype\()1 + add \regtype\()0, \regtype\()0, \regtype\()2 + .irp n 1,2,3,4,5,6 + lb \regtype\n, \n(\regtype\()0) + .endr +.ifc \regtype,t + lb a7, 7(\regtype\()0) +.elseif \regtype == s + lb s7, 7(\regtype\()0) +.endif + lb \regtype\()0, 0(\regtype\()0) +.endm + +.macro epel_load dst len do name type from_mem regtype + li a5, 64 +.ifc \from_mem, 1 + vle8.v v22, (a2) + addi a2, a2, -1 + vle8.v v20, (a2) + addi a2, a2, 2 + vle8.v v24, (a2) + addi a2, a2, 1 + vle8.v v26, (a2) + addi a2, a2, 1 + vle8.v v28, (a2) + addi a2, a2, 1 + vle8.v v30, (a2) + +.ifc \name,smooth + vwmulu.vx v16, v24, \regtype\()4 + vwmaccu.vx v16, \regtype\()2, v20 + vwmaccu.vx v16, \regtype\()5, v26 + vwmaccsu.vx v16, \regtype\()6, v28 +.else + vwmulu.vx v16, v28, \regtype\()6 + vwmaccsu.vx v16, \regtype\()2, v20 + vwmaccsu.vx v16, \regtype\()5, v26 +.endif + +.ifc \regtype,t + vwmaccsu.vx v16, a7, v30 +.elseif \regtype == s + vwmaccsu.vx v16, s7, v30 +.endif + + addi a2, a2, -6 + vle8.v v28, (a2) + addi a2, a2, -1 + vle8.v v26, (a2) + addi a2, a2, 3 + +.ifc \name,smooth + vwmaccsu.vx v16, \regtype\()1, v28 +.else + vwmaccu.vx v16, \regtype\()1, v28 + vwmulu.vx v28, v24, \regtype\()4 +.endif + vwmaccsu.vx v16, \regtype\()0, v26 + vwmulu.vx v20, v22, \regtype\()3 +.else +.ifc \name,smooth + vwmulu.vx v16, v8, \regtype\()4 + vwmaccu.vx v16, \regtype\()2, v4 + vwmaccu.vx v16, \regtype\()5, v10 + vwmaccsu.vx v16, \regtype\()6, v12 + vwmaccsu.vx v16, \regtype\()1, v2 +.else + vwmulu.vx v16, v2, \regtype\()1 + vwmaccu.vx v16, \regtype\()6, v12 + vwmaccsu.vx v16, \regtype\()5, v10 + vwmaccsu.vx v16, \regtype\()2, v4 + vwmulu.vx v28, v8, \regtype\()4 +.endif + vwmaccsu.vx v16, \regtype\()0, v0 + vwmulu.vx v20, v6, \regtype\()3 + +.ifc \regtype,t + vwmaccsu.vx v16, a7, v14 +.elseif \regtype == s + vwmaccsu.vx v16, s7, v14 +.endif + +.endif + vwadd.wx v16, v16, a5 +.ifc \len,4 + vsetvli zero, zero, e16, mf2, ta, ma +.elseif \len == 8 + vsetvli zero, zero, e16, m1, ta, ma +.elseif \len == 16 + vsetvli zero, zero, e16, m2, ta, ma +.else + vsetvli zero, zero, e16, m4, ta, ma +.endif + +.ifc \name,smooth + vwadd.vv v24, v16, v20 +.else + vwadd.vv v24, v16, v28 + vwadd.wv v24, v24, v20 +.endif + vnsra.wi v24, v24, 7 + vmax.vx v24, v24, zero +.ifc \len,4 + vsetvli zero, zero, e8, mf4, ta, ma +.elseif \len == 8 + vsetvli zero, zero, e8, mf2, ta, ma +.elseif \len == 16 + vsetvli zero, zero, e8, m1, ta, ma +.else + vsetvli zero, zero, e8, m2, ta, ma +.endif + +.ifc \do,put + vnclipu.wi \dst, v24, 0 +.elseif \do == avg + vle8.v \dst, (a0) + vnclipu.wi v24, v24, 0 + vaaddu.vv \dst, \dst, v24 +.endif + +.endm + +.macro epel_load_inc dst len do name type from_mem regtype + epel_load \dst \len \do \name \type \from_mem \regtype + add a2, a2, a3 +.endm + +.macro epel len do name type + epel_filter \name \type t + +.ifc \len,4 + vsetivli zero, 4, e8, mf4, ta, ma +.elseif \len == 8 + vsetivli zero, 8, e8, mf2, ta, ma +.elseif \len == 16 + vsetivli zero, 16, e8, m1, ta, ma +.else + li a5, 32 + vsetvli zero, a5, e8, m2, ta, ma +.endif +.ifc \do,avg + csrwi vxrm, 0 +.endif + +1: + addi a4, a4, -1 + epel_load v30 \len \do \name \type 1 t + vse8.v v30, (a0) +.ifc \len,64 + addi a0, a0, 32 + addi a2, a2, 32 + epel_load v30 \len \do \name \type 1 t + vse8.v v30, (a0) + addi a0, a0, -32 + addi a2, a2, -32 +.endif + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + +.macro gen_epel len do name type +func ff_\do\()_8tap_\name\()_\len\()\type\()_rvv, zve32x + epel \len \do \name \type +endfunc +.endm + .irp len 64, 32, 16, 8, 4 func ff_avg\len\()_rvv, zve32x copy_avg \len avg @@ -134,4 +359,12 @@ endfunc func ff_avg_bilin_\len\()h_rvv, zve32x bilin_h \len avg endfunc + +.irp name regular sharp smooth + .irp do put avg + .irp type h + gen_epel \len \do \name \type + .endr + .endr +.endr .endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 248501f5d2..97f02e601d 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -109,7 +109,13 @@ static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) #define init_subpel1(idx1, idx2, idxh, idxv, sz, dir, type) \ dsp->mc[idx1][FILTER_BILINEAR ][idx2][idxh][idxv] = \ - ff_##type##_bilin_##sz##dir##_rvv; + ff_##type##_bilin_##sz##dir##_rvv; \ + dsp->mc[idx1][FILTER_8TAP_SMOOTH ][idx2][idxh][idxv] = \ + ff_##type##_8tap_smooth_##sz##dir##_rvv; \ + dsp->mc[idx1][FILTER_8TAP_REGULAR][idx2][idxh][idxv] = \ + ff_##type##_8tap_regular_##sz##dir##_rvv; \ + dsp->mc[idx1][FILTER_8TAP_SHARP ][idx2][idxh][idxv] = \ + ff_##type##_8tap_sharp_##sz##dir##_rvv; #define init_subpel2(idx, idxh, idxv, dir, type) \ init_subpel1(0, idx, idxh, idxv, 64, dir, type); \ From patchwork Sat May 4 15:03:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48502 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431740pzb; Sat, 4 May 2024 08:04:36 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWQdWQv4ehB9J9PDjqTtC1NeWWPuA1KO8BubSWjxwW0sUhhDDrXHeTpo0SAitM2EgLOKhIipQl1nnUGkktEYca7Mc9ovk9Yxz58yg== X-Google-Smtp-Source: AGHT+IGC78E+vDZYq3nOqAcucL91gZeDizz1vtNCaCTqGBdUdFtRx8qSRDYS0WmsVb6R3vnM11zo X-Received: by 2002:ac2:4a6d:0:b0:51d:162e:bf9e with SMTP id q13-20020ac24a6d000000b0051d162ebf9emr3593983lfp.15.1714835076239; Sat, 04 May 2024 08:04:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835076; cv=none; d=google.com; s=arc-20160816; b=jUsOcGhl5JcQXkB9xcP2lGfu4jAFSyH3BMNVqecsu99u3okCF947XWGv0XgCI0kgjd TKsYaMUcnR2xiSUgyKJgeqrVEpvu3TO9jVwJaAEZ5/uzSjWRl+9Yf2CZ41AFxm5zjkqe iX81roA4qWjOiguPfCbf3mC/eQXTxLgQz+LwK7NchRQrioB6yewXxNzHYLCwVZzXAe4Q cSKscBYk8D+z+HR7asFw4euUGS8i6H9dCO40s4worm/r5+Fef9ijnAg0VCdOL99gX+tx rGcf8AuyvNpu80efr52WHJxh6KLpWAgaTuomWmwHzrgRRfQ7aRRxUdy1JulZqaJiXOs1 R/3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=n9i+BEWSyFcEpwh8CEqjikMRdTPZ0a/cvgQT2yu8pIs=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=N84viaMu/djMOyEnLaqW75JEv1QbKCRJ8CIR3T3K4Jun5TpqtMApld1d1J7T11ZS4i Pnj+zq5Y8pVzZMrnSnmJyoPZrkvzDc6ruMPmum6nS8ybXrWnMkwxzSfpQzDE6bANMNeN FZPCrCHtGwYqC0K7MuUb+YQiPD9/MbWlFEXyMXsUIJ8gb9pdK/hY/2CA9YVnhJIniaAf yPMniO9gWRK2jCvGhK/3gzumE+2koL5gfT8bA1336fWL8E7Idu2ka+pQ2zAZpXz3L7Kd FxPzAx/qX2urPzNUZGRz+IXu735PWCeGCIINoU0y9b33WdJ53oydWurCLCZM53meC0ow 3D9g==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=vRmhiWrD; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v4-20020a05651203a400b0051d53eadbe7si1570960lfp.431.2024.05.04.08.04.35; Sat, 04 May 2024 08:04:36 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=vRmhiWrD; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A4D2C68D7BD; Sat, 4 May 2024 18:03:43 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-233.mail.qq.com (out203-205-221-233.mail.qq.com [203.205.221.233]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A12F368D76B for ; Sat, 4 May 2024 18:03:32 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835004; bh=MMF4go7P2tjvDejA1GybYiaYQCpAKDl8sBIR0YSMTEw=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=vRmhiWrD3v34b6ArRY37/zAh9heOF80/TbVKwH5nPurw94MqJi2UCuc/M/7kmLbvT 1/Zd3kSU9WGb3o2i4wQgIfQVlSEUvKBDdU2PuywEHy2+klfaeb+QoPTlRCQ3GuvJib 35KvdZI7LYwDy72oHPRZ03WYA1dqDURmZlSYem2I= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714835003tq03sknyu Message-ID: X-QQ-XMAILINFO: NeHZ50d3l/WZO6KXUdqkf5DdFp46UPYBrlArkFGnHi4ddpt+jP1Xp6Q7hd14yx 49C5urKUiQiT7cEc8es3r3ZDQgZlKD9JZ8KgEYdHdCVXPkUQbQv6T/NGGvERycs8QPKnjNam1SWT YN9rI//qkNW0WV64g8nkdGpAaDsNIvQfpHYEuVRwUZ7dfDzufIR2zAk0KhgKIo0fZuwECwUOHA11 AGZVczpFiF3TESbriQCErkc3myXx6RjyZkogDXdyCpewUGTzlgs5MWAV8FgzQWNLgGHCwN95krQR f4EOlJtVF1Qn9CxGostFwMHttICXxRP9DiNKAuL66zQd8cbnxp0CRCPNmC/ak3BGSwwBbMOrSmJA u2WAFnC7YSXzngbx1ofaGuTSL2muA6cS9mV+qCb7chxGqDJU13D/I7JVtP62qR0IIuR6cUxISA8h 2+uaxCOAsq+1DmwWCrUtBydraMJa7corFjv8jANmq3sLJo5xO0INH0NDeMi/GmSuIan0szx5MHFN TtNvN1eHfuMsClMKeB5fzdEMz+QVwf+dp+ye82i4nEQ5tOKTzxWX22xLIqvlv39jxgtGiEmBRaNN sqcaAz6F4X5i7Q3mD5HDlfwfOvn9RdBxJbMRxWAttirFvJsVc3yVTGcv+5iSaM9e1gaCxkKMVf99 +V0SSZ3PtzEse3NqNdNgNIpW4n0+2BvdHg6wINMpLRWsLEQaBaNxbCWUkWnL1F9ewegCXnCKQCbT nq3S8RXR4LdzxGj6jkwtFhWYvmp94reEjmoAb5pbHNiXZTTpciSH8lt+DQyunsgZeoTqSe6reK1e NF2ALuUPumMMZblMeyo7TAzOlScN9rQdIOT5YBHdgG0VZzU4KD5aL5p1lL4KqVdAGwFZu+gKYxYI cZZAFHlg8BCTIXqO5IfUhg12EFFx+FWccDGil00oOSPVhR8uNEIvyld1j0WTD7HXwiFgPgh3q2 X-QQ-XMRINFO: MSVp+SPm3vtS1Vd6Y4Mggwc= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:10 +0800 X-OQ-MSGID: <20240504150313.2472910-7-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/10] lavc/vp9dsp: R-V V mc bilin v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: vqtMyiywdPNd From: sunyuechi C908: vp9_avg_bilin_4v_8bpp_c: 5.5 vp9_avg_bilin_4v_8bpp_rvv_i64: 2.2 vp9_avg_bilin_8v_8bpp_c: 20.7 vp9_avg_bilin_8v_8bpp_rvv_i64: 4.2 vp9_avg_bilin_16v_8bpp_c: 82.2 vp9_avg_bilin_16v_8bpp_rvv_i64: 9.0 vp9_avg_bilin_32v_8bpp_c: 342.5 vp9_avg_bilin_32v_8bpp_rvv_i64: 27.0 vp9_avg_bilin_64v_8bpp_c: 1319.2 vp9_avg_bilin_64v_8bpp_rvv_i64: 93.2 vp9_put_bilin_4v_8bpp_c: 4.7 vp9_put_bilin_4v_8bpp_rvv_i64: 1.7 vp9_put_bilin_8v_8bpp_c: 17.7 vp9_put_bilin_8v_8bpp_rvv_i64: 3.2 vp9_put_bilin_16v_8bpp_c: 69.2 vp9_put_bilin_16v_8bpp_rvv_i64: 7.5 vp9_put_bilin_32v_8bpp_c: 274.2 vp9_put_bilin_32v_8bpp_rvv_i64: 23.2 vp9_put_bilin_64v_8bpp_c: 1109.5 vp9_put_bilin_64v_8bpp_rvv_i64: 82.2 --- libavcodec/riscv/vp9_mc_rvv.S | 49 +++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index 289c377a42..58b00889ce 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -117,6 +117,49 @@ ret .endm +.macro bilin_v len type +.ifc \type,avg + csrwi vxrm, 0 +.endif +.ifc \len,4 + vsetivli zero, 4, e8, mf4, ta, ma +.elseif \len == 8 + vsetivli zero, 8, e8, mf2, ta, ma +.elseif \len == 16 + vsetivli zero, 16, e8, m1, ta, ma +.elseif \len == 32 + li t0, 32 + vsetvli zero, t0, e8, m2, ta, ma +.elseif \len == 64 + li t0, 64 + vsetvli zero, t0, e8, m4, ta, ma +.endif + li t4, 8 + neg t1, a6 +1: + add t2, a2, a3 + addi a4, a4, -1 + vle8.v v0, (a2) + vle8.v v8, (t2) +.ifc \type,avg + vle8.v v16, (a0) +.endif + vwmulu.vx v24, v8, a6 + vwmaccsu.vx v24, t1, v0 + vwadd.wx v24, v24, t4 + vnsra.wi v24, v24, 4 + vadd.vv v0, v24, v0 +.ifc \type,avg + vaaddu.vv v0, v0, v16 +.endif + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + .irp len 64, 32, 16 func ff_copy\len\()_rvv, zve32x copy_avg \len copy @@ -359,6 +402,12 @@ endfunc func ff_avg_bilin_\len\()h_rvv, zve32x bilin_h \len avg endfunc +func ff_put_bilin_\len\()v_rvv, zve32x + bilin_v \len put +endfunc +func ff_avg_bilin_\len\()v_rvv, zve32x + bilin_v \len avg +endfunc .irp name regular sharp smooth .irp do put avg From patchwork Sat May 4 15:03:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48503 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431858pzb; Sat, 4 May 2024 08:04:45 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVJT6e7UoJl2vuIRb2FcMZ1vGTXxKU5YGg8qu9FXqFO2+cku6nKTgi5i5hnrLsrHIJ9OxBy3KGR4xR847D0bbdIchsvMxbwe17GDg== X-Google-Smtp-Source: AGHT+IGaeMFM8+MPw2hU0n4mL/bfWYQKJFdO/+zKiaUBkjBtKxKMSCKic7p1Fi3mHgNJsKcSiC0G X-Received: by 2002:a17:906:c10e:b0:a59:ab0a:a170 with SMTP id do14-20020a170906c10e00b00a59ab0aa170mr1402176ejc.1.1714835085261; Sat, 04 May 2024 08:04:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835085; cv=none; d=google.com; s=arc-20160816; b=fWqgjWcoFv/Om/IJ9B+EX3wBbAbt0t7PLNxQLE3enSukMSJihZBO8W9WwppSk/QeQo Tt6uHKuoSaGhLtYxzYqT64XXYEaAjvPdP7mXT1lSf7xQYX92PDTpn+D4Vn/SU43SFXi+ Fv/TSzJ9kn9O7CSueey4qt+4XRWVmzdahibbG4Rrc6mc1lu6ySJDLm48VAPJ0CNkWhgP hCNdUCenm/bVs7cQJXcVBZJKlol++LVbBGVhgbqYF8JAROaUFpmFI4A+LvTghEXoyeLQ 6ZXtxxMZ9BuIZkLkTrKmAHmET1F6xe8GXUELfcTP9HXFYP8IqAxp+4HfEAe3UKGpmUJp lMAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=EK4kI5DzUEz29x0Oi3v3MQkp7KIZlw4y7kn7PjK6IjA=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=SluQnUdtD8+/vKMpnoKdCowT22SjXmVQFZ/YoZSuQM+cu/OoPVFN3skU1m4UYUuNK4 7nUzrgrRBgZdwYuZ1e2TkIViECYXqk+XZvOJ1bd7Gm7Q1ms4IZRG0CVTJ+zJIUqulrQK fGjWU6VW2DZneUJDFJu1f+yl9o1nYgjIWuQ8lmlZGQL+Mnm6c8zB5bD6HmfFnzI1Zt7F ms3ycjHRY5CkjKeuuKh/HfhHDLAhRN3hHRXRXkUP8kPRV0FnFl9/l6wBHkU2qdv48jz7 TkuiGedyF4o6lU+ZsSVDorFxJEOB3xoR0HtVGxqtm25B9v4hVg1AvlK7xnKLjl5uLirM jyQw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=h5p8g+JX; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ho7-20020a1709070e8700b00a5992b2c6d7si1765883ejc.603.2024.05.04.08.04.44; Sat, 04 May 2024 08:04:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=h5p8g+JX; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BD4D868D7C8; Sat, 4 May 2024 18:03:44 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-49.mail.qq.com (out162-62-57-49.mail.qq.com [162.62.57.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B0C4B68D77A for ; Sat, 4 May 2024 18:03:33 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835005; bh=pKRz8gr6wEbeltrNXK0E5ia7eIMwrxdC5u9/6X7zMjI=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=h5p8g+JXRZPQqrxUNbzydVc91kbuIEA4cmzgGPlI4RfQJ3ODp0N+6/mL4FM+yzUhf zQbcL8jb+6PCngNmhRp7zZVEM3v5yBx1QKnISYS8fhytwBUwlScr07Rhq90djrP+pY wYLZABv3FsPu8jkeiZvdIBJw2eqCLdzM/e0dRTpY= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714835004tph746dm1 Message-ID: X-QQ-XMAILINFO: NafziRg7Bx696VXO8kUdRRReIPS7qHGn45EUi/5qOUUB0eXaXJDj1+0brcaDLP 4T0LtUQGfspP8IZD+evIgFZjTEZEEhX+X50f/3kInbPSOUDshC0e1g40NczupItoAIN5ipg8h/8L a/2wc1sWMVoc5B1HIH9EdO0DCeGPMVLrIp2alwZVrXnBZwpw5+92I3bvSnmnOERJYkK8rThsDkWU H847CrlJlVOucU6u8FJA2Ri2k1KgVfF1EI4KrCHFXkRyvJKCELV/jaYeNxGu+wTq0XyNfAzGh5Eo +XxVJMNK2OkZp/SRDFJPG9cIYl9LsJGyA/+BGxfLcQ/XaJ04QwqMNM4UdCdoM241I9fjSrxBsXzL hQ70Twvi2NdEhf5jysBFg4fLJ7FUNQukV9HgdoEdnKQy+DgIxsAlhLLxkOVhWjTpnQVdbhq9YF8Z pWLyC6EVQGkQTLRDSYa8TBW4FyGWkPFY3ba2OSZL/K65MyBDfw7AJnvmYFRoLPoB11nDOGA0oLMb lqyhnnoAoCCBBdQ+88dSwXHG7RtrXppTl57nC7sJJ3Xr4ACmeokPfVJ7eQMLz+a+BG9ePF5EwsvV Udo0t0uztW99W/CPofCOByvg9Wll4E8aq0DpdmAwIrET+xGSd8WO7/3NSQgj+fzjaf8wVeDnM93w Jg2bXITHl9B4px0wf7PqfaY2pkf4k84GV7Vv6K2wFtfrDM77Zr9hd6ITdVdsaTjeO1hJXOo4j4D9 mAMUo3DHLfBkpVybMk9uRTbNsc3ctlbMEL3a7iyE2oqZPtQwripcBj1DgyebgrKBFboQ0ufyLerr VfXeg6RGp5NF9p9SNe2ba90hndnhg7hzJ2/S0XBafIYxk5bFNJ2edIgE1SOpdSuQaZsmOGLltEw0 h4mpnuVd0pK6y97yoGIf4PGNUYOimszHQxHLthEJiHKdWa5Mn/IBXov9C9reh3lG/P3qDkTnyFU3 4yQILr1sxOaoBFgjjGQG/r359Gz061 X-QQ-XMRINFO: MPJ6Tf5t3I/ycC2BItcBVIA= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:11 +0800 X-OQ-MSGID: <20240504150313.2472910-8-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/10] lavc/vp9dsp: R-V V mc tap v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: hYcg86QeQHAe From: sunyuechi C908: vp9_avg_8tap_smooth_4v_8bpp_c: 13.7 vp9_avg_8tap_smooth_4v_8bpp_rvv_i64: 5.0 vp9_avg_8tap_smooth_8v_8bpp_c: 49.7 vp9_avg_8tap_smooth_8v_8bpp_rvv_i64: 9.2 vp9_avg_8tap_smooth_16v_8bpp_c: 191.5 vp9_avg_8tap_smooth_16v_8bpp_rvv_i64: 21.2 vp9_avg_8tap_smooth_32v_8bpp_c: 770.5 vp9_avg_8tap_smooth_32v_8bpp_rvv_i64: 66.0 vp9_avg_8tap_smooth_64v_8bpp_c: 3068.0 vp9_avg_8tap_smooth_64v_8bpp_rvv_i64: 262.5 vp9_put_8tap_smooth_4v_8bpp_c: 12.0 vp9_put_8tap_smooth_4v_8bpp_rvv_i64: 4.5 vp9_put_8tap_smooth_8v_8bpp_c: 43.7 vp9_put_8tap_smooth_8v_8bpp_rvv_i64: 8.5 vp9_put_8tap_smooth_16v_8bpp_c: 168.7 vp9_put_8tap_smooth_16v_8bpp_rvv_i64: 20.0 vp9_put_8tap_smooth_32v_8bpp_c: 681.5 vp9_put_8tap_smooth_32v_8bpp_rvv_i64: 63.7 vp9_put_8tap_smooth_64v_8bpp_c: 2692.7 vp9_put_8tap_smooth_64v_8bpp_rvv_i64: 253.5 --- libavcodec/riscv/vp9_mc_rvv.S | 32 +++++++++++++++++++++++++++++++- libavcodec/riscv/vp9dsp_init.c | 3 ++- 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index 58b00889ce..151d7702ec 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -222,7 +222,11 @@ endconst .macro epel_filter name type regtype lla \regtype\()2, subpel_filters_\name li \regtype\()1, 8 +.ifc \type,v + mul \regtype\()0, a6, \regtype\()1 +.elseif \type == h mul \regtype\()0, a5, \regtype\()1 +.endif add \regtype\()0, \regtype\()0, \regtype\()2 .irp n 1,2,3,4,5,6 lb \regtype\n, \n(\regtype\()0) @@ -239,6 +243,19 @@ endconst li a5, 64 .ifc \from_mem, 1 vle8.v v22, (a2) +.ifc \type,v + sub a2, a2, a3 + vle8.v v20, (a2) + add a2, a2, a3 + add a2, a2, a3 + vle8.v v24, (a2) + add a2, a2, a3 + vle8.v v26, (a2) + add a2, a2, a3 + vle8.v v28, (a2) + add a2, a2, a3 + vle8.v v30, (a2) +.elseif \type == h addi a2, a2, -1 vle8.v v20, (a2) addi a2, a2, 2 @@ -249,6 +266,7 @@ endconst vle8.v v28, (a2) addi a2, a2, 1 vle8.v v30, (a2) +.endif .ifc \name,smooth vwmulu.vx v16, v24, \regtype\()4 @@ -267,11 +285,23 @@ endconst vwmaccsu.vx v16, s7, v30 .endif +.ifc \type,v + .rept 6 + sub a2, a2, a3 + .endr + vle8.v v28, (a2) + sub a2, a2, a3 + vle8.v v26, (a2) + .rept 3 + add a2, a2, a3 + .endr +.elseif \type == h addi a2, a2, -6 vle8.v v28, (a2) addi a2, a2, -1 vle8.v v26, (a2) addi a2, a2, 3 +.endif .ifc \name,smooth vwmaccsu.vx v16, \regtype\()1, v28 @@ -411,7 +441,7 @@ endfunc .irp name regular sharp smooth .irp do put avg - .irp type h + .irp type h v gen_epel \len \do \name \type .endr .endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 97f02e601d..ff7d445f6a 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -125,7 +125,8 @@ static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) init_subpel1(4, idx, idxh, idxv, 4, dir, type) #define init_subpel3(idx, type) \ - init_subpel2(idx, 1, 0, h, type) + init_subpel2(idx, 1, 0, h, type); \ + init_subpel2(idx, 0, 1, v, type) init_subpel3(0, put); init_subpel3(1, avg); From patchwork Sat May 4 15:03:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48504 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431951pzb; Sat, 4 May 2024 08:04:55 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUafgwTZeJVCfbQ94y5HKdBEYH1aflGwydYGDjbnXC+e0SVsKjkM3OkSg3ubxj1MpP0Qwvodd/ingxQAZHeBjzycvswt5qXb0Hejw== X-Google-Smtp-Source: AGHT+IGSi7eBvDc/at+I+KH438mMk2WfTomPc65GaobBh+NBJvvnlEBQl+wfKGbpdcEfN0XFZVN0 X-Received: by 2002:a17:906:5a92:b0:a55:a8a6:aac6 with SMTP id l18-20020a1709065a9200b00a55a8a6aac6mr3471836ejq.54.1714835094956; Sat, 04 May 2024 08:04:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835094; cv=none; d=google.com; s=arc-20160816; b=aVmd6k2LRdrJfsaOkvAQdDiuCUe9tqlYpVWDDexUwFm0Ea86XAC9SSxxT0HDe8NvIu Mq9UlDTcakd2LjcWqxwlRh2FnFZL1gPK/984WgZjI4R4Yu+Zj/VxdpvNIRPc9hzFGWbN NAcIXURtJN9tHPvNRWA3XTuqyQDuEsrXD6z44WiMnHUY7+B0qi8j25hI0XH4ob4tR3+S aSHXOgrpd7Y2MdoyNcGWSodYVK5HDE6bI4J4koCJGrbxC+HHvFupbjdQmn8DDWLqEFlg XWOgxX0dn0TgEaBZ2kwYbkIs8A590qfJs3ZEElBRusDz3jd3t2FTflL6w/7mV1Jm9KBD GYtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=tbFD7PweXDkxhOcRVQ6cGwlEWWFSmqTQ4+VPC+29BOQ=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=n9O84qC2oPtIRaKKppzZ9eVdLiDs1xIt/Tbsasa5hXRB0zZJ666VJQja0KJVKX8cVD 1bHKHw+ZC3XTem20nzDwYDQgfbuITvFB+XDdO7k+orta49SRymDtmyXlMYRdKXd1G+3R pQAjBBIDNhMxdgun72aMjBxyQcXN1EBIDRdMIk83kxxEc9MUlY3QDpC+77JlZ7ERWGce 6PgH9txiWd7ZLHTD0DSms13mCbsufgBVreKid//qJZjrhsTImyLTIkBftqJUvBZzIE9w L26BwZY5cgiBI2dKC/BTJdXT2PxqSm9AG3dasDfL8zuJp2G+SLhv2tggxKG+uRCktek/ 2waQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=VaaF4bUI; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s6-20020a170906284600b00a5947ccf606si1365442ejc.66.2024.05.04.08.04.54; Sat, 04 May 2024 08:04:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=VaaF4bUI; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8CDC168D78E; Sat, 4 May 2024 18:03:46 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-236.mail.qq.com (out203-205-221-236.mail.qq.com [203.205.221.236]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BC96668D794 for ; Sat, 4 May 2024 18:03:34 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835006; bh=tHdH+LHUGgDsSAcftH8H8P+l8+KoJDG0/W+FqKE32oA=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=VaaF4bUIFUbtimfk97KYveMCHjbyymwh1To7N5N+B6FJFrBMoHJwrm8JeXddWAm5N FwNU9NCalSrGiyYmhzYn9wSu//xUnZR5W6aC05wQjDwjEsYVn8a+eXHbBH6AY+hvQe 1ojPIKc8D4ohtgErL+IQTZhyf+WBXG/lHMIoxELU= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714835005tw0f1a7r9 Message-ID: X-QQ-XMAILINFO: NS/v/H4w5zv0a6kazezo5TAHM/TmiduS8OzOcckSUsz0oJOEH+UrlyAXnmwj15 qJjKcexcBMq4E+fleUwZ1qItiXKd0i9p7lHFmn7+3qI++/wAc/5BUkh/R+0CRNCAqCyEpc6XXexN +9WL1maiiiZb7wVeNimOhKpXriFgCaAVWkZoUx9mIMGXomd/i/uDxUoKWDTJ9TADgOzTCE2MjMth CvVv9yvQh8zhg2lKZAYTASL6Sy4lqBURPJ426dYPc7pIiCfe/paDEr9YtLOlBW1AKO/NK8GL/4oL e6wzoCyGnhG/13dl8SZeIsm7u7tb3J5gWFfOGtkpkbI/Catt+tUo3rQx0hAWrSf16QCqH94jGSX9 VSCTPNycRkbtvxsIkR5FBtgrDftbpeUy210DX7HHtMJTs6mQEGPrrxhzyDBQhvNgxvzqwxIXqVqs hivt45WCajDXG6wD0FV4jCQ7N5ep9cUi2IgbEXrLfhlQ+zPRrFf/ZJDB0EJmQpJ5AKwfIyrTPufc oErFtCFh/aXvX8KDzoc/5JEoKXdYEQfOCxt8oEkFMeZnovcckj+e7cPquc2d9XvaSRUlTSfVsA/Y rTuVMUObzl3dgi1plJkEHtpTbFjntpcVfPS/h25oycbVqF/AIg726ypWP2pehPiqdJQbeovj1xzu cWrn9HRS/d+rbLFil54Prya0IJilFdVew5aQMdVqVC6BNSGu/MQGelP5/LCfBR4H3mEqZ/+yVZIC YOLOnwasRjkexNN0oNtYZxvch7/GAAgx8TeiV+xCjQvOL1oykzYWM5mR0xmQNWyimKkIq7KQ/1R3 LSDry7y+pF0+EzJ682mRTw4cTZ7f4E2ExMq8RaKY0s20Ag+utmECopGUz+C5vngZ+92vq6y9FNAO eHKmtrqe9VKquGZR4XouMLwO4HYYsgj4slGkTXdXnmVeDZKRnIA3ni9RYdNxUnpOLETDzoqnGQT0 nU04EAVvLBgcqZf/u39MUWOL28L8Yg X-QQ-XMRINFO: NI4Ajvh11aEj8Xl/2s1/T8w= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:12 +0800 X-OQ-MSGID: <20240504150313.2472910-9-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/10] lavc/vp9dsp: R-V V mc bilin hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 7AVF1sX1Hb7x From: sunyuechi C908: vp9_avg_bilin_4hv_8bpp_c: 10.7 vp9_avg_bilin_4hv_8bpp_rvv_i64: 4.5 vp9_avg_bilin_8hv_8bpp_c: 38.7 vp9_avg_bilin_8hv_8bpp_rvv_i64: 8.2 vp9_avg_bilin_16hv_8bpp_c: 147.2 vp9_avg_bilin_16hv_8bpp_rvv_i64: 32.2 vp9_avg_bilin_32hv_8bpp_c: 590.7 vp9_avg_bilin_32hv_8bpp_rvv_i64: 47.5 vp9_avg_bilin_64hv_8bpp_c: 2323.7 vp9_avg_bilin_64hv_8bpp_rvv_i64: 153.5 vp9_put_bilin_4hv_8bpp_c: 10.0 vp9_put_bilin_4hv_8bpp_rvv_i64: 3.7 vp9_put_bilin_8hv_8bpp_c: 35.2 vp9_put_bilin_8hv_8bpp_rvv_i64: 7.2 vp9_put_bilin_16hv_8bpp_c: 133.7 vp9_put_bilin_16hv_8bpp_rvv_i64: 14.2 vp9_put_bilin_32hv_8bpp_c: 521.7 vp9_put_bilin_32hv_8bpp_rvv_i64: 43.0 vp9_put_bilin_64hv_8bpp_c: 2098.0 vp9_put_bilin_64hv_8bpp_rvv_i64: 144.5 --- libavcodec/riscv/vp9_mc_rvv.S | 37 +++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index 151d7702ec..c8a42c7159 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -160,6 +160,37 @@ ret .endm +.macro bilin_hv len type +.ifc \type,avg + csrwi vxrm, 0 +.endif + neg t1, a5 + neg t2, a6 + li t4, 8 + li t5, 1 + bilin_h_load v24, \len, put + add a2, a2, a3 +1: + addi a4, a4, -1 + bilin_h_load v4, \len, put + vwmulu.vx v16, v4, a6 + vwmaccsu.vx v16, t2, v24 + vwadd.wx v16, v16, t4 + vnsra.wi v16, v16, 4 + vadd.vv v0, v16, v24 +.ifc \type,avg + vle8.v v16, (a0) + vaaddu.vv v0, v0, v16 +.endif + vse8.v v0, (a0) + vmv.v.v v24, v4 + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + .irp len 64, 32, 16 func ff_copy\len\()_rvv, zve32x copy_avg \len copy @@ -438,6 +469,12 @@ endfunc func ff_avg_bilin_\len\()v_rvv, zve32x bilin_v \len avg endfunc +func ff_put_bilin_\len\()hv_rvv, zve32x + bilin_hv \len put +endfunc +func ff_avg_bilin_\len\()hv_rvv, zve32x + bilin_hv \len avg +endfunc .irp name regular sharp smooth .irp do put avg From patchwork Sat May 4 15:03:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48505 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp432076pzb; Sat, 4 May 2024 08:05:05 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUtcndMsxJPn9m2k9h+IXfnCyHnAYWeonn5eBTudehyd1KxtBnPrdcChVmkq7n7jt1QOKcREuDi2slaE5YF3mjwxsp0R4iF3NIRuQ== X-Google-Smtp-Source: AGHT+IHcz0yk6vzc4WAK1b51/Ejew0j/AQqEX2fol9FeOfsJ5NtGpq9/p3v/FShREl9BIEQRPXLT X-Received: by 2002:a05:6512:522:b0:51b:ada6:f1a2 with SMTP id o2-20020a056512052200b0051bada6f1a2mr3642148lfc.3.1714835104959; Sat, 04 May 2024 08:05:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835104; cv=none; d=google.com; s=arc-20160816; b=VPIpIgQ88Y648A2m23IYwMX901HB4LthDS0qKfy3eiDWnDXXHWzmF7HvBG8OchQKp3 PkvfHpdeU45kTOnj4iTg+pzH99fb2QrXCNhwaAE2PCVlBKgXd/KUaNfpk94eoLnssM83 sB+cCaGI96dv6hFEsNb8qj+pWa95SXjkmx/ZLNzrBQaCavHHyf9o729l+i9h56R9h5WD B4Tk8mI8VDrnzs9hCSUO/vOy/oRJeQCSGeulzcM/UpMZkCWBEObM6GZGwiOlq0MFceSs 6vwbseH3JtGiEXh/iKKF8QtHvxkC0JhTZC451xdoGEzgas76j8RoqHX95zIbKmutHrjz Ki4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=Bkyg0D6tk4OZQ8WLlR9cglHr2WfYAXFv4mruIHOUq/Q=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=DxHjihpNAt8fwJm7lMNihHeakcxmRCDtGezoAgyvD92Hp56Anm2NOi1ZcwqzFTvfIj mFgCRt07Zeix89f5ImzaNWk9XYRCDB2+tRx5uuJ1nUVPiOU0A5h6TWvuNjogy9zcTx0I YTCeJFnQTFmW6AV6wrDt6reLN61UJbo+Sk0z2PY2P1mI9xajwTjX3BydqhcunZ0N8MeW 4UeXt49C/dRp925lZWlsbw1g3rO1Zc/mTcW6adqJor2SgUp5Elhooy/29MdCR1wEYh/j ks4WUhl9iKQ1S0EtlbfOKeelNCusTrz9dlvQKgyW+7KGGxAZV1U++IouhQ2oTwLsomRF 3vkQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=FubhUxCQ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id n16-20020a05651203f000b0051aecc1baefsi1573547lfq.346.2024.05.04.08.05.04; Sat, 04 May 2024 08:05:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=FubhUxCQ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BBC5D68D7D5; Sat, 4 May 2024 18:03:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-164.mail.qq.com (out203-205-221-164.mail.qq.com [203.205.221.164]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 323F468D7A1 for ; Sat, 4 May 2024 18:03:35 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835007; bh=XMF0+z5veo9aNJOyiYyyciN3aCDfmPm+x1YZJqD+vjk=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=FubhUxCQTAaIQ2N1cCOZC1NKqEbN5uWRYezcnNLWpxsGeLCfu/GPEXrUQYXw+ULQ1 cQVey6IQ/rNN7q/h+8qFMFsE810wRdcmB/z5/aetY/D6x6XA2oK51ggAmsufc20aNi oII/JZP9qGzUkKeaJ6eGDnT9heD6nPRm6yxFXV8I= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714835006t2e5ix2yz Message-ID: X-QQ-XMAILINFO: NIbHSc2ysKlDGr+yaqXiE9zjMRwSRDykYnhAy3Ooh/rwwiPFcZs78t6E3CkO/Y +kspw646B6ayMWoyhJCB4wyr/P/ToI7LfMNj+/TdmJM/Y0ldSZsaS7tsMf8QeeJfxrRjYOzOrPwO hkGlWAWyiJk5ujJKUbQYmg8hMwy6ltP8tznLQckQSrkG2hCjAhS68G48oRcsIXuvUEomcCKmc0JF 5hoBdX+j6p7xbqM7UpHWmIpwjJ7SUXaf5HuxL2gnAfZU9ejoUgHYH83nTo8nwspCU1Mvn8BYcJNp gF7N+Qun7OufaaSlo6ip9/7LsDt0jQCsiYnVAgn+bOVxtCaLbGcSqLxBLRK2tSmnSXWiLOFv5Q01 Sk29Cx8RpK7xHpQyoCoNB1A4OnYeUa0tNEHJWF8qRb7p5chhyGAbn6E6RNKOcfVhGAqxD+8ETRv1 7B0vDZoY1AE+0YSFDCMnAOtkPqOzTf0QoTHLKuLjvjcnOqA1fZV+tguGDw4OEbQOdCwY97avOAqw 7K2X469KUnNjMfzoWHDanXsUKXemG3HcAVMASQAp1tUNNMqP25UdaUjiiMTkfHeOwgNLnucD6EuV X+Q1Ju6u8AkIJHncqa4UxtAbZd6J0vx8woelkBIqMCliLpeGQPLdI09faIIKkuVYOtnQcXXt00fj iVdUbWUbEoVodIjFN8uqeh8R6x6FA9N1aAiU1kfmukwU3Afsv/zh4lzxLle35N39sxTCDPMGX+cQ FvxlJesOIfY3L+kkdmf6aByDOWrcd6m17LfeAMeaQMoze38JevCbDXp2OwhibviwRoMZY4RCT4cb 2RZLSYTkrpwORKtfKplFkEwWhPu0lTNQat9TpkDdyA7l5zntD3pNeiRq/9whbx8ZZ1dOz4ensSCq 1+PIuMs64LKYn8HdVgEpIrR1SgtIlTDoZ0fuOFpKZSHDGPINzPHjJ1oFAfVFW3PQ4E6ILXzpNSQq XkxfEfFshuAJbBsTtraA== X-QQ-XMRINFO: Nq+8W0+stu50PRdwbJxPCL0= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:13 +0800 X-OQ-MSGID: <20240504150313.2472910-10-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/10] lavc/vp9dsp: R-V V mc tap hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 9eT80VP1nRcZ From: sunyuechi C908: vp9_avg_8tap_smooth_4hv_8bpp_c: 32.2 vp9_avg_8tap_smooth_4hv_8bpp_rvv_i64: 15.2 vp9_avg_8tap_smooth_8hv_8bpp_c: 98.5 vp9_avg_8tap_smooth_8hv_8bpp_rvv_i64: 23.5 vp9_avg_8tap_smooth_16hv_8bpp_c: 355.5 vp9_avg_8tap_smooth_16hv_8bpp_rvv_i64: 46.2 vp9_avg_8tap_smooth_32hv_8bpp_c: 1270.7 vp9_avg_8tap_smooth_32hv_8bpp_rvv_i64: 133.2 vp9_avg_8tap_smooth_64hv_8bpp_c: 4936.5 vp9_avg_8tap_smooth_64hv_8bpp_rvv_i64: 521.7 vp9_put_8tap_smooth_4hv_8bpp_c: 30.2 vp9_put_8tap_smooth_4hv_8bpp_rvv_i64: 14.2 vp9_put_8tap_smooth_8hv_8bpp_c: 91.5 vp9_put_8tap_smooth_8hv_8bpp_rvv_i64: 22.7 vp9_put_8tap_smooth_16hv_8bpp_c: 330.0 vp9_put_8tap_smooth_16hv_8bpp_rvv_i64: 45.0 vp9_put_8tap_smooth_32hv_8bpp_c: 1296.5 vp9_put_8tap_smooth_32hv_8bpp_rvv_i64: 131.0 vp9_put_8tap_smooth_64hv_8bpp_c: 4497.7 vp9_put_8tap_smooth_64hv_8bpp_rvv_i64: 513.2 --- libavcodec/riscv/vp9_mc_rvv.S | 79 ++++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp_init.c | 3 +- 2 files changed, 81 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index c8a42c7159..6ad7ea2433 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -446,12 +446,90 @@ endconst ret .endm +.macro epel_hv_once len name do + sub a2, a2, a3 + sub a2, a2, a3 + sub a2, a2, a3 + .irp n 0 2 4 6 8 10 12 14 + epel_load_inc v\n \len put \name h 1 t + .endr + addi a4, a4, -1 +1: + addi a4, a4, -1 + epel_load v30 \len \do \name v 0 s + vse8.v v30, (a0) + vmv.v.v v0, v2 + vmv.v.v v2, v4 + vmv.v.v v4, v6 + vmv.v.v v6, v8 + vmv.v.v v8, v10 + vmv.v.v v10, v12 + vmv.v.v v12, v14 + epel_load v14 \len put \name h 1 t + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + epel_load v30 \len \do \name v 0 s + vse8.v v30, (a0) +.endm + +.macro epel_hv do name len + addi sp, sp, -64 + .irp n 0,1,2,3,4,5,6,7 + sd s\n, \n\()<<3(sp) + .endr +.ifc \len,64 + addi sp, sp, -48 + .irp n 0,1,2,3,4,5 + sd a\n, \n\()<<3(sp) + .endr +.endif +.ifc \do,avg + csrwi vxrm, 0 +.endif + epel_filter \name h t + epel_filter \name v s +.ifc \len,4 + vsetivli zero, 4, e8, mf4, ta, ma +.elseif \len == 8 + vsetivli zero, 8, e8, mf2, ta, ma +.elseif \len == 16 + vsetivli zero, 16, e8, m1, ta, ma +.else + li a6, 32 + vsetvli zero, a6, e8, m2, ta, ma +.endif + epel_hv_once \len \name \do +.ifc \len,64 + .irp n 0,1,2,3,4,5 + ld a\n, \n\()<<3(sp) + .endr + addi sp, sp, 48 + addi a0, a0, 32 + addi a2, a2, 32 + epel_filter \name h t + epel_hv_once \len \name \do +.endif + .irp n 0,1,2,3,4,5,6,7 + ld s\n, \n\()<<3(sp) + .endr + addi sp, sp, 64 + + ret +.endm + .macro gen_epel len do name type func ff_\do\()_8tap_\name\()_\len\()\type\()_rvv, zve32x epel \len \do \name \type endfunc .endm +.macro gen_epelhv len name do +func ff_\do\()_8tap_\name\()_\len\()hv_rvv, zve32x + epel_hv \do \name \len +endfunc +.endm + .irp len 64, 32, 16, 8, 4 func ff_avg\len\()_rvv, zve32x copy_avg \len avg @@ -481,6 +559,7 @@ endfunc .irp type h v gen_epel \len \do \name \type .endr + gen_epelhv \len \name \do .endr .endr .endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index ff7d445f6a..0c75ef38dc 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -126,7 +126,8 @@ static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) #define init_subpel3(idx, type) \ init_subpel2(idx, 1, 0, h, type); \ - init_subpel2(idx, 0, 1, v, type) + init_subpel2(idx, 0, 1, v, type); \ + init_subpel2(idx, 1, 1, hv, type) init_subpel3(0, put); init_subpel3(1, avg);