From patchwork Sat May 4 15:03:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48496 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431081pzb; Sat, 4 May 2024 08:03:41 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUG5KWAaP7SLnxwNrwA1GNKIk5SnPfOsEwwRLRTM0qjWfiCFF6eMIpXnKw2m+oRcCEoDbZCxxf4kxObdQP7f8V2nwmjyiu5Rqpg4A== X-Google-Smtp-Source: AGHT+IFz6NdxrBIC6b+2GbISkkDynSHBKaCEuEGxzDhN0rIM7GWs25K1ssBsQpApnvrFgk2zsQUg X-Received: by 2002:a17:906:ce2b:b0:a59:9eee:837d with SMTP id sd11-20020a170906ce2b00b00a599eee837dmr1686224ejb.2.1714835020715; Sat, 04 May 2024 08:03:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835020; cv=none; d=google.com; s=arc-20160816; b=blvwwYODak9818n78lLbI5Pqw1C6iuHZfEBq5ju7jfXpVSPaebmhOz+Im7QmKbxTLp N2E/EEE6ec72ZR5XbpQyeTbSunhrH+n8V8dE5XYTXy9usfJZBZWaZPRjcq5B7BJPOElk SLYkkEFNIReOTEHkWFQjEUqZevHGr+d8HZp8VPQRpq6UIGyenkEukKSONAAHKodhNwec uULIUnOnK/O4d2HrFnWbtLhChjfLImJoa3CrVGoqsR1kTUXTHTY+hTsxZsg3oMRdcFp8 xSgOSQPMUkCYu/YkXF7C+qR2XZixBcotSKZSVpYiONOosqVWyIb+O0Q461kCuOP1PzOe PxPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=kQGBML5CKr4PRoTixLmBsNENmPQc6m11KbOZgC6gt2M=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=eAXm2Oh8oBkoY70epR7oy4gUU6Atsl+cvoetfzWPF9dQYz/gZz725EOee+0ttWfITT KrXtps9aZw93USq9Hu61gkpq8+/u/psrl2RjTVjWm30nmKKYVQQkHOaU00u4s25ZMrAZ 1gnyDsska1NMb/ag7TDRHR1sD10qs+ZXr2bktclvyZnd698SS7vzYtCVmKSDMulXFOb2 WfB27+0O28U++R0SisoCz5CcKCdCJqk6sSV9bCxTBWygsQntZhZGFD1YLz87GEhzXUPe eK7xchOvYoJs4olam+T39DGQqbUpS/+sTbXTm74yE5loBTNWoXbsjxdwq4oa/xmc2Bbj p/nA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=hIsP1G6m; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id kw9-20020a170907770900b00a558f00e835si2724685ejc.235.2024.05.04.08.03.40; Sat, 04 May 2024 08:03:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=hIsP1G6m; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CB40E68D761; Sat, 4 May 2024 18:03:36 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-191.mail.qq.com (out203-205-221-191.mail.qq.com [203.205.221.191]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 046D568D6FF for ; Sat, 4 May 2024 18:03:28 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835000; bh=hTKAs8bZZbzmGSzb2t21iVDokGcT83OkoXdzHujD/3U=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=hIsP1G6mLNKwGcttq5z623dFaU5UPY1oXUKfMrdlN/2372+wo/ASqyUamZc6NMwbc QNHmYB1eys0Y3zZK5+xrjnkNbAe0vwwEV/7Jk4kM+6zwD9opI2Wa3AfEg8wl4DnR0u hkZY4qXxrBzpW0KmNwuiEJZYY8+eAkZwjtTtSzgo= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714834999t2jp9aurn Message-ID: X-QQ-XMAILINFO: Mjkyxxrp/C7/F/FkVCB/ejQFvf8Py0/CyNFkmtAfp+xz5UvdDzKTcfO/lGKrw5 xQmmTODKxuAJQfbAqdtaDu6OYsIZB9ixGE8eb4RxU3ic+2R0HCeGWfwSG/lUu9UcSu7gRd2XS6gd Q/RXnmPfVLghiUyGKWOi10jFLWlHpYUI/+ZRDdsJNd6nd5xYuK7CA1Fq8A/pHSSMWjjClPn0/2o7 qepuvHC/UZ8oBQr/UQxnrSey2lYYAkCRgXHlX83S2WZvpf8uNpvS7WSK5LHgfExr+AYt41eGkayK Ro4gxaQ4LzUBsgS7+hzgDn5X854kvWRe8bgComHCW0QpZ8fCohPbemu4aukYu/dYjILN2XIrn36C 4qLMmNY5DZWnbgpzVvhKUYIRTCXLeocjb1wvJEpaJMFNXK+FDJrg9jiSHUehrr4gz8d/Z0g0oxzv F9770cgE3hd4kPtUAUZMA1HOaULaxYMJ/TzZew/h/9BfMF59x6qBFdgwvticsuFbCdOs2vS9u+Dj x7OxbtCL6iLP7Q+JrLdk60JJDpuQ1CPDXd9cBCIMBOlls0YwwAPrTJEdayN+BbeI2L58fHke0B/7 wuB2oRY7cvzv4WDPVJbQzhPWKa4RhJsBsNzg7LOOD21VyyEwbPat6P+20cznjINkLgcAWNrZ9ngd GPEA0idcplLv4eTdKIutpDcLmP8n54kYiRW7EAXgs6ujQoOVlvnBqtGi3CM/VfATJH+lfH2VXsRO tWpPpTuA5q8MtOvGt1USTwAUKY9r3o9ipBoaFW//JET/tCC/lAk7jvgheIFgH7z3PtHgO8hZNd+1 OEX8kEP0r0qCUwvTGJCxb+7Ia+8UYHLxBLFw6H5XMsXUu+U7fRpEcfA9Sxq58qvV8fRDC1+V0dsg XOSHCUxBNVNNfmXL8lmiwL79V24PNN4/ZZtPYJ4g44GDKs64vy0i0vd94nsLv5VdVm9FH8cQzTZF hmvsNErsFJhYKvgnamZ6MaL2TkWMkz X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:05 +0800 X-OQ-MSGID: <20240504150313.2472910-2-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/10] lavc/vp9dsp: R-V V ipred hor X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: +L/20TZ1VX3B From: sunyuechi C908: vp9_hor_8x8_8bpp_c: 74.7 vp9_hor_8x8_8bpp_rvv_i32: 35.7 vp9_hor_16x16_8bpp_c: 175.5 vp9_hor_16x16_8bpp_rvv_i32: 80.2 vp9_hor_32x32_8bpp_c: 510.2 vp9_hor_32x32_8bpp_rvv_i32: 264.0 --- libavcodec/riscv/vp9_intra_rvv.S | 56 ++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp.h | 6 ++++ libavcodec/riscv/vp9dsp_init.c | 3 ++ 3 files changed, 65 insertions(+) diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S index b5f0f9d3c3..1b270215fb 100644 --- a/libavcodec/riscv/vp9_intra_rvv.S +++ b/libavcodec/riscv/vp9_intra_rvv.S @@ -148,3 +148,59 @@ func ff_v_8x8_rvv, zve64x ret endfunc + +func ff_h_32x32_rvv, zve32x + li t0, 32 + addi a2, a2, 31 + vsetvli zero, t0, e8, m2, ta, ma + + .rept 2 + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + .endr + + ret +endfunc + +func ff_h_16x16_rvv, zve32x + addi a2, a2, 15 + vsetivli zero, 16, e8, m1, ta, ma + + .irp n 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vse8.v v23, (a0) + + ret +endfunc + +func ff_h_8x8_rvv, zve32x + addi a2, a2, 7 + vsetivli zero, 8, e8, mf2, ta, ma + + .irp n 8, 9, 10, 11, 12, 13, 14, 15 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 8, 9, 10, 11, 12, 13, 14 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vse8.v v15, (a0) + + ret +endfunc diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h index 113397ce86..d4c7652286 100644 --- a/libavcodec/riscv/vp9dsp.h +++ b/libavcodec/riscv/vp9dsp.h @@ -66,6 +66,12 @@ void ff_v_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); void ff_v_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); +void ff_h_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_h_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_h_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); #define VP9_8TAP_RISCV_RVV_FUNC(SIZE, type, type_idx) \ void ff_put_8tap_##type##_##SIZE##h_rvv(uint8_t *dst, ptrdiff_t dststride, \ diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 9c550d40b5..16aeeb260a 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -54,6 +54,9 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_16X16][TOP_DC_PRED] = ff_dc_top_16x16_rvv; dsp->intra_pred[TX_32X32][VERT_PRED] = ff_v_32x32_rvv; dsp->intra_pred[TX_16X16][VERT_PRED] = ff_v_16x16_rvv; + dsp->intra_pred[TX_32X32][HOR_PRED] = ff_h_32x32_rvv; + dsp->intra_pred[TX_16X16][HOR_PRED] = ff_h_16x16_rvv; + dsp->intra_pred[TX_8X8][HOR_PRED] = ff_h_8x8_rvv; } #endif }