From patchwork Mon May 13 16:59:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48858 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a48:b0:1af:fc2d:ff5a with SMTP id zu8csp465393pzb; Mon, 13 May 2024 10:00:14 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXW5AzxB0CKnZiBVtJXjAZodzvSRV9fNAFluKu3hHKU0J5wXbaEx+Pbi7AawGJmHIW8GC30pRbD/jn+eMEmJgZVF3jn9fWCGbGANQ== X-Google-Smtp-Source: AGHT+IEHCDKBzpa2lMS1KK7yv00ZsM7ZEanCWnLbxLF4UGOrZM518m57pJtLtorjY8UCmxv2jGup X-Received: by 2002:ac2:5b84:0:b0:521:54b5:86a3 with SMTP id 2adb3069b0e04-52210273d59mr6065869e87.64.1715619614159; Mon, 13 May 2024 10:00:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715619614; cv=none; d=google.com; s=arc-20160816; b=cMKrhi0jSdaIL5VCr9JaoFj8YJulPUubeNHZjvhW/MvpdtYGZmWKu7fjQMBz2WPf1F cRBRuWlv7A6Kjho2AeDhWxjqmEyu7xybUv1bkgM7Ktv1pQH77wfxa1XZrUoSnskETIlu 0jRV6S9/XkBlg/bfYgY6Wpr4FTKNyKuX4tnUPkFxuXmcCMR4hZYvN//AW4Erof3aYJZT 59U4y9kHiF2RoMLuLgap6FLuP1ygtTLW9wMmjlQwYz/620CSWglNpPKB/owNcFFai5Qf FdDJfsGmiXH2uW4+C+3XgHxsmyjE7v09P06b/SB27lA4TBf2Mb6+uiFnZpK5cwnh0bKf 3ijw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=EQ5YH4moLmnLhpQvH0apV5MnKIr+MgKUX4A8mugNiIM=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=u29yf9wYAgXvtFQhuqNzXSiIMQQzlL05vQBG1T5WA2U6U9UlnYSyc8IpXPj5z+wEEj d6Ri3FLTtuRnVzBPLA1S9zL871Nm4PR41RQ5PKXDcTMGLU4w1zMRU6EcB35W7HMxqQz5 9kjAuSoxfYSwqEXh+HbwzXZQ/cj753WeztwduJp14p7/CL1qxKMd0In5G2hz13C0PM32 2IKAgyQ3JejIAZh6/I0bHh27pnET04fIGdBHE7/4Rd6bfmpOa6xDzvFUeCANsnMCZ/B3 z92Xl3yumEUKvIAKUJVzMyB7zX/uEywzyClNqX+KJfbgf9yo5CpGsWAy9RVDqe1KpUgJ y84g==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=yzVXVVTy; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 2adb3069b0e04-521f38d8ff7si3149744e87.354.2024.05.13.10.00.13; Mon, 13 May 2024 10:00:14 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=yzVXVVTy; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0641868D56F; Mon, 13 May 2024 19:59:57 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-252.mail.qq.com (out162-62-57-252.mail.qq.com [162.62.57.252]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DFA6968D56F for ; Mon, 13 May 2024 19:59:47 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715619579; bh=h2lqUfaZ3+gTdHW/zMKkDGtC/tKDxdgLULO6Z7reL9M=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=yzVXVVTytF0+m/a9C4kumKJM/XoHUuT56u2mF+KDGpaLix17Ko93sOTsmfVSu82DP MHh+Qoi/yITt5Qo9my2eLy7zT0El0xJbujrel2BkWgo1H2LyRVKwe3ZY1B+xf5wckI FM/mFdb58/asCrQ1otiNO6aySIUCP8Cjw8bSI45Q= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-0.qq.com (NewEsmtp) with SMTP id EE3A4485; Tue, 14 May 2024 00:59:35 +0800 X-QQ-mid: xmsmtpt1715619578t6vxkdehr Message-ID: X-QQ-XMAILINFO: MIZfLI1VMPgsPHZntvSQHzHCdOmXCvnMsiEJyIwwsvEc1Sp2rCy4NYCMLEjTxD esgHksDoFGZUqPucCPaq6SzPy29sWsXeHhtYOCCkvoRkugqcUOGgZsJ4QN62N1cFqQWDZsddrlHL 5JfiiJLjCORufty90b+s7580HPND1/4aBtbaTu9JHq1juNgU0h3khbpSRFRszQjs5LlLjsU/kNi7 kWjXAutIMiSsRJM7QnfVAQ75yNDGLbKuauMmW/71xvzyXKe3lOHfXodHoRUva0QGSi/9BgjnVue0 Ko0rXJ+QQqfEcj1NsYQu1YdQZmUzlGKsjd1F6QphwOvUV3AKlJflIqvRBniwqvrEOWRmXKkItQnI pOBKa8bUYkTFU4bUSzPbuh6FBf9V93gNO2X3J1d6bF+L6jSrskigddgraWaU1PCapVBXFisfXQim raXBGdXw0HHrznhONiclVk8kq7uK1i21X7j75jKY9NzSlqBoCE6IwGe0ani2rBmByF4bSX4yfpG1 pzY4GKHCw2m05Ezgb+3/4QpM6ufsdJ6H1jPvKYkhgFBKlbrKn4kTj1UWP+Ha8E4iT/vewGqPbMna 8Lfcmn6EizJjTNais6JhsMSuXrJ9bE0QSbID3Lx5d+nIkfOmCgbzAtdjiEnW0wGZPUw0Kd2EL1AB WBId2ZZfb5W1Mr2rAJuw8+snaiBIdUL1LznPapgkhHK6HEdFKyKvmBigN2I0Efg7nzZ0ipavwzS0 MG78YHC1FVkazuJg1hztUTBaJzNKjw4k21FsgRlZWFlww79PJ+rc1w/0BBWuvK+R4ZHdUHvMkGEL 9lQa7CzqwQr0LCoWrQruo0lwG+HptKVFyq/rNABeyZh2HaNPAexSybYoUaI0g1Q95M0b+0Mj2B8j lhuIDUQA8lViIkrwR/hfgPY3PQWb3uPWSR0uo1vFigU27F0mMTHhbrCD8r+bI4KA== X-QQ-XMRINFO: NS+P29fieYNw95Bth2bWPxk= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Tue, 14 May 2024 00:59:20 +0800 X-OQ-MSGID: <20240513165926.1467967-3-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240513165926.1467967-1-uk7b@foxmail.com> References: <20240513165926.1467967-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 3/9] lavc/vp9dsp: R-V V ipred hor X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: KZAVPnO5Mmoz From: sunyuechi C908: vp9_hor_8x8_8bpp_c: 74.7 vp9_hor_8x8_8bpp_rvv_i32: 35.7 vp9_hor_16x16_8bpp_c: 175.5 vp9_hor_16x16_8bpp_rvv_i32: 80.2 vp9_hor_32x32_8bpp_c: 510.2 vp9_hor_32x32_8bpp_rvv_i32: 264.0 --- libavcodec/riscv/vp9_intra_rvv.S | 56 ++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp.h | 6 ++++ libavcodec/riscv/vp9dsp_init.c | 3 ++ 3 files changed, 65 insertions(+) diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S index 40e38ba83e..ca156d65cd 100644 --- a/libavcodec/riscv/vp9_intra_rvv.S +++ b/libavcodec/riscv/vp9_intra_rvv.S @@ -117,3 +117,59 @@ func_dc dc_left 8 left 3 0 zve64x func_dc dc_top 32 top 5 1 zve32x func_dc dc_top 16 top 4 1 zve32x func_dc dc_top 8 top 3 0 zve64x + +func ff_h_32x32_rvv, zve32x + li t0, 32 + addi a2, a2, 31 + vsetvli zero, t0, e8, m2, ta, ma + + .rept 2 + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + .endr + + ret +endfunc + +func ff_h_16x16_rvv, zve32x + addi a2, a2, 15 + vsetivli zero, 16, e8, m1, ta, ma + + .irp n 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vse8.v v23, (a0) + + ret +endfunc + +func ff_h_8x8_rvv, zve32x + addi a2, a2, 7 + vsetivli zero, 8, e8, mf2, ta, ma + + .irp n 8, 9, 10, 11, 12, 13, 14, 15 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 8, 9, 10, 11, 12, 13, 14 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vse8.v v15, (a0) + + ret +endfunc diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h index b8ff282f8a..0ad961c7e0 100644 --- a/libavcodec/riscv/vp9dsp.h +++ b/libavcodec/riscv/vp9dsp.h @@ -66,6 +66,12 @@ void ff_v_16x16_rvi(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); void ff_v_8x8_rvi(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); +void ff_h_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_h_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_h_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); #define VP9_8TAP_RISCV_RVV_FUNC(SIZE, type, type_idx) \ void ff_put_8tap_##type##_##SIZE##h_rvv(uint8_t *dst, ptrdiff_t dststride, \ diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index dace51cf06..eab3e9cb0a 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -86,6 +86,9 @@ static av_cold void vp9dsp_intrapred_init_riscv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_16X16][DC_129_PRED] = ff_dc_129_16x16_rvv; dsp->intra_pred[TX_32X32][TOP_DC_PRED] = ff_dc_top_32x32_rvv; dsp->intra_pred[TX_16X16][TOP_DC_PRED] = ff_dc_top_16x16_rvv; + dsp->intra_pred[TX_32X32][HOR_PRED] = ff_h_32x32_rvv; + dsp->intra_pred[TX_16X16][HOR_PRED] = ff_h_16x16_rvv; + dsp->intra_pred[TX_8X8][HOR_PRED] = ff_h_8x8_rvv; } #endif #endif