From patchwork Mon May 6 03:38:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48565 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152615pzb; Sun, 5 May 2024 20:39:05 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV0D3oR+gmnxqHKfBqflA99ilOgYo9fYIpOaq3GD5xGDixWLb2PzMlco733/oOSm6zz2HzWgI2kRi8EZlcclpT030KXivhuBhZNvQ== X-Google-Smtp-Source: AGHT+IFSZLWs0f4BDAUxHVDlv7hDbOBlwxRUEM+3RWioDlvb1cGgcyzhy4Yl9ZS9+oupc8R/GkE7 X-Received: by 2002:a17:907:7e84:b0:a59:a0b7:1850 with SMTP id qb4-20020a1709077e8400b00a59a0b71850mr4653274ejc.5.1714966744912; Sun, 05 May 2024 20:39:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966744; cv=none; d=google.com; s=arc-20160816; b=un562//aLLNRhhYFw5Xw1G9ZGZDQj4IbUj8UP2q54NfiACljNE4srrZQMDATKA1cPQ +magGdS442p/hJmb6btLLNSd/S/MLvzyzT5nMZO6s+1HeKHPKpOyw4c50V/EhqKdgAI2 QJdGryGg9vLUpecFv/3uwHpDXDP7BnT/np7slhdNKBrnAU9lD8BectmeRJLtkgjbOaIc elbf/DV3yGTBIQl3iEyrikfav4401AsIY0/vRO7uAVMMuZ9FH0vg8vsf5LlhgYuvrsBc YGKrNmBRmDNfctZrKMXkIw7qApW4evI3OiQH5pnqVxI7hWJIKdDNIBjAm9bqKnZku3l6 0ycA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=pTubhFs3swzwDDKXD5KBbGWMiKNtHZYNBbAia1NdOqE=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=I28OGCyRb49J6X9NhPHtoFFoIyaKzh4oWDVXm0ATg4jwtGXDUscvc1afQTn4gW54TJ 5MSBauBw17YatR1NHNM/LRAE+jx+BWb6Km4tE/nFA7KSi2/rDV2Y9MU14oySgmNhHpyQ Z4pXHLpVgf3gVUmVEqa28d2qxi+ySBLqg8Qh3NWOcLr3ZCTaHSO9KT5ji6VBpdahr4hp FZjUO0ioG3sWhzGhzYiOibGXihMGqnaPz16TuxxyTXA2DzD2fGNSsIwayprIEE6rmmTL RcdDIsA4N0c/zaNGxATDZXuaXh37TlGM22FvwamX+E37bNdZsSwt/xakIPEBckSi45YH Bx3Q==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=pfpt2Z4u; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qb17-20020a1709077e9100b00a59da6b0f4asi184102ejc.1035.2024.05.05.20.39.04; Sun, 05 May 2024 20:39:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=pfpt2Z4u; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1A86F68D587; Mon, 6 May 2024 06:38:36 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-49.mail.qq.com (out162-62-57-49.mail.qq.com [162.62.57.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5770368D51E for ; Mon, 6 May 2024 06:38:26 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966697; bh=mTRbdOftLbxdW9aLFNcRhGC8OthIs7V0PYWKwI25A0A=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=pfpt2Z4u3QSURjw5r4bsBcutdC8OTBsWdbvShHiyiws4yh6dI5mFfXVoclFFzih9K jnCxpRiJuNk0d/iPmFCwgMssYVT1itemcV71dLyPxz6fzuMUWfuR4qxk6EMJnhqQRz l1vxT0ihlfuEI402Jcm5azbuL8dJG2bqJIxEqXxw= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966696tijmpzoy7 Message-ID: X-QQ-XMAILINFO: OKKHiI6c9SH39Bt6DCcI3Bm4Z9N0g1saMc8QehSFa0x1tzk36bB0jcuNrLnwL9 toi4RdKe7QlqOjOmB0LzWVDf+kJDUMLrYs8mQp1iWSm98/BRSo1B7nmE3fFxWq5nrAagKz8WX1GP EAke+DHv5FjOAnEa7o/WaxwqiTmVFPaCDDmFMRZqDS/g4Y57EMYwVyAqa/l1v6sbS5Rrq6YedBco V0yWza4F/ovv81mj2LrAoqC5zMR62AhLrjAjC1IN06xLXqAjhWXFXGhgcrsFfBR8OZMDxGL9Cr65 5T4VHIEhk2cjVtDBdfVMJsDB6Y+//9+UY0lveKWHvqEG48GJZcCKsK8Me+sV4ZzVLLLsTY/c74bv XACNwCeeFhzCNYhN1rwO9OyS9DJgVvn2A178iYyLy9xioqoM7s6OGSxD6ByqU8EnSqePWMvBLdmb F8XFmEiEZP+hqCuY4Ocf/yIZvtHhropThnrXhX0SKnHjWyz+DxcK2s8GfvMLu8fGRDBWPN634O1F a5SvzkOcCunYZtG1Smfn8+rEFW211hffkgm4dxANj4YaDJoMSccss3MSWc75ZkoG16dUs4ZNfvdg ypBNuNfCXTcVogbKO0vK1zl+cECoga0pkfqUjw6j8GxngvmWoIN579OPdo+7w+3iIhQT7rfbYttH yzRGXJbKR3bYcGBN9VkmQOLFrYFHM4CdaOMqYv0vOn3xxi/Tm2jLlVTgqGGzjnCS2wDTGIZsWhT3 RMTwDaEb8e68WIQ+E97Htj2TmRcAeAN43RDyIC5Dn8qMqGtr95R+mdQkeTT9WKgGlixR3Uz5rN3+ oS1VOIS+3lzP+LUHG0xxXrSHGnZaZr1sD67hIphJn+y1x/UEiCrxzjaAcWWz5VZNR1WI8BWvJeM4 Be11mhs9L1/sWJvxkPGZxtXz7jenaE/UABI8a52xK/4uJArLJkucjJk8Lt2+AWydv8jydjcFk4kH jIHBd5KGhD7+G9xgKXMw== X-QQ-XMRINFO: M/715EihBoGSf6IYSX1iLFg= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:05 +0800 X-OQ-MSGID: <20240506033809.3790245-5-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240506033809.3790245-1-uk7b@foxmail.com> References: <20240506033809.3790245-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 5/9] lavc/vp8dsp: R-V V put_epel v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: aBkj4bDeoabj From: sunyuechi C908: vp8_put_epel4_v4_c: 11.0 vp8_put_epel4_v4_rvv_i32: 5.0 vp8_put_epel4_v6_c: 16.5 vp8_put_epel4_v6_rvv_i32: 6.2 vp8_put_epel8_v4_c: 43.7 vp8_put_epel8_v4_rvv_i32: 11.2 vp8_put_epel8_v6_c: 68.7 vp8_put_epel8_v6_rvv_i32: 13.2 vp8_put_epel16_v4_c: 92.5 vp8_put_epel16_v4_rvv_i32: 13.7 vp8_put_epel16_v6_c: 135.7 vp8_put_epel16_v6_rvv_i32: 16.5 --- libavcodec/riscv/vp8dsp_init.c | 7 +++++++ libavcodec/riscv/vp8dsp_rvv.S | 34 +++++++++++++++++++++++----------- 2 files changed, 30 insertions(+), 11 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index a4b7d49932..dc3e087f01 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -90,6 +90,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; + + c->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 30955a7b95..bf268e4d8d 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -161,9 +161,13 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel_filter size +.macro epel_filter size type lla t2, subpel_filters +.ifc \type,v + addi t0, a6, -1 +.elseif \type == h addi t0, a5, -1 +.endif li t1, 6 mul t0, t0, t1 add t0, t0, t2 @@ -176,19 +180,25 @@ endconst .endif .endm -.macro epel_load dst len size - addi t6, a2, -1 - addi a7, a2, 1 +.macro epel_load dst len size type +.ifc \type,v + mv a5, a3 +.else + li a5, 1 +.endif + sub t6, a2, a5 + add a7, a2, a5 + vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) - addi a7, a7, 1 + add a7, a7, a5 vle8.v v28, (a7) vwmulu.vx v16, v24, t2 vwmulu.vx v20, v26, t3 .ifc \size,6 - addi t6, t6, -1 - addi a7, a7, 1 + sub t6, t6, a5 + add a7, a7, a5 vle8.v v24, (t6) vle8.v v26, (a7) vwmaccu.vx v16, t0, v24 @@ -206,18 +216,18 @@ endconst vnclipu.wi \dst, v24, 0 .endm -.macro epel_load_inc dst len size - epel_load \dst \len \size +.macro epel_load_inc dst len size type + epel_load \dst \len \size \type add a2, a2, a3 .endm .macro epel len size type func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x - epel_filter \size + epel_filter \size \type vsetvlstatic8 \len 1: addi a4, a4, -1 - epel_load_inc v30 \len \size + epel_load_inc v30 \len \size \type vse8.v v30, (a0) add a0, a0, a1 bnez a4, 1b @@ -232,4 +242,6 @@ put_vp8_bilin_h_v \len v a6 put_vp8_bilin_hv \len epel \len 6 h epel \len 4 h +epel \len 6 v +epel \len 4 v .endr