From patchwork Sun May 5 16:45:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48544 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp962717pzb; Sun, 5 May 2024 09:46:21 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWaF/4KhIoBGBwP1I5CWGm57tcCjeWSPP8SSqvH5KzyZI9ZD2gRDaYWa9WY/esu4eRjl5V/szlT6/cJFG7JFQQ15hRm4tslmcB0Ow== X-Google-Smtp-Source: AGHT+IGK1SmsIBOYy2bNn91TcZcPxh4CWDkdcfO2WhntCUejZNqzVyOibMCkvwRQ3c75tuZ7QeEx X-Received: by 2002:a50:f613:0:b0:572:689f:6380 with SMTP id c19-20020a50f613000000b00572689f6380mr6576370edn.3.1714927581013; Sun, 05 May 2024 09:46:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927580; cv=none; d=google.com; s=arc-20160816; b=MkPRXeACsZzH45GFticJByvjpyT0wRp1wajtZnsEiLKALJT0DHEYTpFJ7uKwOCbhul 1GiGWjd2+Qdaiacd2lacDUQIek0MKM7wC8V0kIvo/UIUJCdMZELe//YbUWe0YnJt0qWl 1vm2HDfb9Rz5SoZ7fOO5Ja0MpOJZvkX8XxCUzCgOoO3qtMIe0xe1MRI1nDlOSones4Jb GNlnUJB0KmuNJFU5ijhZzQZDgijjGwbMCQqPCju13LvpE/X3pKBgFqaUyKoOU+1Ag0fo LnrCQHryB8atBnDMnHQK/vGfXGiJlqBfhPefAEcX4J9DKCQlVfb3a5MH5nqm8zvnPH3w wrnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=qTb3+baoZ0U4zr5qrwiNN0TtS1A1BINWKyaxYmtoPno=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=DQT+eutbQ+HBYjiyElinLA/fYPJSZk5OqPaUBqNmx7ihY42ITBU5MSIBuGSyMa1xGH 2GjTSyz6E1wf53OrdadKnRellsWh4o2wrRvzB1JTXfNloEUmlHFhPdJfrZdY9qIZc/HR YX9/9RTvh7ervaeCDi+Hk4El630VtLBTkHF44KbY68dL0kCkteQ+EFa92LOtgdtaszzd WG6uxTxj29ANm9YoiY5oHNouCLxmiyfLWB43ymjgcjPHinbZj9fuh92cCCb3bv541wDo VGA1z5ezwoN904r6IaWKvRkJLh7QGHUIg4QfT9ULV2q1nYmN7tm18EWOP5t5zY8rjqE5 IQFw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=ACBtQ99T; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h11-20020aa7c60b000000b0056df9749489si3298893edq.651.2024.05.05.09.46.20; Sun, 05 May 2024 09:46:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=ACBtQ99T; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4BC7A68D5B4; Sun, 5 May 2024 19:45:59 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-210.mail.qq.com (out162-62-57-210.mail.qq.com [162.62.57.210]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5220868D544 for ; Sun, 5 May 2024 19:45:50 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927542; bh=jScOU9qhQVYBL127OKM8D7QGsHyj+5Zi4+yiDTVUcy0=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=ACBtQ99TdaEY03R3VwbV+4eM4S5yC/TZg1ni5ea6KiFSVOP8M2EFyU2ry1Lzp3N2U LukGf5eUMq6BFXd4kmJprUCvSyJ7oBT+W2pri83HnUifxorjdXedoaP2tbFiBzutsV F9gt87nP+8FbtveLV4v7jApmbzNCiG4seXXpp578= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927541taxan10ra Message-ID: X-QQ-XMAILINFO: Msf7FzQQGWpRysOQYONc2kzEoHZC43XGUNm542/8h8xU6uNH1NBsApGASHresO +RLYVl9qM12eXfkP8aQRAa0WR5utgX4uxrH3FzomE+GYl9HsDTVeCIaQCRqAKlltosKRlJ+/xxAR lddjJmyVHyEX9Pf2M5PIgDHz2t9ia32AZa8jXV6iKbnj7GNO3cE1OWNy2KT4vtpESr65cUr7qZZd fB4ROKWJc2gIqE/Byp9xeHYEvgKe0XmbyT57bnuQ5CIzFXRreieWRIOLeO4bW6Q65j2ROtue3rof JJyp7mA/EEmeM6c3MTaJDLw3U7WOt3IP44qJF6ZSoguclJRBBVI1GovHKeCPmB0vpA8JFW5NDhro 8Vv+oEjxg4h51OQYF5NHm17eMhWy2euj2p+z2iAJfkgMdSBqGUedNVjWZXK/j4AhRqPhwKiGyz+d 8+B9XVNA3IjC3qk4mZdErpBPIhOMZdkgmyu85govPdOFNg2uBTZajgRkxttotrXuIFDotMa+EiPK owVTzDfxUcVbawvRSnFoLR/86dtIDVQF5OLm9sVrBwcR//LwAcB3NOoPC7b1E+jE1OCdGbjaUpmD iw4OQNXtL22j3+sh2AfwSnD4ZwfTsHUz7uuJrUdl0XbCchNvH4AOIGBFLsJtCR9hQe7K2Mg/MO62 d2ApR+RlxmQcNmUvOqkVJi0JXNoDMWzDotvKm7fRNZww0iQPTH/I7NzCBiWKwsxE4emtjgR/0FPN iLDnHgAtbdSiKtQVRL7Se4ZJbsL2H0gPJDzn5ihSk10og8bhOoUq7W2wRbnaZQSLYrOwearr6Xne 3vHXSKQSuNgZ6Piu9S7tnfGio09aBN7FV8Iq+MQe6zVi/xlw59IrLsII8MwS3toHCqhyY6XLbeI/ WjK1o7XEBZCB8lXeGssI1GoUOQivwrL76iYgqmziwCq+MAAgSGN6p5thSdwyb7Za8qWCY2Q/5zbf rp0cfmwko= X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:32 +0800 X-OQ-MSGID: <20240505164536.872683-6-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/10] lavc/vp8dsp: R-V V put_epel v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: qoHfiOMpDVip From: sunyuechi C908: vp8_put_epel4_v4_c: 11.0 vp8_put_epel4_v4_rvv_i32: 5.0 vp8_put_epel4_v6_c: 16.5 vp8_put_epel4_v6_rvv_i32: 6.2 vp8_put_epel8_v4_c: 43.7 vp8_put_epel8_v4_rvv_i32: 11.2 vp8_put_epel8_v6_c: 68.7 vp8_put_epel8_v6_rvv_i32: 13.2 vp8_put_epel16_v4_c: 92.5 vp8_put_epel16_v4_rvv_i32: 13.7 vp8_put_epel16_v6_c: 135.7 vp8_put_epel16_v6_rvv_i32: 16.5 --- libavcodec/riscv/vp8dsp_init.c | 7 +++++++ libavcodec/riscv/vp8dsp_rvv.S | 34 +++++++++++++++++++++++----------- 2 files changed, 30 insertions(+), 11 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index a4b7d49932..dc3e087f01 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -90,6 +90,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; + + c->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index f5c4c1d85d..ca5581f845 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -182,9 +182,13 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel_filter size +.macro epel_filter size type lla t2, subpel_filters +.ifc \type,v + addi t0, a6, -1 +.elseif \type == h addi t0, a5, -1 +.endif li t1, 6 mul t0, t0, t1 add t0, t0, t2 @@ -197,19 +201,25 @@ endconst .endif .endm -.macro epel_load dst len size - addi t6, a2, -1 - addi a7, a2, 1 +.macro epel_load dst len size type +.ifc \type,v + mv a5, a3 +.else + li a5, 1 +.endif + sub t6, a2, a5 + add a7, a2, a5 + vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) - addi a7, a7, 1 + add a7, a7, a5 vle8.v v28, (a7) vwmulu.vx v16, v24, t2 vwmulu.vx v20, v26, t3 .ifc \size,6 - addi t6, t6, -1 - addi a7, a7, 1 + sub t6, t6, a5 + add a7, a7, a5 vle8.v v24, (t6) vle8.v v26, (a7) vwmaccu.vx v16, t0, v24 @@ -227,18 +237,18 @@ endconst vnclipu.wi \dst, v24, 0 .endm -.macro epel_load_inc dst len size - epel_load \dst \len \size +.macro epel_load_inc dst len size type + epel_load \dst \len \size \type add a2, a2, a3 .endm .macro epel len size type func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x - epel_filter \size + epel_filter \size \type vsetvlstatic8 \len 1: addi a4, a4, -1 - epel_load_inc v30 \len \size + epel_load_inc v30 \len \size \type vse8.v v30, (a0) add a0, a0, a1 bnez a4, 1b @@ -253,4 +263,6 @@ put_vp8_bilin_v \len put_vp8_bilin_hv \len epel \len 6 h epel \len 4 h +epel \len 6 v +epel \len 4 v .endr