From patchwork Sat May 4 15:03:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48503 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp431858pzb; Sat, 4 May 2024 08:04:45 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVJT6e7UoJl2vuIRb2FcMZ1vGTXxKU5YGg8qu9FXqFO2+cku6nKTgi5i5hnrLsrHIJ9OxBy3KGR4xR847D0bbdIchsvMxbwe17GDg== X-Google-Smtp-Source: AGHT+IGaeMFM8+MPw2hU0n4mL/bfWYQKJFdO/+zKiaUBkjBtKxKMSCKic7p1Fi3mHgNJsKcSiC0G X-Received: by 2002:a17:906:c10e:b0:a59:ab0a:a170 with SMTP id do14-20020a170906c10e00b00a59ab0aa170mr1402176ejc.1.1714835085261; Sat, 04 May 2024 08:04:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714835085; cv=none; d=google.com; s=arc-20160816; b=fWqgjWcoFv/Om/IJ9B+EX3wBbAbt0t7PLNxQLE3enSukMSJihZBO8W9WwppSk/QeQo Tt6uHKuoSaGhLtYxzYqT64XXYEaAjvPdP7mXT1lSf7xQYX92PDTpn+D4Vn/SU43SFXi+ Fv/TSzJ9kn9O7CSueey4qt+4XRWVmzdahibbG4Rrc6mc1lu6ySJDLm48VAPJ0CNkWhgP hCNdUCenm/bVs7cQJXcVBZJKlol++LVbBGVhgbqYF8JAROaUFpmFI4A+LvTghEXoyeLQ 6ZXtxxMZ9BuIZkLkTrKmAHmET1F6xe8GXUELfcTP9HXFYP8IqAxp+4HfEAe3UKGpmUJp lMAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=EK4kI5DzUEz29x0Oi3v3MQkp7KIZlw4y7kn7PjK6IjA=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=SluQnUdtD8+/vKMpnoKdCowT22SjXmVQFZ/YoZSuQM+cu/OoPVFN3skU1m4UYUuNK4 7nUzrgrRBgZdwYuZ1e2TkIViECYXqk+XZvOJ1bd7Gm7Q1ms4IZRG0CVTJ+zJIUqulrQK fGjWU6VW2DZneUJDFJu1f+yl9o1nYgjIWuQ8lmlZGQL+Mnm6c8zB5bD6HmfFnzI1Zt7F ms3ycjHRY5CkjKeuuKh/HfhHDLAhRN3hHRXRXkUP8kPRV0FnFl9/l6wBHkU2qdv48jz7 TkuiGedyF4o6lU+ZsSVDorFxJEOB3xoR0HtVGxqtm25B9v4hVg1AvlK7xnKLjl5uLirM jyQw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=h5p8g+JX; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ho7-20020a1709070e8700b00a5992b2c6d7si1765883ejc.603.2024.05.04.08.04.44; Sat, 04 May 2024 08:04:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=h5p8g+JX; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BD4D868D7C8; Sat, 4 May 2024 18:03:44 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-49.mail.qq.com (out162-62-57-49.mail.qq.com [162.62.57.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B0C4B68D77A for ; Sat, 4 May 2024 18:03:33 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714835005; bh=pKRz8gr6wEbeltrNXK0E5ia7eIMwrxdC5u9/6X7zMjI=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=h5p8g+JXRZPQqrxUNbzydVc91kbuIEA4cmzgGPlI4RfQJ3ODp0N+6/mL4FM+yzUhf zQbcL8jb+6PCngNmhRp7zZVEM3v5yBx1QKnISYS8fhytwBUwlScr07Rhq90djrP+pY wYLZABv3FsPu8jkeiZvdIBJw2eqCLdzM/e0dRTpY= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id D215A30; Sat, 04 May 2024 23:03:18 +0800 X-QQ-mid: xmsmtpt1714835004tph746dm1 Message-ID: X-QQ-XMAILINFO: NafziRg7Bx696VXO8kUdRRReIPS7qHGn45EUi/5qOUUB0eXaXJDj1+0brcaDLP 4T0LtUQGfspP8IZD+evIgFZjTEZEEhX+X50f/3kInbPSOUDshC0e1g40NczupItoAIN5ipg8h/8L a/2wc1sWMVoc5B1HIH9EdO0DCeGPMVLrIp2alwZVrXnBZwpw5+92I3bvSnmnOERJYkK8rThsDkWU H847CrlJlVOucU6u8FJA2Ri2k1KgVfF1EI4KrCHFXkRyvJKCELV/jaYeNxGu+wTq0XyNfAzGh5Eo +XxVJMNK2OkZp/SRDFJPG9cIYl9LsJGyA/+BGxfLcQ/XaJ04QwqMNM4UdCdoM241I9fjSrxBsXzL hQ70Twvi2NdEhf5jysBFg4fLJ7FUNQukV9HgdoEdnKQy+DgIxsAlhLLxkOVhWjTpnQVdbhq9YF8Z pWLyC6EVQGkQTLRDSYa8TBW4FyGWkPFY3ba2OSZL/K65MyBDfw7AJnvmYFRoLPoB11nDOGA0oLMb lqyhnnoAoCCBBdQ+88dSwXHG7RtrXppTl57nC7sJJ3Xr4ACmeokPfVJ7eQMLz+a+BG9ePF5EwsvV Udo0t0uztW99W/CPofCOByvg9Wll4E8aq0DpdmAwIrET+xGSd8WO7/3NSQgj+fzjaf8wVeDnM93w Jg2bXITHl9B4px0wf7PqfaY2pkf4k84GV7Vv6K2wFtfrDM77Zr9hd6ITdVdsaTjeO1hJXOo4j4D9 mAMUo3DHLfBkpVybMk9uRTbNsc3ctlbMEL3a7iyE2oqZPtQwripcBj1DgyebgrKBFboQ0ufyLerr VfXeg6RGp5NF9p9SNe2ba90hndnhg7hzJ2/S0XBafIYxk5bFNJ2edIgE1SOpdSuQaZsmOGLltEw0 h4mpnuVd0pK6y97yoGIf4PGNUYOimszHQxHLthEJiHKdWa5Mn/IBXov9C9reh3lG/P3qDkTnyFU3 4yQILr1sxOaoBFgjjGQG/r359Gz061 X-QQ-XMRINFO: MPJ6Tf5t3I/ycC2BItcBVIA= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Sat, 4 May 2024 23:03:11 +0800 X-OQ-MSGID: <20240504150313.2472910-8-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240504150313.2472910-1-uk7b@foxmail.com> References: <20240504150313.2472910-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/10] lavc/vp9dsp: R-V V mc tap v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: hYcg86QeQHAe From: sunyuechi C908: vp9_avg_8tap_smooth_4v_8bpp_c: 13.7 vp9_avg_8tap_smooth_4v_8bpp_rvv_i64: 5.0 vp9_avg_8tap_smooth_8v_8bpp_c: 49.7 vp9_avg_8tap_smooth_8v_8bpp_rvv_i64: 9.2 vp9_avg_8tap_smooth_16v_8bpp_c: 191.5 vp9_avg_8tap_smooth_16v_8bpp_rvv_i64: 21.2 vp9_avg_8tap_smooth_32v_8bpp_c: 770.5 vp9_avg_8tap_smooth_32v_8bpp_rvv_i64: 66.0 vp9_avg_8tap_smooth_64v_8bpp_c: 3068.0 vp9_avg_8tap_smooth_64v_8bpp_rvv_i64: 262.5 vp9_put_8tap_smooth_4v_8bpp_c: 12.0 vp9_put_8tap_smooth_4v_8bpp_rvv_i64: 4.5 vp9_put_8tap_smooth_8v_8bpp_c: 43.7 vp9_put_8tap_smooth_8v_8bpp_rvv_i64: 8.5 vp9_put_8tap_smooth_16v_8bpp_c: 168.7 vp9_put_8tap_smooth_16v_8bpp_rvv_i64: 20.0 vp9_put_8tap_smooth_32v_8bpp_c: 681.5 vp9_put_8tap_smooth_32v_8bpp_rvv_i64: 63.7 vp9_put_8tap_smooth_64v_8bpp_c: 2692.7 vp9_put_8tap_smooth_64v_8bpp_rvv_i64: 253.5 --- libavcodec/riscv/vp9_mc_rvv.S | 32 +++++++++++++++++++++++++++++++- libavcodec/riscv/vp9dsp_init.c | 3 ++- 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index 58b00889ce..151d7702ec 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -222,7 +222,11 @@ endconst .macro epel_filter name type regtype lla \regtype\()2, subpel_filters_\name li \regtype\()1, 8 +.ifc \type,v + mul \regtype\()0, a6, \regtype\()1 +.elseif \type == h mul \regtype\()0, a5, \regtype\()1 +.endif add \regtype\()0, \regtype\()0, \regtype\()2 .irp n 1,2,3,4,5,6 lb \regtype\n, \n(\regtype\()0) @@ -239,6 +243,19 @@ endconst li a5, 64 .ifc \from_mem, 1 vle8.v v22, (a2) +.ifc \type,v + sub a2, a2, a3 + vle8.v v20, (a2) + add a2, a2, a3 + add a2, a2, a3 + vle8.v v24, (a2) + add a2, a2, a3 + vle8.v v26, (a2) + add a2, a2, a3 + vle8.v v28, (a2) + add a2, a2, a3 + vle8.v v30, (a2) +.elseif \type == h addi a2, a2, -1 vle8.v v20, (a2) addi a2, a2, 2 @@ -249,6 +266,7 @@ endconst vle8.v v28, (a2) addi a2, a2, 1 vle8.v v30, (a2) +.endif .ifc \name,smooth vwmulu.vx v16, v24, \regtype\()4 @@ -267,11 +285,23 @@ endconst vwmaccsu.vx v16, s7, v30 .endif +.ifc \type,v + .rept 6 + sub a2, a2, a3 + .endr + vle8.v v28, (a2) + sub a2, a2, a3 + vle8.v v26, (a2) + .rept 3 + add a2, a2, a3 + .endr +.elseif \type == h addi a2, a2, -6 vle8.v v28, (a2) addi a2, a2, -1 vle8.v v26, (a2) addi a2, a2, 3 +.endif .ifc \name,smooth vwmaccsu.vx v16, \regtype\()1, v28 @@ -411,7 +441,7 @@ endfunc .irp name regular sharp smooth .irp do put avg - .irp type h + .irp type h v gen_epel \len \do \name \type .endr .endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 97f02e601d..ff7d445f6a 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -125,7 +125,8 @@ static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) init_subpel1(4, idx, idxh, idxv, 4, dir, type) #define init_subpel3(idx, type) \ - init_subpel2(idx, 1, 0, h, type) + init_subpel2(idx, 1, 0, h, type); \ + init_subpel2(idx, 0, 1, v, type) init_subpel3(0, put); init_subpel3(1, avg);