From patchwork Tue May 7 16:54:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48639 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9c99:b0:1af:836d:81b3 with SMTP id mj25csp38817pzb; Tue, 7 May 2024 09:55:40 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUESkzdNODheLgAj83kc07wh8utJN0etWzsCzrM1A8PUDkVMdqmxllpOCblgXj6n8GD4vJ38ghFOs3ZXlmQz64LvudaJTy0fvypXQ== X-Google-Smtp-Source: AGHT+IEfVnOP2RhSp+G5iVfaO5rljOkC5aNPhM67Qxtl3aGAhjolpur7td4gZGuFZuOQbn/QVWzy X-Received: by 2002:a17:906:b05:b0:a59:c944:de4 with SMTP id a640c23a62f3a-a59fa864149mr31380266b.2.1715100940574; Tue, 07 May 2024 09:55:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715100940; cv=none; d=google.com; s=arc-20160816; b=NzcFPuxC9/H/lGdI3nK5B1BLb5fkB/SHhDCJFn9AXbbSRxQGjYbcw2A8NCrkJYODte yFf+ACvzL6sytkKp1wUezcQv227WaSDf7KVYW4LkLt2UxDVCCfgbmdxRHGafaDJ3rj2J Sw49BEeiaehjjvbfK5V6uJ4L05dYlU4+NymyVOFk57TzsHZ1MghUwNVZKqoiM+AmfyEX x8dVwt6seEj3zNER5zdvusDcu7EFALgUHT8lOd6HNIGr6dMxXEG6cSObdoPWbAgnFjQg FpBnm07ckSSqfx0rSQ5eL5NOnmAioMJIMcUTF1jhViu8h9z14Mu+2UzOiOKq8IlLK+8z j93A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=fw7ZMpAR6mJG1ny7+8QtNzWVDsJDP7rR/9kDQwiz0D4=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=VMJ+CG3XZiQ0igW/zPR4qWjqvbuOBAcRE5vDIrnDNY/oTgdhPQWElDjtxY8vn+Wp3x ZprzXEt7ErcvZ2ufN3VKoiaudNsMF8vmU6Dc3bRTXPUX3J+m9wjkez/kEOEGyOH+Ov6i 4kYgHVK3pb08wZCKlAnBD8PBFBbaWjcPm6QVSr/E5I211P8OYYh5DUk6+tPHf+kgEZfJ 7qM3VTk7g+mwFMpWRw6SP2sefJ+4mzT6DPsNC4A7xB73tuOB+cim+W5/5YM0nOIAmH4B tTbGtrnBmE30nXqLgcb+bGapFZujaM/fw3ze3rxAsYLW3ZFPLln6xv/nsDmVaPWZUF8b JECA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=LUsFZCyb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gb33-20020a170907962100b00a59cbd05395si2581810ejc.366.2024.05.07.09.55.40; Tue, 07 May 2024 09:55:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=LUsFZCyb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 63D5368D7C5; Tue, 7 May 2024 19:54:49 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-49.mail.qq.com (out162-62-57-49.mail.qq.com [162.62.57.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9E89568D4FF for ; Tue, 7 May 2024 19:54:38 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715100870; bh=Z3h2cr81i5wM3PQlWZRTpfqQyFle2y7rU8WdS8XIJV0=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=LUsFZCyb3O0H0okweyiDECDXBeVEkQ4wreRuzK7medPP64RJ5LiyZb5DzUa5HF7iT g8f8oTk1odWnWbcD1vDfhVqZyoPPwlOmJQeVukr2hTFhhsZhs/fyxlfRrUmc+DuJqO ETNvriqmBSjbKeObMZoEEn00mCZxgxl1BmFk2+IQ= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-1.qq.com (NewEsmtp) with SMTP id D99244C2; Wed, 08 May 2024 00:54:25 +0800 X-QQ-mid: xmsmtpt1715100867t2qfdyb0x Message-ID: X-QQ-XMAILINFO: OR+tS2yJykEIupb6JcdYYDxMPmX3i5XcOJ5bBN0elv6KHGthfyXt+icKg3D/wf 2QkfjPqMXs2kaZzzCawS+GLJfI8oUD+w7bG67e34wJsHkON/RGxmMHq1GzMJDDKrNKsLrAiRpglQ DvaEZcbcQqtjaEGl5z1wl75LxHCZNYIZgEcrM9li7JiQZUhkJfxTazC1pzys41Bn8+rWMzxY+83a s1AjmjoXg3nX9DVvnH9XgvSVHn1CwGnZSSj4/jU4oHE5epjfjs0Aju7muGHrPq7KSeP2a6yLvWWp tzQf/yRrzKpATI0ZwSJaxeTk91aWwFQcLrcVageTct7P6aiFQjUD/Fe/nDBdpkmWNrQQamUJXwHf eYV1iYjeQqWWN3IyZZ3dvlsEhgjhx9EoH0qbw6hibNDxn26wDnNMUdz576KebEAJQROd33coH6Qp W8oQkYUZDmmYjo4RPB7lo5HEc2qhTzNf7VN2rFeI7TUbvWFdUj+F7Uvc0mxjJ1hrnSihuwDsEtSh MwssRuHSfhNNjuvCaXXvWgRz8VBRN/+C+4IS/T/9fCZ8OThDLGaz0b9Ojtfssj8gFgLhDnPE4mlI bhskK7OdFdLRDSpCrBxvgnBtlJPZ15n8Q4DqgtUGevwHaHvUZedadagd0q323Mks/gA3t0NRfG0s SetY1rZy98hPnBzMDJBpfYbCZYx5ZX85xDoLXHslJMik5jL9Z89ts93PaI+zYiO/OsfFq2rbbXax 7HesXI9ei32S7LTzTYC3S3F9Pg+SHYY1kbYT8pWQ4BnKnY+zO7iOm7ceSKQLwEzqXvLzFfefHsM2 v4C4jq2YejM+c76jYWS43kcJKWVaevu4wRHCdHsxTW5qqqqR6HzQznUInXjVpYMEX4N7M+rKwtKm wQSj3UP6gLixMy+yhM5u36WoaNR7CAUw== X-QQ-XMRINFO: Mp0Kj//9VHAxr69bL5MkOOs= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 8 May 2024 00:54:06 +0800 X-OQ-MSGID: <20240507165412.1306563-3-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240507165412.1306563-1-uk7b@foxmail.com> References: <20240507165412.1306563-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 3/9] lavc/vp8dsp: R-V V put_bilin_hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 1sxVIhMGOSeV From: sunyuechi C908: vp8_put_bilin4_hv_c: 561.0 vp8_put_bilin4_hv_rvv_i32: 232.7 vp8_put_bilin8_hv_c: 2162.7 vp8_put_bilin8_hv_rvv_i32: 506.7 vp8_put_bilin16_hv_c: 4769.7 vp8_put_bilin16_hv_rvv_i32: 556.7 --- libavcodec/riscv/vp8dsp_init.c | 13 +++++++++++++ libavcodec/riscv/vp8dsp_rvv.S | 26 ++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index afffa6de2f..9627105fc8 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -67,6 +67,19 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_rvv; c->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_rvv; c->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_rvv; + + c->put_vp8_bilinear_pixels_tab[0][1][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][1][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index ec8ff917b9..4f232c7707 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -116,7 +116,33 @@ func ff_put_vp8_bilin\len\()_\type\()_rvv, zve32x endfunc .endm +.macro put_vp8_bilin_hv len +func ff_put_vp8_bilin\len\()_hv_rvv, zve32x + vsetvlstatic8 \len + li t3, 8 + sub t1, t3, a5 + sub t2, t3, a6 + li t4, 4 + bilin_load v4, \len, h, a5 + add a2, a2, a3 +1: + addi a4, a4, -1 + vwmulu.vx v20, v4, t2 + bilin_load v4, \len, h, a5 + vwmaccu.vx v20, a6, v4 + vwaddu.wx v24, v20, t4 + vnsra.wi v0, v24, 3 + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h_v \len h a5 put_vp8_bilin_h_v \len v a6 +put_vp8_bilin_hv \len .endr