From patchwork Fri Feb 23 14:46:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 46498 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:26a3:b0:19e:cdac:8cce with SMTP id h35csp1000186pze; Fri, 23 Feb 2024 06:46:37 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCWJPqbMcCrcIsk66zm8iZuEwqllgGRlNdML5ubYWT9H3v9BLf0IkpOgVT3gMM9TVdZjXCvew3Mz3Iv+76wOtpNvj9MDakYVnkR5CQ== X-Google-Smtp-Source: AGHT+IGSb/QvqSyWFq1OTMEy5iBd/CAr2imRPykuDIUfh5s3UZyMt7wJRBIIMPw/5xm7H+IfzviT X-Received: by 2002:a17:906:260d:b0:a3c:a4ba:7917 with SMTP id h13-20020a170906260d00b00a3ca4ba7917mr15784ejc.0.1708699596940; Fri, 23 Feb 2024 06:46:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1708699596; cv=none; d=google.com; s=arc-20160816; b=W793CflSTpj0lDxDJm7/JvjRlYFRl6oPPJQtmlKlBJjUKQSyERVQtSBMjsAFHOlo6X jsMcKy7zxGI3zlhyAyQ6Djd6EmxG9D9+hXvN9nhvJLh3MVU6yUWTPAkN6O6wZX0+ATKR dRtTXssl6O8P5s58CXIsfslHHrf8OrRCdni2i+GtAtp6LnVlZ2sMpUNaFmgUFwBYImgA 4M3njxIyN35dwr7TWMym34Gbp2Mjw7aL73DPbQG1OzquLdoWBxzYOEok5o/PXmjcbWaS 9Jp6w4Sl3L4pSu2lEYDK2WhVexlY7/9KjS7C6Y3Gr/cfY7Pa60YzRo1QsTQiRVpz/tlL oHog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=nfqdBJrqgaDWZMolYMrUXtbuv1F6dkXe6U59S40whXU=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=Wk+M3oNBo9wCoYJ88pVfqVj4F0HnRqcqb3Tdd8JKDNEYgbIi+emYochFfczZyqJiox V2AWLvbXbffhsliT0a3xdkFyx1jnB600wg55zj5l+ghYnNWOMY/htt8IVDQNp7pGn9tX e94EkosZRD934oDXBAu1ykxd+C4oen9ndi2N4H0zhzrS8C2/1YyoK9x/J23XpH7+GkDB O6a6N4B65d1AYUcEGJtzN+6UJouyji5x0dhImUT8x7iwgyrsCU9+rZ6VOGYu0bxWmoYJ pQltr0yZlQfVPqAKduudIysIyykb2BmT3sndSzs27LZme7IOocmhoC4TlLYhcunqjMXY F9dQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="kU/TqcPZ"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nb18-20020a1709071c9200b00a3ef1214742si3820524ejc.85.2024.02.23.06.46.36; Fri, 23 Feb 2024 06:46:36 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="kU/TqcPZ"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 288706808F5; Fri, 23 Feb 2024 16:46:34 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f50.google.com (mail-qv1-f50.google.com [209.85.219.50]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7310F6808F5 for ; Fri, 23 Feb 2024 16:46:27 +0200 (EET) Received: by mail-qv1-f50.google.com with SMTP id 6a1803df08f44-68fdc714187so1647756d6.2 for ; Fri, 23 Feb 2024 06:46:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708699586; x=1709304386; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=6co+zFYR1yLmmnEqDCxuogSACF4p/a3livoNAq9Akk8=; b=kU/TqcPZAfZkTS5XQo659cUro5fX2PVvkRqAw8AmT8oNNhnTk3rwLpeWr+wujG/p0a sLS7oTzkvyVGcLTEWfttmL/VCr0UgDVn0cIvbSefx3eVsOP0SRTM8PhSFGbYKXHXMJHO G+guZ+axFv9zfKQTTqpa3Tx1TpjM/SGfWim1WwBVI609tEL691kUBWCyzdRvcbEUhjI7 dRh6Qgh73fT5ZN4h3CfL67r5vWtzNoO485SCCzhSBYiBBXLP3HXNa+uMT0HqVQz3S5eh L6r1Wd0FsFi/4OuXWL/Enmb+gsaW61s978kcjBw/lKgIMzdP1Cp+LtCb7hQS9IkQrmK3 7IrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708699586; x=1709304386; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=6co+zFYR1yLmmnEqDCxuogSACF4p/a3livoNAq9Akk8=; b=lYJN0mnOHpVmxUTO4uSfKJocQGF/j5nzWjWw10Br0+nWJ7xTPTeJ/xsv1aspFh7yTA kHG3m9DZQFq2Ksy0VVDd5fiEDsHIVLZ1zWZjemGCy/3YtNDX0Mdpeezso17NKp1ZsRWB 5z1CRSvnk1ifg0nDY79JqGp9qKYckyfADRSm3B+UsyzVK4n8NHv1eGp9o28YViXhT7ht n4ZM5p2+kmZ9hEmhiIw2oYDUVjIlz8q8aaqwq6Ht48mhGBHTPTs6qDFGxTa/tb1EsPmH PUenxZS/Hz89id2IF+FtBSXThjiZncjHMG+6ZlgxSLfijSJDB6V8S3xY3Nm+P3b9wkGN G8Ww== X-Gm-Message-State: AOJu0YxoYVig3FMVGv0DdiHNeY7u/owuatpiB5piT7FHfppYAPS6nB9g aM8I2uhStZomnECc8PguwxfEimCb+RlA3pmkmU9ZqVHwKG5L8fj/bNknZAuCuHH4XQ/OlxCbwp4 t8MBHQ5rqX3uMIVulnYXE4kTUHyolQjB8edQ= X-Received: by 2002:a0c:e202:0:b0:68f:5cf1:ffdc with SMTP id q2-20020a0ce202000000b0068f5cf1ffdcmr2103963qvl.51.1708699586180; Fri, 23 Feb 2024 06:46:26 -0800 (PST) MIME-Version: 1.0 From: flow gg Date: Fri, 23 Feb 2024 22:46:15 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 3/3] lavc/vp8dsp: R-V V put_bilin_hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: jRbnnakqytBq From e1a01b1e0a365935868d7825d53c7cc64e2c1787 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Fri, 23 Feb 2024 22:35:23 +0800 Subject: [PATCH 3/3] lavc/vp8dsp: R-V V put_bilin_hv C908: vp8_put_bilin4_hv_c: 567.7 vp8_put_bilin4_hv_rvv_i32: 255.7 vp8_put_bilin8_hv_c: 2169.5 vp8_put_bilin8_hv_rvv_i32: 528.7 vp8_put_bilin16_hv_c: 4777.5 vp8_put_bilin16_hv_rvv_i32: 587.7 --- libavcodec/riscv/vp8dsp_init.c | 13 +++++++++++++ libavcodec/riscv/vp8dsp_rvv.S | 35 ++++++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 10e1498d01..02dbda979e 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -65,6 +65,19 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_rvv; c->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_rvv; c->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_rvv; + + c->put_vp8_bilinear_pixels_tab[0][1][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][1][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; } #endif } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index a58c197ba1..9d4ffed255 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -188,3 +188,38 @@ func ff_put_vp8_bilin4_v_rvv, zve32x vsetivli zero, 4, e8, mf4, ta, ma put_vp8_bilin_v endfunc + +.macro put_vp8_bilin_hv len + li t3, 8 + sub t1, t3, a5 + sub t2, t3, a6 + li t4, 4 + li t5, 1 + bilin_h_load v4, \len + add a2, a2, a3 +1: + addi a4, a4, -1 + vwmulu.vx v20, v4, t2 + bilin_h_load v4, \len + vwmaccu.vx v20, a6, v4 + vwaddu.wx v24, v20, t4 + vnsra.wi v0, v24, 3 + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + +func ff_put_vp8_bilin16_hv_rvv, zve32x + put_vp8_bilin_hv 16 +endfunc + +func ff_put_vp8_bilin8_hv_rvv, zve32x + put_vp8_bilin_hv 8 +endfunc + +func ff_put_vp8_bilin4_hv_rvv, zve32x + put_vp8_bilin_hv 4 +endfunc -- 2.43.2