From patchwork Mon May 6 03:38:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48564 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152586pzb; Sun, 5 May 2024 20:38:56 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCULxTvhkxfJNJoHPtNYL8nO/5EfDhzumFnNXojfivYpedVtRgIGrDPIvIIZHaKH+O0+7I8/eC6nYiYjW+Bt0JxKpGK0S2WmbsT43g== X-Google-Smtp-Source: AGHT+IEGUPA9oZQv4894/rKljz/FDZeQnws2mFOdUlI91MO9e3Ei6QES+ZoXpBHLZS2lQy2J9vH2 X-Received: by 2002:a17:906:d15a:b0:a59:d0fc:7ac5 with SMTP id br26-20020a170906d15a00b00a59d0fc7ac5mr582889ejb.32.1714966735799; Sun, 05 May 2024 20:38:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966735; cv=none; d=google.com; s=arc-20160816; b=v8utQphPkDzqutTs+ueexobItnei7ecQ9haanYqhWPKTJBnAOByCvdtIyGvYCpKRS7 riXXrcnBDyqw6Z/uBYETR4q/ShBTwJT0LLDetdPmO93d8Jswd4Uf/5El4fKZwNLRqq09 5kor6Wmmk0SR1RzNDaS5xNvXGDvvHViNLzODLCDxxwNQeVTtBsb3OCAL9YvvEm+hdLTY j+kjP9uPqIjIv32+31rqKSjuWstZXigtCFU5rYlEtZZW5WOPgXau0wJpJ68ZbNevUztK +DlhpGdaavwOKiGNXqgwOPuZveHA3+7yn5k/KghGkrDYJY4VkDM2XlDsS5wTskK4pzMm GJ6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=CDmG1RbItQOBcycEn7DbDGOsfbaThMKU+nDlCtlp8Yk=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=LL7+fOzkqOyVGGOr0NEGgE04vxZWBYy1dhqK9N1dQCLns/qCuFC4ec9iewGPXBPccu Mw0pbL4tIiwHcwtP+7qrO1Fu74gmrtQX0Rj9Z39Ku8fO1UyURt6izpAcvajSCF+C+APJ LRLTFVHj2sxTGC9BsU7nIxQmUAOOk01nPoKgf3JV5VZOWzWPxjtYL5IbSfQuYSlE8RMA CSBNbZKWESMNrb/JqH0SaH0cgL/KJXgytGqfvs1EiYxJlKMeFgJutOwM7Shd9OUpMFUq bqY1OLQNrR11SfQp4w8/HEbgsRd6zBfGDZa8NgdPE80EqCJ024jLsQACN3aAXCpInPwD aGDw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="y/guu6h4"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qk35-20020a1709077fa300b00a59ca336855si1094153ejc.422.2024.05.05.20.38.55; Sun, 05 May 2024 20:38:55 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="y/guu6h4"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B481068D5EC; Mon, 6 May 2024 06:38:34 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-236.mail.qq.com (out203-205-221-236.mail.qq.com [203.205.221.236]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8908968D57A for ; Mon, 6 May 2024 06:38:25 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966695; bh=Lk3se95zPVF5Fb0IL6sgO61Bx3DmRtyKpnOU5FYWdMI=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=y/guu6h4HlKDpevdeXeAhEoVY3A2++N0AjbOwAZXqfKCD0IXVFgWDsRPk+vL3pwiF fE/GE0bHBxS0J+uV330DH+GUFKZN/+vSKS0D/heM3wo3WNN0Hd34zXgzZvIIPswl9e HTTGwJpw6Bb4bXXjN4x05APs0wcdyX0jSfvhNU7g= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966694tjecf56xs Message-ID: X-QQ-XMAILINFO: MNHTiO1x6sV31V5QipgusFjbXdBw4u/oiLf23AyGurDSAS/w2rFCFLLJUCUnn/ 6nQYTk9NjOZmddqw+Z22PNQtDvbwfZRintUvvj/USHGcXZB7qxMc9/Uu/HPav2EnY9RvMNpHZXbX tI6qejRZ1nlJYnA9SwtNbAFRwnyCKXf7SMKsSqGJ424YEHZUdh6HTdAMeVntc27tXss7SaZq8N8+ /koayZBqLCHo0QFhmnDNgfylUc5kuVuJk+3YfXSKOloaq8iv+PorGeV0c469DA53wIouS332fM1+ cyuK4aKkukQ8mj5n0/PC72ArtfCw7OWRsLvRBPinF3Fv08C0aGxuuea7c3W0ieicskXp4Oo1Arnc Cjs9Vy4TW5eQ0g1Z/myIf+KnTdMrVv7np6I+VtsEGxllgs0WEBKJteOKEPY/mT272wyNR1iHiiOm IMT8fW8VFQ8tvez2rocMvxIuNy57sI9JidmVTtgiGnEAcbF0zmZbOooSVBR+tuvqx2oIe/1A5jC5 jhlW49l3U8mXfeIaCsCdvLrppDS3/usF7mMDNaskpKegw0MaQj0W4YHETwUgiHTxCWFobrBSTxlD VVvDqNrLh8QNCSEHeVKo4Z40qWoEuET85WGZfO6Kz5BXnQ9TE0QgycW0CSd0aOpHGR539HzWBnxr R0+DufbdBjWcHL1PLk9tCIqQ5HElY8juOOIeIr1UngEcOg4gp316BMraR+WqHqtdMRKx/b37lSmq htakz/pTaW7Vqg0ryjjXBPQYTMMffP8FNGWtqGTBLBwNYkKdeZAtCl6HMNwrxlMu0tp0WJJtlTC8 Y0w4IL3dH07HxAEEyE92Sz+2vqNhIxHgFqkwkAIBmCkozuKEpIm/Kz7oGnzUGohPH3WrpHoIBdv7 3ytK1Z3RcDB5Qmq4JpMyQ54WntqbkwBTlrv4nqh7s6 X-QQ-XMRINFO: M/715EihBoGSf6IYSX1iLFg= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:03 +0800 X-OQ-MSGID: <20240506033809.3790245-3-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240506033809.3790245-1-uk7b@foxmail.com> References: <20240506033809.3790245-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 3/9] lavc/vp8dsp: R-V V put_bilin_hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: irHPmJcJhr4T From: sunyuechi C908: vp8_put_bilin4_hv_c: 561.0 vp8_put_bilin4_hv_rvv_i32: 232.7 vp8_put_bilin8_hv_c: 2162.7 vp8_put_bilin8_hv_rvv_i32: 506.7 vp8_put_bilin16_hv_c: 4769.7 vp8_put_bilin16_hv_rvv_i32: 556.7 --- libavcodec/riscv/vp8dsp_init.c | 13 +++++++++++++ libavcodec/riscv/vp8dsp_rvv.S | 26 ++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index afffa6de2f..9627105fc8 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -67,6 +67,19 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_rvv; c->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_rvv; c->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_rvv; + + c->put_vp8_bilinear_pixels_tab[0][1][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][1][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 9bf969d794..d30e4cab07 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -116,7 +116,33 @@ func ff_put_vp8_bilin\len\()_\type\()_rvv, zve32x endfunc .endm +.macro put_vp8_bilin_hv len +func ff_put_vp8_bilin\len\()_hv_rvv, zve32x + vsetvlstatic8 \len + li t3, 8 + sub t1, t3, a5 + sub t2, t3, a6 + li t4, 4 + bilin_load v4, \len, h, a5 + add a2, a2, a3 +1: + addi a4, a4, -1 + vwmulu.vx v20, v4, t2 + bilin_load v4, \len, h, a5 + vwmaccu.vx v20, a6, v4 + vwaddu.wx v24, v20, t4 + vnsra.wi v0, v24, 3 + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h_v \len h a5 put_vp8_bilin_h_v \len v a6 +put_vp8_bilin_hv \len .endr