From patchwork Sun May 5 16:45:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48542 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp962589pzb; Sun, 5 May 2024 09:46:01 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV9SN398iaK1zHogl5RjNfnti5lPGu2zQplcyiDOOG95xClHySSuTeeQGwusPPQIaBtsWR8F+Sw1y7v6maw5gFr8I76TEmTXTsXBA== X-Google-Smtp-Source: AGHT+IEwpg7W6h+H+pbbIPW1dOQMt94GH9k+d5/4sPq2J9xFhbqvC43U60figWviSGFZ8er0NrjN X-Received: by 2002:a17:906:4ed3:b0:a59:9f88:f1f1 with SMTP id i19-20020a1709064ed300b00a599f88f1f1mr4247678ejv.19.1714927560808; Sun, 05 May 2024 09:46:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927560; cv=none; d=google.com; s=arc-20160816; b=efH6EcCqeiOwUeFooA82/N+kX8NfGhHjBY1SjyVVyrn3lygQKzj/lYDQgLJWRgP4E5 90MfgIZekYMJtUBuKGCR+Wui8eUFgn0bDE1Zxggmg9E1nPDX4/3A+T/CggcWnjpVjB7V IQGPfOzn+yU4/6OYlAy3njxtvS7WKoYsC5OsqOHdj68lmMHONkTAD/UOAPoaQ2ceqHZN KnEANeHD2wTRFKpTJkuF1kdidjLro7gp1RHcitV9/rOPpjfWV/euJSf8iSm6FWruXY8K STOl3bXNLFgRsKjoLfRKlZ8ZdP1kh9/pYNPOJKiCqYYrdbr0h7nftEQXHDujiFf6D95o d7kQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:date:to:from:message-id :dkim-signature:delivered-to; bh=3VGPmzax+lsvCeSNqnV+jdrPzPWjUQzVBWromYyMF1c=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=d017hhJPT1Km1Sa1ggu6K8BgooODLlksmlrS3OkyhLvh0lRdyzYpMbKCRMmVhEaNJ5 uFN0vYAjzrVX9OeVxeZmJZ1Fse0iuYosfDP/AzokQGULJnJWSgXQ8uWHIXXMh/z5wvE8 xh9DbyqAMMBEvgVggcxIgU4gtPlrsaIY3xDEjBIUdkOW9KAFRYGiAaqq8t7BgD3fE/pP 9PpM04MkE3aOQMEryycfC672tCN+d3vVP26xtAYiQvhSg4d4xq78/OpndzH018PNgzcj 6hUh7x6GsE4eySM+aNe9/E+1tsY0yiNFUyQle+S3+3YGyhIJIiWsMqL3GWxPBvv6hiq+ UhQg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=eL172myo; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b9-20020a170906194900b00a5237093b7bsi1836014eje.807.2024.05.05.09.46.00; Sun, 05 May 2024 09:46:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=eL172myo; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C688668D5AA; Sun, 5 May 2024 19:45:56 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-210.mail.qq.com (out162-62-57-210.mail.qq.com [162.62.57.210]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C4CCB68D544 for ; Sun, 5 May 2024 19:45:48 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927539; bh=guC8hC0bTXtJRwAbN7baeYh/nfETQOHAtyGHeBSrxkQ=; h=From:To:Cc:Subject:Date; b=eL172myoK81xueH2c5SOnYaz1gOmtPzttaJLMjtsp4/8HtCEbeTUcNjjrFg/eXoBD Zltm5E1biakfls7IlrTaam8dgkU5Lefypv55Yz++ZlS8BhuC739rrsF10LP51EauWX S3gIrHbauSs2Vuf+j7MCC7iGqs0AScXuC8YcrENs= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927537tf8vfpmqr Message-ID: X-QQ-XMAILINFO: MwqrwaLzgdeb32LlDEaave0rJq9Ia8Fj65/GoePBF65mOnSSHWKbwAyRukT+Ek nY7dPZ4G453yi5wgkOkamIwf1O5TibMJRDg+IEE64qOd6IOobjg4Ntc4OdntjDkTYmyZQq8Aerna BbR3hRg9CyLv389Ta5dDERQU7j47XVTH5c8FD1IIfC6jj4Mhs6tJmpmGGkWfZJXnJncf2bAPOuos Jymlm9SWpj7oYFLKoi+fA5q07uyzJWBUTiJxipFcslorpNWeJM6hduc7lP3InT4PPzvUYMGU5sF/ fiOA5O5s3UrAvZ6+qg5HJpBtZNkfT6x7amodieSldBtMbHXWjkUs1i3Pr1+C84Tn/1w5xE83BunQ sPvKNEaXzqWoG4o2k0a8pYJEXgyfSzi8cIdFqLiD5rxDkIbHNLIbRaobR+oKkqu44GF51NTNmo7f P90MHzdhAeijIxAGbcx7XqzIhhpAWSZQ9dswPYZQmujbj4h6LaMkdQhh9m4E1LYJM5Cm1p9XV9HH z/a5XQYpxRZwuxBVacj9VU9PRh7Q9wnyYoItSfUb2+IDEVB3MmCBNJ1aY83GIB41gozbOsRt2CF8 HeTBcndkr5mhiku6xe6sIiLRHD4UC+ZCDei7uZD31JLd24fcHgWLXfLO03hVzMLcs0oCQDRdMxiE IE62kuJBJZbWO9Br+2mq5g56OvLbI4Ug6Apj9AjZNw413CFNGkK8cJiphYE9Jep/89ysG4icDWtX DQmuFo0O8297i/RCN8K+0BIYQtGDZ/Y523l8GYhoreKx1MOjHhSZ6y6tTD3xfKoL2eSG5dkLZIho K9HXRbMwCnIIpT6t/YCFuBbiHK0cLIpJaqqyvdzY0MNCC7oqiyKapFrxiE91/t/VdhVN5ENEfosW pbQvM2aDGUssaW42U/I07BLkAfuBM4qg== X-QQ-XMRINFO: MPJ6Tf5t3I/ycC2BItcBVIA= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:27 +0800 X-OQ-MSGID: <20240505164536.872683-1-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/10] lavc/vp8dsp: R-V put_vp8_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Zo72X9RMJ9zw From: sunyuechi C908: vp8_put_pixels4_c: 78.0 vp8_put_pixels4_rvi: 33.7 vp8_put_pixels8_c: 278.0 vp8_put_pixels8_rvi: 55.0 vp8_put_pixels16_c: 999.0 vp8_put_pixels16_rvi: 86.7 --- libavcodec/riscv/Makefile | 1 + libavcodec/riscv/vp8dsp.h | 75 ++++++++++++++++++++++++++++++++++ libavcodec/riscv/vp8dsp_init.c | 22 ++++++++++ libavcodec/riscv/vp8dsp_rvi.S | 61 +++++++++++++++++++++++++++ libavcodec/vp8dsp.c | 2 + libavcodec/vp8dsp.h | 1 + 6 files changed, 162 insertions(+) create mode 100644 libavcodec/riscv/vp8dsp.h create mode 100644 libavcodec/riscv/vp8dsp_rvi.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 050c08ee61..526cb5c97c 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -61,6 +61,7 @@ RVV-OBJS-$(CONFIG_UTVIDEO_DECODER) += riscv/utvideodsp_rvv.o OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_init.o RVV-OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_rvv.o OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_init.o +RV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvi.o RVV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvv.o OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9dsp_init.o RVV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_intra_rvv.o diff --git a/libavcodec/riscv/vp8dsp.h b/libavcodec/riscv/vp8dsp.h new file mode 100644 index 0000000000..971c5c0a96 --- /dev/null +++ b/libavcodec/riscv/vp8dsp.h @@ -0,0 +1,75 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVCODEC_RISCV_VP8DSP_H +#define AVCODEC_RISCV_VP8DSP_H + +#include "libavcodec/vp8dsp.h" + +#define VP8_LF_Y(hv, inner, opt) \ + void ff_vp8_##hv##_loop_filter16##inner##_##opt(uint8_t *dst, \ + ptrdiff_t stride, \ + int flim_E, int flim_I, \ + int hev_thresh) + +#define VP8_LF_UV(hv, inner, opt) \ + void ff_vp8_##hv##_loop_filter8uv##inner##_##opt(uint8_t *dstU, \ + uint8_t *dstV, \ + ptrdiff_t stride, \ + int flim_E, int flim_I, \ + int hev_thresh) + +#define VP8_LF_SIMPLE(hv, opt) \ + void ff_vp8_##hv##_loop_filter16_simple_##opt(uint8_t *dst, \ + ptrdiff_t stride, \ + int flim) + +#define VP8_LF_HV(inner, opt) \ + VP8_LF_Y(h, inner, opt); \ + VP8_LF_Y(v, inner, opt); \ + VP8_LF_UV(h, inner, opt); \ + VP8_LF_UV(v, inner, opt) + +#define VP8_LF(opt) \ + VP8_LF_HV(, opt); \ + VP8_LF_HV(_inner, opt); \ + VP8_LF_SIMPLE(h, opt); \ + VP8_LF_SIMPLE(v, opt) + +#define VP8_MC(n, opt) \ + void ff_put_vp8_##n##_##opt(uint8_t *dst, ptrdiff_t dststride, \ + const uint8_t *src, ptrdiff_t srcstride,\ + int h, int x, int y) + +#define VP8_EPEL(w, opt) \ + VP8_MC(pixels ## w, opt); \ + VP8_MC(epel ## w ## _h4, opt); \ + VP8_MC(epel ## w ## _h6, opt); \ + VP8_MC(epel ## w ## _v4, opt); \ + VP8_MC(epel ## w ## _h4v4, opt); \ + VP8_MC(epel ## w ## _h6v4, opt); \ + VP8_MC(epel ## w ## _v6, opt); \ + VP8_MC(epel ## w ## _h4v6, opt); \ + VP8_MC(epel ## w ## _h6v6, opt) + +#define VP8_BILIN(w, opt) \ + VP8_MC(bilin ## w ## _h, opt); \ + VP8_MC(bilin ## w ## _v, opt); \ + VP8_MC(bilin ## w ## _hv, opt) + +#endif /* AVCODEC_RISCV_VP8DSP_H */ diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index af57aabb71..fa3feeacf7 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -24,11 +24,33 @@ #include "libavutil/cpu.h" #include "libavutil/riscv/cpu.h" #include "libavcodec/vp8dsp.h" +#include "vp8dsp.h" void ff_vp8_idct_dc_add_rvv(uint8_t *dst, int16_t block[16], ptrdiff_t stride); void ff_vp8_idct_dc_add4y_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t stride); void ff_vp8_idct_dc_add4uv_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t stride); +VP8_EPEL(16, rvi); +VP8_EPEL(8, rvi); +VP8_EPEL(4, rvi); + +av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) +{ +#if HAVE_RV + int flags = av_get_cpu_flags(); + if (flags & AV_CPU_FLAG_RVI) { +#if __riscv_xlen >= 64 + c->put_vp8_epel_pixels_tab[0][0][0] = ff_put_vp8_pixels16_rvi; + c->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvi; + c->put_vp8_bilinear_pixels_tab[0][0][0] = ff_put_vp8_pixels16_rvi; + c->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvi; +#endif + c->put_vp8_epel_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; + c->put_vp8_bilinear_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; + } +#endif +} + av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RVV diff --git a/libavcodec/riscv/vp8dsp_rvi.S b/libavcodec/riscv/vp8dsp_rvi.S new file mode 100644 index 0000000000..50ba4f293f --- /dev/null +++ b/libavcodec/riscv/vp8dsp_rvi.S @@ -0,0 +1,61 @@ +/* + * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +#if __riscv_xlen >= 64 +func ff_put_vp8_pixels16_rvi +1: + addi a4, a4, -1 + ld t0, (a2) + ld t1, 8(a2) + sd t0, (a0) + sd t1, 8(a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc + +func ff_put_vp8_pixels8_rvi +1: + addi a4, a4, -1 + ld t0, (a2) + sd t0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +#endif + +func ff_put_vp8_pixels4_rvi +1: + addi a4, a4, -1 + lw t0, (a2) + sw t0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc diff --git a/libavcodec/vp8dsp.c b/libavcodec/vp8dsp.c index df7bd12424..f7c9c9899c 100644 --- a/libavcodec/vp8dsp.c +++ b/libavcodec/vp8dsp.c @@ -1402,6 +1402,8 @@ dsp->put_vp8_epel_pixels_tab[2][2][2] = put_vp8_epel4_h6v6_c; ff_vp78dsp_init_arm(dsp); #elif ARCH_PPC ff_vp78dsp_init_ppc(dsp); +#elif ARCH_RISCV + ff_vp78dsp_init_riscv(dsp); #elif ARCH_X86 ff_vp78dsp_init_x86(dsp); #endif diff --git a/libavcodec/vp8dsp.h b/libavcodec/vp8dsp.h index 30dc2c6cc1..3bf12b6b45 100644 --- a/libavcodec/vp8dsp.h +++ b/libavcodec/vp8dsp.h @@ -87,6 +87,7 @@ void ff_vp78dsp_init(VP8DSPContext *c); void ff_vp78dsp_init_aarch64(VP8DSPContext *c); void ff_vp78dsp_init_arm(VP8DSPContext *c); void ff_vp78dsp_init_ppc(VP8DSPContext *c); +void ff_vp78dsp_init_riscv(VP8DSPContext *c); void ff_vp78dsp_init_x86(VP8DSPContext *c); void ff_vp8dsp_init(VP8DSPContext *c); From patchwork Sun May 5 16:45:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48545 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp962766pzb; Sun, 5 May 2024 09:46:29 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV0MfahRkIYGZn5bmBKUhOzLOwEbBXTa1YkO5pj13vuiOhN76svgtZ0SyjvuuzuQJ0XNNucqfNGQQiuKDQup9YrTpGxPLgysUOtAA== X-Google-Smtp-Source: AGHT+IEbL7GsnVtEJ04RntyKR7stt+1EcoTmm3qoL6kZiGi7jy9nyFh+wVO6wJKuor2GyIDczbKZ X-Received: by 2002:ac2:593b:0:b0:51d:b2a:5f7d with SMTP id v27-20020ac2593b000000b0051d0b2a5f7dmr4580730lfi.48.1714927589109; Sun, 05 May 2024 09:46:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927589; cv=none; d=google.com; s=arc-20160816; b=u2DCMeZPdkbWt/fxpdRGyr6zkM7eqPEvYfAalJypsXqB050CVcH8uZhzuCReOBCQun iqfKT0cJf2ihA8iGed7bFoVo4r1Fz11UZarr2YB8H+jcpJW3/piFz2RREMpKoWcltkGy UIGvrQt0EyBHk5d8pCXVgye2gxCdKyyznuqeeYJWor6EZ3t/WU7gqpKxfAEFHy59iaa9 UkpC7zEx7AgMTbzb1+nj+vI63QDve4SxfXE/MJbSsYHEnna5UbESdQrYHMDYq5Vr3rTL OPBzlhns/WmqbFtpbZ2RBnDVLcILCCf1Fkoz4K8n+8sYeb89QXXV9k8NtzlPdiemm2gG BwIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=fFkMReyzpL/KrVNuF7HgCSO+V+LlxX6MsQzUD85DH3c=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=Cg++BJU6COmHqBzJG4CwqlTXcHRoNLZivl+I5kyVYmXI1Y3d5LlWrWJ/be4x5YNwBq 0XA+PG0IVci1C0luD0WtTrighOzZJy9e6MfArpmOLX+qJ8CHt++WwBjTTm3hUfIB176y Y2K5g3cCPRkG6BPkW/QSjwAC69kMPxOyjNM6AlGkTu4BlJdGyGkSfJWDjKNKIvFRuD+K wU0NkZCuVb+OCu/2gurQNgLlcc1aLdZT2zmBdGdj+FUlM6f3dJC2BAhp4Eq4FusKHRZb g513lQv9BOdEdjCeOI8UfkCXpfL9Sk7dJr6Nr2ke78zxqIUOT2ggDH7g9FtJdeKy8XFN K49A==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=osdPpRIe; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g19-20020a056402181300b00572cfad956csi3231105edy.617.2024.05.05.09.46.28; Sun, 05 May 2024 09:46:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=osdPpRIe; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4AB2368D5B6; Sun, 5 May 2024 19:46:00 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-242.mail.qq.com (out203-205-221-242.mail.qq.com [203.205.221.242]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id F013868D5A6 for ; Sun, 5 May 2024 19:45:50 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927539; bh=lloFRjmxC3JfyN+g4iLuCeX7X/8bzmqGnsFdKl4g9mM=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=osdPpRIePgRhIto5MGz3Ygjs7iSdYDXWClGjR/pm9JpdoSYzFJrbbwUNv1borIAN7 f2cJOT0HazcgKUhl/a9YWXGvtkxkojjTqxLBtd3hkcIyIlejBDSjAx4wIs85Y8pbSC S52Edy9rwafJE01oO648on11ikVY3U6Jh/nrn8Gg= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927538tcmjdeiek Message-ID: X-QQ-XMAILINFO: OI1RgUwxh0kZ4AwXOCNKjHFz2wRn86+9DFb4gbCmgFGDHywY8oM4Pa5DnJusc0 FwpSBtigQBjGdagstM51TouhbfOg/4AXr0E1H3K3M8d0rH/ekTrnuOlr0z3/vjyeAMiVwL1kGG58 Hzcw3hZLeJvNg0OEGXpeqL++1kiOSGcOtSAlxUmeMx9ooDAxVPsB48dGs6Vy+wjSOer8KGfCj2D6 p+7KUb459j6r5yYst9F9R65uws5WEySZJwNK4xCCjQ8DURGcOylFBqE/MVSwAVtpynQWaEUQENPE CKfA5vyEbzTo+i82O0BVi2t861FsrPyJyUbZcVUDMAXbZop6Gs26fnE5nfckQwFVPkmwrW67mUpT 9R1gLCzTR3Mb3NJkF54uLeWKtnbwppuN4M0eINd528BrpuL5CMsAmcivSspEQitqNk0zTIgQyTJR phNkye3WdPRBPXpu8aalugNaIQVnv8uJw5YAPREsf37Go1JuzrZibC7y2f09MPeELEJyOei1sj7+ eQCJFaYAHhQ7v+lPoaaX15FNHIBNGTyZ1Woi5pLy7YVEliKPx/pYFBIzGDXA9oM3U/np+jhyza+P CFKMAoshtCju1nbXPVi9UoaQ0EyogrKpjYsRXXhnI+KGMToYExVQZhoGSdzYRx13thZ6F8vV4UCS scoj5jKGijsND/1hXzQPqXKpD6atYJ3Zopl16NKvgGlA1KUtCvK3rWOG5U2k+Qt39Gyu52hUO4RH xeFMrg5iSknkvicC23nNkQthwS5i6Me1A3+NHTX4ut2MGGwv7pkpt/1grzp75BXALZ11aySIBwmu MCU7JJoekMXy/2CugrpXtbm2HgFr70WW+sVwDgF01OybTgdaxDHYyLunnTfU0Co3VRIW9JBXGKyx eq7dR0+OZBokpzgclFUtngbxIiSZ1vNk5CPYlBZyGjq/LwOxjp07FTphS7pkvZKjVJ+g20W0zW X-QQ-XMRINFO: Nq+8W0+stu50PRdwbJxPCL0= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:28 +0800 X-OQ-MSGID: <20240505164536.872683-2-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/10] lavc/vp8dsp: R-V V put_bilin_h X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: oQNGMWWkGCm7 From: sunyuechi C908: vp8_put_bilin4_h_c: 373.5 vp8_put_bilin4_h_rvv_i32: 158.7 vp8_put_bilin8_h_c: 1437.7 vp8_put_bilin8_h_rvv_i32: 318.7 vp8_put_bilin16_h_c: 2845.7 vp8_put_bilin16_h_rvv_i32: 374.7 --- libavcodec/riscv/vp8dsp_init.c | 14 +++++++++++ libavcodec/riscv/vp8dsp_rvv.S | 45 ++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index fa3feeacf7..778d5ceb29 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -34,6 +34,10 @@ VP8_EPEL(16, rvi); VP8_EPEL(8, rvi); VP8_EPEL(4, rvi); +VP8_BILIN(16, rvv); +VP8_BILIN(8, rvv); +VP8_BILIN(4, rvv); + av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RV @@ -48,6 +52,16 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; c->put_vp8_bilinear_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; } +#if HAVE_RVV + if (flags & AV_CPU_FLAG_RVV_I32 && ff_get_rv_vlenb() >= 16) { + c->put_vp8_bilinear_pixels_tab[0][0][1] = ff_put_vp8_bilin16_h_rvv; + c->put_vp8_bilinear_pixels_tab[0][0][2] = ff_put_vp8_bilin16_h_rvv; + c->put_vp8_bilinear_pixels_tab[1][0][1] = ff_put_vp8_bilin8_h_rvv; + c->put_vp8_bilinear_pixels_tab[1][0][2] = ff_put_vp8_bilin8_h_rvv; + c->put_vp8_bilinear_pixels_tab[2][0][1] = ff_put_vp8_bilin4_h_rvv; + c->put_vp8_bilinear_pixels_tab[2][0][2] = ff_put_vp8_bilin4_h_rvv; + } +#endif #endif } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 8a0773f964..760d9d3871 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -20,6 +20,18 @@ #include "libavutil/riscv/asm.S" +.macro vsetvlstatic8 len +.if \len <= 4 + vsetivli zero, \len, e8, mf4, ta, ma +.elseif \len <= 8 + vsetivli zero, \len, e8, mf2, ta, ma +.elseif \len <= 16 + vsetivli zero, \len, e8, m1, ta, ma +.elseif \len <= 31 + vsetivli zero, \len, e8, m2, ta, ma +.endif +.endm + .macro vp8_idct_dc_add vlse32.v v0, (a0), a2 lh a5, 0(a1) @@ -71,3 +83,36 @@ func ff_vp8_idct_dc_add4uv_rvv, zve32x ret endfunc + +.macro bilin_h_load dst len + vsetvlstatic8 \len + 1 + vle8.v \dst, (a2) + vslide1down.vx v2, \dst, t5 + vsetvlstatic8 \len + vwmulu.vx v28, \dst, t1 + vwmaccu.vx v28, a5, v2 + vwaddu.wx v24, v28, t4 + vnsra.wi \dst, v24, 3 +.endm + +.macro put_vp8_bilin_h len +func ff_put_vp8_bilin\len\()_h_rvv, zve32x + li t1, 8 + li t4, 4 + li t5, 1 + sub t1, t1, a5 +1: + addi a4, a4, -1 + bilin_h_load v0, \len + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + +.irp len 16,8,4 +put_vp8_bilin_h \len +.endr From patchwork Sun May 5 16:45:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48547 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp962863pzb; Sun, 5 May 2024 09:46:46 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCU1IaRQQTdGqMz9THjwjlGQp/9W3IuiXzOED5G1HZ/AoiEYZhSl1bv6BDPEqoAnJzwkmPDp+YYDsVqleg54UEuRGj1mNbF5+RnCCA== X-Google-Smtp-Source: AGHT+IG2gs2qLV3Gxxmn/0ux2oAPg9YlrpIS88msW1XnE+kCIFPJBB1ka3VfA5H926z+xwaPYOfU X-Received: by 2002:ac2:46cf:0:b0:51d:a87e:27ec with SMTP id p15-20020ac246cf000000b0051da87e27ecmr5159869lfo.9.1714927606392; Sun, 05 May 2024 09:46:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927606; cv=none; d=google.com; s=arc-20160816; b=eMjjGvB9iX5qWXFys2OKML5o088AiG3gE1ZYetiabnslfVUtIkhyxj+wpAJem+VT/S IJQlE6Zgp2tlldXnSV3P+LRbjUBrIaf/SJG3c1AeuZZiYJL4GfsVmdVJ9LoQjFVjo+Xs Zi6vyen3XoKIgB+6m1zQvhpozG4t/s2n/iQbiFclme861Gk99xYIH6vnmLb9cqYkYF6e lWO6TZvRVsHSI83EdaWtLeFao/bQ9+/HGVDIODAzOEkctrb+603NGvWz+u54gdk/xQ9i 5nfT2iRK83pWUT29xvyQQYeLru4qjJZosYS/gzi8wNV1/xsriQFLsa/nMwN7N95D2K/k dqyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=ZFqLZWiZxlXscwba2W37wRHSofnlfR0//m/Oqszr8PI=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=BQfICh8g37OGe5H7NH5Qf8kJ56/krK3jQ4Z1zwzyPBg6ZVPv1s37LpQbW2uv/D97Y7 7dZc3FGHsTQcRUv/uAw3qMyRVI743BwUyPh2Iz0MXPKVlW6Bvo3TjLiDVK7jNwQJJGzX /El24tI6SsrQkDDbyz9gr3Tb9MZsStvz3mtLDIJIaOf1goBVFdXLgiFTU7srEC3edKOi 6jZK8g5CkaqxYWVcX9/4ZI+KIaypZEmb1hisf9G3eIEIuDdCe/BEhfiduPTJNo+xTHvb 2LxvHY78BIa8A3z9zUmKUtGrAz2VsZnJHPbpK+Vha3fvtxoZ00i7qknrTNELlrZZBHTK bdAQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=N16rzZy8; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v7-20020a170906380700b00a59a75518a8si2096917ejc.676.2024.05.05.09.46.45; Sun, 05 May 2024 09:46:46 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=N16rzZy8; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9A2E468D5FC; Sun, 5 May 2024 19:46:02 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-236.mail.qq.com (out203-205-221-236.mail.qq.com [203.205.221.236]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9300F68D5B0 for ; Sun, 5 May 2024 19:45:51 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927540; bh=uZK32xlT9VXTThO1LdvxBU9DmQGSDFrdFEkgoqWaqIA=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=N16rzZy81XnTJ+Ohe/Tq5nxyVu6OUgNedWjrQV3mYelEvHe4qY0/USrBQqTSxwgrg C7JtX4rOe1MzrnLCNsLs8bzzZZL2ncDXPXskihXXgilsT9MQWVem4dYdcebxtDPArb FXG7asmywCqJmKpKE2xtLGF/9pCgeMqrXw4FGLzM= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927539tsep3plhv Message-ID: X-QQ-XMAILINFO: NTPcq+1kbvJWrk3FATqrZOCL17OCvhWlYWfiux14Fd4uqVhTDpqUvzHtBg/x7V oonR+A6UOespJx2sa4sgxyyOfPSmHUKOJCz8P9wAbGgTb0JdQBIaYsIAlcTx2sfJXkegOaa7naxH zbJ5uEDPZ52/L1fhb+qQc17oaRIZvdQvljD86Nw6xMwGAUCkUn/1byEmkZLp06ov7emdLlOT0jP1 bMrN3Br2x9b7Q/a47ROLeMoEDzsR5XtIB5nKIQ0qIFv7i2KXx2kJwfqR8WOj6FHW3906edsu0dH+ QeAv7sylYfYccIOxfxNUNDZKFHIm2VRiSXRQAMfh8Sb2gELYk6Xy5wNpZpq2oEqgcmWeEcqU2B97 eyDuhTvzQpDzheTt4aG9ziGfjaTRCSAaNcw/OO0iyUWaZq/iVWg+qcLcYd7ESyR5jEE0avA1It9D wj5X6WmE4JFHBIUitwe4Dj40Gp2emPmQ5FpdSXf1HnfFw9dM4NiCCBe/igpw8ByRC67190rQUqwG 4/IGhds4ruf7/90NDokLBwhlkMklszLr2P20r8qnNrMbdJMsPOuKx5nucUYbQu3wsIZHxarWc9vN A5LHK0xIQtf+v51P2ss31J6/fs2e8hpmliLr7yzuUsTrrRcLOOyg2bieHsAUzfVdW/ETARzi+ZdS CAlspU4K9sJQSJX2Yep02EBnTQoEuHiHmHEizTB/aZ+wqqojHZUlWYv0r/p9fYzDUnKMhXZ9CC33 ekpKcguj/4ekjiL2NevykT8UATZM7RlHkwtJFXyr/SSDEwJwhDXZ+kk5/qq0Muoja7lvzbHqzp4q MqpAXvjrFvfOShtA2I16U88DFxuKj6u0nulU3FEW9PtoIo8q6jibEoyRE7qeNy5d5nMq4YStFjI3 2/sPB5dFr0OJHYrW0PFTjt++aJrREObqXr2LUUX4mUdztIuvScus9MJdE9jVy+Tg== X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:29 +0800 X-OQ-MSGID: <20240505164536.872683-3-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 03/10] lavc/vp8dsp: R-V V put_bilin_v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: SOBFJ+hZHRVw From: sunyuechi C908: vp8_put_bilin4_v_c: 383.5 vp8_put_bilin4_v_rvv_i32: 139.7 vp8_put_bilin8_v_c: 1455.7 vp8_put_bilin8_v_rvv_i32: 299.7 vp8_put_bilin16_v_c: 2863.7 vp8_put_bilin16_v_rvv_i32: 347.7 --- libavcodec/riscv/vp8dsp_init.c | 7 +++++++ libavcodec/riscv/vp8dsp_rvv.S | 25 +++++++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 778d5ceb29..afffa6de2f 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -60,6 +60,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[1][0][2] = ff_put_vp8_bilin8_h_rvv; c->put_vp8_bilinear_pixels_tab[2][0][1] = ff_put_vp8_bilin4_h_rvv; c->put_vp8_bilinear_pixels_tab[2][0][2] = ff_put_vp8_bilin4_h_rvv; + + c->put_vp8_bilinear_pixels_tab[0][1][0] = ff_put_vp8_bilin16_v_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][0] = ff_put_vp8_bilin16_v_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][0] = ff_put_vp8_bilin8_v_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 760d9d3871..2a2d40d77d 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -113,6 +113,31 @@ func ff_put_vp8_bilin\len\()_h_rvv, zve32x endfunc .endm +.macro put_vp8_bilin_v len +func ff_put_vp8_bilin\len\()_v_rvv, zve32x + vsetvlstatic8 \len + li t1, 8 + li t4, 4 + sub t1, t1, a6 +1: + add t2, a2, a3 + addi a4, a4, -1 + vle8.v v0, (a2) + vle8.v v2, (t2) + vwmulu.vx v28, v0, t1 + vwmaccu.vx v28, a6, v2 + vwaddu.wx v24, v28, t4 + vnsra.wi v0, v24, 3 + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h \len +put_vp8_bilin_v \len .endr From patchwork Sun May 5 16:45:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48543 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp962654pzb; Sun, 5 May 2024 09:46:12 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVTv3zqzkHTLy6JFHQf0uBC+XnsG4N1NLAQQtk747H6d47/2Q+lY68+JOVkQg9AZ0zEdDKv+uDDt6Qc2FLZ4hVQSC/SikV6wA5abA== X-Google-Smtp-Source: AGHT+IH3PGiLt4RfJLhxCUaTXHayII4D4qvZqYVkBB8f4evG2tB/rG6kGyG9xvSXDk9fcPO51beP X-Received: by 2002:a17:907:760c:b0:a59:aae5:583d with SMTP id jx12-20020a170907760c00b00a59aae5583dmr2210553ejc.7.1714927571964; Sun, 05 May 2024 09:46:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927571; cv=none; d=google.com; s=arc-20160816; b=JYcSSCQRzPrGCEuNzkdwr7yquhKcjIWRYE1bLXIsLK0UZ19XOlVkBjAJ7fea6uCQb1 8ER4RJNZgQb/3o9ssyl+iix7dkeUvGD5JN9AupczfXubBHUuDLUh6L+Y7W6E/7amhQs/ Jx0PIFnnk1BgB9SkEwdFkFlwYLn1mCAtFypPQdHwaudgMlEetyhSIN2sMpnqqNSHEZXv 4p7LMXNWKr8ppfW3Y2xgcUP0IcCC0CPEdVgQHqzSZq3kcuG/39OCJOYUJifHXpIXjHLe 7HqRSaCXBjZV2baihiYqxG2c4Zm8Mhmd50xZsYnW6/NEej1LhP91fxjje6FRP7p1YfaR PJyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=IKxMcIP/P4rHwF/Uyzv4LUFB6go9AH0Fh9qhWpuQM8k=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=TfVGrxl+ZmWjX2n+WNioPmC6yVV4SCb3agOHJYSOFwmuPE3CnCEe4hKUXeu91kPzvR x0YKc45bbmG9+CSG1wKxLFdx59BctlYRiwI6oLQW4DCqvjaJmycur3xS1w+YH2R+6VJ7 QrsMe6AVIDHWaB8HUgcw1S6HWzv0WM5F2EfF0g4MMsvu5Q9dZaQkX8aty/9xaU0lmlRW 4Zr3c5l+JXSkDWG+qf5utfJ+TEaHO6aM6AbyvLl2VICRhqeB5ysiOEqm+B3X18Zj5d7F +yKT/ahIP5TrxnF740CY+J8vtAh8rnMJKBz85VsQK4894DRJq4kqxnK4eNzz/HQCDpfY uK8Q==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="zlrHKf/o"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id fx5-20020a170906b74500b00a5213f8fc14si4008004ejb.69.2024.05.05.09.46.11; Sun, 05 May 2024 09:46:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="zlrHKf/o"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D9C2968D5B0; Sun, 5 May 2024 19:45:57 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-210.mail.qq.com (out162-62-57-210.mail.qq.com [162.62.57.210]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C9F3C68D59D for ; Sun, 5 May 2024 19:45:48 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927541; bh=DXZlnT1+98tNhqVylMzJ1PfdTNdTJvuxPDCwcQphqdo=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=zlrHKf/oSR3lgZ1PYuoVG4zrayrFOsz5maVYI+240rBViBmHlQalqaE0aDIqnxIhc f/PKpab7fdjMVKUE1cNZQVG4Yq5GSPTeJoxDf9V/1NM//DmA/ALgkeC4Z5tyE7pV1s iNPFEz7kvPw9VoPzJEID6q3YuM37UoLbL3+fCL4g= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927540tn9qr1c8z Message-ID: X-QQ-XMAILINFO: N86Viuf4f4gYBkUPGSoG8YxHxDivZmmtqzuxTG0NrcKGGrOXO88mIks5By4Tv1 2Jr24vfqH8lQGCKxTcWrxwZx8vS4Tq5MJDmm9czLBFP5QH2PdJ79k7y8bHXHBTYiH6oZjHHbiwUO pIvhsxhtpqVDA+As3SjYYUSyCOLUMBfHFytB4JmbzrojN+ICRobSsWgjte/3YX4dKM9LJ0gF2Slm oG4WIgP1Lh8JYEAQCo4SxPa77UyDGXj1yGVijkbQ1wKznOKuzcEbdcpE7iObyLjg+OGJxUMUpZaN djZfBQKMVqfDqRs3SmdQ5eCl67mhFwXZD3V3nQ4qimVjAV18Lk96L1LroFoyG9rzCAYU41qNzrzY mfdpOEzrMQgdNrwKUCFZ8tgfBizqmWfqLcVgzAFxARLva8tpntL8d6CynaIv2+/S6eTG+a5V7K89 grTjv1rEr1iI8wPdS24X++8TPYFwM5lRhXKetJUksE+DeY9eGL6zXaE6v8z2vieJVT1tolsrLDIJ zVP4/5d2svSG5/D2NwfTyW3e6f3SrehP+HA0YDJQxRA8+As8qaOu8Cr7RszixcQExHR0FwqLNWAi g0qMhENbkqKpxT6NwyhhDMwUbsvE+iMVSrZMkBMTTwZlcF5Uc4Y6CFDWvNGwe9aQmOZXE5HXrcYY djnrPOazPDCQVYqSAQF/0s7fRwX7FJ0dPA+PVHTHvH/ZF9K96/Kl8JFNChoUuDEYeWdrtBVLx12l rmEaJ9pAepeokGjA8QXoV88IWni3ezE/HfP83cfZ//N9K3+t5qe+uGfxtRNcCfk9/T6hv8jPepmS TzFA96jvRlUYIfN7oiTTPsqlJ0H8sPgiPrZ3neP9+kP4eBpIBFrG5Affes6twTEkWVtE0YzELHlj Pt5V8AY2ZZqHRXvqK+fpD4FHkDi2o7DZp23ldLIHf+ X-QQ-XMRINFO: NyFYKkN4Ny6FSmKK/uo/jdU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:30 +0800 X-OQ-MSGID: <20240505164536.872683-4-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 04/10] lavc/vp8dsp: R-V V put_bilin_hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: /hTR0Tfdj6Di From: sunyuechi C908: vp8_put_bilin4_hv_c: 567.7 vp8_put_bilin4_hv_rvv_i32: 255.7 vp8_put_bilin8_hv_c: 2169.5 vp8_put_bilin8_hv_rvv_i32: 528.7 vp8_put_bilin16_hv_c: 4777.5 vp8_put_bilin16_hv_rvv_i32: 587.7 --- libavcodec/riscv/vp8dsp_init.c | 13 +++++++++++++ libavcodec/riscv/vp8dsp_rvv.S | 26 ++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index afffa6de2f..9627105fc8 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -67,6 +67,19 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_rvv; c->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_rvv; c->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_rvv; + + c->put_vp8_bilinear_pixels_tab[0][1][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][1][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 2a2d40d77d..f8105010c9 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -137,7 +137,33 @@ func ff_put_vp8_bilin\len\()_v_rvv, zve32x endfunc .endm +.macro put_vp8_bilin_hv len +func ff_put_vp8_bilin\len\()_hv_rvv, zve32x + li t3, 8 + sub t1, t3, a5 + sub t2, t3, a6 + li t4, 4 + li t5, 1 + bilin_h_load v4, \len + add a2, a2, a3 +1: + addi a4, a4, -1 + vwmulu.vx v20, v4, t2 + bilin_h_load v4, \len + vwmaccu.vx v20, a6, v4 + vwaddu.wx v24, v20, t4 + vnsra.wi v0, v24, 3 + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h \len put_vp8_bilin_v \len +put_vp8_bilin_hv \len .endr From patchwork Sun May 5 16:45:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48546 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp962821pzb; Sun, 5 May 2024 09:46:38 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV3M/bBw+0A96YoeTIwpI5Q/OK1AHXgZP2kGhX0gHLtEltIlU/mJQR0EC+5rnOmpABOqKGeAu61JzqnA27kYWFAWgTIc2KKHFx1og== X-Google-Smtp-Source: AGHT+IGZhIJB7wQuKF6R2HJzFFxHB3p6LP+yHr1lNi2wAu5WMvZHdSi+VH5pt+BSsWJL8vFNB4oT X-Received: by 2002:a19:e009:0:b0:51f:4f9c:8591 with SMTP id x9-20020a19e009000000b0051f4f9c8591mr4408535lfg.6.1714927598591; Sun, 05 May 2024 09:46:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927598; cv=none; d=google.com; s=arc-20160816; b=TLo+7nN2GH8MgMgj/FydwS77f2HFB0n++nc7pSUcusCSvjiyHPhAtuDbS3OdZAkZ18 NY797IzYI6hJRbw7ubSnom2OgDlAReKq1VPmbn76M9NjVj3fNtS45HqOKTN0+rGCOGQR IGyBf+EqP5VrYqBGRTKmg6DKGlh0W/hXGmzs4f8OgjRurpF1UnqoSv5oJwJP9+BSWUcb JjQjZYGGRa7f3kGR3HcZ8kP0bLACfwirK0uy/2eaxJ4E9/kiNfqyAZB+l3wQcfifOQxo g4aH35nVPcjZ3JcBEbYEee+cnvW3WMIk+k9qU1n3t2fqm5PBQ7EUsRYn2UisC/UF1baB UOCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=6Uxqf2QoCAE9/VLb7ParmREnrzvqTL3vGofhDjsbsqU=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=EcFlgVO2OKMTvfU2KHhMksgcooxouzNpxBXGf/Wqt98b2AwHMakqmBS4pHYUm7v7kw RjEv2xwmt2cnBAarMSOPXAKoNrL3VmqZ4nyAzjQvMta5/O3wb9Cjgi0/sbzGvE1ElYFe rLeYziTAI6u4md8sWZWF81BbNuOzeJXFhfVqkzgrU5xE9voMgXY6J4l82tE/3W36THrM yoB3vvCL3NiqYNYkNKQE3ab47zT/yMQjLhEVyFqyUom0UzCIrWUkW6bSzj3oXDiqlbcw fPFOPApmrjU9/XbuPLm11ftbILrDul+ernUZnSZ0LnPTpUhLzRIKPxgHPnVTEUcEYHLt sbsg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="KJSH/Etp"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f27-20020a19381b000000b0051d31f7d44esi2364681lfa.314.2024.05.05.09.46.38; Sun, 05 May 2024 09:46:38 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="KJSH/Etp"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3A52168D5F2; Sun, 5 May 2024 19:46:01 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-251-84.mail.qq.com (unknown [203.205.251.84]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 257DE68D59D for ; Sun, 5 May 2024 19:45:50 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927541; bh=iZOfBIg6gtmo1mk1kr3B1/fY8HMJHY3+API5G2NAfog=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=KJSH/EtpPdJX8NOcjfYaYgvCw+N75xNZlbJe6D7XMiYE3amRkgLxOAeBRThWszfb3 /fDGOpag180MD+ZcN0E91AesqJBg6Iu9htEOIqbIXxf34OE+WtVK5+/GowVhbLXKHK 5N6P+oQ6NfB7zHxXx/ahurSVS/LuvD3IIkbtcgMU= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927541tj16pv6ik Message-ID: X-QQ-XMAILINFO: NyTsQ4JOu2J24jr+YPFhK0JP6nvQIy41aUw1Ajh/zrchsRmQG8InqWabcgH4/E cgZGEAeHpJSrNZnT8RrpPZqrm9PQEGiabSVqmXR6xzuzWv9I7Rl+3KGrr6Hd+Ht9hmBj5vtMK+7/ Vf4DAKWMoYkyhyBije5DoTswmpmttlr2OvvzF5sdmI0DX92eDBN7CKFSRd59UcJXmRzdvyJiaLCP MxXkGHtcSNqnrSnGcUNyACIir9IVdH4cOnOImz5T1urrns62ardC/Q5yainQnt7RybvxDXeGyYVT mxJfR2GxwlPYBk11u94iBK0Kih37LpdjOUNY6oSxOFdC8Qtcmha8c+2xzxNk6QqGaF1Ti69Oeti4 /uUu6lS/OtGUvXruzEcAMw2+4DeY5Hj8fZkOnDZimV0TCJdH4lFuGCK7Tx6unXi9KvTCpJGgnUGm zHFEFIHPrql7RrgIP9tk3Q+fwVx5f5Cks4BNFH4RIb5n7WYwDsOTTYBmf/4LRClJmxX1lyyDF9kq KYH6nnwDiGM9Fcpggubx3PI0RWeMByVgv9JNsbJ3YXptJUJNPkmko8wGZNDR6YTA8dGiEI3XLXoB 0/+qxHB0Ynr1OJwe9xI1tN99GVDuDtSu4NdjQrq1c/cdMy+Q0HoPIw/UTGCs7oAwaF1lZII7IOEg 2ul1O8vztyi2BaajswvHvNmvD3s5zFoeMrNeQRD2uz9LqGzmuqdjDWkkAt4sAqS4QTquTOLo50y0 smgz9AMqFMwbVA6vVOloWKLDjxaF9fCT7+adyXOk+uWOk57e4Lp10NBTGvBgmTbCcXS3pbytzvQs MB4eP2c4hL++smNi0dbVLjTIWRBSJ5IPtnA2DgHTxx7aBgleng6IJMic5lrt5SvI746M4m5EDfvh 5ZqQF8kAKl/hiZ42Q4kAe+rVsOpw59Lw== X-QQ-XMRINFO: M/715EihBoGSf6IYSX1iLFg= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:31 +0800 X-OQ-MSGID: <20240505164536.872683-5-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 05/10] lavc/vp8dsp: R-V V put_epel h X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: IiH1Mhkl4Vpd From: sunyuechi C908: vp8_put_epel4_h4_c: 10.7 vp8_put_epel4_h4_rvv_i32: 5.0 vp8_put_epel4_h6_c: 15.0 vp8_put_epel4_h6_rvv_i32: 6.2 vp8_put_epel8_h4_c: 43.2 vp8_put_epel8_h4_rvv_i32: 11.2 vp8_put_epel8_h6_c: 57.5 vp8_put_epel8_h6_rvv_i32: 13.5 vp8_put_epel16_h4_c: 92.5 vp8_put_epel16_h4_rvv_i32: 13.7 vp8_put_epel16_h6_c: 139.0 vp8_put_epel16_h6_rvv_i32: 16.5 --- libavcodec/riscv/vp8dsp_init.c | 10 ++++ libavcodec/riscv/vp8dsp_rvv.S | 87 ++++++++++++++++++++++++++++++++++ 2 files changed, 97 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 9627105fc8..a4b7d49932 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -33,6 +33,9 @@ void ff_vp8_idct_dc_add4uv_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t str VP8_EPEL(16, rvi); VP8_EPEL(8, rvi); VP8_EPEL(4, rvi); +VP8_EPEL(16, rvv); +VP8_EPEL(8, rvv); +VP8_EPEL(4, rvv); VP8_BILIN(16, rvv); VP8_BILIN(8, rvv); @@ -80,6 +83,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_rvv; c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; + + c->put_vp8_epel_pixels_tab[0][0][2] = ff_put_vp8_epel16_h6_rvv; + c->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_rvv; + c->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_rvv; + c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; + c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; + c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index f8105010c9..f5c4c1d85d 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -32,6 +32,16 @@ .endif .endm +.macro vsetvlstatic16 len +.if \len <= 4 + vsetivli zero, \len, e16, mf2, ta, ma +.elseif \len <= 8 + vsetivli zero, \len, e16, m1, ta, ma +.elseif \len <= 16 + vsetivli zero, \len, e16, m2, ta, ma +.endif +.endm + .macro vp8_idct_dc_add vlse32.v v0, (a0), a2 lh a5, 0(a1) @@ -162,8 +172,85 @@ func ff_put_vp8_bilin\len\()_hv_rvv, zve32x endfunc .endm +const subpel_filters + .byte 0, -6, 123, 12, -1, 0 + .byte 2, -11, 108, 36, -8, 1 + .byte 0, -9, 93, 50, -6, 0 + .byte 3, -16, 77, 77, -16, 3 + .byte 0, -6, 50, 93, -9, 0 + .byte 1, -8, 36, 108, -11, 2 + .byte 0, -1, 12, 123, -6, 0 +endconst + +.macro epel_filter size + lla t2, subpel_filters + addi t0, a5, -1 + li t1, 6 + mul t0, t0, t1 + add t0, t0, t2 + .irp n 1,2,3,4 + lb t\n, \n(t0) + .endr +.ifc \size,6 + lb t5, 5(t0) + lb t0, (t0) +.endif +.endm + +.macro epel_load dst len size + addi t6, a2, -1 + addi a7, a2, 1 + vle8.v v24, (a2) + vle8.v v22, (t6) + vle8.v v26, (a7) + addi a7, a7, 1 + vle8.v v28, (a7) + vwmulu.vx v16, v24, t2 + vwmulu.vx v20, v26, t3 +.ifc \size,6 + addi t6, t6, -1 + addi a7, a7, 1 + vle8.v v24, (t6) + vle8.v v26, (a7) + vwmaccu.vx v16, t0, v24 + vwmaccu.vx v16, t5, v26 +.endif + li t6, 64 + vwmaccsu.vx v16, t1, v22 + vwmaccsu.vx v16, t4, v28 + vwadd.wx v16, v16, t6 + vsetvlstatic16 \len + vwadd.vv v24, v16, v20 + vnsra.wi v24, v24, 7 + vmax.vx v24, v24, zero + vsetvlstatic8 \len + vnclipu.wi \dst, v24, 0 +.endm + +.macro epel_load_inc dst len size + epel_load \dst \len \size + add a2, a2, a3 +.endm + +.macro epel len size type +func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x + epel_filter \size + vsetvlstatic8 \len +1: + addi a4, a4, -1 + epel_load_inc v30 \len \size + vse8.v v30, (a0) + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h \len put_vp8_bilin_v \len put_vp8_bilin_hv \len +epel \len 6 h +epel \len 4 h .endr From patchwork Sun May 5 16:45:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48544 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp962717pzb; Sun, 5 May 2024 09:46:21 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWaF/4KhIoBGBwP1I5CWGm57tcCjeWSPP8SSqvH5KzyZI9ZD2gRDaYWa9WY/esu4eRjl5V/szlT6/cJFG7JFQQ15hRm4tslmcB0Ow== X-Google-Smtp-Source: AGHT+IGK1SmsIBOYy2bNn91TcZcPxh4CWDkdcfO2WhntCUejZNqzVyOibMCkvwRQ3c75tuZ7QeEx X-Received: by 2002:a50:f613:0:b0:572:689f:6380 with SMTP id c19-20020a50f613000000b00572689f6380mr6576370edn.3.1714927581013; Sun, 05 May 2024 09:46:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927580; cv=none; d=google.com; s=arc-20160816; b=MkPRXeACsZzH45GFticJByvjpyT0wRp1wajtZnsEiLKALJT0DHEYTpFJ7uKwOCbhul 1GiGWjd2+Qdaiacd2lacDUQIek0MKM7wC8V0kIvo/UIUJCdMZELe//YbUWe0YnJt0qWl 1vm2HDfb9Rz5SoZ7fOO5Ja0MpOJZvkX8XxCUzCgOoO3qtMIe0xe1MRI1nDlOSones4Jb GNlnUJB0KmuNJFU5ijhZzQZDgijjGwbMCQqPCju13LvpE/X3pKBgFqaUyKoOU+1Ag0fo LnrCQHryB8atBnDMnHQK/vGfXGiJlqBfhPefAEcX4J9DKCQlVfb3a5MH5nqm8zvnPH3w wrnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=qTb3+baoZ0U4zr5qrwiNN0TtS1A1BINWKyaxYmtoPno=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=DQT+eutbQ+HBYjiyElinLA/fYPJSZk5OqPaUBqNmx7ihY42ITBU5MSIBuGSyMa1xGH 2GjTSyz6E1wf53OrdadKnRellsWh4o2wrRvzB1JTXfNloEUmlHFhPdJfrZdY9qIZc/HR YX9/9RTvh7ervaeCDi+Hk4El630VtLBTkHF44KbY68dL0kCkteQ+EFa92LOtgdtaszzd WG6uxTxj29ANm9YoiY5oHNouCLxmiyfLWB43ymjgcjPHinbZj9fuh92cCCb3bv541wDo VGA1z5ezwoN904r6IaWKvRkJLh7QGHUIg4QfT9ULV2q1nYmN7tm18EWOP5t5zY8rjqE5 IQFw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=ACBtQ99T; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h11-20020aa7c60b000000b0056df9749489si3298893edq.651.2024.05.05.09.46.20; Sun, 05 May 2024 09:46:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=ACBtQ99T; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4BC7A68D5B4; Sun, 5 May 2024 19:45:59 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-210.mail.qq.com (out162-62-57-210.mail.qq.com [162.62.57.210]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5220868D544 for ; Sun, 5 May 2024 19:45:50 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927542; bh=jScOU9qhQVYBL127OKM8D7QGsHyj+5Zi4+yiDTVUcy0=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=ACBtQ99TdaEY03R3VwbV+4eM4S5yC/TZg1ni5ea6KiFSVOP8M2EFyU2ry1Lzp3N2U LukGf5eUMq6BFXd4kmJprUCvSyJ7oBT+W2pri83HnUifxorjdXedoaP2tbFiBzutsV F9gt87nP+8FbtveLV4v7jApmbzNCiG4seXXpp578= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927541taxan10ra Message-ID: X-QQ-XMAILINFO: Msf7FzQQGWpRysOQYONc2kzEoHZC43XGUNm542/8h8xU6uNH1NBsApGASHresO +RLYVl9qM12eXfkP8aQRAa0WR5utgX4uxrH3FzomE+GYl9HsDTVeCIaQCRqAKlltosKRlJ+/xxAR lddjJmyVHyEX9Pf2M5PIgDHz2t9ia32AZa8jXV6iKbnj7GNO3cE1OWNy2KT4vtpESr65cUr7qZZd fB4ROKWJc2gIqE/Byp9xeHYEvgKe0XmbyT57bnuQ5CIzFXRreieWRIOLeO4bW6Q65j2ROtue3rof JJyp7mA/EEmeM6c3MTaJDLw3U7WOt3IP44qJF6ZSoguclJRBBVI1GovHKeCPmB0vpA8JFW5NDhro 8Vv+oEjxg4h51OQYF5NHm17eMhWy2euj2p+z2iAJfkgMdSBqGUedNVjWZXK/j4AhRqPhwKiGyz+d 8+B9XVNA3IjC3qk4mZdErpBPIhOMZdkgmyu85govPdOFNg2uBTZajgRkxttotrXuIFDotMa+EiPK owVTzDfxUcVbawvRSnFoLR/86dtIDVQF5OLm9sVrBwcR//LwAcB3NOoPC7b1E+jE1OCdGbjaUpmD iw4OQNXtL22j3+sh2AfwSnD4ZwfTsHUz7uuJrUdl0XbCchNvH4AOIGBFLsJtCR9hQe7K2Mg/MO62 d2ApR+RlxmQcNmUvOqkVJi0JXNoDMWzDotvKm7fRNZww0iQPTH/I7NzCBiWKwsxE4emtjgR/0FPN iLDnHgAtbdSiKtQVRL7Se4ZJbsL2H0gPJDzn5ihSk10og8bhOoUq7W2wRbnaZQSLYrOwearr6Xne 3vHXSKQSuNgZ6Piu9S7tnfGio09aBN7FV8Iq+MQe6zVi/xlw59IrLsII8MwS3toHCqhyY6XLbeI/ WjK1o7XEBZCB8lXeGssI1GoUOQivwrL76iYgqmziwCq+MAAgSGN6p5thSdwyb7Za8qWCY2Q/5zbf rp0cfmwko= X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:32 +0800 X-OQ-MSGID: <20240505164536.872683-6-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 06/10] lavc/vp8dsp: R-V V put_epel v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: qoHfiOMpDVip From: sunyuechi C908: vp8_put_epel4_v4_c: 11.0 vp8_put_epel4_v4_rvv_i32: 5.0 vp8_put_epel4_v6_c: 16.5 vp8_put_epel4_v6_rvv_i32: 6.2 vp8_put_epel8_v4_c: 43.7 vp8_put_epel8_v4_rvv_i32: 11.2 vp8_put_epel8_v6_c: 68.7 vp8_put_epel8_v6_rvv_i32: 13.2 vp8_put_epel16_v4_c: 92.5 vp8_put_epel16_v4_rvv_i32: 13.7 vp8_put_epel16_v6_c: 135.7 vp8_put_epel16_v6_rvv_i32: 16.5 --- libavcodec/riscv/vp8dsp_init.c | 7 +++++++ libavcodec/riscv/vp8dsp_rvv.S | 34 +++++++++++++++++++++++----------- 2 files changed, 30 insertions(+), 11 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index a4b7d49932..dc3e087f01 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -90,6 +90,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; + + c->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index f5c4c1d85d..ca5581f845 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -182,9 +182,13 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel_filter size +.macro epel_filter size type lla t2, subpel_filters +.ifc \type,v + addi t0, a6, -1 +.elseif \type == h addi t0, a5, -1 +.endif li t1, 6 mul t0, t0, t1 add t0, t0, t2 @@ -197,19 +201,25 @@ endconst .endif .endm -.macro epel_load dst len size - addi t6, a2, -1 - addi a7, a2, 1 +.macro epel_load dst len size type +.ifc \type,v + mv a5, a3 +.else + li a5, 1 +.endif + sub t6, a2, a5 + add a7, a2, a5 + vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) - addi a7, a7, 1 + add a7, a7, a5 vle8.v v28, (a7) vwmulu.vx v16, v24, t2 vwmulu.vx v20, v26, t3 .ifc \size,6 - addi t6, t6, -1 - addi a7, a7, 1 + sub t6, t6, a5 + add a7, a7, a5 vle8.v v24, (t6) vle8.v v26, (a7) vwmaccu.vx v16, t0, v24 @@ -227,18 +237,18 @@ endconst vnclipu.wi \dst, v24, 0 .endm -.macro epel_load_inc dst len size - epel_load \dst \len \size +.macro epel_load_inc dst len size type + epel_load \dst \len \size \type add a2, a2, a3 .endm .macro epel len size type func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x - epel_filter \size + epel_filter \size \type vsetvlstatic8 \len 1: addi a4, a4, -1 - epel_load_inc v30 \len \size + epel_load_inc v30 \len \size \type vse8.v v30, (a0) add a0, a0, a1 bnez a4, 1b @@ -253,4 +263,6 @@ put_vp8_bilin_v \len put_vp8_bilin_hv \len epel \len 6 h epel \len 4 h +epel \len 6 v +epel \len 4 v .endr From patchwork Sun May 5 16:45:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48548 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp962912pzb; Sun, 5 May 2024 09:46:55 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXHKNmmWJGthj4tOKgD2G7spAFKE9zyXPFMEMch4Zkl02ZrN4oip3VLSMafd27/q4X3ru8RchnWYUMIh6XHiTOScWOqqYmRcaNSqA== X-Google-Smtp-Source: AGHT+IHdGfOkggF+DzfILW9b+k2M4j3sPdxiAz9YW8TgzxgtHlcsB06ZrvTNZehksHtbQ20asO3F X-Received: by 2002:a2e:9ad7:0:b0:2e1:fd4a:cc3d with SMTP id p23-20020a2e9ad7000000b002e1fd4acc3dmr5681302ljj.2.1714927614997; Sun, 05 May 2024 09:46:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927614; cv=none; d=google.com; s=arc-20160816; b=ES5JQwq8SiVE1c5R59s6g8p+PHNBLRZ2YkZXtMFQ55R8y19I4gy98NCnqK4S8TORdq NLXSUI89ylAO12GUDlixQyMPYOK1+fDXUhJFSOnbvDsd3GEGH31gV3yI9Pp4tIikWk9+ lRDmfxiIODfOYv2JGHMtBTN42TGAZyUKC3sO3yQynsjJrN7Uw1ViW5yu+n2n3cLiWU7G iMVVEAAR1ciq6X0jKokSt+yQoeZT7386flL9yCs9AjA4ZiI60e3fJeGC90+94F8eNBz0 1vmdUYYLEjCwNzdncdL7z6vHSpN1ysKojcjCb8fTA+8def1FMHZu4T6aHeILiD1t36lV wmOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=gkj8SkH99tYuvVxPPOUwoyGQ9ilX0KqYCTS16oXlErI=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=sJXDauGaUOBwmz/Dsvli3BXobk+0xJa26sZGoyFW7+MxzTmWDWUavkNDrI3eoZ8uI2 LyHRb2frPMvsTvKT0f4JRZtdOUMeaHo3v6pFix+0Ql1fb/EiF4jjmEoD8J5I/xVG3yTT 6quIRH0xqUtCbJ6gkbKgw8aZXxv+2IrhKqPxcNIMldbL/DKvhvJFuL0w7mn3AE0u3UJZ ocSiqDNxfcM/fa0nThpprSbuxNx+w7u1qUUdwbMBFtSN4WukoKD8LoPIRGblnZaNJRqK VQXVc0rVqybDiZ42H0wV1WJmI1BM5J1V4AYdcq8JZOKod+rk+WXJyr0OAVi99B/nV6pd Knlg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=LBNh5C9x; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ca5-20020aa7cd65000000b00572d25c4509si2998943edb.257.2024.05.05.09.46.54; Sun, 05 May 2024 09:46:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=LBNh5C9x; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 994E468D602; Sun, 5 May 2024 19:46:03 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-164.mail.qq.com (out203-205-221-164.mail.qq.com [203.205.221.164]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 85D7C68D5BD for ; Sun, 5 May 2024 19:45:52 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927543; bh=DAwiBy16nbTAKui0anNOmBS69c2AfGrcWOhoN0YMwuU=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=LBNh5C9xgm38mIlQC7v/F9e9xHnQcs8UEkpDnGmiAqIskwz3GL8uSeFrfzzjrjDLk JIFCe1EHkBZgfkxyh403JY6J/pCBS4Yre7R3IlV98PzreAZi1GzJHz2oHUhTGDttjV IiQwMHc73gyOVGIWeFGcdmnL2DEVOkVr/t86DSi0= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927542tq3lfspbm Message-ID: X-QQ-XMAILINFO: OLHLRPgEm/lFMRCZkTmCaETWJ8RkfAQ8JBuRkwWtXjOxGSt1AOibrMAQTrYgda 3xYfWya72Mo+cZQ+jU5vpz9vIHltYaQolD9K7pSMA3b5id6xB3wEs09a9SCxPzmgxwN/KvPz740h HZ/C+JM52o1EmECDy5BpcANypu4iDIVnIgTtf5gVcT4brbnsfAeVKWJSJBLPMoTQn4mk6S4Hl0Ng QDvQ04Hw+8rrbZP1wbRBSFhzX+ljNJqtz+2x7tWAZsAJwW1FDDv1DLHN1r5L185Ijlrnlg9/uIj4 b1b4qKMZaqGe4HXyoojr6bjmhZQHJA3VzXarOxaFwlvxWuSxpsGCs2Eaya7D/l/7C70WsKAQhCCV RQT1yT3hRWjDFzfkKSS3PU7jtGj60K11/FZ24BPBM7tXFTq2KwnZmGt91XZLOEjLzBug9oLpzToU iFwpOH5VH7kOQyOLuUwIYnOQi1Df/0XW5CtoajqLMBzf9WW7eAo6bBAt9rkfKUUVcdpPkWY1/9uU d14fbl0tZEJ4A8V12QvvRJDn8SICmnXxhdY/IUP7u5BEAcVWzESsgiFvWqW2uqvLfR68ZlpALecv 20beYUPf5jY059FubiFtFPcHbub1sZzElVN8MeCK4UO3suLhzG01/R1a5WjDGUVlXaDDPt47cyY/ 2zpM6fbRWq7FlTnR9u7U0IlF1XFutrlIU5fIy4cF0BqKemM7a7XCJ8dTAXWI6hqi1I/gUPpp6Pjs d1VDpuT21NuKbszgpZN63jSkIIrT1gpYd0ccCLsp7fQEopfQ3+fnS8D15PgCKzm6fhOmalhg0lcA A7aCUiL+Qn0eO9q4FcsUXN1h/J8lXM2+zAqEF82C14wj0zZjh0u+P6gEoPj8oQKuCQ4Dd8qJ+8Nc OOe7DDEmPtcpN7XmORYkzeczYUriLGuUw53l5GTsObFEOwU9H0rvaLspDheYJQFA== X-QQ-XMRINFO: NS+P29fieYNw95Bth2bWPxk= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:33 +0800 X-OQ-MSGID: <20240505164536.872683-7-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 07/10] lavc/vp8dsp: R-V V put_epel hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: gfMa/Ls/fb3M From: sunyuechi C908: vp8_put_epel4_h4v4_c: 20.0 vp8_put_epel4_h4v4_rvv_i32: 11.0 vp8_put_epel4_h4v6_c: 25.2 vp8_put_epel4_h4v6_rvv_i32: 13.5 vp8_put_epel4_h6v4_c: 22.2 vp8_put_epel4_h6v4_rvv_i32: 14.5 vp8_put_epel4_h6v6_c: 29.0 vp8_put_epel4_h6v6_rvv_i32: 15.7 vp8_put_epel8_h4v4_c: 73.0 vp8_put_epel8_h4v4_rvv_i32: 22.2 vp8_put_epel8_h4v6_c: 90.5 vp8_put_epel8_h4v6_rvv_i32: 26.7 vp8_put_epel8_h6v4_c: 85.0 vp8_put_epel8_h6v4_rvv_i32: 27.2 vp8_put_epel8_h6v6_c: 104.7 vp8_put_epel8_h6v6_rvv_i32: 29.5 vp8_put_epel16_h4v4_c: 145.5 vp8_put_epel16_h4v4_rvv_i32: 26.5 vp8_put_epel16_h4v6_c: 190.7 vp8_put_epel16_h4v6_rvv_i32: 47.5 vp8_put_epel16_h6v4_c: 173.7 vp8_put_epel16_h6v4_rvv_i32: 33.2 vp8_put_epel16_h6v6_c: 222.2 vp8_put_epel16_h6v6_rvv_i32: 35.5 --- libavcodec/riscv/vp8dsp_init.c | 14 ++++ libavcodec/riscv/vp8dsp_rvv.S | 118 +++++++++++++++++++++++++++------ 2 files changed, 111 insertions(+), 21 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index dc3e087f01..6ebb2e11e0 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -1,3 +1,4 @@ + /* * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). * @@ -97,6 +98,19 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; + + c->put_vp8_epel_pixels_tab[0][2][2] = ff_put_vp8_epel16_h6v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][2] = ff_put_vp8_epel4_h6v6_rvv; + c->put_vp8_epel_pixels_tab[0][2][1] = ff_put_vp8_epel16_h4v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][1] = ff_put_vp8_epel4_h4v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][1] = ff_put_vp8_epel16_h4v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][1] = ff_put_vp8_epel8_h4v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][1] = ff_put_vp8_epel4_h4v4_rvv; + c->put_vp8_epel_pixels_tab[0][1][2] = ff_put_vp8_epel16_h6v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][2] = ff_put_vp8_epel4_h6v4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index ca5581f845..2d5e2260b7 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -1,3 +1,4 @@ + /* * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). * @@ -182,26 +183,26 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel_filter size type - lla t2, subpel_filters +.macro epel_filter size type regtype + lla \regtype\()2, subpel_filters .ifc \type,v - addi t0, a6, -1 + addi \regtype\()0, a6, -1 .elseif \type == h - addi t0, a5, -1 + addi \regtype\()0, a5, -1 .endif - li t1, 6 - mul t0, t0, t1 - add t0, t0, t2 + li \regtype\()1, 6 + mul \regtype\()0, \regtype\()0, \regtype\()1 + add \regtype\()0, \regtype\()0, \regtype\()2 .irp n 1,2,3,4 - lb t\n, \n(t0) + lb \regtype\n, \n(\regtype\()0) .endr .ifc \size,6 - lb t5, 5(t0) - lb t0, (t0) + lb \regtype\()5, 5(\regtype\()0) + lb \regtype\()0, (\regtype\()0) .endif .endm -.macro epel_load dst len size type +.macro epel_load dst len size type from_mem regtype .ifc \type,v mv a5, a3 .else @@ -210,24 +211,35 @@ endconst sub t6, a2, a5 add a7, a2, a5 +.if \from_mem vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) add a7, a7, a5 vle8.v v28, (a7) - vwmulu.vx v16, v24, t2 - vwmulu.vx v20, v26, t3 + vwmulu.vx v16, v24, \regtype\()2 + vwmulu.vx v20, v26, \regtype\()3 .ifc \size,6 sub t6, t6, a5 add a7, a7, a5 vle8.v v24, (t6) vle8.v v26, (a7) - vwmaccu.vx v16, t0, v24 - vwmaccu.vx v16, t5, v26 + vwmaccu.vx v16, \regtype\()0, v24 + vwmaccu.vx v16, \regtype\()5, v26 +.endif + vwmaccsu.vx v16, \regtype\()1, v22 + vwmaccsu.vx v16, \regtype\()4, v28 +.else + vwmulu.vx v16, v4, \regtype\()2 + vwmulu.vx v20, v6, \regtype\()3 + .ifc \size,6 + vwmaccu.vx v16, \regtype\()0, v0 + vwmaccu.vx v16, \regtype\()5, v10 + .endif + vwmaccsu.vx v16, \regtype\()1, v2 + vwmaccsu.vx v16, \regtype\()4, v8 .endif li t6, 64 - vwmaccsu.vx v16, t1, v22 - vwmaccsu.vx v16, t4, v28 vwadd.wx v16, v16, t6 vsetvlstatic16 \len vwadd.vv v24, v16, v20 @@ -237,21 +249,81 @@ endconst vnclipu.wi \dst, v24, 0 .endm -.macro epel_load_inc dst len size type - epel_load \dst \len \size \type +.macro epel_load_inc dst len size type from_mem regtype + epel_load \dst \len \size \type \from_mem \regtype add a2, a2, a3 .endm .macro epel len size type func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x - epel_filter \size \type + epel_filter \size \type t + vsetvlstatic8 \len +1: + addi a4, a4, -1 + epel_load_inc v30 \len \size \type 1 t + vse8.v v30, (a0) + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + +.macro epel_hv len hsize vsize +func ff_put_vp8_epel\len\()_h\hsize\()v\vsize\()_rvv, zve32x + addi sp, sp, -48 + .irp n 0,1,2,3,4,5 +#if __riscv_xlen >= 64 + sd s\n, \n\()<<3(sp) +#else + sw s\n, \n\()<<3(sp) +#endif + .endr + sub a2, a2, a3 + epel_filter \hsize h t + epel_filter \vsize v s vsetvlstatic8 \len +.if \hsize == 6 || \vsize == 6 + sub a2, a2, a3 + epel_load_inc v0 \len \hsize h 1 t +.endif + epel_load_inc v2 \len \hsize h 1 t + epel_load_inc v4 \len \hsize h 1 t + epel_load_inc v6 \len \hsize h 1 t + epel_load_inc v8 \len \hsize h 1 t +.if \hsize == 6 || \vsize == 6 + epel_load_inc v10 \len \hsize h 1 t +.endif + addi a4, a4, -1 1: addi a4, a4, -1 - epel_load_inc v30 \len \size \type + epel_load v30 \len \vsize v 0 s vse8.v v30, (a0) +.if \hsize == 6 || \vsize == 6 + vmv.v.v v0, v2 +.endif + vmv.v.v v2, v4 + vmv.v.v v4, v6 + vmv.v.v v6, v8 +.if \hsize == 6 || \vsize == 6 + vmv.v.v v8, v10 + epel_load_inc v10 \len \hsize h 1 t +.else + epel_load_inc v8 \len 4 h 1 t +.endif add a0, a0, a1 bnez a4, 1b + epel_load v30 \len \vsize v 0 s + vse8.v v30, (a0) + + .irp n 0,1,2,3,4,5 +#if __riscv_xlen >= 64 + ld s\n, \n\()<<3(sp) +#else + lw s\n, \n\()<<3(sp) +#endif + .endr + addi sp, sp, 48 ret endfunc @@ -265,4 +337,8 @@ epel \len 6 h epel \len 4 h epel \len 6 v epel \len 4 v +epel_hv \len 6 6 +epel_hv \len 4 4 +epel_hv \len 6 4 +epel_hv \len 4 6 .endr From patchwork Sun May 5 16:45:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48550 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp963026pzb; Sun, 5 May 2024 09:47:13 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUYVX/XmN+7+/lxjXjwer4/jQXvX2SFF/tUyYBkjX7ZL5QFD5gzdpbjLbpqOL0EXwbXrbIgw3PI+gIu7rbSnAnFzTuM3cd9xcgqOQ== X-Google-Smtp-Source: AGHT+IG/163CS6tebz2gdIKRKfnUjnqWzh/qKuB53197RBN29HzDleJrU1qyAg7pZddPQ4bo4pTf X-Received: by 2002:a17:907:868e:b0:a59:bfab:b24f with SMTP id qa14-20020a170907868e00b00a59bfabb24fmr1525382ejc.3.1714927633617; Sun, 05 May 2024 09:47:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927633; cv=none; d=google.com; s=arc-20160816; b=nMi3RKeQXPOsBwJglQwFgZFVFoDSaO6Y7SjieivWjUHHy2vxuoykAhINy3NnotRCbb NnEVaWvlBLkKOwQdiT0KsnFEapo9h/wwZDl9zZFI7o1KNeCHRjXjaUc5MYphrzO86xFc shuO7WHw6+racDJDF/S4mEGoQIeQJWjutlVHDxDf3kSxB1GUX0r1Yu3HCuK8qPj0AvFZ SYncYnGol3M8oWAC/+T9xXcAd85sRyskMQ1QlZdQ89mJaTekJTg6ryWQEyPzkU6YkIwc ReFEfA2sSooMxiHJRAXIwIIXwLhWskJGiURHZx0FAiKqhmIwHunGE4aVIhPXg+46mLxT HanQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=FlS8ZlHm+iEvM2V0/cYmeJRoMnzESlUbdCYPW979/AI=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=szx8Uj/G+4nLWUTe9E7Zvw/irXp4WUoy8IRdZU4JrKg1yd6qJb7h403CwHoW+zyRe0 E8Pqo0SgY50mT5fGOwkq3h5+p8ShCARpcq84sJIhJpEhWlRyme5PixbKqNH6KfdJeEVu sbikhVyFNiPSvL3bDNCnW9MZ1K0WyRoiyWRxshfnm/ceVAf9VVW8k4BlX5WBlWmmRGyq vbctV+bBYuZuNPuX+QJAp1Dyn7ot8DZGMc2dQ5GDM/LAOQTVhapRaMrdXoBDanBgHvvT 6vqlJxtYQ5iTHaxleL6P/zY5YNQjKa8mY26JZkjwaXjrD68DPkKqQ55zsgH99cTpFSyP tOiA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="xCWL/7D+"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id x20-20020a170906711400b00a59a221dca9si2065037ejj.522.2024.05.05.09.47.13; Sun, 05 May 2024 09:47:13 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="xCWL/7D+"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A82FE68D412; Sun, 5 May 2024 19:46:05 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-251-59.mail.qq.com (out203-205-251-59.mail.qq.com [203.205.251.59]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 16B7E68D5B6 for ; Sun, 5 May 2024 19:45:53 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927544; bh=F6jF1id8DDLcrsk11DC4Mn+xizI1R0fa1XAWtbSfGrU=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=xCWL/7D+viWFFJjZTJKx8daPkkE0t2PyOFLv1MnuSaETcTnBCkDoqnNjPM1uW4n7X pPHowPxBsWmNw3qCYvgxpo38eJMgQBHJBcTAddQei5mTDvbwFPLJ3RvpaWs1GWzLw9 DovSM3twyc9hfbG2jBGuBeLGs++yR1KDqx6mvz8o= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927543tol549bmk Message-ID: X-QQ-XMAILINFO: NVviswOLIcto/wzg5HEBo/Iahpry03MZn7igUyoI8bEqzRUlCT6x8qRXRbtEHi Gg6I0bEYTTD7FojI4JYdtn+ybvWmSetxN9Ir8rlbaH9AfJ65CisJt2PclNoAEaOLVZCDFjIgYGFG dSJ7tqjGNnNs30AfOlBCXG5ZQx9JJP5dXGPhUwNbLrHv8uk7clbGx4RB4gIViT2LO43Hvq7zNJ8a IOSb3eSNPI8G0K2ydmCgroQOrIsveWHKjSEiGsaQPfUi4YmJhPbh6qNL35Od+EdsNLjC16oBClC4 TSZKA5X2mJTBmN+dd9TscVSApkPwvyDJYGgjefGVmvFcAMwuPkXN0LbqeodD0IFkV/7q3fTqxYkK rlv8BTI7bqZ73+Cczm+V22/We7of1k62aE+il+TcUfrs4kfiSAMtoe1W3iJx27Sc4z0OySOHDH5R lCYAwRhw09XGD7rC57TD6aWXxuZiVvWfaaoF3Iylwc6VklznQsHq/goLk30azCi8F6973iLJWPSh 9DrURBkGmxkmSJF12SHaVbM1Bh34eg/ws+UcUe0KF/sAJ1wwVP4RnOxKeH09IeWb6tmoDW+GnTKn iS46UO7oFwYN3j1eNp5c9yJz9+/oPXIduxENhL1H+Xx/ILLSzGeV02C8UumI52tNV5ws6qSkzU85 H8g46Kt7UfNgi4aHejB/mOVg8A4kyygPvW2UFzKCEtiyb8hJxWiGNjvAJWZNYeD4Mxo7I1F4bn6i 6/ObBkV6my8F3wKyaBMLMim0fXVKuKmZAS+vbjFp2y10r1YXl0QH7DTqnnmxiKFz4G4oIc2tk5aq HAkTf0A/QjKiSIUQW9AWVd89dq1ESGz2i0LsnzwRHpTEodJTHcLZEFAtFuWvmhygzaznSRalj3Km vzWlxHaFziXHbnY5dkE+YCvBeoalUC5DUV1L8dSUHLmGnmMYwvEVfas8UBIoyy9w== X-QQ-XMRINFO: Mp0Kj//9VHAxr69bL5MkOOs= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:34 +0800 X-OQ-MSGID: <20240505164536.872683-8-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 08/10] lavc/vp8dsp: R-V V loop_filter_simple X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: jzI4hbTrMTiz From: sunyuechi C908: vp8_loop_filter_simple_h_c: 416.0 vp8_loop_filter_simple_h_rvv_i32: 187.5 vp8_loop_filter_simple_v_c: 429.7 vp8_loop_filter_simple_v_rvv_i32: 104.0 --- libavcodec/riscv/vp8dsp_init.c | 5 ++ libavcodec/riscv/vp8dsp_rvv.S | 85 ++++++++++++++++++++++++++++++++++ 2 files changed, 90 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 6ebb2e11e0..6037c86e19 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -42,6 +42,8 @@ VP8_BILIN(16, rvv); VP8_BILIN(8, rvv); VP8_BILIN(4, rvv); +VP8_LF(rvv); + av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RV @@ -127,6 +129,9 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) if (flags & AV_CPU_FLAG_RVB_ADDR) { c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + + c->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter16_simple_rvv; + c->vp8_h_loop_filter_simple = ff_vp8_h_loop_filter16_simple_rvv; } #endif } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 2d5e2260b7..bef5f0ebdc 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -95,6 +95,91 @@ func ff_vp8_idct_dc_add4uv_rvv, zve32x ret endfunc +.macro filter_fmin len a f1 p0f2 q0f1 + vsetvlstatic16 \len + vsext.vf2 \q0f1, \a + vmin.vx \p0f2, \q0f1, a7 + vmin.vx \q0f1, \q0f1, t3 + vadd.vi \p0f2, \p0f2, 3 + vadd.vi \q0f1, \q0f1, 4 + vsra.vi \p0f2, \p0f2, 3 + vsra.vi \f1, \q0f1, 3 + vadd.vv \p0f2, \p0f2, v8 + vsub.vv \q0f1, v16, \f1 + vmax.vx \p0f2, \p0f2, zero + vmax.vx \q0f1, \q0f1, zero +.endm + +.macro filter len type normal inner dst stride fE fI thresh +.ifc \type,v + slli a6, \stride, 1 + sub t2, \dst, a6 + add t4, \dst, \stride + sub t1, \dst, \stride + vle8.v v1, (t2) + vle8.v v11, (t4) + vle8.v v17, (t1) + vle8.v v22, (\dst) +.else + addi t1, \dst, -1 + addi a6, \dst, -2 + addi t4, \dst, 1 + vlse8.v v1, (a6), \stride + vlse8.v v11, (t4), \stride + vlse8.v v17, (t1), \stride + vlse8.v v22, (\dst), \stride +.endif + vwsubu.vv v12, v1, v11 // p1-q1 + vwsubu.vv v24, v22, v17 // q0-p0 + vnclip.wi v23, v12, 0 + vsetvlstatic16 \len + // vp8_simple_limit(dst + i, stride, flim) + li a7, 2 + vneg.v v18, v12 + vmax.vv v18, v18, v12 + vneg.v v8, v24 + vmax.vv v8, v8, v24 + vsrl.vi v18, v18, 1 + vmacc.vx v18, a7, v8 + vmsleu.vx v0, v18, \fE + + li t5, 3 + li a7, 124 + li t3, 123 + vsext.vf2 v4, v23 + vzext.vf2 v8, v17 // p0 + vzext.vf2 v16, v22 // q0 + vmul.vx v30, v24, t5 + vadd.vv v12, v30, v4 + vsetvlstatic8 \len + vnclip.wi v11, v12, 0 + filter_fmin \len v11 v24 v4 v6 + vsetvlstatic8 \len + vnclipu.wi v4, v4, 0 + vnclipu.wi v6, v6, 0 + +.ifc \type,v + vse8.v v4, (t1), v0.t + vse8.v v6, (\dst), v0.t +.else + vsse8.v v4, (t1), \stride, v0.t + vsse8.v v6, (\dst), \stride, v0.t +.endif + +.endm + +func ff_vp8_v_loop_filter16_simple_rvv, zve32x + vsetvlstatic8 16 + filter 16 v 0 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_h_loop_filter16_simple_rvv, zve32x + vsetvlstatic8 16 + filter 16 h 0 0 a0 a1 a2 a3 a4 + ret +endfunc + .macro bilin_h_load dst len vsetvlstatic8 \len + 1 vle8.v \dst, (a2) From patchwork Sun May 5 16:45:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48549 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp962967pzb; Sun, 5 May 2024 09:47:04 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCW5Wv1WMVBwVPk5XfE/Ow/mDnpusBtILnEFrwskPxLSbgws9qKAU8CqLLyvupUCc0Am+3r++4129zbppNa6RPipgjYSZ/JXgZlHfQ== X-Google-Smtp-Source: AGHT+IGu+s4E1zEMQKovD+Le0/EtHb7+ZTJzkQACrOiyy33emOyu17TOsg0eOwIezJyH4CpqOrEL X-Received: by 2002:a17:906:48c4:b0:a55:9dec:355f with SMTP id d4-20020a17090648c400b00a559dec355fmr4270819ejt.70.1714927623986; Sun, 05 May 2024 09:47:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927623; cv=none; d=google.com; s=arc-20160816; b=v3Cpr7YeuVvlxvv4CcWXXoThnbim8H285iL+LuT62VPt81lIU7oOG56Wb4I2CqVpYc Cbgj98JQ2RkHety14a54rMO05nDuoxIA69ReVeYKrflKX0nch3PGv1+V0H45owZYZ84S zS8lfuE6/H5YMc0NjR1B2HR5AKYEc8jQunlZvpREYxvGlobni/ScoQaSI6m/J3AcgRub cQXXOYAC1ZZTq/xafuVkw45S6Alz2M8/hLIbjEZOn+Xa5R11RDudVXtkEN23kYY0/XKF VTUbCvUDCbT9MD/OT3+OPMNRDgkJrfukFolUZrtYazRRmhXOui7mWDl+suNMcqrjHViz z/3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=xehpv/iHxiy8pw2jma82f134zUDeye0czOwzyKy6BmQ=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=RDktZONo3BDW2F4dU+FuVuxDdSbszCo6etPxlSeEJqc8g+HZxd0hQEl5e13/gfGgfM 82Y6yrJNS/m+fKF/QlLMY2bWDhtfLf9sCQp6h+MDVIGKH4OpQg5dswLBVio4+VfmlJnE TNURY7g3z/MvKyR/dv/RhARKg0EZXeixM3P9LuAbwm3Sg13+ysZ+CF4AG15kHc5+H7UQ ovnxBXngCme7Ugifn+6a3td4SnVpICnwqSurSSrZCXZ3X5JScVyP8h9fX6sjZDO49S0g u05H7tumVDa8TwbmLNaiCICLKy0nDS3i6a1Ur6H1zUAb1krrpJj7z1vL1B7wQEud54M4 fhFw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=wM3+sHSm; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id le12-20020a170907170c00b00a59bedc2204si1203680ejc.1022.2024.05.05.09.47.03; Sun, 05 May 2024 09:47:03 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=wM3+sHSm; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9E3BC68D5D3; Sun, 5 May 2024 19:46:04 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-221.mail.qq.com (out203-205-221-221.mail.qq.com [203.205.221.221]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0DED968D5D3 for ; Sun, 5 May 2024 19:45:53 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927545; bh=8qcvgCiZDfzxbIlm0YqRVNIfHA4CYMN8+cLPMSBHeA4=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=wM3+sHSmHpVYHp1CLV7LEEu4NAD7tcRTFBm/M09KYz7M01pAqf9ETCtl3wclk116f zwvR6Bu3zQ8mNTO6sr/ogLJ7kcsUmCYYwSmiZAmJX3ZoQtTaf2Y7anTLX7ZdtwtnB6 +N2NzIShOvJLYi3DUxZxQOkrWM6zvR9dMpQl3drw= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927544t5faks814 Message-ID: X-QQ-XMAILINFO: MWxZUaxBV+u3L+OGm+gy9GZP6AkHRxFQnnVhOk7WIf65bWUtBaSzwCvYkpHM39 XLmmZqHybtzdbB3AkCoqctjGA0/ZqYP0omr819ppCKjuagBSdaA2NkvwImK8752sbAN7EXc7wfwC buKbpVfbLI0g6GWRVgbDLocqjvINdVu0fp/cMYldRIb3MMSCwU2BPAvfb9IudsaEnR/aS+ZoIKsY Q0i7lADNk1P/IpnfG+Xoe99bIRMetwKot6XthLnOaNc4pEXwe7U9w2MTLAF2RTq9btYyKbJPCcUu UoJK5fMMNeCE0Pr81G1V1J3qvtjfO2nDadaKoXG7N1ejlOn0HFM5Y/HkfEFNSSE5jdjEWWEL44iQ b/pmzwm7P5vr6Ot1nvlK3dqyAK+MKl483Pdm9wtgJFRlKI2MGbAbbGq539/lh4iS21xieYdjMy50 KIF6M6hEZxJlnaouBFPEy6+tLBICrJn53z3dZkReXBsbwkt6ucg7S/bnvBKFhWgDMWlWINl4OuFf brNDmG9b6uvE7HZqBRRCydFlm33A9SqtFsxDxPWxvwWu6ZYW7rgPdbjJ6K0iziP1aUeLVZdlUMiO nWmYHIjZasX4IviuvcI0oF4Pj3wPErsKTjRxSJ7DzEzEIlKhs8r/lEGp+1DWjEMIZpCjyeZtUDBm y8Rq8dDEvsYtB6nhEDpjcDeClx9xmvQPTl7BUJ5XdYJTUuttWU2KtWHeFib0lcbz8KuxP7DlQNJs UmQwXsPl2IT06hFWw11hykMeDIHY+tQQFP8KDik5V3ku64uFKBQj4CfS4EK1xIWlKe8jXdLK9hk4 0GDtYCm/YCQcTmF08W0I63OIVCfeZC+XWtcMpPuzk+2Qy2Fquwb/0/hBix6p920CQNlgtS6gTVDk +1pIzREmrgD6KFZf1H9LaDfuyqSo51kUUDHEAwGdBeON2noo+DuzRTIS9OMwS0TqVpz/s7ZZru X-QQ-XMRINFO: OWPUhxQsoeAVDbp3OJHYyFg= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:35 +0800 X-OQ-MSGID: <20240505164536.872683-9-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 09/10] lavc/vp8dsp: R-V V loop_filter_inner X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: MnjlCMwtcXzW From: sunyuechi C908: vp8_loop_filter8uv_inner_v_c: 738.2 vp8_loop_filter8uv_inner_v_rvv_i32: 455.2 vp8_loop_filter16y_inner_h_c: 685.0 vp8_loop_filter16y_inner_h_rvv_i32: 497.0 vp8_loop_filter16y_inner_v_c: 743.7 vp8_loop_filter16y_inner_v_rvv_i32: 295.7 --- libavcodec/riscv/vp8dsp_init.c | 4 ++ libavcodec/riscv/vp8dsp_rvv.S | 104 +++++++++++++++++++++++++++++++++ 2 files changed, 108 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 6037c86e19..4f38abba93 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -130,6 +130,10 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + c->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_rvv; + c->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_rvv; + c->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_rvv; + c->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter16_simple_rvv; c->vp8_h_loop_filter_simple = ff_vp8_h_loop_filter16_simple_rvv; } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index bef5f0ebdc..d7e8b6ae58 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -95,6 +95,13 @@ func ff_vp8_idct_dc_add4uv_rvv, zve32x ret endfunc +.macro filter_abs dst diff fI + vneg.v v8, \diff + vmax.vv \dst, v8, \diff + vmsleu.vx v8, \dst, \fI + vmand.mm v27, v27, v8 +.endm + .macro filter_fmin len a f1 p0f2 q0f1 vsetvlstatic16 \len vsext.vf2 \q0f1, \a @@ -120,6 +127,16 @@ endfunc vle8.v v11, (t4) vle8.v v17, (t1) vle8.v v22, (\dst) + .if \normal + sub t3, t2, a6 + sub t0, t1, a6 + add t6, \dst, a6 + add a7, t4, a6 + vle8.v v2, (t3) + vle8.v v15, (t0) + vle8.v v10, (t6) + vle8.v v14, (a7) + .endif .else addi t1, \dst, -1 addi a6, \dst, -2 @@ -128,9 +145,27 @@ endfunc vlse8.v v11, (t4), \stride vlse8.v v17, (t1), \stride vlse8.v v22, (\dst), \stride + .if \normal + addi t5, \dst, -4 + addi t0, \dst, -3 + addi t6, \dst, 2 + addi a7, \dst, 3 + vlse8.v v2, (t5), \stride + vlse8.v v15, (t0), \stride + vlse8.v v10, (t6), \stride + vlse8.v v14, (a7), \stride + .endif .endif vwsubu.vv v12, v1, v11 // p1-q1 vwsubu.vv v24, v22, v17 // q0-p0 +.if \normal + vwsubu.vv v30, v1, v17 + vwsubu.vv v20, v11, v22 + vwsubu.vv v28, v1, v15 + vwsubu.vv v4, v2, v15 + vwsubu.vv v6, v10, v11 + vwsubu.vv v2, v14, v10 +.endif vnclip.wi v23, v12, 0 vsetvlstatic16 \len // vp8_simple_limit(dst + i, stride, flim) @@ -142,6 +177,25 @@ endfunc vsrl.vi v18, v18, 1 vmacc.vx v18, a7, v8 vmsleu.vx v0, v18, \fE +.if \normal + vneg.v v18, v30 + vmax.vv v30, v18, v30 + vmsleu.vx v27, v30, \fI + filter_abs v18 v28 \fI + filter_abs v18 v4 \fI + filter_abs v18 v6 \fI + filter_abs v18 v2 \fI + filter_abs v20 v20 \fI + vmand.mm v27, v0, v27 // vp8_simple_limit && normal + + vmsgtu.vx v20, v20, \thresh // hev + vmsgtu.vx v3, v30, \thresh + vmor.mm v3, v3, v20 // v3 = hev: > thresh + vzext.vf2 v18, v1 // v18 = p1 + vmand.mm v0, v27, v3 // v0 = normal && hev + vzext.vf2 v20, v11 // v12 = q1 + vmnot.m v3, v3 // v3 = !hv +.endif li t5, 3 li a7, 124 @@ -166,6 +220,37 @@ endfunc vsse8.v v6, (\dst), \stride, v0.t .endif +.if \normal + vmand.mm v0, v27, v3 // vp8_normal_limit & !hv + + .if \inner + vnclip.wi v30, v30, 0 + filter_fmin \len v30 v24 v4 v6 + vadd.vi v24, v24, 1 + vsra.vi v24, v24, 1 // (f1 + 1) >> 1; + vadd.vv v8, v18, v24 + vsub.vv v10, v20, v24 + .endif + + vmax.vx v8, v8, zero + vmax.vx v10, v10, zero + vsetvlstatic8 \len + vnclipu.wi v4, v4, 0 + vnclipu.wi v5, v6, 0 + vnclipu.wi v6, v8, 0 + vnclipu.wi v7, v10, 0 + .ifc \type,v + vse8.v v4, (t1), v0.t + vse8.v v5, (\dst), v0.t + vse8.v v6, (t2), v0.t + vse8.v v7, (t4), v0.t + .else + vsse8.v v4, (t1), \stride, v0.t + vsse8.v v5, (\dst), \stride, v0.t + vsse8.v v6, (a6), \stride, v0.t + vsse8.v v7, (t4), \stride, v0.t + .endif +.endif .endm func ff_vp8_v_loop_filter16_simple_rvv, zve32x @@ -180,6 +265,25 @@ func ff_vp8_h_loop_filter16_simple_rvv, zve32x ret endfunc +func ff_vp8_h_loop_filter16_inner_rvv, zve32x + vsetvlstatic8 16 + filter 16 h 1 1 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter16_inner_rvv, zve32x + vsetvlstatic8 16 + filter 16 v 1 1 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter8uv_inner_rvv, zve32x + vsetvlstatic8 8 + filter 8 v 1 1 a0 a2 a3 a4 a5 + filter 8 v 1 1 a1 a2 a3 a4 a5 + ret +endfunc + .macro bilin_h_load dst len vsetvlstatic8 \len + 1 vle8.v \dst, (a2) From patchwork Sun May 5 16:45:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48551 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp963083pzb; Sun, 5 May 2024 09:47:22 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXffxhAwMQrmsilWSAkp0j4AF/2KbrNtG6Ah1r0Rp46yK+WdARk5quW4GpHjJCtJgazYQ4/SUN2nj5VkY1Fp4TimEskpwhL8YW3Tw== X-Google-Smtp-Source: AGHT+IH/pd4PA76K4MPyrNwbpzn+nXlIiJ5dJZ2/1quCOoni1aYl/PKt3+aFVqSBseth+NM9BURu X-Received: by 2002:a17:906:f845:b0:a58:9e89:7d91 with SMTP id ks5-20020a170906f84500b00a589e897d91mr5199625ejb.42.1714927642223; Sun, 05 May 2024 09:47:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927642; cv=none; d=google.com; s=arc-20160816; b=VR5HQeUW7/t42iDmkgg7QCwVV0h5CWsBwY6IBqMA7vZ730c572+CrAiN2F/Z3/9xNq ktWfR+sV4mU5aJdIBRSyquf22QzoUuhjo6ipkGCnMFwBsQxsP8yiX4yrQFTSRxQ5zLpe 9XtdKXUY3P9R3mROMgu4XxrXEuO8tzaZT9QI5ZY+n2Z6ojYnSPQnNm8//H47AtKpcEoH grECF2InxxrbIcx09whGtTwFE+WqpUj4XU3YBE2x7l+Gd64ikV0/BPWndpMHg4+qPWGY zgJIMstItWRvOLSHILlpFB7z/PI+kJIJRYL6upWH/B8jLy90GQB6aGdf9o6uQ3jERM7U KpzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=5sw3rqocwLFTdKO2iq6ZxoqARAuJCMCKdXgp98hUEYw=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=MtVvdhD4spWx877ZtP3Vy3doc8Po9IiMF6BhSN+SJek+Tgvu6tp9kok5AYo5ncCPlc fNrTg1XLYOMqQaREJNHMj6Syw+ukYMn1bSXClcxLd02Vo9j+08kph8VZf9BAaEtbIjNo 86tO2CqgsUndHQU4i4VTBSA67S+fYiRA579jaab+hpi0PAGKQmUCfr9OygCbKCisRYjO qZWK2Ak+tqxBJZbCqDhbEnf+shxLBResFz0kzfLbX1aaQRwxVfsc4a9wqU8t+/o0tYvr 24CRaRf/Gil5ZuDEluR6JLz7TxiE1Dmj/rumHdgXWWEkYSI02Kuv/aZ0YtUp5As9xR4K vIig==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=NAw3JEYH; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sh34-20020a1709076ea200b00a59a975a8adsi1893512ejc.761.2024.05.05.09.47.21; Sun, 05 May 2024 09:47:22 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=NAw3JEYH; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 093BA68D647; Sun, 5 May 2024 19:46:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-209.mail.qq.com (out203-205-221-209.mail.qq.com [203.205.221.209]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id EF67368D5EE for ; Sun, 5 May 2024 19:45:54 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927545; bh=J5fgv8bzv3WNdHdeUCKFgcJXslmRF8mItl//+anGV2w=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=NAw3JEYH5h63dOf2fVMgsvifOOIBbK7KCILKIVmDo0aCR/S2FCZRKjBotKoPyuu4h pTjjTMaSyMJhG3P0SWoSt/wfNLPqvcYyQGn63WUZZW8WAppv4i2z2/HhbdA5nDGCen /U0qmksZ6B34LDmaRWyeSfn2V4ZsPVjloyi+4X9w= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927544tr3ayn5b9 Message-ID: X-QQ-XMAILINFO: NDgMZBR9sMmaL0Gzn/9bn1SEGVEPNgEQzFjZ5HAxFSOomc5XjMhIHz6o6m7pJ4 xN67TgLeLafIrUd0bES2ZRD+chmFdCSvXXy66baHSwhPADzn9qbMcPVp9UhPJJwSaraNVpEjdQ99 lqBbbe+5+1hDxc3rSDtHhdI4/CgY21ULLDV9rBJOsVNb9ajfdX8M+2x8h49mzdSuZjeC7q3eaa7Z hgr3YIXj/Venb889FXub+LQrDYaj8BAiL5jFQnILcmRNvQqKmT+hjF/xX2nQKhQJf6bTkzf9WaqV 5qlFCMkcxwYrGVn82yBxWML42zCv7cNxOnG1JzP1z8wp9aL07RqiwQJumWIXQbdVRVMgPg3lk/eT 7hX6iDJAMS523lHdWsQdHt/uoAjm2HoIqk/SAvOcBOBsoeRwl62VsRM1amkroWfAoQvrZKYED0Dz cpKeybPTWh4kWD7hBU3Xkn8JWdnOKFTMOI/SEkE6batwN3I+MPuv5jbJ4UXG/Df9JgSMa0fqEyXQ mz8T8/vR7NWGFcWxOM5quPiOBuiG0QcKMxINBKgaysCpc9rQZUpK1wJlQ/Jc8QD/T3NzxDj/cleL pMSE7tASeWg5y3hGOKV+LDCQ5riXUIsPSK7/gAAhCV8qKjojTobppDn65kRGow9U9Ym0sVyxEbKW ysp0q4eFGYXEqYEpu0ETUeeqSZM0B/JBmyB8gDRxDd1FWWzB+nEXL5cEqCTYrYK4Cc3lgqx+ddOg DaQ2g2oHtNvZxD9T5cDbXLhsA+Oj9yI56COb1cuZfTojKhlZ3ngYacp5vSqTJoJ8xi4BFq/ygyo/ MHEpfeldP1nslzSqz2+iPbyeFRsZcxrTTRpLZdaytUPwjykknLgOMJ42LYOGeV5ec41ZCGxMVyAt cM5oPxky4ma5s5xwGiGYOIT/2z26vWD8B74tY/AY7svin1vH942Iw= X-QQ-XMRINFO: MPJ6Tf5t3I/ycC2BItcBVIA= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:36 +0800 X-OQ-MSGID: <20240505164536.872683-10-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240505164536.872683-1-uk7b@foxmail.com> References: <20240505164536.872683-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 10/10] lavc/vp8dsp: R-V V loop_filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 2Rl3x1FBa+EM From: sunyuechi C908: vp8_loop_filter8uv_v_c: 745.5 vp8_loop_filter8uv_v_rvv_i32: 467.2 vp8_loop_filter16y_h_c: 674.2 vp8_loop_filter16y_h_rvv_i32: 553.0 vp8_loop_filter16y_v_c: 732.7 vp8_loop_filter16y_v_rvv_i32: 324.5 --- libavcodec/riscv/vp8dsp_init.c | 4 +++ libavcodec/riscv/vp8dsp_rvv.S | 57 ++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 4f38abba93..35c1646dab 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -130,6 +130,10 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + c->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_rvv; + c->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_rvv; + c->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_rvv; + c->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_rvv; c->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_rvv; c->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_rvv; diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index d7e8b6ae58..360d79bc22 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -230,6 +230,33 @@ endfunc vsra.vi v24, v24, 1 // (f1 + 1) >> 1; vadd.vv v8, v18, v24 vsub.vv v10, v20, v24 + .else + li t5, 27 + li t3, 9 + li a7, 18 + vwmul.vx v2, v11, t5 + vwmul.vx v6, v11, t3 + vwmul.vx v4, v11, a7 + vsetvlstatic16 \len + li a7, 63 + vzext.vf2 v14, v15 // p2 + vzext.vf2 v24, v10 // q2 + vadd.vx v2, v2, a7 + vadd.vx v4, v4, a7 + vadd.vx v6, v6, a7 + vsra.vi v2, v2, 7 // a0 + vsra.vi v12, v4, 7 // a1 + vsra.vi v6, v6, 7 // a2 + vadd.vv v14, v14, v6 // p2 + a2 + vsub.vv v22, v24, v6 // q2 - a2 + vsub.vv v10, v20, v12 // q1 - a1 + vadd.vv v4, v8, v2 // p0 + a0 + vsub.vv v6, v16, v2 // q0 - a0 + vadd.vv v8, v12, v18 // a1 + p1 + vmax.vx v4, v4, zero + vmax.vx v6, v6, zero + vmax.vx v14, v14, zero + vmax.vx v16, v22, zero .endif vmax.vx v8, v8, zero @@ -250,6 +277,17 @@ endfunc vsse8.v v6, (a6), \stride, v0.t vsse8.v v7, (t4), \stride, v0.t .endif + .if !\inner + vnclipu.wi v14, v14, 0 + vnclipu.wi v16, v16, 0 + .ifc \type,v + vse8.v v14, (t0), v0.t + vse8.v v16, (t6), v0.t + .else + vsse8.v v14, (t0), \stride, v0.t + vsse8.v v16, (t6), \stride, v0.t + .endif + .endif .endif .endm @@ -284,6 +322,25 @@ func ff_vp8_v_loop_filter8uv_inner_rvv, zve32x ret endfunc +func ff_vp8_v_loop_filter16_rvv, zve32x + vsetvlstatic8 16 + filter 16 v 1 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_h_loop_filter16_rvv, zve32x + vsetvlstatic8 16 + filter 16 h 1 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter8uv_rvv, zve32x + vsetvlstatic8 8 + filter 8 v 1 0 a0 a2 a3 a4 a5 + filter 8 v 1 0 a1 a2 a3 a4 a5 + ret +endfunc + .macro bilin_h_load dst len vsetvlstatic8 \len + 1 vle8.v \dst, (a2)