From patchwork Tue May 7 16:54:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48635 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9c99:b0:1af:836d:81b3 with SMTP id mj25csp38495pzb; Tue, 7 May 2024 09:55:06 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWXnZuVyBIaKd23NbNUiUxl6HshWregT3msAE3FWVmhDuPoSroNmZE7eQbRWYhirTynoAnGcpolOIuN4zmPKV+ja/Fu59gbdsl2Rw== X-Google-Smtp-Source: AGHT+IGNTh67mFQMvFH/4mih+RDOPC4OfcIVdZc5qlVI+7BZEYPVSS/hL69DQWW30Icom3ZOlTbf X-Received: by 2002:a05:6402:3707:b0:572:d6b6:4bed with SMTP id 4fb4d7f45d1cf-5731da62568mr172563a12.2.1715100905873; Tue, 07 May 2024 09:55:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715100905; cv=none; d=google.com; s=arc-20160816; b=cNrWUGmacQ0CnMWNOfccbm1sCOQF7QL5hnIjxRvGehXhT5RDL8Cpb9kuwaqrVgX1vG P9lPGcxJ92Dni5QKsM1LC2QcWwPm/+XX3rllxfDnJQS6+1EcBNUSKnuER+pYijyZ0KjE T4B8J1iavSXO5aIjhW68m1To0rRtsUkhrDgKeleZ0/YrRiypKWPeEiXBYRa2OSySZQ6P YP31L4KoOWxdsnYtD7RGKfNdYtEaict+QVsCLeT2wMpn6W/0xTENSNz7vgaOwFf9iNHX VxkaZtpGj05AoJjAjI8k/pJG2FlaaIMwubw4nbSf7LaPDhs1G80My2e07j2yLFV30J6K th1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:date:to:from:message-id :dkim-signature:delivered-to; bh=3VGPmzax+lsvCeSNqnV+jdrPzPWjUQzVBWromYyMF1c=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=cMOQnJOkQnVPh82LPbgFE1eRkTvjRrYsXtcduPU09NJv3fcJO4xDzq9B4yX9uYDhtO VB+fRZRPwc/AIKo+I+iwDQ5z3OMy4yEU689mqTUj9KLqUmaGdA2vjYolDAgQBJsj1ooA U4RHvrx2697JOrD/bZ/5jP7HfKvHXVuPOGhNFwyF/CYrvx3lGEGRwF+6aLEXwdBc35ed VLoBb4QO1O9dhhYCKHBIvG7GGjEF1fYi9DvWPnjGPqau1rnkAqF5wMpxL732y1MY/hli o/FvgeENOCoXvIcwnaBCNW6gb2SkjzDGzryCYH+/Suh6/YDdqZL3PDsOLtBi2Eb9XuWL AW8w==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=nHVZeFfj; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d26-20020a056402517a00b00572a8b84a9fsi5908846ede.465.2024.05.07.09.54.47; Tue, 07 May 2024 09:55:05 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=nHVZeFfj; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 72B3768D544; Tue, 7 May 2024 19:54:43 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-210.mail.qq.com (out162-62-57-210.mail.qq.com [162.62.57.210]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4555968D4FF for ; Tue, 7 May 2024 19:54:34 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715100866; bh=guC8hC0bTXtJRwAbN7baeYh/nfETQOHAtyGHeBSrxkQ=; h=From:To:Cc:Subject:Date; b=nHVZeFfj+OKKKZZst+A9yMOK1GytMmJDZqtxc7/ScMHj0FYS92TI/qA/xW4lhluCU 3eCu0MHq03hUjo2fCtu03Ewu8iLkuxgjcPurctP5r1InSkz/bI+nHcL5pcPRaxsr18 aF6BQh+OfKs5KjAteF7krWag2PM8xuOV0Zp2sFMM= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-1.qq.com (NewEsmtp) with SMTP id D99244C2; Wed, 08 May 2024 00:54:25 +0800 X-QQ-mid: xmsmtpt1715100865tq4jm2ywn Message-ID: X-QQ-XMAILINFO: Nci1v0XuD9lFtESJBCkZXwMJ+ZrHXyqYI/s8TMfyQB6DQ5zBRzmQGMkn1/02hB WCGvMydGjCrhlsUmtWTTcwJ7CqiDoyDcPgMiDrhPT4sCXDqB+DSIT0hlTGjvkel5Q0gsx0iEnWmk OuovumeJ7cHGrwwANtjPHCxHDc2jYLXM0olfpJh2mC8OHhR7G6k+8XL9vdbq70IcGLU4mc6kBx/S eUSx7n5RmGULVV1SuqrO20EqSzKigbWt6Kcp85bjCdmd00qlXv+qYKlmSRE7156grGY9UGnUsBgC vbA6sD7Em/qa6oaWOehPncm4TFaMM+aK9KbM2RfWHUgfe9ljM5IBz0tevl4MoD/Fux2InUkzt8vo 5bPFUYKmXwb5fmKUzc3ZATtJMy2VPVrc06TAK5ewGBhfb+V+6ZFSSh1lY0hYs1SOMQ2WPOip6Yql wlSDc6AY6QTSc0DtpMCyenfFRt6QQVlVPPRvqffba/DoaY5TavYAHHHZE+21T3P1ftfPsxxuP8bA tXke6uyFwKlhmMu6h0fspXHGx8Dikgz8pEwYpZFetOe0gZROl0tlzcXOGjHtj7bkqqj9sC78Q7Ci CKXVajqcJAShLeys6rjEkpM9pvTEXRwbueENg9/4vjcsOOncKXuRc/PXDHXMf4cS8wOqz9ujDcdV RRc2AmglvTtHTL1BVOR9UVdv3LM8emiZtot8LH0Qi7BsuxOBPp/FyqMUIOUKS7lPFJNYhNLZlLrO acbjWxq80mhqUt+Kfz/XPU7c48C7ZB95X29r6///umP3kdgptcEGeRIxA28YVBhDTwIVxNZCTxMA g9MJXaUl8ltx8zH0QBMn4MJZJxyH2LAztFcgbxk0ro8c9ZbXAfwYYij/od7geLM0Il+CbvSPPGC+ shTvAcYP7Uds9hP/n7TA3uvFA+mRPGTQ== X-QQ-XMRINFO: MSVp+SPm3vtS1Vd6Y4Mggwc= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 8 May 2024 00:54:04 +0800 X-OQ-MSGID: <20240507165412.1306563-1-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 1/9] lavc/vp8dsp: R-V put_vp8_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: tPwa4n0HsIvd From: sunyuechi C908: vp8_put_pixels4_c: 78.0 vp8_put_pixels4_rvi: 33.7 vp8_put_pixels8_c: 278.0 vp8_put_pixels8_rvi: 55.0 vp8_put_pixels16_c: 999.0 vp8_put_pixels16_rvi: 86.7 --- libavcodec/riscv/Makefile | 1 + libavcodec/riscv/vp8dsp.h | 75 ++++++++++++++++++++++++++++++++++ libavcodec/riscv/vp8dsp_init.c | 22 ++++++++++ libavcodec/riscv/vp8dsp_rvi.S | 61 +++++++++++++++++++++++++++ libavcodec/vp8dsp.c | 2 + libavcodec/vp8dsp.h | 1 + 6 files changed, 162 insertions(+) create mode 100644 libavcodec/riscv/vp8dsp.h create mode 100644 libavcodec/riscv/vp8dsp_rvi.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 050c08ee61..526cb5c97c 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -61,6 +61,7 @@ RVV-OBJS-$(CONFIG_UTVIDEO_DECODER) += riscv/utvideodsp_rvv.o OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_init.o RVV-OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_rvv.o OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_init.o +RV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvi.o RVV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvv.o OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9dsp_init.o RVV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_intra_rvv.o diff --git a/libavcodec/riscv/vp8dsp.h b/libavcodec/riscv/vp8dsp.h new file mode 100644 index 0000000000..971c5c0a96 --- /dev/null +++ b/libavcodec/riscv/vp8dsp.h @@ -0,0 +1,75 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVCODEC_RISCV_VP8DSP_H +#define AVCODEC_RISCV_VP8DSP_H + +#include "libavcodec/vp8dsp.h" + +#define VP8_LF_Y(hv, inner, opt) \ + void ff_vp8_##hv##_loop_filter16##inner##_##opt(uint8_t *dst, \ + ptrdiff_t stride, \ + int flim_E, int flim_I, \ + int hev_thresh) + +#define VP8_LF_UV(hv, inner, opt) \ + void ff_vp8_##hv##_loop_filter8uv##inner##_##opt(uint8_t *dstU, \ + uint8_t *dstV, \ + ptrdiff_t stride, \ + int flim_E, int flim_I, \ + int hev_thresh) + +#define VP8_LF_SIMPLE(hv, opt) \ + void ff_vp8_##hv##_loop_filter16_simple_##opt(uint8_t *dst, \ + ptrdiff_t stride, \ + int flim) + +#define VP8_LF_HV(inner, opt) \ + VP8_LF_Y(h, inner, opt); \ + VP8_LF_Y(v, inner, opt); \ + VP8_LF_UV(h, inner, opt); \ + VP8_LF_UV(v, inner, opt) + +#define VP8_LF(opt) \ + VP8_LF_HV(, opt); \ + VP8_LF_HV(_inner, opt); \ + VP8_LF_SIMPLE(h, opt); \ + VP8_LF_SIMPLE(v, opt) + +#define VP8_MC(n, opt) \ + void ff_put_vp8_##n##_##opt(uint8_t *dst, ptrdiff_t dststride, \ + const uint8_t *src, ptrdiff_t srcstride,\ + int h, int x, int y) + +#define VP8_EPEL(w, opt) \ + VP8_MC(pixels ## w, opt); \ + VP8_MC(epel ## w ## _h4, opt); \ + VP8_MC(epel ## w ## _h6, opt); \ + VP8_MC(epel ## w ## _v4, opt); \ + VP8_MC(epel ## w ## _h4v4, opt); \ + VP8_MC(epel ## w ## _h6v4, opt); \ + VP8_MC(epel ## w ## _v6, opt); \ + VP8_MC(epel ## w ## _h4v6, opt); \ + VP8_MC(epel ## w ## _h6v6, opt) + +#define VP8_BILIN(w, opt) \ + VP8_MC(bilin ## w ## _h, opt); \ + VP8_MC(bilin ## w ## _v, opt); \ + VP8_MC(bilin ## w ## _hv, opt) + +#endif /* AVCODEC_RISCV_VP8DSP_H */ diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index af57aabb71..fa3feeacf7 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -24,11 +24,33 @@ #include "libavutil/cpu.h" #include "libavutil/riscv/cpu.h" #include "libavcodec/vp8dsp.h" +#include "vp8dsp.h" void ff_vp8_idct_dc_add_rvv(uint8_t *dst, int16_t block[16], ptrdiff_t stride); void ff_vp8_idct_dc_add4y_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t stride); void ff_vp8_idct_dc_add4uv_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t stride); +VP8_EPEL(16, rvi); +VP8_EPEL(8, rvi); +VP8_EPEL(4, rvi); + +av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) +{ +#if HAVE_RV + int flags = av_get_cpu_flags(); + if (flags & AV_CPU_FLAG_RVI) { +#if __riscv_xlen >= 64 + c->put_vp8_epel_pixels_tab[0][0][0] = ff_put_vp8_pixels16_rvi; + c->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvi; + c->put_vp8_bilinear_pixels_tab[0][0][0] = ff_put_vp8_pixels16_rvi; + c->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvi; +#endif + c->put_vp8_epel_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; + c->put_vp8_bilinear_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; + } +#endif +} + av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RVV diff --git a/libavcodec/riscv/vp8dsp_rvi.S b/libavcodec/riscv/vp8dsp_rvi.S new file mode 100644 index 0000000000..50ba4f293f --- /dev/null +++ b/libavcodec/riscv/vp8dsp_rvi.S @@ -0,0 +1,61 @@ +/* + * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +#if __riscv_xlen >= 64 +func ff_put_vp8_pixels16_rvi +1: + addi a4, a4, -1 + ld t0, (a2) + ld t1, 8(a2) + sd t0, (a0) + sd t1, 8(a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc + +func ff_put_vp8_pixels8_rvi +1: + addi a4, a4, -1 + ld t0, (a2) + sd t0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +#endif + +func ff_put_vp8_pixels4_rvi +1: + addi a4, a4, -1 + lw t0, (a2) + sw t0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc diff --git a/libavcodec/vp8dsp.c b/libavcodec/vp8dsp.c index df7bd12424..f7c9c9899c 100644 --- a/libavcodec/vp8dsp.c +++ b/libavcodec/vp8dsp.c @@ -1402,6 +1402,8 @@ dsp->put_vp8_epel_pixels_tab[2][2][2] = put_vp8_epel4_h6v6_c; ff_vp78dsp_init_arm(dsp); #elif ARCH_PPC ff_vp78dsp_init_ppc(dsp); +#elif ARCH_RISCV + ff_vp78dsp_init_riscv(dsp); #elif ARCH_X86 ff_vp78dsp_init_x86(dsp); #endif diff --git a/libavcodec/vp8dsp.h b/libavcodec/vp8dsp.h index 30dc2c6cc1..3bf12b6b45 100644 --- a/libavcodec/vp8dsp.h +++ b/libavcodec/vp8dsp.h @@ -87,6 +87,7 @@ void ff_vp78dsp_init(VP8DSPContext *c); void ff_vp78dsp_init_aarch64(VP8DSPContext *c); void ff_vp78dsp_init_arm(VP8DSPContext *c); void ff_vp78dsp_init_ppc(VP8DSPContext *c); +void ff_vp78dsp_init_riscv(VP8DSPContext *c); void ff_vp78dsp_init_x86(VP8DSPContext *c); void ff_vp8dsp_init(VP8DSPContext *c); From patchwork Tue May 7 16:54:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48634 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9c99:b0:1af:836d:81b3 with SMTP id mj25csp38456pzb; Tue, 7 May 2024 09:55:00 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCX6J+DfRN/c+QvcKMZSfZsMK08mtvOLU1700Uv/l5EEeeiKakIz512bRQU2dYapQZsju928bRVST1ZoMihZRqoZYpEBRW4D1ILBdQ== X-Google-Smtp-Source: AGHT+IFCoWYjdFXETCXaIYhNLXyANd8IMNz1GW7d7p+UcQS6EyO+j9UT/KoCfXYILZNLGH8+z/fO X-Received: by 2002:a17:906:abd3:b0:a59:bb61:8edd with SMTP id a640c23a62f3a-a59fb9f212fmr2004666b.72.1715100900559; Tue, 07 May 2024 09:55:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715100900; cv=none; d=google.com; s=arc-20160816; b=WjfNHprFUNJrk0fgr/+mRJMz/11cAcwguOmmkpuamUllcnEeGJ3k+seYZsKfoqVn47 oidMDzkgqDRlfs45mChOjV5knNwfFh5ymuCV1bf5tHIISNjrfkOgLNF776Af1ewE3+vW i8KVncEGwOCSHui7RmpuoQGYBae8r4xy0qfpQvuaAX8B5kkKOMUZL80EGYNxDHfFt0ex sYYN82UwLcpp6ElkyuaJNZUfE0PMZ6CLgn5Q3sOtkQbqr0XrUni0dwREvDTciOYttOJn iC2OTkCxASuLtnBclP8YbiylesQGVF2FKsLS7GuDqIaVy3oOpiR8cKULmI/uniCVQrlK VXEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=prHtddFyMaAEGodaTbotoOFSTwiAcQaw7D3iirU5C/A=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=KM9vycz40DEZUF2IOT6qK7gMCvaX7a0GQ4IfYYMr8WBBFZxy+F15ozOgo0Mgd3J041 KPIkfqI2abQPLDix+XurtvNFw9vJ7ApFTQ6znzVWhH4dPRimjZpeYKpuSMOx5zWcEWMh Der0WhOtZDZnEYvfQlTes4IKPkuehwMM7yxQ2pe+9vXQ5xxi5bVwfKFy/SA+RDfCXvSe X3HOSgu7t8oPcQ92ecXO00TEROQNDZvRit6PD1lvfl9SrFBPGfrlzWlIGkh32HfjRGO4 BmufWN+vZkUEaTEcEC7RivpHAYWsvwATXx+sAygxOhttQMNZwjrWwomjFE/CDvhXpsqI 3wKA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=OAW2wbaH; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d5-20020a170906304500b00a58953bb1f2si6199555ejd.500.2024.05.07.09.54.58; Tue, 07 May 2024 09:55:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=OAW2wbaH; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8E48B68D79A; Tue, 7 May 2024 19:54:44 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-49.mail.qq.com (out162-62-57-49.mail.qq.com [162.62.57.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CA35E68D504 for ; Tue, 7 May 2024 19:54:35 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715100867; bh=2dPNB4++VVAh9DX4eXTRWh97YKdy94OxvaJ4ju/TqKg=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=OAW2wbaHmUinQ8DzoyPQMgxrCJfELwo6ayVKyNyc6vhgmH7ShIMXZrm62JilxIQgq Gt7nmgF1m6EQlzcdwOal+wbh+6t2BcZ3D9gR8cvXzNiyUeWhjTVd/iFlDCy9j38zkE FYFCEq/sZJ1LeVAnBh2B6U8jYFw2m5FJcRfiOr60= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-1.qq.com (NewEsmtp) with SMTP id D99244C2; Wed, 08 May 2024 00:54:25 +0800 X-QQ-mid: xmsmtpt1715100866t0a0s7f3f Message-ID: X-QQ-XMAILINFO: OC0C8720A+YS9RGXlFCiGQH5NyEJmFFKq25KyAEHhjDTaDi2A7nBoMCswagFl2 acyBuFhhIaWbIyNHsiR0LRHsTUNsb4ltWOvhT8PhkobPYCegGiOGlM08UjdMQvqLj9/zFhyJp4Eo Y941GI++NlVIOmOYd3kJTiXSmFe+OtWot3LpDwY7hFMFrkqQpdiCsgV3a+kmLxNYO1UJs2jPi0Wr yp7AsU+2byDocW+rDgEiNWIMwtFUWGoQ6jkuRGpeD56UChZurdiTrLKziD9ayOUrbJBQoKabGWX4 xLue3EvFZ8IYXFKXeXOjfN61Kdr9Z5td54nFSM2iAUTfZ28UZLNo863OIrcdmF4ZYGkTRSCDG/kZ QBcYozisVLILaotDPYcE93LBcWXxrdP05kHr8MGO/eVqiB72Yh8R4u4gJD7Pavf5rsi33CZGnUJn T5xVn1JpGF+53uJOQi5Yc8JIWRGDBrLaobddrg8Do8HOvt9tdpKFw3VGmhIWspzctNpr0bbFCg9t hSVrJYrAsQCq/vMRdIHpDoP1DR7aMqsoUgvQ8Q2hKPhzHd1YRFdnYqzfK+oqmTr2fMR9zNRvH0Am zFFXuBobNkuBx3igLOisVmIjQ657iJDAF9oxAETA1C9rpl7/kyYSVsK8HQaY3MnaoMCp+RN5yzd0 wFRkJk4rYirhC3i9RxOfUcrupMimi0r/SCM+f5lcn4as6pn4T81cyVAeSxmMqpp6gPibcJsiikMG Ara7EwGSf2TRsOdm4EZytyhtsz56xszoCo0EUpA0ahkf1WGW0cl0IdlnQ3Tqf7vQ9O0VeDZ7Pq+3 26aezHW+C7wg3rN83KTCHTcBtY7DgBLAP2TzoNqs7xImGI2PhLR+F+8F9E8J2afc6meE3S2E0/73 l22Sh8Ty/oOJyEA112tNokxfNp/W9X8SuJAVQp5WHQXeph4ZqYvqZO/bnrKtPv4A== X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 8 May 2024 00:54:05 +0800 X-OQ-MSGID: <20240507165412.1306563-2-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240507165412.1306563-1-uk7b@foxmail.com> References: <20240507165412.1306563-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 2/9] lavc/vp8dsp: R-V V put_bilin_h v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Mg/bF8GOnehM From: sunyuechi C908: vp8_put_bilin4_h_c: 367.0 vp8_put_bilin4_h_rvv_i32: 137.7 vp8_put_bilin4_v_c: 377.0 vp8_put_bilin4_v_rvv_i32: 137.7 vp8_put_bilin8_h_c: 1431.0 vp8_put_bilin8_h_rvv_i32: 297.5 vp8_put_bilin8_v_c: 1449.0 vp8_put_bilin8_v_rvv_i32: 297.5 vp8_put_bilin16_h_c: 2839.0 vp8_put_bilin16_h_rvv_i32: 344.7 vp8_put_bilin16_v_c: 2857.0 vp8_put_bilin16_v_rvv_i32: 344.7 --- libavcodec/riscv/vp8dsp_init.c | 21 +++++++++++++++ libavcodec/riscv/vp8dsp_rvv.S | 49 ++++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index fa3feeacf7..afffa6de2f 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -34,6 +34,10 @@ VP8_EPEL(16, rvi); VP8_EPEL(8, rvi); VP8_EPEL(4, rvi); +VP8_BILIN(16, rvv); +VP8_BILIN(8, rvv); +VP8_BILIN(4, rvv); + av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RV @@ -48,6 +52,23 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; c->put_vp8_bilinear_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; } +#if HAVE_RVV + if (flags & AV_CPU_FLAG_RVV_I32 && ff_get_rv_vlenb() >= 16) { + c->put_vp8_bilinear_pixels_tab[0][0][1] = ff_put_vp8_bilin16_h_rvv; + c->put_vp8_bilinear_pixels_tab[0][0][2] = ff_put_vp8_bilin16_h_rvv; + c->put_vp8_bilinear_pixels_tab[1][0][1] = ff_put_vp8_bilin8_h_rvv; + c->put_vp8_bilinear_pixels_tab[1][0][2] = ff_put_vp8_bilin8_h_rvv; + c->put_vp8_bilinear_pixels_tab[2][0][1] = ff_put_vp8_bilin4_h_rvv; + c->put_vp8_bilinear_pixels_tab[2][0][2] = ff_put_vp8_bilin4_h_rvv; + + c->put_vp8_bilinear_pixels_tab[0][1][0] = ff_put_vp8_bilin16_v_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][0] = ff_put_vp8_bilin16_v_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][0] = ff_put_vp8_bilin8_v_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_rvv; + } +#endif #endif } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 8a0773f964..ec8ff917b9 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -20,6 +20,18 @@ #include "libavutil/riscv/asm.S" +.macro vsetvlstatic8 len +.if \len <= 4 + vsetivli zero, \len, e8, mf4, ta, ma +.elseif \len <= 8 + vsetivli zero, \len, e8, mf2, ta, ma +.elseif \len <= 16 + vsetivli zero, \len, e8, m1, ta, ma +.elseif \len <= 31 + vsetivli zero, \len, e8, m2, ta, ma +.endif +.endm + .macro vp8_idct_dc_add vlse32.v v0, (a0), a2 lh a5, 0(a1) @@ -71,3 +83,40 @@ func ff_vp8_idct_dc_add4uv_rvv, zve32x ret endfunc + +.macro bilin_load dst len type mn +.ifc \type,v + add t5, a2, a3 +.else + addi t5, a2, 1 +.endif + vle8.v \dst, (a2) + vle8.v v2, (t5) + vwmulu.vx v28, \dst, t1 + vwmaccu.vx v28, \mn, v2 + vwaddu.wx v24, v28, t4 + vnsra.wi \dst, v24, 3 +.endm + +.macro put_vp8_bilin_h_v len type mn +func ff_put_vp8_bilin\len\()_\type\()_rvv, zve32x + vsetvlstatic8 \len + li t1, 8 + li t4, 4 + sub t1, t1, \mn +1: + addi a4, a4, -1 + bilin_load v0, \len, \type, \mn + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + +.irp len 16,8,4 +put_vp8_bilin_h_v \len h a5 +put_vp8_bilin_h_v \len v a6 +.endr From patchwork Tue May 7 16:54:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48639 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9c99:b0:1af:836d:81b3 with SMTP id mj25csp38817pzb; Tue, 7 May 2024 09:55:40 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUESkzdNODheLgAj83kc07wh8utJN0etWzsCzrM1A8PUDkVMdqmxllpOCblgXj6n8GD4vJ38ghFOs3ZXlmQz64LvudaJTy0fvypXQ== X-Google-Smtp-Source: AGHT+IEfVnOP2RhSp+G5iVfaO5rljOkC5aNPhM67Qxtl3aGAhjolpur7td4gZGuFZuOQbn/QVWzy X-Received: by 2002:a17:906:b05:b0:a59:c944:de4 with SMTP id a640c23a62f3a-a59fa864149mr31380266b.2.1715100940574; Tue, 07 May 2024 09:55:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715100940; cv=none; d=google.com; s=arc-20160816; b=NzcFPuxC9/H/lGdI3nK5B1BLb5fkB/SHhDCJFn9AXbbSRxQGjYbcw2A8NCrkJYODte yFf+ACvzL6sytkKp1wUezcQv227WaSDf7KVYW4LkLt2UxDVCCfgbmdxRHGafaDJ3rj2J Sw49BEeiaehjjvbfK5V6uJ4L05dYlU4+NymyVOFk57TzsHZ1MghUwNVZKqoiM+AmfyEX x8dVwt6seEj3zNER5zdvusDcu7EFALgUHT8lOd6HNIGr6dMxXEG6cSObdoPWbAgnFjQg FpBnm07ckSSqfx0rSQ5eL5NOnmAioMJIMcUTF1jhViu8h9z14Mu+2UzOiOKq8IlLK+8z j93A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=fw7ZMpAR6mJG1ny7+8QtNzWVDsJDP7rR/9kDQwiz0D4=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=VMJ+CG3XZiQ0igW/zPR4qWjqvbuOBAcRE5vDIrnDNY/oTgdhPQWElDjtxY8vn+Wp3x ZprzXEt7ErcvZ2ufN3VKoiaudNsMF8vmU6Dc3bRTXPUX3J+m9wjkez/kEOEGyOH+Ov6i 4kYgHVK3pb08wZCKlAnBD8PBFBbaWjcPm6QVSr/E5I211P8OYYh5DUk6+tPHf+kgEZfJ 7qM3VTk7g+mwFMpWRw6SP2sefJ+4mzT6DPsNC4A7xB73tuOB+cim+W5/5YM0nOIAmH4B tTbGtrnBmE30nXqLgcb+bGapFZujaM/fw3ze3rxAsYLW3ZFPLln6xv/nsDmVaPWZUF8b JECA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=LUsFZCyb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gb33-20020a170907962100b00a59cbd05395si2581810ejc.366.2024.05.07.09.55.40; Tue, 07 May 2024 09:55:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=LUsFZCyb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 63D5368D7C5; Tue, 7 May 2024 19:54:49 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-49.mail.qq.com (out162-62-57-49.mail.qq.com [162.62.57.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9E89568D4FF for ; Tue, 7 May 2024 19:54:38 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715100870; bh=Z3h2cr81i5wM3PQlWZRTpfqQyFle2y7rU8WdS8XIJV0=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=LUsFZCyb3O0H0okweyiDECDXBeVEkQ4wreRuzK7medPP64RJ5LiyZb5DzUa5HF7iT g8f8oTk1odWnWbcD1vDfhVqZyoPPwlOmJQeVukr2hTFhhsZhs/fyxlfRrUmc+DuJqO ETNvriqmBSjbKeObMZoEEn00mCZxgxl1BmFk2+IQ= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-1.qq.com (NewEsmtp) with SMTP id D99244C2; Wed, 08 May 2024 00:54:25 +0800 X-QQ-mid: xmsmtpt1715100867t2qfdyb0x Message-ID: X-QQ-XMAILINFO: OR+tS2yJykEIupb6JcdYYDxMPmX3i5XcOJ5bBN0elv6KHGthfyXt+icKg3D/wf 2QkfjPqMXs2kaZzzCawS+GLJfI8oUD+w7bG67e34wJsHkON/RGxmMHq1GzMJDDKrNKsLrAiRpglQ DvaEZcbcQqtjaEGl5z1wl75LxHCZNYIZgEcrM9li7JiQZUhkJfxTazC1pzys41Bn8+rWMzxY+83a s1AjmjoXg3nX9DVvnH9XgvSVHn1CwGnZSSj4/jU4oHE5epjfjs0Aju7muGHrPq7KSeP2a6yLvWWp tzQf/yRrzKpATI0ZwSJaxeTk91aWwFQcLrcVageTct7P6aiFQjUD/Fe/nDBdpkmWNrQQamUJXwHf eYV1iYjeQqWWN3IyZZ3dvlsEhgjhx9EoH0qbw6hibNDxn26wDnNMUdz576KebEAJQROd33coH6Qp W8oQkYUZDmmYjo4RPB7lo5HEc2qhTzNf7VN2rFeI7TUbvWFdUj+F7Uvc0mxjJ1hrnSihuwDsEtSh MwssRuHSfhNNjuvCaXXvWgRz8VBRN/+C+4IS/T/9fCZ8OThDLGaz0b9Ojtfssj8gFgLhDnPE4mlI bhskK7OdFdLRDSpCrBxvgnBtlJPZ15n8Q4DqgtUGevwHaHvUZedadagd0q323Mks/gA3t0NRfG0s SetY1rZy98hPnBzMDJBpfYbCZYx5ZX85xDoLXHslJMik5jL9Z89ts93PaI+zYiO/OsfFq2rbbXax 7HesXI9ei32S7LTzTYC3S3F9Pg+SHYY1kbYT8pWQ4BnKnY+zO7iOm7ceSKQLwEzqXvLzFfefHsM2 v4C4jq2YejM+c76jYWS43kcJKWVaevu4wRHCdHsxTW5qqqqR6HzQznUInXjVpYMEX4N7M+rKwtKm wQSj3UP6gLixMy+yhM5u36WoaNR7CAUw== X-QQ-XMRINFO: Mp0Kj//9VHAxr69bL5MkOOs= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 8 May 2024 00:54:06 +0800 X-OQ-MSGID: <20240507165412.1306563-3-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240507165412.1306563-1-uk7b@foxmail.com> References: <20240507165412.1306563-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 3/9] lavc/vp8dsp: R-V V put_bilin_hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 1sxVIhMGOSeV From: sunyuechi C908: vp8_put_bilin4_hv_c: 561.0 vp8_put_bilin4_hv_rvv_i32: 232.7 vp8_put_bilin8_hv_c: 2162.7 vp8_put_bilin8_hv_rvv_i32: 506.7 vp8_put_bilin16_hv_c: 4769.7 vp8_put_bilin16_hv_rvv_i32: 556.7 --- libavcodec/riscv/vp8dsp_init.c | 13 +++++++++++++ libavcodec/riscv/vp8dsp_rvv.S | 26 ++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index afffa6de2f..9627105fc8 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -67,6 +67,19 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_rvv; c->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_rvv; c->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_rvv; + + c->put_vp8_bilinear_pixels_tab[0][1][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][1][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index ec8ff917b9..4f232c7707 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -116,7 +116,33 @@ func ff_put_vp8_bilin\len\()_\type\()_rvv, zve32x endfunc .endm +.macro put_vp8_bilin_hv len +func ff_put_vp8_bilin\len\()_hv_rvv, zve32x + vsetvlstatic8 \len + li t3, 8 + sub t1, t3, a5 + sub t2, t3, a6 + li t4, 4 + bilin_load v4, \len, h, a5 + add a2, a2, a3 +1: + addi a4, a4, -1 + vwmulu.vx v20, v4, t2 + bilin_load v4, \len, h, a5 + vwmaccu.vx v20, a6, v4 + vwaddu.wx v24, v20, t4 + vnsra.wi v0, v24, 3 + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h_v \len h a5 put_vp8_bilin_h_v \len v a6 +put_vp8_bilin_hv \len .endr From patchwork Tue May 7 16:54:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48636 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9c99:b0:1af:836d:81b3 with SMTP id mj25csp38591pzb; Tue, 7 May 2024 09:55:16 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXkUphiix4hd3zeaFpyo1FgtO6iX1AQ7tOJsE9qP1haqBCpl33qprZfZfQ08EH4KK0zuc1jB8Q0UU0XMKv2KAriDMXy5cWB+PIeOw== X-Google-Smtp-Source: AGHT+IH70CJgDndBo5mMjfHBB59ot07R9VPULdgQU7k/RC2oKtbx5scmgaBjrj7wEJGIx6ZuWvg2 X-Received: by 2002:a17:907:eaa:b0:a52:2441:99c with SMTP id a640c23a62f3a-a59fb9d1f69mr2817966b.69.1715100916051; Tue, 07 May 2024 09:55:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715100916; cv=none; d=google.com; s=arc-20160816; b=bDBIyrbCUU9S8ADtfr/9q2Ts8tjYmPm48pGArRpF/INfCPud1rNQC4MWel4yP5PRUu Nm3bIOBl/nqI8l3x+ufgjfKgvMpOx5hXgcSTAtExrbBt49GWdKTmU9Se0jYs7m+JSQbE H+ftIfpxmkML2iw6bLUW/xE2iGjWCkkjmKt8fAmKtJeXOGoHjTvoKBJJVle4gwfO4uNW WlW6K8+MDxPbMdinJ+WWO9/jygx+mIPygfVKqTtAXfStE6SQNb0B6uwUsmOww14Oi8EB 6J351pa0XRX769joBbFnF+n6x1+8GHIW2iE5EoCXUnq3QZh9JKRkevOrluR7ZtAgIr2Y ysWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=+Ztw2xXUQ415zdz9ncT6XGtGPFTdJHKzcvWiqVpbHFk=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=ICw8FLpV70yMBrqOjP2lpUW4R+lYY0gFYNwVJpyjSzr4iq3NJksplaWjeOrbjEOHgb 6czqFiwROHZ2G24b5ictVfs3mRo89NUGsmlPZhvqobR/j3LWvqjo9KkhcWbQZs/0Rfrb A0z/qe+rAhZcNBoXRx8TlTvccUtIn29qjhjoYHOH7JbcFDAdLSftZGeh/gtsXabaLGrp aw/Wta2kwpN4z3FPOMNE4A5WcyvEsFh5BRKibkdqiGbcrY3dcIwELP0mmGK1fLI4o5uL QYl9C+bAlfag14XIsO/4dU1Ba+ejKUpuHjmRfxhMW20hqdTby2B5yPYnu6JYAeVJ00ef 9nPw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=AmJugFrj; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hv11-20020a17090760cb00b00a59c6694c26si3197721ejc.518.2024.05.07.09.55.10; Tue, 07 May 2024 09:55:16 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=AmJugFrj; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DE21F68D7B8; Tue, 7 May 2024 19:54:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-58-216.mail.qq.com (out162-62-58-216.mail.qq.com [162.62.58.216]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id AF54068D4FF for ; Tue, 7 May 2024 19:54:37 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715100868; bh=08Vy+mJTDn0Kvx8tmaqmCsaixFvh+RLRat0GzwfD0aA=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=AmJugFrjUPf3+maXOwdj+9AIVHNM1M8ZzhoboFqVHG/Atd92DsveLKomeqV1Fe38t te2ZyqSkDr1GkKKsf98hy0LTYLzwMMieBzeZrCeAkt3RbtUrJExfOSMTZW9SA0AYCn oRhCYCUTQYQhuxH4wgVdvieakX9AtqEq+VIpAIdM= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-1.qq.com (NewEsmtp) with SMTP id D99244C2; Wed, 08 May 2024 00:54:25 +0800 X-QQ-mid: xmsmtpt1715100867tuvnd2dwo Message-ID: X-QQ-XMAILINFO: MVKy59SpMLfUh1RBULBOqwTrqI6F4JhfEiNEyx4TqTzA92r5a8pf+U2A76l23D dmBii4575DiPm7MAhUpeM5bgqA47oqwQ1Ju4VwbTHNqjJQDiWHu74+G3Qd3MvBHTSONf67o4u2oO fG7c2Zty6dfT4eJZze4QuqNvbPNxsyTmpuSKPkG4jiKQQMqSl+GxfUdjCGjQ1ZObVk9pKTM0jddi umKP1U9mz2Wj9dFlqQG3R2+cG5K2l2x7upQKSgzm3FMAf72IfvbveVwjgECkX+AF3r8UKRygda70 +1x8l7mpdmDQBUix7IWoTCFbIAFP1n+yCAZo5+NnhQ3l//jUa2vc7lzWmneLb2A/BeCM27xAG7IQ /IfoUCMdC0Bqb6ao6CHledqe9S1/8wj6k/G8ebV+j2STb/YwBXhLaJBhybdwQgJh9thcWquHWJXw ZPhcUwIR+eqiSGdLFhK/pnve2eDFsBfvGT5/RcmOXooMUgPXui94xL7mSEpe8xlTC5DuGo//k3Q5 xiFThqYKOM4sZxjWoROJyrJgrZgaQaH9S1qgOo96K2TrWHnnw/kLlIwzjh2Wh0F9P2DTPCxzikyX bvM6b2F0dihXwBCzcQkF80hZjqss7uuqiDaEWbHN60vnmbMod5vJe5QjKcBhUyT4o/RqhGdnuQ4s GSYiDpV0N9P09Rs1lLFPso/tzIqN9TxBbF7S6kvW6O8RFOvFDDsGagp7saFvTCKc4FQSr7ZgokXp jlP522JOS63et69ydRtaK0yXzfm/kwMa5/HEOeA5Wk46zpenIZyiR0x5Y7qS7u64M9eYxWS3cxtW 7HCQrClaGn8OfKHMRAB4QMBmGWqzm3lnL95t611wtqE/Lhh+WC1wOEi/TWyEjQmp6XhJEvTdzIyP U8eQoUtttl0sbGk7QikKCR41Aepa7jsQ== X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 8 May 2024 00:54:07 +0800 X-OQ-MSGID: <20240507165412.1306563-4-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240507165412.1306563-1-uk7b@foxmail.com> References: <20240507165412.1306563-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 4/9] lavc/vp8dsp: R-V V put_epel h X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ArxeEjLlN4NR From: sunyuechi C908: vp8_put_epel4_h4_c: 10.7 vp8_put_epel4_h4_rvv_i32: 5.0 vp8_put_epel4_h6_c: 15.0 vp8_put_epel4_h6_rvv_i32: 6.2 vp8_put_epel8_h4_c: 43.2 vp8_put_epel8_h4_rvv_i32: 11.2 vp8_put_epel8_h6_c: 57.5 vp8_put_epel8_h6_rvv_i32: 13.5 vp8_put_epel16_h4_c: 92.5 vp8_put_epel16_h4_rvv_i32: 13.7 vp8_put_epel16_h6_c: 139.0 vp8_put_epel16_h6_rvv_i32: 16.5 --- libavcodec/riscv/vp8dsp_init.c | 10 ++++ libavcodec/riscv/vp8dsp_rvv.S | 87 ++++++++++++++++++++++++++++++++++ 2 files changed, 97 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 9627105fc8..a4b7d49932 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -33,6 +33,9 @@ void ff_vp8_idct_dc_add4uv_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t str VP8_EPEL(16, rvi); VP8_EPEL(8, rvi); VP8_EPEL(4, rvi); +VP8_EPEL(16, rvv); +VP8_EPEL(8, rvv); +VP8_EPEL(4, rvv); VP8_BILIN(16, rvv); VP8_BILIN(8, rvv); @@ -80,6 +83,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_rvv; c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; + + c->put_vp8_epel_pixels_tab[0][0][2] = ff_put_vp8_epel16_h6_rvv; + c->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_rvv; + c->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_rvv; + c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; + c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; + c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 4f232c7707..629d7a23d5 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -32,6 +32,16 @@ .endif .endm +.macro vsetvlstatic16 len +.if \len <= 4 + vsetivli zero, \len, e16, mf2, ta, ma +.elseif \len <= 8 + vsetivli zero, \len, e16, m1, ta, ma +.elseif \len <= 16 + vsetivli zero, \len, e16, m2, ta, ma +.endif +.endm + .macro vp8_idct_dc_add vlse32.v v0, (a0), a2 lh a5, 0(a1) @@ -141,8 +151,85 @@ func ff_put_vp8_bilin\len\()_hv_rvv, zve32x endfunc .endm +const subpel_filters + .byte 0, -6, 123, 12, -1, 0 + .byte 2, -11, 108, 36, -8, 1 + .byte 0, -9, 93, 50, -6, 0 + .byte 3, -16, 77, 77, -16, 3 + .byte 0, -6, 50, 93, -9, 0 + .byte 1, -8, 36, 108, -11, 2 + .byte 0, -1, 12, 123, -6, 0 +endconst + +.macro epel_filter size + lla t2, subpel_filters + addi t0, a5, -1 + li t1, 6 + mul t0, t0, t1 + add t0, t0, t2 + .irp n 1,2,3,4 + lb t\n, \n(t0) + .endr +.ifc \size,6 + lb t5, 5(t0) + lb t0, (t0) +.endif +.endm + +.macro epel_load dst len size + addi t6, a2, -1 + addi a7, a2, 1 + vle8.v v24, (a2) + vle8.v v22, (t6) + vle8.v v26, (a7) + addi a7, a7, 1 + vle8.v v28, (a7) + vwmulu.vx v16, v24, t2 + vwmulu.vx v20, v26, t3 +.ifc \size,6 + addi t6, t6, -1 + addi a7, a7, 1 + vle8.v v24, (t6) + vle8.v v26, (a7) + vwmaccu.vx v16, t0, v24 + vwmaccu.vx v16, t5, v26 +.endif + li t6, 64 + vwmaccsu.vx v16, t1, v22 + vwmaccsu.vx v16, t4, v28 + vwadd.wx v16, v16, t6 + vsetvlstatic16 \len + vwadd.vv v24, v16, v20 + vnsra.wi v24, v24, 7 + vmax.vx v24, v24, zero + vsetvlstatic8 \len + vnclipu.wi \dst, v24, 0 +.endm + +.macro epel_load_inc dst len size + epel_load \dst \len \size + add a2, a2, a3 +.endm + +.macro epel len size type +func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x + epel_filter \size + vsetvlstatic8 \len +1: + addi a4, a4, -1 + epel_load_inc v30 \len \size + vse8.v v30, (a0) + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h_v \len h a5 put_vp8_bilin_h_v \len v a6 put_vp8_bilin_hv \len +epel \len 6 h +epel \len 4 h .endr From patchwork Tue May 7 16:54:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48637 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9c99:b0:1af:836d:81b3 with SMTP id mj25csp38636pzb; Tue, 7 May 2024 09:55:21 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV0dXNi1pSX43wqruP1bjcxBAkcxlg4ULq86BigljMUtWZWWqu2/ZHcvADd3Yu7yQ89PNn8TDJViCiZBlwi7ARh0iuDl2gtkZ7Ywg== X-Google-Smtp-Source: AGHT+IH4RWElMdIkw1rDiAMXOp6PHUFU5HfbBG7+1ZLKZtU3wFQk914bZVFOn/tAwcFSmTmcSvSF X-Received: by 2002:a17:906:33c7:b0:a59:aa7a:3b16 with SMTP id a640c23a62f3a-a59fa863e42mr33798766b.4.1715100921005; Tue, 07 May 2024 09:55:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715100920; cv=none; d=google.com; s=arc-20160816; b=gVghnuY2A/vmAhzKHcg2U/uovB2DQK3414tehvwY68GYydA9t6b7lDvXBI5VTrMy1P wQBIaXQU8uXas+URKshuldZClflbpwIl7fm4ze84wd8GnzbjQokpN9BUkPNZbLOiRD8+ czey2RTGz6xBwXVU9Wo0NGZmaA7H70SHzVYzt8VmCOFv9B//sPmaWlM3eQhU7jTS7HFh P5r+HEQbH88Fq1ov8ZH4XbJP4vTJknuHyW0bsqR2ynPXCKx/gKqpu6FinSsQHa1PGGwM qVzTc4cu95yYfbDG3Zf0yQjRmr2yMVhd8bERTQg8UWVheX+j+59xnxNu2hszVrH0XI9n wa5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=efp8lnszmnowdjVN9Js0wfk18FEqVux3RwBtFfUySG0=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=mVHKMYcmQ8W55ylWQHGsq6vRZtZS37OGXShvnSKhJX+AlY8L0F/3wjFmvIQveQ1D2X igmlckC+GpbbCJ02etUhW14xxe1cLwHwVhYBjzFvAXraVSbJl7TCulziNQm0cCwCF0KC 3gcyMzgVOXPWfPsEtoU7gjTaGgAnsPY9l2RchA8qFxvlO6pR/Th+8+S45DbmSCHYxNLL pg6jIpYOb/Rr6siUHkJAMySFT3iVZ6X5h6iEnHCpp0Sr7h34xhZudKmeLmHv0t5L7XSK QlHEbHMGw1OTH/9EPYuDkZ2cX4FqA6j/BTGrThVbLbr98Pxxu9uzAw9cmiVwZa2qa+2c zz3Q==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=AGFeKUkd; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hr16-20020a1709073f9000b00a59de4b838esi1720190ejc.349.2024.05.07.09.55.20; Tue, 07 May 2024 09:55:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=AGFeKUkd; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1733068D7B9; Tue, 7 May 2024 19:54:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-49.mail.qq.com (out162-62-57-49.mail.qq.com [162.62.57.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id EE05868D669 for ; Tue, 7 May 2024 19:54:37 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715100869; bh=XAqySEFTTLPIO/7Jm0vMc8Hg36NYdCgq2zFbA6E7Lqo=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=AGFeKUkd+LfxB4px4pf1sNZFJmcWKuPtluXv5xVMObU5HJTsC3nnYV8CN4XmMCrPU 4VDEr1vj63E0JoPYCFyEfY/6kCMoFrokdPWjnpNlGdmambR2SP0WdPBGAFekFuosDa C6t/H2FcQDbA0fp/WgVUDAzrdg/WoFwEb0/pSv/A= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-1.qq.com (NewEsmtp) with SMTP id D99244C2; Wed, 08 May 2024 00:54:25 +0800 X-QQ-mid: xmsmtpt1715100868thumva8be Message-ID: X-QQ-XMAILINFO: Ml8T/6JCI2CtKqynsXhowHgJ7zz8J9B1ITEaK3huh+ivNDTmehX1B4g75dJOun Mj2HwFCwu/DytCv4KruX0h5h86ou4IoS+6HKYESyj8IIJl9U6sJKaWr//9FtWUtoY5hp3+Sj4YfR TwTpOQOYKRqRb32m0IzXvTUgfOSRbANW6sPIrqCRy8wfnaeZbUtN/RhdFS9CJExM06Ct89SsYa2N QDapoHPfKlXExjeuGlK6PYfKOlX9p37T1XBi843D93zgkmsZErxnySuXeya6CDLQzj8GSflQWBCt plmsv9wnA5SUsxiP6L7/WHEQaSuQPDazWRn8XFAuPePuuyamENnOh2m4NMb5RYYSw6joXvAPx/Pj Mv+1Bayuwr6En/QMqQElphKC9Cp9HdKtbve2n9NLXXJzktUbGLDKHArfLrRCO5e5gxRfndPWxtWq YXehpdipEXAy8IzAFNg8jJPaurNd8Zh+81bztu6Z+qP5E1+AhkKtGhVZdackgIax794xLYe1NSMk AZK5o17Yc/HmOECcFTAIDxPFvAafMkZL5xAHmovX0VKGjdJTG9agY7jCE7tpeAQUnsbAUbT8bBtT 2mHuF3zmi0wssbwI7H0hz0p0b12Oy4UE4fgckeHVIUZHXe0ZMd27TGHcNnaJt/0lMBljrlnznhv0 K/ecHxtrQlVKceSY1nfNhrFP6K87KCMOtwoM6hg50vFFO4Ytg9zjb+r593JNn2A6qKE679bmkHRL KEZUWj4FtSXzM6ffTTXVJrHO06xqSXYX80qVvst8ATUSfGR/m/8qw+bha7AbHyotZ1gcVf88LCvw Hs4I9WKiJfIvkeOrNbxPMg6IdJc5SCCq3birZrG5aSM8vs0F5FNvwQ0wujYTHNM8iLZEkQx3oqAU Fl8v2kdN1wBx+f02Um/u9LY7rrXEesiGimjusz3vcHAdmdGBmXD2LoP9XO43bOm6NXxAsfZRiIVv 2N+4WXno8= X-QQ-XMRINFO: MSVp+SPm3vtS1Vd6Y4Mggwc= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 8 May 2024 00:54:08 +0800 X-OQ-MSGID: <20240507165412.1306563-5-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240507165412.1306563-1-uk7b@foxmail.com> References: <20240507165412.1306563-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 5/9] lavc/vp8dsp: R-V V put_epel v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: yor9GyWb8A7Z From: sunyuechi C908: vp8_put_epel4_v4_c: 11.0 vp8_put_epel4_v4_rvv_i32: 5.0 vp8_put_epel4_v6_c: 16.5 vp8_put_epel4_v6_rvv_i32: 6.2 vp8_put_epel8_v4_c: 43.7 vp8_put_epel8_v4_rvv_i32: 11.2 vp8_put_epel8_v6_c: 68.7 vp8_put_epel8_v6_rvv_i32: 13.2 vp8_put_epel16_v4_c: 92.5 vp8_put_epel16_v4_rvv_i32: 13.7 vp8_put_epel16_v6_c: 135.7 vp8_put_epel16_v6_rvv_i32: 16.5 --- libavcodec/riscv/vp8dsp_init.c | 7 +++++++ libavcodec/riscv/vp8dsp_rvv.S | 34 +++++++++++++++++++++++----------- 2 files changed, 30 insertions(+), 11 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index a4b7d49932..dc3e087f01 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -90,6 +90,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; + + c->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 629d7a23d5..4d7a9f6a2d 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -161,9 +161,13 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel_filter size +.macro epel_filter size type lla t2, subpel_filters +.ifc \type,v + addi t0, a6, -1 +.else addi t0, a5, -1 +.endif li t1, 6 mul t0, t0, t1 add t0, t0, t2 @@ -176,19 +180,25 @@ endconst .endif .endm -.macro epel_load dst len size - addi t6, a2, -1 - addi a7, a2, 1 +.macro epel_load dst len size type +.ifc \type,v + mv a5, a3 +.else + li a5, 1 +.endif + sub t6, a2, a5 + add a7, a2, a5 + vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) - addi a7, a7, 1 + add a7, a7, a5 vle8.v v28, (a7) vwmulu.vx v16, v24, t2 vwmulu.vx v20, v26, t3 .ifc \size,6 - addi t6, t6, -1 - addi a7, a7, 1 + sub t6, t6, a5 + add a7, a7, a5 vle8.v v24, (t6) vle8.v v26, (a7) vwmaccu.vx v16, t0, v24 @@ -206,18 +216,18 @@ endconst vnclipu.wi \dst, v24, 0 .endm -.macro epel_load_inc dst len size - epel_load \dst \len \size +.macro epel_load_inc dst len size type + epel_load \dst \len \size \type add a2, a2, a3 .endm .macro epel len size type func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x - epel_filter \size + epel_filter \size \type vsetvlstatic8 \len 1: addi a4, a4, -1 - epel_load_inc v30 \len \size + epel_load_inc v30 \len \size \type vse8.v v30, (a0) add a0, a0, a1 bnez a4, 1b @@ -232,4 +242,6 @@ put_vp8_bilin_h_v \len v a6 put_vp8_bilin_hv \len epel \len 6 h epel \len 4 h +epel \len 6 v +epel \len 4 v .endr From patchwork Tue May 7 16:54:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48638 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9c99:b0:1af:836d:81b3 with SMTP id mj25csp38753pzb; Tue, 7 May 2024 09:55:34 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVeqV1QRqITMFbwDi65pI14ROVO2jBvkSPtjI7n8g1t8QVg616lBjcr53nCMr3bXFk9dqKkpx8QaN1tcqWCvnDEE2Q6Jd05ivAYsw== X-Google-Smtp-Source: AGHT+IFfqxK+rvfW91+PjIVDqrbt4Y7VNU62cj2c1Uw3RLNdpGwzg+pIHBJReB+0OSJ68RTjmiXV X-Received: by 2002:a50:aad1:0:b0:572:9dbf:1538 with SMTP id 4fb4d7f45d1cf-5731da81838mr196023a12.31.1715100934169; Tue, 07 May 2024 09:55:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715100934; cv=none; d=google.com; s=arc-20160816; b=vVMPeOpjmklgbByJqTlQim0nLP2/xQ6btFNtL/Pem/gE1vNmHUX3R7fq6Z5ONMVYAW eC8Au5ZnIVn+Map1jGFuyZ+SAgwjwD4KH4MzZNhbWgsY3jrKd7G4YEOnOK22X/AAaqvy wpPLkn8HSLGgDfzpbkfdiN5I6XLb7rNqr4cgacqggYsF1ctMaKsEsX35xJILtqhC6fsl eu/83G6pDfa6rUiglYWiiL9Ysx/uC6pGJSWMpJRHYr8FVuZR1AcUorpiHHhfTkd32EOr vHip+1imxcjR01cwbZBg9wxiGIJNAwd80Jb9sUTfTuOg9q3ih8dcJwGw12Tehsa49mXc T64g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=tUonxl1l7MzN0m3+CQY4+JsYO6HPuEmENgSa3Ck9Rxg=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=T8sRziiOdr2e8L0Bga9YS+ArB3+Yh2NDRbuE4WAth2c8j4gDhwLv9HkjcH1op3dK67 Nh4pvLFmNM51kvzZinnLUuUcuZnY3bOxJasd2Dzsv53BdCp61sbFwT/bBxgYldFw7wKe nzTq3heGkdVHPcjc9qS5ZtmT4y3JoMx6L2XwkDiX1hVI3hOXlI2Ij1xoeUJX3j9WMSdk 4+HTQ2yhatjNTmCKfBWmxIVEdqzpeDZR4Ruuv7XwgWGu2qoxmlOUzJDx16jU5Pgbk8Bl ghRu8IiFCstnh/X3lck8yo8ikyv2uh9tmEb4siAWXJSYcgX5JSrl/39ata7ggYazye7B a1kQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="XE/S2ljt"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id m17-20020aa7c491000000b005721240c40bsi6063263edq.492.2024.05.07.09.55.31; Tue, 07 May 2024 09:55:34 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="XE/S2ljt"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4268768D7CC; Tue, 7 May 2024 19:54:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-58-216.mail.qq.com (out162-62-58-216.mail.qq.com [162.62.58.216]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 99A5268D752 for ; Tue, 7 May 2024 19:54:38 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715100870; bh=wRvKYaw9HlWJZ4GtcmW7GqvyNO9M2pAUARalWKn/mXA=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=XE/S2ljtRyzahgaNcqfTgwv42G1TYaqOJz3mTB5suiPRaaW8WrZcDm2ZkSbFjTchY luOTxkrKcyddp8iAXVB1tXFMnVbbJ/DMP8zs/IS9zWyRmuVNIYWofNfF8dMOjGbVwm CrC/K5RFMC8Rn5ihzsMTRBvW57MUNujq29xgB0gk= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-1.qq.com (NewEsmtp) with SMTP id D99244C2; Wed, 08 May 2024 00:54:25 +0800 X-QQ-mid: xmsmtpt1715100869trrmxyrdg Message-ID: X-QQ-XMAILINFO: NCmjBvJFq6XNJhLqOAHxhs7bT2tF0aTZQqtcy2qwNEJehZotsDc+i4cZxHCsaW 06rVYhUUbv2GPPSLd7Kjf6JerWFKYbxRNNKaW+lcVP1B8+QQTyr4qg1+eWua6d1LSdmY9v97s8fA 5xAbwY+dcfLs89XTmZqF2ZK2aXU0+eZTdMcoPORlPm8tK4LbFag3uNogDfF4DBwbZayaIz2+c+dm oK+A5zBO56yvaa1kdyKFWn0zY0GykyzxtxxCUKEIkKz40RjTKId04o2gB5PNzI86Ev6FPbLmdG05 87gEx7y4OFU3xp1BIPME7BOfOiOoU6dKz6amIjAp76ZVWolrdINLis/vQq/hyQ6nO+JmlbzklgTA omlPJ7U0M1xioLT89clDxXB4uJcnlkH0xwSn6PR/QEXzG5fa3sxAlEQOTaX+lV8WGpseDvn//BDG ndMoX1PsneUy/VJg6hAcJq9mbKezfg1HrUbBH9WB4xdibJRqf942ItA0v887yKcckpAhbLD7+ssH m7I/mQaIBgzpFT9HMrF+IZPX7NqavNn8yQOtoBLsNfXS3VdusGudVPALVkr4kaTJ7k9c1oTcbYM9 BQUcNBKZXRhyMuCMriuD70xemQ9W9fMJk7h52I3BZ6yayHt8B8lyjUZ1quQjQKZ591M/UDLnM8AA vH4wp0bOE5wHuxzaNrsq46gkqPZwk8n6NSxXoyQc+X2eVkU979masVFMdaZMeLlOkuozhFSr/8im 4vb9pgr0g2t5C1B55UG9l4duCbVlQX3H2sGjwRTHr0lQTVsZKC/7NS8e55LMlxJvGf6DhRBJ9Lz/ 4vfR5lBDze8+LWiGIs3F4jm89Pqmbw02zkNezwFn8dCHoNi1usTWLuWkzVVdUgUQpqbRSfD/jhAY udNO9ztPQw/ski/L9nKvisaFTUAi6qrMqVUrY4cMC+JH4y5E9jVaE4pfD9sKnO3Q== X-QQ-XMRINFO: NI4Ajvh11aEj8Xl/2s1/T8w= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 8 May 2024 00:54:09 +0800 X-OQ-MSGID: <20240507165412.1306563-6-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240507165412.1306563-1-uk7b@foxmail.com> References: <20240507165412.1306563-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 6/9] lavc/vp8dsp: R-V V put_epel hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: OVydYmYFl+wD From: sunyuechi C908: vp8_put_epel4_h4v4_c: 20.0 vp8_put_epel4_h4v4_rvv_i32: 11.0 vp8_put_epel4_h4v6_c: 25.2 vp8_put_epel4_h4v6_rvv_i32: 13.5 vp8_put_epel4_h6v4_c: 22.2 vp8_put_epel4_h6v4_rvv_i32: 14.5 vp8_put_epel4_h6v6_c: 29.0 vp8_put_epel4_h6v6_rvv_i32: 15.7 vp8_put_epel8_h4v4_c: 73.0 vp8_put_epel8_h4v4_rvv_i32: 22.2 vp8_put_epel8_h4v6_c: 90.5 vp8_put_epel8_h4v6_rvv_i32: 26.7 vp8_put_epel8_h6v4_c: 85.0 vp8_put_epel8_h6v4_rvv_i32: 27.2 vp8_put_epel8_h6v6_c: 104.7 vp8_put_epel8_h6v6_rvv_i32: 29.5 vp8_put_epel16_h4v4_c: 145.5 vp8_put_epel16_h4v4_rvv_i32: 26.5 vp8_put_epel16_h4v6_c: 190.7 vp8_put_epel16_h4v6_rvv_i32: 47.5 vp8_put_epel16_h6v4_c: 173.7 vp8_put_epel16_h6v4_rvv_i32: 33.2 vp8_put_epel16_h6v6_c: 222.2 vp8_put_epel16_h6v6_rvv_i32: 35.5 --- libavcodec/riscv/vp8dsp_init.c | 13 ++++ libavcodec/riscv/vp8dsp_rvv.S | 123 +++++++++++++++++++++++++++------ 2 files changed, 115 insertions(+), 21 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index dc3e087f01..463c8fa0a2 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -97,6 +97,19 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; + + c->put_vp8_epel_pixels_tab[0][2][2] = ff_put_vp8_epel16_h6v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][2] = ff_put_vp8_epel4_h6v6_rvv; + c->put_vp8_epel_pixels_tab[0][2][1] = ff_put_vp8_epel16_h4v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][1] = ff_put_vp8_epel4_h4v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][1] = ff_put_vp8_epel16_h4v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][1] = ff_put_vp8_epel8_h4v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][1] = ff_put_vp8_epel4_h4v4_rvv; + c->put_vp8_epel_pixels_tab[0][1][2] = ff_put_vp8_epel16_h6v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][2] = ff_put_vp8_epel4_h6v4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 4d7a9f6a2d..fba72f8c15 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -161,26 +161,26 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel_filter size type - lla t2, subpel_filters +.macro epel_filter size type regtype + lla \regtype\()2, subpel_filters .ifc \type,v - addi t0, a6, -1 + addi \regtype\()0, a6, -1 .else - addi t0, a5, -1 + addi \regtype\()0, a5, -1 .endif - li t1, 6 - mul t0, t0, t1 - add t0, t0, t2 + li \regtype\()1, 6 + mul \regtype\()0, \regtype\()0, \regtype\()1 + add \regtype\()0, \regtype\()0, \regtype\()2 .irp n 1,2,3,4 - lb t\n, \n(t0) + lb \regtype\n, \n(\regtype\()0) .endr .ifc \size,6 - lb t5, 5(t0) - lb t0, (t0) + lb \regtype\()5, 5(\regtype\()0) + lb \regtype\()0, (\regtype\()0) .endif .endm -.macro epel_load dst len size type +.macro epel_load dst len size type from_mem regtype .ifc \type,v mv a5, a3 .else @@ -189,24 +189,35 @@ endconst sub t6, a2, a5 add a7, a2, a5 +.if \from_mem vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) add a7, a7, a5 vle8.v v28, (a7) - vwmulu.vx v16, v24, t2 - vwmulu.vx v20, v26, t3 + vwmulu.vx v16, v24, \regtype\()2 + vwmulu.vx v20, v26, \regtype\()3 .ifc \size,6 sub t6, t6, a5 add a7, a7, a5 vle8.v v24, (t6) vle8.v v26, (a7) - vwmaccu.vx v16, t0, v24 - vwmaccu.vx v16, t5, v26 + vwmaccu.vx v16, \regtype\()0, v24 + vwmaccu.vx v16, \regtype\()5, v26 +.endif + vwmaccsu.vx v16, \regtype\()1, v22 + vwmaccsu.vx v16, \regtype\()4, v28 +.else + vwmulu.vx v16, v4, \regtype\()2 + vwmulu.vx v20, v6, \regtype\()3 + .ifc \size,6 + vwmaccu.vx v16, \regtype\()0, v0 + vwmaccu.vx v16, \regtype\()5, v10 + .endif + vwmaccsu.vx v16, \regtype\()1, v2 + vwmaccsu.vx v16, \regtype\()4, v8 .endif li t6, 64 - vwmaccsu.vx v16, t1, v22 - vwmaccsu.vx v16, t4, v28 vwadd.wx v16, v16, t6 vsetvlstatic16 \len vwadd.vv v24, v16, v20 @@ -216,18 +227,18 @@ endconst vnclipu.wi \dst, v24, 0 .endm -.macro epel_load_inc dst len size type - epel_load \dst \len \size \type +.macro epel_load_inc dst len size type from_mem regtype + epel_load \dst \len \size \type \from_mem \regtype add a2, a2, a3 .endm .macro epel len size type func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x - epel_filter \size \type + epel_filter \size \type t vsetvlstatic8 \len 1: addi a4, a4, -1 - epel_load_inc v30 \len \size \type + epel_load_inc v30 \len \size \type 1 t vse8.v v30, (a0) add a0, a0, a1 bnez a4, 1b @@ -236,6 +247,72 @@ func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x endfunc .endm +.macro epel_hv len hsize vsize +func ff_put_vp8_epel\len\()_h\hsize\()v\vsize\()_rvv, zve32x +#if __riscv_xlen == 64 + addi sp, sp, -48 + .irp n 0,1,2,3,4,5 + sd s\n, \n\()<<3(sp) + .endr +#else + addi sp, sp, -24 + .irp n 0,1,2,3,4,5 + sw s\n, \n\()<<2(sp) + .endr +#endif + sub a2, a2, a3 + epel_filter \hsize h t + epel_filter \vsize v s + vsetvlstatic8 \len +.if \hsize == 6 || \vsize == 6 + sub a2, a2, a3 + epel_load_inc v0 \len \hsize h 1 t +.endif + epel_load_inc v2 \len \hsize h 1 t + epel_load_inc v4 \len \hsize h 1 t + epel_load_inc v6 \len \hsize h 1 t + epel_load_inc v8 \len \hsize h 1 t +.if \hsize == 6 || \vsize == 6 + epel_load_inc v10 \len \hsize h 1 t +.endif + addi a4, a4, -1 +1: + addi a4, a4, -1 + epel_load v30 \len \vsize v 0 s + vse8.v v30, (a0) +.if \hsize == 6 || \vsize == 6 + vmv.v.v v0, v2 +.endif + vmv.v.v v2, v4 + vmv.v.v v4, v6 + vmv.v.v v6, v8 +.if \hsize == 6 || \vsize == 6 + vmv.v.v v8, v10 + epel_load_inc v10 \len \hsize h 1 t +.else + epel_load_inc v8 \len 4 h 1 t +.endif + add a0, a0, a1 + bnez a4, 1b + epel_load v30 \len \vsize v 0 s + vse8.v v30, (a0) + +#if __riscv_xlen == 64 + .irp n 0,1,2,3,4,5 + ld s\n, \n\()<<3(sp) + .endr + addi sp, sp, 48 +#else + .irp n 0,1,2,3,4,5 + lw s\n, \n\()<<2(sp) + .endr + addi sp, sp, 24 +#endif + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h_v \len h a5 put_vp8_bilin_h_v \len v a6 @@ -244,4 +321,8 @@ epel \len 6 h epel \len 4 h epel \len 6 v epel \len 4 v +epel_hv \len 6 6 +epel_hv \len 4 4 +epel_hv \len 6 4 +epel_hv \len 4 6 .endr From patchwork Tue May 7 16:54:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48640 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9c99:b0:1af:836d:81b3 with SMTP id mj25csp38880pzb; Tue, 7 May 2024 09:55:50 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWjvaFVlmFoOjfdRJMgT5Qs1BXK08zs3iS8H4J0Ekg6QoeNfw5cls4aY2GKkkAtYqGL0SIYhBo6JwcJ6VL60AhalvzSo9bC3HUHHw== X-Google-Smtp-Source: AGHT+IEMBN9SlZuLQIUitLkzMi3CsslSzny72EjZiLuDHwV2pdoeQu4IA19SlHy0zQqFhw/krWmV X-Received: by 2002:a17:906:3296:b0:a59:a3ef:21f9 with SMTP id a640c23a62f3a-a59fb9ce6e8mr2774766b.52.1715100950349; Tue, 07 May 2024 09:55:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715100950; cv=none; d=google.com; s=arc-20160816; b=OJqdlx+1hg7ICjDgah8H+QUw4UZOApmRB2qU3fsxxwURU3erXX4u+ZAhTKASJDBURV 7P+s0ziCgfCSV334OBwF0YihwXgKbSf96sc87bHmwXya5KCAzcDTmMML6Njuu7QDNVnS l+BE2jbV5zqejoqem64LnCqZ7yq5UYd22CvD6RgfK2eCKXz6If0jKtIDFNosDOcwWW85 jCYjwtggfJ0bC9dB7rGdpOylNsmiL5vgq8ciWTX/mg6ILIsAuQjkC1itL3+dhQNAUuIW encBCYJ4eb+ekDfG8696IzF/voKLaGM/71I107LEmXMbFheGKKYpnmRG5HYqwNOf+E5u UKQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=BvpSyC1cCv5/PnCuNSlXvefwNwM+Urwyn9H+kfk2mk0=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=aG+81RzRWhR0cSpq34cSNij/mJtXjX03B6wIMGHNuUlwxHYtyQTWe7REPAMgTateJo KXNP/BqyOyDoUvMtgrTuuoaLMt7gR//RBJAHv4C5FMAdJ+bc5omNGMRLNyGNw2msPOxo SCvnogFEGoMxR6G3jXXx7q4z3NAKK7tdqzpFv3qM6P7hn64s9caly7AI8c0PYSya5ApR xrNSn5YQ1S7359tstWnFFiD46KKXQk+k90PtXqcnaZo9SKh97Wd4EAQ8fWN5EbzzYAZ+ p97Pmas9cuLNEaU0NhpkT54Il/Bpqdx19ECKLWQ6FAz4Pj2AQOb/DvP1QYr9a+rM/+j1 asVA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=WdWY1m2M; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id z24-20020a170906075800b00a59ea3280e2si975495ejb.852.2024.05.07.09.55.49; Tue, 07 May 2024 09:55:50 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=WdWY1m2M; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id ABEA268D7D9; Tue, 7 May 2024 19:54:50 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-58-216.mail.qq.com (out162-62-58-216.mail.qq.com [162.62.58.216]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 793F568D4FF for ; Tue, 7 May 2024 19:54:39 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715100871; bh=W5Fy/o1bsw1FcKe1AKhlWE4ThNZZ5rhC4JNZzh78098=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=WdWY1m2MOzwUYh/j625+us071YMTYzPJzFM3euSi6XjRWYv0BKxBBHCu8Zu3HXmps oRjylVB6eqwW8lGsbcVzQV3OMPWn7Z5zmZhgwPWs1B3h39yvehh0s4DxT5qo4JWEEp KmkjpZjmVmwyZZj6d+5UPP7Bw/ui31wUXw9kEWy4= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-1.qq.com (NewEsmtp) with SMTP id D99244C2; Wed, 08 May 2024 00:54:25 +0800 X-QQ-mid: xmsmtpt1715100870tkfunmpox Message-ID: X-QQ-XMAILINFO: OUrMHMu9XZHvIvoaXaVaIStSyfN8Em1/2IWIgAkFzIoHYFPwNlB9lzN36ciISS 8c1JeuTJxId2dOV4y/gQ8UWtNLW0r+MDtNIMKBvGqZAinAsGyqEkND8rfv0QQwcTT8xNA+li48+h QGnFZC9McUhaL7fTB8KYQ2rQ2iGMqzkdQMqOJ5qnrkcxuW1mhs7GVsLCKcq8CZJ90+4O/+ttxdDX 6ZL3HCndamhNY3AZbjtlKHHDd+c1mFh/jFlq3vW5GJKB3sT3lvyen8WQ1Pn091ZVxCnDNrHuMgzl pCO0mSX/Q9AXxUo7acIfsLHjgm7VUHmGdaJV/zsHs6aWSvFwDw+kGBfnH53KplMr7fAoJuI2Qk7Y 8PwIyzwm8np8lhJkUN2OljQ/4gQjbRJ2Ife1V/LaThIGFSgWpkRNDdJp5oM8G9lwP5rOqZyBFALj aVNTH/iHFtlr728aXzp/tD4fffk7GTmz8ZZFYniLw/byJJRd0Aq4h2F5Z0JZoqkWuSn29hqmL+xk TeibuWKWfYtUhxqdlcmgWY8vfZg+/jQ58Uu4992tO/AlBb5/yNWYdpW7VjM12JfJCaxabKS2K2zA DsScFwan6AO7BuPrZvVkCv3xLa8P7ge+rbebqsYZgiNXnBnd6yRJgViE4rpES1XZy4Kgc/wgi2NJ UwnRFycKl0IYlS14JnAJaDBKSmTXclgWWX6m4EiqwHmN7ZpCF0hQMzncAq7fd/9EWPD20k/mlFbf lE49u0U7UkJv/SbnIQCmrOorv+GLUVhtOVfKN5BVe2HDKU5jGfFpmLTsB1jj0Q+KMNnWn8uPltZr OLyH5Ljt6RY3kKY70zvNipS7KmFfm9xuUrATQ06QTuMIjnq9FdYtn0Nge5b27tbyvqPysHrxhptT FaPMZKg3T47P/PhPAdx6THii8UZ9Fq2LXne++g02mHDgtvOjzrsTE69cx/hZSLnA== X-QQ-XMRINFO: NS+P29fieYNw95Bth2bWPxk= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 8 May 2024 00:54:10 +0800 X-OQ-MSGID: <20240507165412.1306563-7-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240507165412.1306563-1-uk7b@foxmail.com> References: <20240507165412.1306563-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 7/9] lavc/vp8dsp: R-V V loop_filter_simple X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NG72A7964hed From: sunyuechi C908: vp8_loop_filter_simple_h_c: 416.0 vp8_loop_filter_simple_h_rvv_i32: 187.5 vp8_loop_filter_simple_v_c: 429.7 vp8_loop_filter_simple_v_rvv_i32: 104.0 --- libavcodec/riscv/vp8dsp_init.c | 5 ++ libavcodec/riscv/vp8dsp_rvv.S | 85 ++++++++++++++++++++++++++++++++++ 2 files changed, 90 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 463c8fa0a2..3acfe75d67 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -41,6 +41,8 @@ VP8_BILIN(16, rvv); VP8_BILIN(8, rvv); VP8_BILIN(4, rvv); +VP8_LF(rvv); + av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RV @@ -126,6 +128,9 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) if (flags & AV_CPU_FLAG_RVB_ADDR) { c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + + c->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter16_simple_rvv; + c->vp8_h_loop_filter_simple = ff_vp8_h_loop_filter16_simple_rvv; } #endif } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index fba72f8c15..4260545c6d 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -94,6 +94,91 @@ func ff_vp8_idct_dc_add4uv_rvv, zve32x ret endfunc +.macro filter_fmin len a f1 p0f2 q0f1 + vsetvlstatic16 \len + vsext.vf2 \q0f1, \a + vmin.vx \p0f2, \q0f1, a7 + vmin.vx \q0f1, \q0f1, t3 + vadd.vi \p0f2, \p0f2, 3 + vadd.vi \q0f1, \q0f1, 4 + vsra.vi \p0f2, \p0f2, 3 + vsra.vi \f1, \q0f1, 3 + vadd.vv \p0f2, \p0f2, v8 + vsub.vv \q0f1, v16, \f1 + vmax.vx \p0f2, \p0f2, zero + vmax.vx \q0f1, \q0f1, zero +.endm + +.macro filter len type normal inner dst stride fE fI thresh +.ifc \type,v + slli a6, \stride, 1 + sub t2, \dst, a6 + add t4, \dst, \stride + sub t1, \dst, \stride + vle8.v v1, (t2) + vle8.v v11, (t4) + vle8.v v17, (t1) + vle8.v v22, (\dst) +.else + addi t1, \dst, -1 + addi a6, \dst, -2 + addi t4, \dst, 1 + vlse8.v v1, (a6), \stride + vlse8.v v11, (t4), \stride + vlse8.v v17, (t1), \stride + vlse8.v v22, (\dst), \stride +.endif + vwsubu.vv v12, v1, v11 // p1-q1 + vwsubu.vv v24, v22, v17 // q0-p0 + vnclip.wi v23, v12, 0 + vsetvlstatic16 \len + // vp8_simple_limit(dst + i, stride, flim) + li a7, 2 + vneg.v v18, v12 + vmax.vv v18, v18, v12 + vneg.v v8, v24 + vmax.vv v8, v8, v24 + vsrl.vi v18, v18, 1 + vmacc.vx v18, a7, v8 + vmsleu.vx v0, v18, \fE + + li t5, 3 + li a7, 124 + li t3, 123 + vsext.vf2 v4, v23 + vzext.vf2 v8, v17 // p0 + vzext.vf2 v16, v22 // q0 + vmul.vx v30, v24, t5 + vadd.vv v12, v30, v4 + vsetvlstatic8 \len + vnclip.wi v11, v12, 0 + filter_fmin \len v11 v24 v4 v6 + vsetvlstatic8 \len + vnclipu.wi v4, v4, 0 + vnclipu.wi v6, v6, 0 + +.ifc \type,v + vse8.v v4, (t1), v0.t + vse8.v v6, (\dst), v0.t +.else + vsse8.v v4, (t1), \stride, v0.t + vsse8.v v6, (\dst), \stride, v0.t +.endif + +.endm + +func ff_vp8_v_loop_filter16_simple_rvv, zve32x + vsetvlstatic8 16 + filter 16 v 0 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_h_loop_filter16_simple_rvv, zve32x + vsetvlstatic8 16 + filter 16 h 0 0 a0 a1 a2 a3 a4 + ret +endfunc + .macro bilin_load dst len type mn .ifc \type,v add t5, a2, a3 From patchwork Tue May 7 16:54:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48641 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9c99:b0:1af:836d:81b3 with SMTP id mj25csp38962pzb; Tue, 7 May 2024 09:56:00 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV7a2a7K7DzEkqBTtfCLylVJC+qL5jQbZvNPZrRM7R/eJEQ+kd5A4Ny3voseeT8DWMNVgAOLacaGEr56tflhnOJcQG4jNJISdiGIQ== X-Google-Smtp-Source: AGHT+IG9iuW564f77B8k5BRcSpCmJyLeekfxI/bOooV2/BPTkjX/3gLlWA6siAfdgBvuf62m8DUH X-Received: by 2002:a17:906:1d13:b0:a59:9c2f:c7d4 with SMTP id a640c23a62f3a-a59e4cf83d6mr273725266b.19.1715100960422; Tue, 07 May 2024 09:56:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715100960; cv=none; d=google.com; s=arc-20160816; b=gqq0b5BX6q4R0CKzBJ8a58EnJ2yVVN+MyZGuvHEmb5CR5k9Wb3EsjNXwSECRcQ9Xx5 ycanUGBbWhy7kwDTuL1A3F6rgnalxsAAhDRVIEZPAps0ca0i2ynvGqZcaZhtLMWkKp33 8bkHhyhtXjn4oLmTL8MVfXUDy95xXnCPTX0I4hoxZ19cktHUmrUC1/bDgii0dCppbYTZ KugwkKRUdbrVjgbgB1hFPmnkThsGWERGblrnI8wK+WVxwtCMAWoxuugmsDqlGxE+Zv5D f81Z0aOrSSC+iYPdKGdZCXwN3xRJk0ItHmHEWOiqIqwQSFAqw8hywfqIrDKKE4CmoZU8 0qaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=qJ9sOIcAr7BA0x20e3zp8Bxd9JFTrAEgTDsRMZDfyvs=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=Zecb7IpTRC5NxHnsNEYUPm8jee5s5oKxMM5cNPbB94z5197ATRx+j+KcR5+K1AqMUH bp0gbSgFVdBOHkwsbW7RN3sb1UD/htNaR66WtHxQaCnuQg053bvPXbHtta+SwoMhLgWe 1DtXYeO2RBoqXsMab4jypE3Dbp6kYBKFpBKgMyxkt+/0Uj5bZ8QjewgxnwfbUxCbDh38 9IPLIybhWiaMTFlLFOvWwOA2RMdc4Y3heOQ2CbESfrxgwfyBkgGNxxKUUV+4/L/ydhoy Wwlj9ZVECGc7pGB0rFdNS5ijli3hvB4o4H/nTJR64VQYq46fsEj2/wJFaAJWIXmATEmg jrRw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=HwBmGBgb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ca11-20020a170906a3cb00b00a59c2a29930si3226172ejb.965.2024.05.07.09.55.59; Tue, 07 May 2024 09:56:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=HwBmGBgb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F173B68D7DF; Tue, 7 May 2024 19:54:51 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-58-211.mail.qq.com (out162-62-58-211.mail.qq.com [162.62.58.211]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B198568D7AF for ; Tue, 7 May 2024 19:54:40 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715100871; bh=P/SLLleASiRFwco2p4DXe0CxdCTLtjVsvuw3szcBCfU=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=HwBmGBgbPaJ6pqoipVI02d/ZBjLtDIThHw12rY5WYwTdHyaXIWF0/3wpB28LANpiQ Dd85IQlN+q6JwK5pOgDkffVkN5+/8MCTg2hwzQslD4O75wma2p+3TMGxxC8wkDzYlZ nFQOhLoIZBYGhHJdwPlZOid1/XyY8CoKLngvPkH8= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-1.qq.com (NewEsmtp) with SMTP id D99244C2; Wed, 08 May 2024 00:54:25 +0800 X-QQ-mid: xmsmtpt1715100871t43qmdii8 Message-ID: X-QQ-XMAILINFO: Nqk65Hg7/x2M6NY1WF+cKnW7WVJmvhNulsKMPEnion/2QUvg8glP1YcucqbXDj qPthRKHwJ440t9edJyq/cdyC0xTiX448fUAHXfDP70pHsUudoVpLVQxuANOy+CGrSN8Lkoa9SnLz MjBIExtCTH7gadjfV2BnQRPHwzoQcotzrdYinehw07agql5Pgr/DDDniCYfajZijnH01Z++aQQWG fDG3qIMoo9iPsGRHMRhDnxo1KRXtXkcIQfluPGFVlmIlftHRAPTLLj6+MIHzBSMe97ZvNPRSoSNA K/j4CTEfsaGMW753RA+m0y+XlVY+s8kjX4YeVKLx3/bYbPIWvY4loHsr/VJJlMLk5SvhQbwLmQ1N KBPI2xyL0MihhNWXxtPmDqd8+FTtazqTRaloHVDb8PhmotzaBi4e1gJStYy2duLRI7J6Nq1ovXGe BrUcFO6aTcU/mMz3nTV9eRCdzdEsxDS9eLsO+NqOyEkX6u8mcBKgK65du2BiBOpcAtKKZHpGxpPS SR1X2AQIN+QKFZ9TKJ//b4FIO67CIRK/haV3dT780dLzFA3qg6cDACJAFj5geVZydxt7Pb4qZNRn 1LmjJ6Q9Dca0a9rARYyH0/xu4XoVReUNa9iTEdOtgRzBoFzTQdTm0afH7eK4am0flO0ML6bMD8XO GRVBkhTp0dRlSDrBHsX7l9Fsjx8IvK67X1P4tLTIUQN4UwLAK1RxulLsBgsDrOoD4T2rlTh63K6k c2iZov+vrpqlHJxlfmYMRw1ZNGsVHnDwKf3XItdacnjGuHQ4hDEAdhwf3aXrdte4GQdJ8lJDm46v m7tY8EAIoQ7HOpu4YFjXWNxCsjrxJinFo7JnFZODX835K58kyJA9kMW2dnzoaU1mVZaxWqeppH/A SlwLX3Sv8luWXyVwo/rFanCyWU/TUsy4P/RcLcKiQtpc4AXkr1TJP0iVmPilDpZQ== X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 8 May 2024 00:54:11 +0800 X-OQ-MSGID: <20240507165412.1306563-8-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240507165412.1306563-1-uk7b@foxmail.com> References: <20240507165412.1306563-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 8/9] lavc/vp8dsp: R-V V loop_filter_inner X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 2v20ioQc+Yy2 From: sunyuechi C908: vp8_loop_filter8uv_inner_v_c: 738.2 vp8_loop_filter8uv_inner_v_rvv_i32: 455.2 vp8_loop_filter16y_inner_h_c: 685.0 vp8_loop_filter16y_inner_h_rvv_i32: 497.0 vp8_loop_filter16y_inner_v_c: 743.7 vp8_loop_filter16y_inner_v_rvv_i32: 295.7 --- libavcodec/riscv/vp8dsp_init.c | 4 ++ libavcodec/riscv/vp8dsp_rvv.S | 104 +++++++++++++++++++++++++++++++++ 2 files changed, 108 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 3acfe75d67..2adff1052a 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -129,6 +129,10 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + c->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_rvv; + c->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_rvv; + c->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_rvv; + c->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter16_simple_rvv; c->vp8_h_loop_filter_simple = ff_vp8_h_loop_filter16_simple_rvv; } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 4260545c6d..a930d0b8f4 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -94,6 +94,13 @@ func ff_vp8_idct_dc_add4uv_rvv, zve32x ret endfunc +.macro filter_abs dst diff fI + vneg.v v8, \diff + vmax.vv \dst, v8, \diff + vmsleu.vx v8, \dst, \fI + vmand.mm v27, v27, v8 +.endm + .macro filter_fmin len a f1 p0f2 q0f1 vsetvlstatic16 \len vsext.vf2 \q0f1, \a @@ -119,6 +126,16 @@ endfunc vle8.v v11, (t4) vle8.v v17, (t1) vle8.v v22, (\dst) + .if \normal + sub t3, t2, a6 + sub t0, t1, a6 + add t6, \dst, a6 + add a7, t4, a6 + vle8.v v2, (t3) + vle8.v v15, (t0) + vle8.v v10, (t6) + vle8.v v14, (a7) + .endif .else addi t1, \dst, -1 addi a6, \dst, -2 @@ -127,9 +144,27 @@ endfunc vlse8.v v11, (t4), \stride vlse8.v v17, (t1), \stride vlse8.v v22, (\dst), \stride + .if \normal + addi t5, \dst, -4 + addi t0, \dst, -3 + addi t6, \dst, 2 + addi a7, \dst, 3 + vlse8.v v2, (t5), \stride + vlse8.v v15, (t0), \stride + vlse8.v v10, (t6), \stride + vlse8.v v14, (a7), \stride + .endif .endif vwsubu.vv v12, v1, v11 // p1-q1 vwsubu.vv v24, v22, v17 // q0-p0 +.if \normal + vwsubu.vv v30, v1, v17 + vwsubu.vv v20, v11, v22 + vwsubu.vv v28, v1, v15 + vwsubu.vv v4, v2, v15 + vwsubu.vv v6, v10, v11 + vwsubu.vv v2, v14, v10 +.endif vnclip.wi v23, v12, 0 vsetvlstatic16 \len // vp8_simple_limit(dst + i, stride, flim) @@ -141,6 +176,25 @@ endfunc vsrl.vi v18, v18, 1 vmacc.vx v18, a7, v8 vmsleu.vx v0, v18, \fE +.if \normal + vneg.v v18, v30 + vmax.vv v30, v18, v30 + vmsleu.vx v27, v30, \fI + filter_abs v18 v28 \fI + filter_abs v18 v4 \fI + filter_abs v18 v6 \fI + filter_abs v18 v2 \fI + filter_abs v20 v20 \fI + vmand.mm v27, v0, v27 // vp8_simple_limit && normal + + vmsgtu.vx v20, v20, \thresh // hev + vmsgtu.vx v3, v30, \thresh + vmor.mm v3, v3, v20 // v3 = hev: > thresh + vzext.vf2 v18, v1 // v18 = p1 + vmand.mm v0, v27, v3 // v0 = normal && hev + vzext.vf2 v20, v11 // v12 = q1 + vmnot.m v3, v3 // v3 = !hv +.endif li t5, 3 li a7, 124 @@ -165,6 +219,37 @@ endfunc vsse8.v v6, (\dst), \stride, v0.t .endif +.if \normal + vmand.mm v0, v27, v3 // vp8_normal_limit & !hv + + .if \inner + vnclip.wi v30, v30, 0 + filter_fmin \len v30 v24 v4 v6 + vadd.vi v24, v24, 1 + vsra.vi v24, v24, 1 // (f1 + 1) >> 1; + vadd.vv v8, v18, v24 + vsub.vv v10, v20, v24 + .endif + + vmax.vx v8, v8, zero + vmax.vx v10, v10, zero + vsetvlstatic8 \len + vnclipu.wi v4, v4, 0 + vnclipu.wi v5, v6, 0 + vnclipu.wi v6, v8, 0 + vnclipu.wi v7, v10, 0 + .ifc \type,v + vse8.v v4, (t1), v0.t + vse8.v v5, (\dst), v0.t + vse8.v v6, (t2), v0.t + vse8.v v7, (t4), v0.t + .else + vsse8.v v4, (t1), \stride, v0.t + vsse8.v v5, (\dst), \stride, v0.t + vsse8.v v6, (a6), \stride, v0.t + vsse8.v v7, (t4), \stride, v0.t + .endif +.endif .endm func ff_vp8_v_loop_filter16_simple_rvv, zve32x @@ -179,6 +264,25 @@ func ff_vp8_h_loop_filter16_simple_rvv, zve32x ret endfunc +func ff_vp8_h_loop_filter16_inner_rvv, zve32x + vsetvlstatic8 16 + filter 16 h 1 1 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter16_inner_rvv, zve32x + vsetvlstatic8 16 + filter 16 v 1 1 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter8uv_inner_rvv, zve32x + vsetvlstatic8 8 + filter 8 v 1 1 a0 a2 a3 a4 a5 + filter 8 v 1 1 a1 a2 a3 a4 a5 + ret +endfunc + .macro bilin_load dst len type mn .ifc \type,v add t5, a2, a3 From patchwork Tue May 7 16:54:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48642 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9c99:b0:1af:836d:81b3 with SMTP id mj25csp39032pzb; Tue, 7 May 2024 09:56:10 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXgZrD9nAn1o/wrskRW9tTky2wgjLHUsKiHE+uKyTeB4j9Us7YKQ8VswRpYT342+yMw5JoDFCGZBlXapOaVJebXB5oUXb1EVpGBww== X-Google-Smtp-Source: AGHT+IGDjPOxkf9SXLg/YQ4JpUK9V+Z7fFrs069uyJsHXr3nNVN1odE1pbK76ayAKoAa9jHYzmQO X-Received: by 2002:a17:906:f1d5:b0:a58:e8c7:c0b8 with SMTP id a640c23a62f3a-a59fb94bf36mr4242466b.7.1715100969956; Tue, 07 May 2024 09:56:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715100969; cv=none; d=google.com; s=arc-20160816; b=W1cp/ZlJSpII6EsPJqBeQjFDSyY6otlO8hu4R432KzT1lsJMyuqL6FX6xgrjkgspOP UShDvHaXRWgFo4YA4n4j1M+KOhXToU2ALnb4xOACeO4A1LABbx0zrU15pHU6KOz57rjy osEnxigY0bSvGPwzszna4exloyzpZNU+gGRgkSatIqceG3z6tCqN4vKzphZK3tgqH1O7 j27DBW6H+cmaXRuyzB7HCLakU+EuJHUd/75r3c9MM5ZaOzhrtVPCsWtijn9kcSDViwCS AwtJxqzXns1rnEyhu0omFmp/0W13GN/IRqTXa7Whx2r2ItKKTRohLZINIqzB3uLZ7lzC pwSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=duVVcIRCf3EOvJyqtymK6K62no+Bedpf6l/IHkw61ow=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=oTbk/Bj/Wnu/mct9DArhSmQVeTUKm7nLC0+a72tGJUdoOt2812iU/5LwU0hqO2tnaK uPspz4IGvchpey1ycLQ1d6GkqOrATJMUHkHlCSL15uWmGl/pwVnAn2z0+6bzoaZBuN+A O9IOc2yLbykbVQWY2V34w6zozu1EwZ34UQW6wpdlHAFIRAwPYImor9FgFyzGxiLLhbUn NxJmdvjOOeA5108kvzoTvoYPWynxl5VUIAsaJzSL6vBzhXdc6JQPYNj7qY12TzKqMaoQ JFtznOKsj+DuYSP+B6sXKYRKrO4zJxpFE9rEX5+Wv6znD7EM7cXGpJPnvakjKdpMVXII /nqQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=Y0MYeSdC; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ka17-20020a170907991100b00a59c4104b7bsi3056803ejc.504.2024.05.07.09.56.09; Tue, 07 May 2024 09:56:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=Y0MYeSdC; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6D1FD68D7E5; Tue, 7 May 2024 19:54:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-58-216.mail.qq.com (out162-62-58-216.mail.qq.com [162.62.58.216]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id F2C4868D7B9 for ; Tue, 7 May 2024 19:54:40 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715100872; bh=zF3pAM7TP+iFST7+b5Egx/hFRzgpEyO3c9+kYxR7anE=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=Y0MYeSdCOI0uls+YKarnoxO2mYlys7VPBlvrzeLZmC/ywAgfSv56X3ASKxExURNtw 88bC57njc6bpgxTs5Csg5DkI+vmFuQf+DoNEWvPiOGmxIW/c9PwcpIpbxV8SpEePzs G281oM3E6M+eXx16Iyht2kxnYTkTY1ceWH0r7wpQ= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrszb9-1.qq.com (NewEsmtp) with SMTP id D99244C2; Wed, 08 May 2024 00:54:25 +0800 X-QQ-mid: xmsmtpt1715100871tvrn52bap Message-ID: X-QQ-XMAILINFO: ODuejhpHzoC0wpKmrluxrfjH5nWtU+8eN2VI3STxuZBaoZGdKsMVSoBBu0ekX0 lkMIp1/GiUG1M0C5M/90qliPzTxkozaqCSVlUHb7CzugqGFLU7Jgwnl8L08LcvowvO3qtU8iBGvv xm9JPWRgwKjHldoOxW/q1UwsT7gxWEolK0nomy2ecG04rStcCeQG/cBph6DTpKjko+9sDhvXTPI5 0Ptg3OTPZsRPmhZ+wP57e1iZy3fUP0jl1M+y2y9KOhp/f4zKFgxka5HYfk/y8wG/mfINlUVEdarj Xs3YtmMfRtybgDtruyUUtR7f9kAY+hAJNi7PMhH1dExpbz7n6X+icONoBW64WhWeG6VjFUiCYtup N+Jirb7szedyg9AGfDMEchMmF3ofeQ3KIZxhe8VwzLRLYXnMEX99RVIB/VaixTzDEBq2FKHo+9BL 7Wv7aPjlbaSk+Iqq9cFy4DWulQmCoAtL7+K5+8DqGa3Ndb/YXyjJjZSwRXNumYGAOINi23Gc/ONi EGO1gH/IwG83V4wHVNUkBPE4tnY4U3HSl8rz93E1iq+9DB4t8IY8JlrGBkhAhHdmzqmxceraubJi u8pkVkgKj+TZCADUHUo96eyK/H5BvatwwtmGGxj444veehcpQ5ZtCarLemBnKr32HAKzq3FDFsq5 KelES6PGwWmXiSGbhpoDHgGWqOKTiadTO2OhPNI75jliRGZ7HLE9ZQ9mjJO+0GsIg2laBoSoJrDE TcQMkxdS7i/GgqlyIW66Cqer8iDEbu+ShnlPtlrEx8pnubbsidXPb+0oe7e19QYAcPzgWP/2jj92 Sbg/XtSAUdNFLEOpxNcS1h5z7fwGw7xhjgknH266lqd4j5J4+FOKZ/6ENM9BcskakhlNXmmyOlkU 9h4gePWStsTpqJMqNkRjg8fsWofWIu+wcbvq/3iDtn81/GKKTXkLn3bPyGP+EYYqWVTNWddcsD X-QQ-XMRINFO: OD9hHCdaPRBwq3WW+NvGbIU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 8 May 2024 00:54:12 +0800 X-OQ-MSGID: <20240507165412.1306563-9-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240507165412.1306563-1-uk7b@foxmail.com> References: <20240507165412.1306563-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 9/9] lavc/vp8dsp: R-V V loop_filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: xSpIZRRMMeWb From: sunyuechi C908: vp8_loop_filter8uv_v_c: 745.5 vp8_loop_filter8uv_v_rvv_i32: 467.2 vp8_loop_filter16y_h_c: 674.2 vp8_loop_filter16y_h_rvv_i32: 553.0 vp8_loop_filter16y_v_c: 732.7 vp8_loop_filter16y_v_rvv_i32: 324.5 --- libavcodec/riscv/vp8dsp_init.c | 4 +++ libavcodec/riscv/vp8dsp_rvv.S | 57 ++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 2adff1052a..1bb5aad518 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -129,6 +129,10 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + c->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_rvv; + c->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_rvv; + c->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_rvv; + c->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_rvv; c->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_rvv; c->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_rvv; diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index a930d0b8f4..04c391c237 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -229,6 +229,33 @@ endfunc vsra.vi v24, v24, 1 // (f1 + 1) >> 1; vadd.vv v8, v18, v24 vsub.vv v10, v20, v24 + .else + li t5, 27 + li t3, 9 + li a7, 18 + vwmul.vx v2, v11, t5 + vwmul.vx v6, v11, t3 + vwmul.vx v4, v11, a7 + vsetvlstatic16 \len + li a7, 63 + vzext.vf2 v14, v15 // p2 + vzext.vf2 v24, v10 // q2 + vadd.vx v2, v2, a7 + vadd.vx v4, v4, a7 + vadd.vx v6, v6, a7 + vsra.vi v2, v2, 7 // a0 + vsra.vi v12, v4, 7 // a1 + vsra.vi v6, v6, 7 // a2 + vadd.vv v14, v14, v6 // p2 + a2 + vsub.vv v22, v24, v6 // q2 - a2 + vsub.vv v10, v20, v12 // q1 - a1 + vadd.vv v4, v8, v2 // p0 + a0 + vsub.vv v6, v16, v2 // q0 - a0 + vadd.vv v8, v12, v18 // a1 + p1 + vmax.vx v4, v4, zero + vmax.vx v6, v6, zero + vmax.vx v14, v14, zero + vmax.vx v16, v22, zero .endif vmax.vx v8, v8, zero @@ -249,6 +276,17 @@ endfunc vsse8.v v6, (a6), \stride, v0.t vsse8.v v7, (t4), \stride, v0.t .endif + .if !\inner + vnclipu.wi v14, v14, 0 + vnclipu.wi v16, v16, 0 + .ifc \type,v + vse8.v v14, (t0), v0.t + vse8.v v16, (t6), v0.t + .else + vsse8.v v14, (t0), \stride, v0.t + vsse8.v v16, (t6), \stride, v0.t + .endif + .endif .endif .endm @@ -283,6 +321,25 @@ func ff_vp8_v_loop_filter8uv_inner_rvv, zve32x ret endfunc +func ff_vp8_v_loop_filter16_rvv, zve32x + vsetvlstatic8 16 + filter 16 v 1 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_h_loop_filter16_rvv, zve32x + vsetvlstatic8 16 + filter 16 h 1 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter8uv_rvv, zve32x + vsetvlstatic8 8 + filter 8 v 1 0 a0 a2 a3 a4 a5 + filter 8 v 1 0 a1 a2 a3 a4 a5 + ret +endfunc + .macro bilin_load dst len type mn .ifc \type,v add t5, a2, a3