From patchwork Mon May 6 03:38:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48563 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152551pzb; Sun, 5 May 2024 20:38:47 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVeyRO/ZVlAGZLfOOqUHOEW31f0Pkkaxr7DhlqLv0gLktLeTwUvkASLOcnndQ9sPvKXF1CiMz8B0ilycWhYbv+HMsu4R6X8vlR3OA== X-Google-Smtp-Source: AGHT+IHwZbZmWz3Czt6rtA4QBVy+eYQ4eI8f5qyWzhBaoMcPgtPzfLUVlkBRLZpiZ/bWCRVvo0hG X-Received: by 2002:a05:6402:5408:b0:572:6698:9258 with SMTP id ev8-20020a056402540800b0057266989258mr5481347edb.2.1714966726787; Sun, 05 May 2024 20:38:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966726; cv=none; d=google.com; s=arc-20160816; b=b5gvy6Bon6upF21sxV6372zxALYZAxa5oZ5ATCoBj81ovgIpGvJj3AchroDUPATlbK 8LYmxSoh57l2Ls/nyzjsQwAO+g9s44+UwZBR48Dkvu1jucIgFundVw9hkVCioOW8p7Dz +R48AR6/3k871O6Ts3vSoBthOjN+9kkGRYOsYOEKmusrmV4Eckl1gF8fkT8CJsCKzvec mAzMFHHx/DPCfQ6tCHZd0cuueVMCRHxJUpiyGrF5XvayLX832mxJcyYF7VcQcn2lvbKj p4kH19AFdY+EonFjZ9RaO7MJbz9SD6UPlD1K9d7QdrMGK+zGAt8jiu6Fd9hzmiJCiPV8 GWVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:date:to:from:message-id :dkim-signature:delivered-to; bh=3VGPmzax+lsvCeSNqnV+jdrPzPWjUQzVBWromYyMF1c=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=kyueNIkSF3VpoF7syF6jTnHXADWViK7oBJ9SwAxzlySlzarnJfwFAFy/6B86o7T2+t Ec9at/ACkuXmzQPRh3AFrpaHwXaDHYJi4B7sdTFTTGEFJkfihCu+qKVnKjYYJUwMTXTY KrnJAx2xSjg3xk7H/YddXBjMkEd/5EuyL5cKGBzMKoU2HpWVwtCg5AZGU2JFg4vCUnHD iLT3OlWcjSjXhDc0LE78N5sAABHZpYVGmiD5sbbZqrLQY3UDABMXBTqa1ypNzh620vML WQydlALOYlMna2fD6NfKLRNH5XDhEsCQkP79Y76/0pxqKkuznA2iT9QQy2PvQrUHL3R5 AhmA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=HpZDPoHD; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e25-20020a50ec99000000b00572977f3185si4455997edr.31.2024.05.05.20.38.46; Sun, 05 May 2024 20:38:46 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=HpZDPoHD; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 92C6368D5D9; Mon, 6 May 2024 06:38:33 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-245.mail.qq.com (out203-205-221-245.mail.qq.com [203.205.221.245]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9A16668D578 for ; Mon, 6 May 2024 06:38:24 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966693; bh=guC8hC0bTXtJRwAbN7baeYh/nfETQOHAtyGHeBSrxkQ=; h=From:To:Cc:Subject:Date; b=HpZDPoHDXHEecPL+COF3zzaBm1g3D30n46NnHLg1atD6Ji7qFQL5CfIPofUiFnuzc JPPzfCmAYMiP7pG0Y78QoEJRpem4/5rvvGkXYsUHrYGBqVgpxmBCRDVpyd8dMgi6P8 8ZMRT9rTdDb2SPqQ/mLYPL5vnDh8BVu+hm/3DOdw= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966691tn3zq1hnm Message-ID: X-QQ-XMAILINFO: NMDsnXof05FGP6nyDg/ZlTrsKD/fhJPcENuksve+sH2ocxcBw5Fnek0j9GpOG4 zI1pUx8i01jfzbHzALmcQL3smbW66eduPLcjneRnSnsrwB17fW7NwB54PWyeh0R6eUPWt+Tuf62w Xy0LhYLkrReA76AhTNhtsn8uNyAWnKh6NU07XOFLxOwDgvQHjjEOtFhlfH+6JO0jah1vbghh+UiW a1rqKAyaBXysd/N6Ym0wsRvjsgEDUj5OpEktRGm1AnQsL4aO0VV7ZAmCCYeBrpR/jM685Z0Xo0/d TjgDujDMZyOfIT99NZk/cgsRSM4tL9aSFUU8x18uS2fgM2vy3vxOG14hH77QYaJanyfKouFlFoz8 VCq2MRd4ODqfwy0RryJg1KWhPC3MySN3Qvuup5+h9l80HsJy27P2H1qJOf64VJAaBFrFQQa5ObP3 M2JYVUro+0JDfRSuGq9O2HdSIzrnvMOEbkePDdIRQMkPJuygFdKyq/S7AEmBiRUsaaEQq9mJuG59 mYX2AYRPYG+16msGv2hiVCfLPS78jy5dJuxNkNp8pnexPMTUTpP9XZI5DmSu8waFPHhKSsiwBPId frZZeK55EU442yZkVlJ7uu2AWgPp6T6hVARNGqcdngxdnfXHm1oU01fO9BrC1fNe8ngV6xFWFz7D Z8iLa7TP7vK60sOxCNPYqPfyJWySnSCXVpS0tlr8uLpMRNYxKNvsor5NZoV6j0nCMs6G2nmnnYjB vV0ZlTGez/zL3ZYidJw2QfQO8Y34wUrLgQpxfmM3Xb1xthGltE9jcD6J6zrZAJrDXKaA1dZ4bdtR 3M/VSvbG78575TIn9gRMDb7AgNj8h70+ecSZCHf/ViO6rsZ597ukwZqgV6XP4AY7AURmYPdHiHZn 1uJntNq27OzXVxHYgqjbzWD+aK6uG/pvgW/NwD4rGe X-QQ-XMRINFO: NS+P29fieYNw95Bth2bWPxk= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:01 +0800 X-OQ-MSGID: <20240506033809.3790245-1-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 1/9] lavc/vp8dsp: R-V put_vp8_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 7xSiR37k5hKl From: sunyuechi C908: vp8_put_pixels4_c: 78.0 vp8_put_pixels4_rvi: 33.7 vp8_put_pixels8_c: 278.0 vp8_put_pixels8_rvi: 55.0 vp8_put_pixels16_c: 999.0 vp8_put_pixels16_rvi: 86.7 --- libavcodec/riscv/Makefile | 1 + libavcodec/riscv/vp8dsp.h | 75 ++++++++++++++++++++++++++++++++++ libavcodec/riscv/vp8dsp_init.c | 22 ++++++++++ libavcodec/riscv/vp8dsp_rvi.S | 61 +++++++++++++++++++++++++++ libavcodec/vp8dsp.c | 2 + libavcodec/vp8dsp.h | 1 + 6 files changed, 162 insertions(+) create mode 100644 libavcodec/riscv/vp8dsp.h create mode 100644 libavcodec/riscv/vp8dsp_rvi.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 050c08ee61..526cb5c97c 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -61,6 +61,7 @@ RVV-OBJS-$(CONFIG_UTVIDEO_DECODER) += riscv/utvideodsp_rvv.o OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_init.o RVV-OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_rvv.o OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_init.o +RV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvi.o RVV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvv.o OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9dsp_init.o RVV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_intra_rvv.o diff --git a/libavcodec/riscv/vp8dsp.h b/libavcodec/riscv/vp8dsp.h new file mode 100644 index 0000000000..971c5c0a96 --- /dev/null +++ b/libavcodec/riscv/vp8dsp.h @@ -0,0 +1,75 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVCODEC_RISCV_VP8DSP_H +#define AVCODEC_RISCV_VP8DSP_H + +#include "libavcodec/vp8dsp.h" + +#define VP8_LF_Y(hv, inner, opt) \ + void ff_vp8_##hv##_loop_filter16##inner##_##opt(uint8_t *dst, \ + ptrdiff_t stride, \ + int flim_E, int flim_I, \ + int hev_thresh) + +#define VP8_LF_UV(hv, inner, opt) \ + void ff_vp8_##hv##_loop_filter8uv##inner##_##opt(uint8_t *dstU, \ + uint8_t *dstV, \ + ptrdiff_t stride, \ + int flim_E, int flim_I, \ + int hev_thresh) + +#define VP8_LF_SIMPLE(hv, opt) \ + void ff_vp8_##hv##_loop_filter16_simple_##opt(uint8_t *dst, \ + ptrdiff_t stride, \ + int flim) + +#define VP8_LF_HV(inner, opt) \ + VP8_LF_Y(h, inner, opt); \ + VP8_LF_Y(v, inner, opt); \ + VP8_LF_UV(h, inner, opt); \ + VP8_LF_UV(v, inner, opt) + +#define VP8_LF(opt) \ + VP8_LF_HV(, opt); \ + VP8_LF_HV(_inner, opt); \ + VP8_LF_SIMPLE(h, opt); \ + VP8_LF_SIMPLE(v, opt) + +#define VP8_MC(n, opt) \ + void ff_put_vp8_##n##_##opt(uint8_t *dst, ptrdiff_t dststride, \ + const uint8_t *src, ptrdiff_t srcstride,\ + int h, int x, int y) + +#define VP8_EPEL(w, opt) \ + VP8_MC(pixels ## w, opt); \ + VP8_MC(epel ## w ## _h4, opt); \ + VP8_MC(epel ## w ## _h6, opt); \ + VP8_MC(epel ## w ## _v4, opt); \ + VP8_MC(epel ## w ## _h4v4, opt); \ + VP8_MC(epel ## w ## _h6v4, opt); \ + VP8_MC(epel ## w ## _v6, opt); \ + VP8_MC(epel ## w ## _h4v6, opt); \ + VP8_MC(epel ## w ## _h6v6, opt) + +#define VP8_BILIN(w, opt) \ + VP8_MC(bilin ## w ## _h, opt); \ + VP8_MC(bilin ## w ## _v, opt); \ + VP8_MC(bilin ## w ## _hv, opt) + +#endif /* AVCODEC_RISCV_VP8DSP_H */ diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index af57aabb71..fa3feeacf7 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -24,11 +24,33 @@ #include "libavutil/cpu.h" #include "libavutil/riscv/cpu.h" #include "libavcodec/vp8dsp.h" +#include "vp8dsp.h" void ff_vp8_idct_dc_add_rvv(uint8_t *dst, int16_t block[16], ptrdiff_t stride); void ff_vp8_idct_dc_add4y_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t stride); void ff_vp8_idct_dc_add4uv_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t stride); +VP8_EPEL(16, rvi); +VP8_EPEL(8, rvi); +VP8_EPEL(4, rvi); + +av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) +{ +#if HAVE_RV + int flags = av_get_cpu_flags(); + if (flags & AV_CPU_FLAG_RVI) { +#if __riscv_xlen >= 64 + c->put_vp8_epel_pixels_tab[0][0][0] = ff_put_vp8_pixels16_rvi; + c->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvi; + c->put_vp8_bilinear_pixels_tab[0][0][0] = ff_put_vp8_pixels16_rvi; + c->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvi; +#endif + c->put_vp8_epel_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; + c->put_vp8_bilinear_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; + } +#endif +} + av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RVV diff --git a/libavcodec/riscv/vp8dsp_rvi.S b/libavcodec/riscv/vp8dsp_rvi.S new file mode 100644 index 0000000000..50ba4f293f --- /dev/null +++ b/libavcodec/riscv/vp8dsp_rvi.S @@ -0,0 +1,61 @@ +/* + * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +#if __riscv_xlen >= 64 +func ff_put_vp8_pixels16_rvi +1: + addi a4, a4, -1 + ld t0, (a2) + ld t1, 8(a2) + sd t0, (a0) + sd t1, 8(a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc + +func ff_put_vp8_pixels8_rvi +1: + addi a4, a4, -1 + ld t0, (a2) + sd t0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +#endif + +func ff_put_vp8_pixels4_rvi +1: + addi a4, a4, -1 + lw t0, (a2) + sw t0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc diff --git a/libavcodec/vp8dsp.c b/libavcodec/vp8dsp.c index df7bd12424..f7c9c9899c 100644 --- a/libavcodec/vp8dsp.c +++ b/libavcodec/vp8dsp.c @@ -1402,6 +1402,8 @@ dsp->put_vp8_epel_pixels_tab[2][2][2] = put_vp8_epel4_h6v6_c; ff_vp78dsp_init_arm(dsp); #elif ARCH_PPC ff_vp78dsp_init_ppc(dsp); +#elif ARCH_RISCV + ff_vp78dsp_init_riscv(dsp); #elif ARCH_X86 ff_vp78dsp_init_x86(dsp); #endif diff --git a/libavcodec/vp8dsp.h b/libavcodec/vp8dsp.h index 30dc2c6cc1..3bf12b6b45 100644 --- a/libavcodec/vp8dsp.h +++ b/libavcodec/vp8dsp.h @@ -87,6 +87,7 @@ void ff_vp78dsp_init(VP8DSPContext *c); void ff_vp78dsp_init_aarch64(VP8DSPContext *c); void ff_vp78dsp_init_arm(VP8DSPContext *c); void ff_vp78dsp_init_ppc(VP8DSPContext *c); +void ff_vp78dsp_init_riscv(VP8DSPContext *c); void ff_vp78dsp_init_x86(VP8DSPContext *c); void ff_vp8dsp_init(VP8DSPContext *c); From patchwork Mon May 6 03:38:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48562 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152538pzb; Sun, 5 May 2024 20:38:37 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVHICChKXRWXEi2m5R5srqDG4UZLTTS1zL4ve83NNXPRoVEgukFTDieTaP04vSu/Wnf0ymI6LOECEqrQnmsOUu9dXkfl+xi/6nMjQ== X-Google-Smtp-Source: AGHT+IFfrxJGQ4iiqpWLL0alG1T2Az49LCqbHh9ukaE90mcHukR1HRqe/ei6Sy7nIqrL4TDcTQQI X-Received: by 2002:a17:906:5907:b0:a59:b543:c9f9 with SMTP id h7-20020a170906590700b00a59b543c9f9mr2628036ejq.7.1714966716768; Sun, 05 May 2024 20:38:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966716; cv=none; d=google.com; s=arc-20160816; b=xgGEntC6nh37g+Vf8rWOQzUuh//tPvKfnVjFJ4YVYkVSfL+cbJ6bLGh1x6dqQ89MM3 vaC/nCQ3LeXXJ3P4OskhV24hN10NZhcf4KSyOwBlbJEkrj2q/KwX5sLDM6pnU/mI5PNw 5sTfT7KTyK4VYuqIIvrTSa3vC7vypl81xrW6Jeoir3rnqHpkXDTpaeR0qDYrNYv3A4Ll lGydTMVTwaseM+6bOOYOuAC7sglwfRu7+X7bR8RN8FZOa3w/nwpDasGb2EweaeV0AGG6 N2WrYR5s2oOz0T3JdJTdFs04HtBRXFJP7ZUsskxG+qXPVMIrMbA+dgsL/irkQiFV7tal 7nuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=z3DJSI331zpATyJMUQhseVKYeNqmFUmoB4inDhdil5Q=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=lFa76oI6m5kjJ7a8KZUQtru1GW4u6F4mGx37JPTMeWDsjxdb5CJciW5bTRT7e7ZcCr B3ClQLpCV9qe3Apx3P9ppOuFUUQjVOctpCGNyzDj4MGnPY1VUB00ivVQO/YxZLMWRklK 2b9FmCh76rKeJef1LBo2pnhfmFAFZ5EFoNOh2w+QJ0o9d84hNTE5XIFzAsnMp1KmygXv hgnTiTaDQWI+sNekPheez6w9pM+uS/wHG5SFo2GWJsRTRDLxgdGzbrV5DUNKnevmtFbl aRSAHxhbeJQCSj+v5Pl50OfvlJuMoVUGnR9gheCNSUIvfFWRZI4QXFk4MfeGXYAlc0Uc jApQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=rUo9uBMN; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id bv8-20020a170907934800b00a5987c906e3si4155266ejc.879.2024.05.05.20.38.36; Sun, 05 May 2024 20:38:36 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=rUo9uBMN; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8992468D469; Mon, 6 May 2024 06:38:32 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-191.mail.qq.com (out203-205-221-191.mail.qq.com [203.205.221.191]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9741B68D469 for ; Mon, 6 May 2024 06:38:24 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966694; bh=TjL2BoJt1aJwB9AkpI9Dq4yUNpt6o7lRKqiPVLHWE1U=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=rUo9uBMN4cm9ZLpvEvMij91eR8FbqBuoAoO/Xjv8DLeZ1KCEkAe/SYsstbcx903uj jGhM2gfwDWY05rpyqHeyEEXbIsOt7viTCYS8ZPZ2ssZNtcr32wfHGDF1ZOTFrD2wyP TqkLI7rxV/WIFoZd8AglZyNFPF7gcPMX1bwqeemM= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966693tkd70582r Message-ID: X-QQ-XMAILINFO: N7WYGs5jwDno8n7+G9Bjj9EdOYmuCT5D8dHt8yLO3VWDgGo9CglAag0h0x2dOR yda2TRJa9YOP3bHzHSpKZaL3vQI5UJjGoY8EsBZK4XR0nq0akUL9HNKSe2Nj9xcu6yVinXCO9/oU gfmcI1nBNvvUJ116747MMpP0N/xYX7gY0+eWWKDTnPO1HOz6OA5yDIn8Pz+46bkmFuZD6rdsS5ST yU8loGigynoDNRA7SgYYFbZ2rUoKM5MK0c/6iKgtC1KRRmCGFbdbJTDv6xmkKajjI/MYViTYWwF6 q4AFvB/7QFU1SCGkQXuRNxdFVAtkGgaJFX2axTl/m3NY2x5XfmkDowynfM2DUj4RrZElkSGiqWHL IV+4x+qswf+aAMRfxFqQZ5cCUbXhW9Nsu9N5B2/Q/a22xcgfgzyP7jZlF7QJ48bIjpOeOf5Gy0iM g6TSwC6KCw7C5DMLp/AXSRxiMcD2YUKf0oMl9ALAaIlhMuknl9FUFPTieUx0KUzoFqMt/1l2bMOt bTtdQ0UyZrb8vBIWsDTQQg2LLYo1kKmAKPvUhQwjDvZywDxrF1tcrs9E8fPja3cIzsi335t0VUHW Fw1HNshTl7nUe9iX0bLs3yt2s/XoZLefWzJfk9WhH32BiFPChz04AIdJu41xx8Fl4QPWaD2g7LX1 k3a9uJstr4wI55RuVRdxkNW5jFlxWn6zGJwzT5/GcF5Ki2Vnt0Wq0vEKQeuoBupJeeKROO8M4xk/ RiFG+HQKvrfm581vGQBt9nXWrEYHcTUcMVRlRR43Jf2r2TBssclU4DrzmaL74hWnD1ypJ2M7wYqK WahjpC88cVaNGk8jqhlGDV3a7MafCMhu9UNpjBdZj9ofPYaiJBvwkiVCBX4RCeEHLFkxgy1A8FeP IsOlE53ely2F/1KQ/S+tVAfs01J9GxSUM9Uvb1B1WnRdpaMNs+oQDb+7RzGLjByZAVtRi4ANy+ X-QQ-XMRINFO: NS+P29fieYNw95Bth2bWPxk= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:02 +0800 X-OQ-MSGID: <20240506033809.3790245-2-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240506033809.3790245-1-uk7b@foxmail.com> References: <20240506033809.3790245-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 2/9] lavc/vp8dsp: R-V V put_bilin_h v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: O7EfN0u5TXD4 From: sunyuechi C908: vp8_put_bilin4_h_c: 367.0 vp8_put_bilin4_h_rvv_i32: 137.7 vp8_put_bilin4_v_c: 377.0 vp8_put_bilin4_v_rvv_i32: 137.7 vp8_put_bilin8_h_c: 1431.0 vp8_put_bilin8_h_rvv_i32: 297.5 vp8_put_bilin8_v_c: 1449.0 vp8_put_bilin8_v_rvv_i32: 297.5 vp8_put_bilin16_h_c: 2839.0 vp8_put_bilin16_h_rvv_i32: 344.7 vp8_put_bilin16_v_c: 2857.0 vp8_put_bilin16_v_rvv_i32: 344.7 --- libavcodec/riscv/vp8dsp_init.c | 21 +++++++++++++++ libavcodec/riscv/vp8dsp_rvv.S | 49 ++++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index fa3feeacf7..afffa6de2f 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -34,6 +34,10 @@ VP8_EPEL(16, rvi); VP8_EPEL(8, rvi); VP8_EPEL(4, rvi); +VP8_BILIN(16, rvv); +VP8_BILIN(8, rvv); +VP8_BILIN(4, rvv); + av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RV @@ -48,6 +52,23 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; c->put_vp8_bilinear_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; } +#if HAVE_RVV + if (flags & AV_CPU_FLAG_RVV_I32 && ff_get_rv_vlenb() >= 16) { + c->put_vp8_bilinear_pixels_tab[0][0][1] = ff_put_vp8_bilin16_h_rvv; + c->put_vp8_bilinear_pixels_tab[0][0][2] = ff_put_vp8_bilin16_h_rvv; + c->put_vp8_bilinear_pixels_tab[1][0][1] = ff_put_vp8_bilin8_h_rvv; + c->put_vp8_bilinear_pixels_tab[1][0][2] = ff_put_vp8_bilin8_h_rvv; + c->put_vp8_bilinear_pixels_tab[2][0][1] = ff_put_vp8_bilin4_h_rvv; + c->put_vp8_bilinear_pixels_tab[2][0][2] = ff_put_vp8_bilin4_h_rvv; + + c->put_vp8_bilinear_pixels_tab[0][1][0] = ff_put_vp8_bilin16_v_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][0] = ff_put_vp8_bilin16_v_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][0] = ff_put_vp8_bilin8_v_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_rvv; + } +#endif #endif } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 8a0773f964..9bf969d794 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -20,6 +20,18 @@ #include "libavutil/riscv/asm.S" +.macro vsetvlstatic8 len +.if \len <= 4 + vsetivli zero, \len, e8, mf4, ta, ma +.elseif \len <= 8 + vsetivli zero, \len, e8, mf2, ta, ma +.elseif \len <= 16 + vsetivli zero, \len, e8, m1, ta, ma +.elseif \len <= 31 + vsetivli zero, \len, e8, m2, ta, ma +.endif +.endm + .macro vp8_idct_dc_add vlse32.v v0, (a0), a2 lh a5, 0(a1) @@ -71,3 +83,40 @@ func ff_vp8_idct_dc_add4uv_rvv, zve32x ret endfunc + +.macro bilin_load dst len type mn +.ifc \type,v + add t5, a2, a3 +.elseif \type == h + addi t5, a2, 1 +.endif + vle8.v \dst, (a2) + vle8.v v2, (t5) + vwmulu.vx v28, \dst, t1 + vwmaccu.vx v28, \mn, v2 + vwaddu.wx v24, v28, t4 + vnsra.wi \dst, v24, 3 +.endm + +.macro put_vp8_bilin_h_v len type mn +func ff_put_vp8_bilin\len\()_\type\()_rvv, zve32x + vsetvlstatic8 \len + li t1, 8 + li t4, 4 + sub t1, t1, \mn +1: + addi a4, a4, -1 + bilin_load v0, \len, \type, \mn + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + +.irp len 16,8,4 +put_vp8_bilin_h_v \len h a5 +put_vp8_bilin_h_v \len v a6 +.endr From patchwork Mon May 6 03:38:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48564 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152586pzb; Sun, 5 May 2024 20:38:56 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCULxTvhkxfJNJoHPtNYL8nO/5EfDhzumFnNXojfivYpedVtRgIGrDPIvIIZHaKH+O0+7I8/eC6nYiYjW+Bt0JxKpGK0S2WmbsT43g== X-Google-Smtp-Source: AGHT+IEGUPA9oZQv4894/rKljz/FDZeQnws2mFOdUlI91MO9e3Ei6QES+ZoXpBHLZS2lQy2J9vH2 X-Received: by 2002:a17:906:d15a:b0:a59:d0fc:7ac5 with SMTP id br26-20020a170906d15a00b00a59d0fc7ac5mr582889ejb.32.1714966735799; Sun, 05 May 2024 20:38:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966735; cv=none; d=google.com; s=arc-20160816; b=v8utQphPkDzqutTs+ueexobItnei7ecQ9haanYqhWPKTJBnAOByCvdtIyGvYCpKRS7 riXXrcnBDyqw6Z/uBYETR4q/ShBTwJT0LLDetdPmO93d8Jswd4Uf/5El4fKZwNLRqq09 5kor6Wmmk0SR1RzNDaS5xNvXGDvvHViNLzODLCDxxwNQeVTtBsb3OCAL9YvvEm+hdLTY j+kjP9uPqIjIv32+31rqKSjuWstZXigtCFU5rYlEtZZW5WOPgXau0wJpJ68ZbNevUztK +DlhpGdaavwOKiGNXqgwOPuZveHA3+7yn5k/KghGkrDYJY4VkDM2XlDsS5wTskK4pzMm GJ6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=CDmG1RbItQOBcycEn7DbDGOsfbaThMKU+nDlCtlp8Yk=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=LL7+fOzkqOyVGGOr0NEGgE04vxZWBYy1dhqK9N1dQCLns/qCuFC4ec9iewGPXBPccu Mw0pbL4tIiwHcwtP+7qrO1Fu74gmrtQX0Rj9Z39Ku8fO1UyURt6izpAcvajSCF+C+APJ LRLTFVHj2sxTGC9BsU7nIxQmUAOOk01nPoKgf3JV5VZOWzWPxjtYL5IbSfQuYSlE8RMA CSBNbZKWESMNrb/JqH0SaH0cgL/KJXgytGqfvs1EiYxJlKMeFgJutOwM7Shd9OUpMFUq bqY1OLQNrR11SfQp4w8/HEbgsRd6zBfGDZa8NgdPE80EqCJ024jLsQACN3aAXCpInPwD aGDw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="y/guu6h4"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qk35-20020a1709077fa300b00a59ca336855si1094153ejc.422.2024.05.05.20.38.55; Sun, 05 May 2024 20:38:55 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="y/guu6h4"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B481068D5EC; Mon, 6 May 2024 06:38:34 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-236.mail.qq.com (out203-205-221-236.mail.qq.com [203.205.221.236]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8908968D57A for ; Mon, 6 May 2024 06:38:25 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966695; bh=Lk3se95zPVF5Fb0IL6sgO61Bx3DmRtyKpnOU5FYWdMI=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=y/guu6h4HlKDpevdeXeAhEoVY3A2++N0AjbOwAZXqfKCD0IXVFgWDsRPk+vL3pwiF fE/GE0bHBxS0J+uV330DH+GUFKZN/+vSKS0D/heM3wo3WNN0Hd34zXgzZvIIPswl9e HTTGwJpw6Bb4bXXjN4x05APs0wcdyX0jSfvhNU7g= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966694tjecf56xs Message-ID: X-QQ-XMAILINFO: MNHTiO1x6sV31V5QipgusFjbXdBw4u/oiLf23AyGurDSAS/w2rFCFLLJUCUnn/ 6nQYTk9NjOZmddqw+Z22PNQtDvbwfZRintUvvj/USHGcXZB7qxMc9/Uu/HPav2EnY9RvMNpHZXbX tI6qejRZ1nlJYnA9SwtNbAFRwnyCKXf7SMKsSqGJ424YEHZUdh6HTdAMeVntc27tXss7SaZq8N8+ /koayZBqLCHo0QFhmnDNgfylUc5kuVuJk+3YfXSKOloaq8iv+PorGeV0c469DA53wIouS332fM1+ cyuK4aKkukQ8mj5n0/PC72ArtfCw7OWRsLvRBPinF3Fv08C0aGxuuea7c3W0ieicskXp4Oo1Arnc Cjs9Vy4TW5eQ0g1Z/myIf+KnTdMrVv7np6I+VtsEGxllgs0WEBKJteOKEPY/mT272wyNR1iHiiOm IMT8fW8VFQ8tvez2rocMvxIuNy57sI9JidmVTtgiGnEAcbF0zmZbOooSVBR+tuvqx2oIe/1A5jC5 jhlW49l3U8mXfeIaCsCdvLrppDS3/usF7mMDNaskpKegw0MaQj0W4YHETwUgiHTxCWFobrBSTxlD VVvDqNrLh8QNCSEHeVKo4Z40qWoEuET85WGZfO6Kz5BXnQ9TE0QgycW0CSd0aOpHGR539HzWBnxr R0+DufbdBjWcHL1PLk9tCIqQ5HElY8juOOIeIr1UngEcOg4gp316BMraR+WqHqtdMRKx/b37lSmq htakz/pTaW7Vqg0ryjjXBPQYTMMffP8FNGWtqGTBLBwNYkKdeZAtCl6HMNwrxlMu0tp0WJJtlTC8 Y0w4IL3dH07HxAEEyE92Sz+2vqNhIxHgFqkwkAIBmCkozuKEpIm/Kz7oGnzUGohPH3WrpHoIBdv7 3ytK1Z3RcDB5Qmq4JpMyQ54WntqbkwBTlrv4nqh7s6 X-QQ-XMRINFO: M/715EihBoGSf6IYSX1iLFg= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:03 +0800 X-OQ-MSGID: <20240506033809.3790245-3-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240506033809.3790245-1-uk7b@foxmail.com> References: <20240506033809.3790245-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 3/9] lavc/vp8dsp: R-V V put_bilin_hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: irHPmJcJhr4T From: sunyuechi C908: vp8_put_bilin4_hv_c: 561.0 vp8_put_bilin4_hv_rvv_i32: 232.7 vp8_put_bilin8_hv_c: 2162.7 vp8_put_bilin8_hv_rvv_i32: 506.7 vp8_put_bilin16_hv_c: 4769.7 vp8_put_bilin16_hv_rvv_i32: 556.7 --- libavcodec/riscv/vp8dsp_init.c | 13 +++++++++++++ libavcodec/riscv/vp8dsp_rvv.S | 26 ++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index afffa6de2f..9627105fc8 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -67,6 +67,19 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_rvv; c->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_rvv; c->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_rvv; + + c->put_vp8_bilinear_pixels_tab[0][1][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][1][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][1] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[0][2][2] = ff_put_vp8_bilin16_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][1][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][1] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[1][2][2] = ff_put_vp8_bilin8_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; + c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 9bf969d794..d30e4cab07 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -116,7 +116,33 @@ func ff_put_vp8_bilin\len\()_\type\()_rvv, zve32x endfunc .endm +.macro put_vp8_bilin_hv len +func ff_put_vp8_bilin\len\()_hv_rvv, zve32x + vsetvlstatic8 \len + li t3, 8 + sub t1, t3, a5 + sub t2, t3, a6 + li t4, 4 + bilin_load v4, \len, h, a5 + add a2, a2, a3 +1: + addi a4, a4, -1 + vwmulu.vx v20, v4, t2 + bilin_load v4, \len, h, a5 + vwmaccu.vx v20, a6, v4 + vwaddu.wx v24, v20, t4 + vnsra.wi v0, v24, 3 + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h_v \len h a5 put_vp8_bilin_h_v \len v a6 +put_vp8_bilin_hv \len .endr From patchwork Mon May 6 03:38:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48569 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152760pzb; Sun, 5 May 2024 20:39:40 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUF/DsOuA2qow6sPxLEcLeQmSHUSC1WVl0bRm3KphY+ZDCJH508xsUWAb6jbAT2WhsQptJ1btAZSYTJgOnKhVJKUNvesp4ANJzMLw== X-Google-Smtp-Source: AGHT+IFYg5u5o+vqwSK7F9RIMsWwiKe1WuR4LIUp7vGwLtxYsIYdpuFgCrEVtoqCi/kssHhEPK9u X-Received: by 2002:a50:cc97:0:b0:56d:f3f3:f61f with SMTP id q23-20020a50cc97000000b0056df3f3f61fmr7702886edi.9.1714966780247; Sun, 05 May 2024 20:39:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966780; cv=none; d=google.com; s=arc-20160816; b=yPgXmJY29rHoitxsD8HM1qT7ui/p0JHXt0HLujAwCyGGkzz53oyDoqCe3nrsBMX40D owH6Z0ydHIDKRDpHUjz+aBMSdS/tjpQLc+8NSZbknhTNtUMu1I7ov5lz+T7oQpt4NZ3v CKlf+X/M0j5vU0s8HuNNR1W1fuk6eZfEiI9t+NUKpnliW7rG6y5S/yyt+e5T2xNx7r28 ADlhzd5FU9jpMMfWbwsdMfG7vBQ/rZ4pHTBjZPt5eGgQ44bjAKgZKp7ioSFfwubCgW4a HqujUtVRMW3EO5trlWc/OvESnJh7rxMsNLfEe3XcU+vzoKaGryCvf51QY4RjVhaVdahq S6GQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=FUTRHReI+Cm6lz4+SxyKmDl55s62yAnvDa34fHH5bCg=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=vTi7/j195Jwfxmfd8GeaYvg+5C06MitPTK3UCbq+TEgu8kWWSs3N9fkmpLfH/6U9Sz Mn32ol0I/nRYjrBcJ+ydh59e0n30qh12N/mFyXs5b3M6nltaOIakXRk04yXjXliuhbuo HieyzyBhCMSZhzARUSV8n7k38GZ5PYYLEudtZAld6g5z60nvoQH4PHtQupwBjo2dUB9M 06M7co6wSzznZUgPlqYUumKdiI01jDsYGKkhCZf8kIQ76nQVrHM71HvaAzeEKrAldFkU J4EDKor9GT75KEdimnmwVe7WG7ryqZ6pBRJHPjdkxxMPWTiGwjSmGvgLF8eb2/owB6O/ sc7A==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=TcMznR31; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id cy27-20020a0564021c9b00b0057273312e9esi4419705edb.453.2024.05.05.20.39.39; Sun, 05 May 2024 20:39:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=TcMznR31; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7BC6668D5D3; Mon, 6 May 2024 06:39:33 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-252.mail.qq.com (out162-62-57-252.mail.qq.com [162.62.57.252]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2460868D428 for ; Mon, 6 May 2024 06:39:29 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966767; bh=VMCSI5O+bw6Rr7yyIYoIntb6sy79usT10OcD1cnX8Ck=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=TcMznR31JDCZFoHJpxPjGYwM5/NfDRrCorUgFcK2W8iBM/KSm8wkA9PcMJKMDNJTY Jpay39DzwySTNQ6hyZM4ozCw//CAEXaAic1znc9m8f22GStbgRL/MljtjbHhaWj5im BOFNmNjBe44N80UlrxiwpnMwhquvWs5KJAajkZhU= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966695tch8f9gzb Message-ID: X-QQ-XMAILINFO: MZHbDvTHakKcFgwfgRQuspFWtXCummC763hfQMQW8UrEoLKrrIHeq36euLh9uZ 1AkuZrfsf2e98by4Lv/pPBk/dz0uRonQP1TkO6zCcaTkrL6cx6C1kZ6JoflfLlYEdtAgdP7HkjSz 1tcmcUON12OlbaOghQCvnf8M48gROeuOtns/6tx1xCoG9HFeSVeh4wEohDvK59Pf+hrtkWthLP91 AN3IPaiekHW6xx5robvuEsxCPdBkMWFttwkZ6sTp9a6pb5+gGmXsDfK8/tpfwBm8QoBzvcTkW00+ 4v0TJznVqdUs68ucLlBFGGyGHeo83XNyb6uq9LAGYlXVc2jzhlPNHOOfJU6cBl/IzeyZSskU25hD vXbS2+Yuv3C72hAOM4C0bNIhxTGhns2siiKrzde72Atp5f1/E1SK8YdqUwmrEdlldb5fXQehPZeu FT/Nym5NkE9Qj9rvZj9WqFufYmTy8AkpSK4RxAtJ4by+mHNaxLdPtFjt6oqzYYBMEoLkIKDd3WtG y43VETkd2ZgABw0o4OM2GoNBv0xDY5TT+sQeBBuUtJH7SRJ1OfYuZfBihDN8km7QIVRRjq6nDHzy JoPRowfp56kjLw4m/nQ3sWzGmpQ22UFMZGTdL5p3pxC5wz4UvH5yFD693yrjys4btJK4Z68ruv8e 0ztKYMYmGSpDpe4Z0nzzIcXcUVeQpP4SmeUClCWiWvzpBEGwo/hd/ZOCg5M4PO4V5EWmcCz32aZD FPHXEeZhkJmcPyOPHDluonEK5+PQ8YhOgPNDAdz+ef7QolaOPrHGqBHxnwttKoHsAbxU2Lpf0Jww cWs7LPhKv3D4TBfPybvDTHnF/VK3NIdo2wVF/m3/gZgUD6iu1hwK2qIs7vOAVxgZBqyBzce2ecvL ZsNeVyahYlCr7/po8clgQdqmFHivWKgQ== X-QQ-XMRINFO: NI4Ajvh11aEj8Xl/2s1/T8w= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:04 +0800 X-OQ-MSGID: <20240506033809.3790245-4-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240506033809.3790245-1-uk7b@foxmail.com> References: <20240506033809.3790245-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 4/9] lavc/vp8dsp: R-V V put_epel h X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: qHTYcUEqQ8ff From: sunyuechi C908: vp8_put_epel4_h4_c: 10.7 vp8_put_epel4_h4_rvv_i32: 5.0 vp8_put_epel4_h6_c: 15.0 vp8_put_epel4_h6_rvv_i32: 6.2 vp8_put_epel8_h4_c: 43.2 vp8_put_epel8_h4_rvv_i32: 11.2 vp8_put_epel8_h6_c: 57.5 vp8_put_epel8_h6_rvv_i32: 13.5 vp8_put_epel16_h4_c: 92.5 vp8_put_epel16_h4_rvv_i32: 13.7 vp8_put_epel16_h6_c: 139.0 vp8_put_epel16_h6_rvv_i32: 16.5 --- libavcodec/riscv/vp8dsp_init.c | 10 ++++ libavcodec/riscv/vp8dsp_rvv.S | 87 ++++++++++++++++++++++++++++++++++ 2 files changed, 97 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 9627105fc8..a4b7d49932 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -33,6 +33,9 @@ void ff_vp8_idct_dc_add4uv_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t str VP8_EPEL(16, rvi); VP8_EPEL(8, rvi); VP8_EPEL(4, rvi); +VP8_EPEL(16, rvv); +VP8_EPEL(8, rvv); +VP8_EPEL(4, rvv); VP8_BILIN(16, rvv); VP8_BILIN(8, rvv); @@ -80,6 +83,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_rvv; c->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_rvv; c->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_rvv; + + c->put_vp8_epel_pixels_tab[0][0][2] = ff_put_vp8_epel16_h6_rvv; + c->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_rvv; + c->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_rvv; + c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; + c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; + c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index d30e4cab07..30955a7b95 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -32,6 +32,16 @@ .endif .endm +.macro vsetvlstatic16 len +.if \len <= 4 + vsetivli zero, \len, e16, mf2, ta, ma +.elseif \len <= 8 + vsetivli zero, \len, e16, m1, ta, ma +.elseif \len <= 16 + vsetivli zero, \len, e16, m2, ta, ma +.endif +.endm + .macro vp8_idct_dc_add vlse32.v v0, (a0), a2 lh a5, 0(a1) @@ -141,8 +151,85 @@ func ff_put_vp8_bilin\len\()_hv_rvv, zve32x endfunc .endm +const subpel_filters + .byte 0, -6, 123, 12, -1, 0 + .byte 2, -11, 108, 36, -8, 1 + .byte 0, -9, 93, 50, -6, 0 + .byte 3, -16, 77, 77, -16, 3 + .byte 0, -6, 50, 93, -9, 0 + .byte 1, -8, 36, 108, -11, 2 + .byte 0, -1, 12, 123, -6, 0 +endconst + +.macro epel_filter size + lla t2, subpel_filters + addi t0, a5, -1 + li t1, 6 + mul t0, t0, t1 + add t0, t0, t2 + .irp n 1,2,3,4 + lb t\n, \n(t0) + .endr +.ifc \size,6 + lb t5, 5(t0) + lb t0, (t0) +.endif +.endm + +.macro epel_load dst len size + addi t6, a2, -1 + addi a7, a2, 1 + vle8.v v24, (a2) + vle8.v v22, (t6) + vle8.v v26, (a7) + addi a7, a7, 1 + vle8.v v28, (a7) + vwmulu.vx v16, v24, t2 + vwmulu.vx v20, v26, t3 +.ifc \size,6 + addi t6, t6, -1 + addi a7, a7, 1 + vle8.v v24, (t6) + vle8.v v26, (a7) + vwmaccu.vx v16, t0, v24 + vwmaccu.vx v16, t5, v26 +.endif + li t6, 64 + vwmaccsu.vx v16, t1, v22 + vwmaccsu.vx v16, t4, v28 + vwadd.wx v16, v16, t6 + vsetvlstatic16 \len + vwadd.vv v24, v16, v20 + vnsra.wi v24, v24, 7 + vmax.vx v24, v24, zero + vsetvlstatic8 \len + vnclipu.wi \dst, v24, 0 +.endm + +.macro epel_load_inc dst len size + epel_load \dst \len \size + add a2, a2, a3 +.endm + +.macro epel len size type +func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x + epel_filter \size + vsetvlstatic8 \len +1: + addi a4, a4, -1 + epel_load_inc v30 \len \size + vse8.v v30, (a0) + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h_v \len h a5 put_vp8_bilin_h_v \len v a6 put_vp8_bilin_hv \len +epel \len 6 h +epel \len 4 h .endr From patchwork Mon May 6 03:38:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48565 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152615pzb; Sun, 5 May 2024 20:39:05 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV0D3oR+gmnxqHKfBqflA99ilOgYo9fYIpOaq3GD5xGDixWLb2PzMlco733/oOSm6zz2HzWgI2kRi8EZlcclpT030KXivhuBhZNvQ== X-Google-Smtp-Source: AGHT+IFSZLWs0f4BDAUxHVDlv7hDbOBlwxRUEM+3RWioDlvb1cGgcyzhy4Yl9ZS9+oupc8R/GkE7 X-Received: by 2002:a17:907:7e84:b0:a59:a0b7:1850 with SMTP id qb4-20020a1709077e8400b00a59a0b71850mr4653274ejc.5.1714966744912; Sun, 05 May 2024 20:39:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966744; cv=none; d=google.com; s=arc-20160816; b=un562//aLLNRhhYFw5Xw1G9ZGZDQj4IbUj8UP2q54NfiACljNE4srrZQMDATKA1cPQ +magGdS442p/hJmb6btLLNSd/S/MLvzyzT5nMZO6s+1HeKHPKpOyw4c50V/EhqKdgAI2 QJdGryGg9vLUpecFv/3uwHpDXDP7BnT/np7slhdNKBrnAU9lD8BectmeRJLtkgjbOaIc elbf/DV3yGTBIQl3iEyrikfav4401AsIY0/vRO7uAVMMuZ9FH0vg8vsf5LlhgYuvrsBc YGKrNmBRmDNfctZrKMXkIw7qApW4evI3OiQH5pnqVxI7hWJIKdDNIBjAm9bqKnZku3l6 0ycA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=pTubhFs3swzwDDKXD5KBbGWMiKNtHZYNBbAia1NdOqE=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=I28OGCyRb49J6X9NhPHtoFFoIyaKzh4oWDVXm0ATg4jwtGXDUscvc1afQTn4gW54TJ 5MSBauBw17YatR1NHNM/LRAE+jx+BWb6Km4tE/nFA7KSi2/rDV2Y9MU14oySgmNhHpyQ Z4pXHLpVgf3gVUmVEqa28d2qxi+ySBLqg8Qh3NWOcLr3ZCTaHSO9KT5ji6VBpdahr4hp FZjUO0ioG3sWhzGhzYiOibGXihMGqnaPz16TuxxyTXA2DzD2fGNSsIwayprIEE6rmmTL RcdDIsA4N0c/zaNGxATDZXuaXh37TlGM22FvwamX+E37bNdZsSwt/xakIPEBckSi45YH Bx3Q==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=pfpt2Z4u; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qb17-20020a1709077e9100b00a59da6b0f4asi184102ejc.1035.2024.05.05.20.39.04; Sun, 05 May 2024 20:39:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=pfpt2Z4u; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1A86F68D587; Mon, 6 May 2024 06:38:36 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-49.mail.qq.com (out162-62-57-49.mail.qq.com [162.62.57.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5770368D51E for ; Mon, 6 May 2024 06:38:26 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966697; bh=mTRbdOftLbxdW9aLFNcRhGC8OthIs7V0PYWKwI25A0A=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=pfpt2Z4u3QSURjw5r4bsBcutdC8OTBsWdbvShHiyiws4yh6dI5mFfXVoclFFzih9K jnCxpRiJuNk0d/iPmFCwgMssYVT1itemcV71dLyPxz6fzuMUWfuR4qxk6EMJnhqQRz l1vxT0ihlfuEI402Jcm5azbuL8dJG2bqJIxEqXxw= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966696tijmpzoy7 Message-ID: X-QQ-XMAILINFO: OKKHiI6c9SH39Bt6DCcI3Bm4Z9N0g1saMc8QehSFa0x1tzk36bB0jcuNrLnwL9 toi4RdKe7QlqOjOmB0LzWVDf+kJDUMLrYs8mQp1iWSm98/BRSo1B7nmE3fFxWq5nrAagKz8WX1GP EAke+DHv5FjOAnEa7o/WaxwqiTmVFPaCDDmFMRZqDS/g4Y57EMYwVyAqa/l1v6sbS5Rrq6YedBco V0yWza4F/ovv81mj2LrAoqC5zMR62AhLrjAjC1IN06xLXqAjhWXFXGhgcrsFfBR8OZMDxGL9Cr65 5T4VHIEhk2cjVtDBdfVMJsDB6Y+//9+UY0lveKWHvqEG48GJZcCKsK8Me+sV4ZzVLLLsTY/c74bv XACNwCeeFhzCNYhN1rwO9OyS9DJgVvn2A178iYyLy9xioqoM7s6OGSxD6ByqU8EnSqePWMvBLdmb F8XFmEiEZP+hqCuY4Ocf/yIZvtHhropThnrXhX0SKnHjWyz+DxcK2s8GfvMLu8fGRDBWPN634O1F a5SvzkOcCunYZtG1Smfn8+rEFW211hffkgm4dxANj4YaDJoMSccss3MSWc75ZkoG16dUs4ZNfvdg ypBNuNfCXTcVogbKO0vK1zl+cECoga0pkfqUjw6j8GxngvmWoIN579OPdo+7w+3iIhQT7rfbYttH yzRGXJbKR3bYcGBN9VkmQOLFrYFHM4CdaOMqYv0vOn3xxi/Tm2jLlVTgqGGzjnCS2wDTGIZsWhT3 RMTwDaEb8e68WIQ+E97Htj2TmRcAeAN43RDyIC5Dn8qMqGtr95R+mdQkeTT9WKgGlixR3Uz5rN3+ oS1VOIS+3lzP+LUHG0xxXrSHGnZaZr1sD67hIphJn+y1x/UEiCrxzjaAcWWz5VZNR1WI8BWvJeM4 Be11mhs9L1/sWJvxkPGZxtXz7jenaE/UABI8a52xK/4uJArLJkucjJk8Lt2+AWydv8jydjcFk4kH jIHBd5KGhD7+G9xgKXMw== X-QQ-XMRINFO: M/715EihBoGSf6IYSX1iLFg= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:05 +0800 X-OQ-MSGID: <20240506033809.3790245-5-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240506033809.3790245-1-uk7b@foxmail.com> References: <20240506033809.3790245-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 5/9] lavc/vp8dsp: R-V V put_epel v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: aBkj4bDeoabj From: sunyuechi C908: vp8_put_epel4_v4_c: 11.0 vp8_put_epel4_v4_rvv_i32: 5.0 vp8_put_epel4_v6_c: 16.5 vp8_put_epel4_v6_rvv_i32: 6.2 vp8_put_epel8_v4_c: 43.7 vp8_put_epel8_v4_rvv_i32: 11.2 vp8_put_epel8_v6_c: 68.7 vp8_put_epel8_v6_rvv_i32: 13.2 vp8_put_epel16_v4_c: 92.5 vp8_put_epel16_v4_rvv_i32: 13.7 vp8_put_epel16_v6_c: 135.7 vp8_put_epel16_v6_rvv_i32: 16.5 --- libavcodec/riscv/vp8dsp_init.c | 7 +++++++ libavcodec/riscv/vp8dsp_rvv.S | 34 +++++++++++++++++++++++----------- 2 files changed, 30 insertions(+), 11 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index a4b7d49932..dc3e087f01 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -90,6 +90,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_rvv; c->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_rvv; c->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_rvv; + + c->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 30955a7b95..bf268e4d8d 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -161,9 +161,13 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel_filter size +.macro epel_filter size type lla t2, subpel_filters +.ifc \type,v + addi t0, a6, -1 +.elseif \type == h addi t0, a5, -1 +.endif li t1, 6 mul t0, t0, t1 add t0, t0, t2 @@ -176,19 +180,25 @@ endconst .endif .endm -.macro epel_load dst len size - addi t6, a2, -1 - addi a7, a2, 1 +.macro epel_load dst len size type +.ifc \type,v + mv a5, a3 +.else + li a5, 1 +.endif + sub t6, a2, a5 + add a7, a2, a5 + vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) - addi a7, a7, 1 + add a7, a7, a5 vle8.v v28, (a7) vwmulu.vx v16, v24, t2 vwmulu.vx v20, v26, t3 .ifc \size,6 - addi t6, t6, -1 - addi a7, a7, 1 + sub t6, t6, a5 + add a7, a7, a5 vle8.v v24, (t6) vle8.v v26, (a7) vwmaccu.vx v16, t0, v24 @@ -206,18 +216,18 @@ endconst vnclipu.wi \dst, v24, 0 .endm -.macro epel_load_inc dst len size - epel_load \dst \len \size +.macro epel_load_inc dst len size type + epel_load \dst \len \size \type add a2, a2, a3 .endm .macro epel len size type func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x - epel_filter \size + epel_filter \size \type vsetvlstatic8 \len 1: addi a4, a4, -1 - epel_load_inc v30 \len \size + epel_load_inc v30 \len \size \type vse8.v v30, (a0) add a0, a0, a1 bnez a4, 1b @@ -232,4 +242,6 @@ put_vp8_bilin_h_v \len v a6 put_vp8_bilin_hv \len epel \len 6 h epel \len 4 h +epel \len 6 v +epel \len 4 v .endr From patchwork Mon May 6 03:38:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48566 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152647pzb; Sun, 5 May 2024 20:39:14 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWba9D0VnGd71TPZo/GpKPzDwY97JAsb6CAbiPObhX+4r1m8BRR6lnxbAp/bB+7qIKaoa95/lO6N1Ru7PaDoC9UZbDs6aaV5H8p6Q== X-Google-Smtp-Source: AGHT+IHjWyWKrZOaZr2P43vx5rqboDBrid4BI3BDf5tPTb1zx2jonEosI+UYoVKcNvUiVtHS1mud X-Received: by 2002:a50:9ea5:0:b0:572:a158:8a7b with SMTP id a34-20020a509ea5000000b00572a1588a7bmr7673203edf.8.1714966753780; Sun, 05 May 2024 20:39:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966753; cv=none; d=google.com; s=arc-20160816; b=v7XhUBmo71Q1A3cHQK/EPUS2fhmTxKvkbvl/A6e7RmzAouE9FjfleWyr2ddhFXEU/g ORcWSberNRnB1Y65GpfYuN5+Vmt9S+ja4FqDCDyu/C7HXcyXs3FFZJXfKk4+BFlSaWe1 IvCgKb3LPlZD+IUVM+CUnlUSCILoZXsqu56IeyKSUF1bRntChmJqw1kL0z7mZnH8DM+a xsTdelYfvB+8RJIl+1Aaic6cz5oZsuqLyzFVwF3W7IIbSesSoCYilYgEs2mSSGHHArbG qjRi6Zpj5gw4SN6pCMVvdFMLhQauvX/Dj0SmZGJBiNl6dONAmWfLnraQwMshoxitPmPI 9ZGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=JL9Zc9sOVZ7wH7vo8Fq1DWNuDh48F3seAyRn6xd3QpQ=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=iuMDggStvVIZD8KX7WwH2wqwyIGDEkbyovf3M9AYVcFqC+akCEP7ziMPSPinieV9d/ wiyYa85l3TYaoQHxOU1wrxyIv/DzJ8unaWbgymadVc0au7pyzvbct39SLU+IcBAfFsuB XcdBkSKRab8QMC3/HgYUNzFY7iCgFKbVvW+MaifK8UnQ6pCdEwUmUXWsCCHUzA41v2Ft siREFyT3ViVq9IzSC1Mp6a1+mqdBMRMsLRvXviO/zdXorXcmdI4VAQxJZQDEeVRFqGxB JhhEuqBPg/mkwRZ+DD43Zcgqzn/sfV9LrWqM5TnjgX6jN7uMsFCK7Ln47n21Glm/nyjy 3DOQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=QMpAmTWs; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l1-20020aa7c3c1000000b00571d70d1b05si3214496edr.370.2024.05.05.20.39.13; Sun, 05 May 2024 20:39:13 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=QMpAmTWs; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6846368D5F9; Mon, 6 May 2024 06:38:37 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-210.mail.qq.com (out162-62-57-210.mail.qq.com [162.62.57.210]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id AA9EC68D57A for ; Mon, 6 May 2024 06:38:26 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966698; bh=VnKNwIxmbZHWFa9k8KYb9CnKeOBHkFJSIhrBqBfrZV4=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=QMpAmTWsH8tRCaBAPrKhMQzjbbIKgOUKnI8GKKn6Ovzroz/KIxYd5xxjFERNU6nv6 M/5XJMUTGysQr+Gbz7IDgzfuljUabiQTCDg2/LkEkFJA7haaNqMc5ko6+WkmrmIalk N+f6o6Dpqr7Rs2wdujxdzVY+OZIpdUXkKb81PWXg= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966697tjhevwm5k Message-ID: X-QQ-XMAILINFO: M0PjjqbLT90wL0yBJmkH6q57fdnUMTDXI6BBPm4y4BSyVxjPexvXoVzTPLAW8d M5yQ6YLQ5pINhr3BAE1Qma4T7W7PQXgzALCEfPmZbPvw3Up/ADG9hTxT0hqfilMD8HTBb51YTYbm RGWOGiAysi9ZrExBW1a0/1OEPFK2kWPj+Natjmfmp3mKV5UWBHZ8HRJLBJlWXtZCoGImbzrWMqhc jklkwfY8gorHqWkJaC09rxsR0AdF/dk2huWWKAp9PTI5/aVYOEMJXRnAOSxR4SKY/tYMKLrT8PZr KTNaRHCxA5+yg0Yt7KgoHzRZxs1ZjeK8lXVUCJo2LXua+tyM8tH6xuHpOupeMFkKlHCK5XUMo+vh UDPs+yYa5vEffzA7EGtOqvHOOoQUzTvaYNiN8DwxYTY86bH541exsqH9AF5VkknT/4HNLAFPYWbv ASttxrB2o7hGd7LyoE61au6a0hQrE8Y07chicHnMHetB6yfTNeT1wcNMdzp3yS9J5oJB3AvLB+LN ODZ/t0zB7U/wDFTdoPTLc8Ishcvu1+/s3Bx2Mq7OEdVWu2cpa9XOu8U2fAkC7piRNszCDZwLBmKi Cn0yy4pX4XG+F5iMb7dexy5pK92HZxhuM3KnK1AgPL4XxhY/wBfCL8VEb6Kv+gKiLz7d3lIVYZa1 zaTkAYEI4RXcrW/rMX10Bvqun3GUHA3Kq0ChiSvL/Q39JwX81NImV0xUJwOV7TYwOaftPF6PcGH8 3iC/nMDQXuCZZuLkN3aZn/QqaK+LMCdZ+AWKCVOb/x/i7U3z5zxZJzXwaMoomL7b9AyeImF+nHCU Fh486fetpqw1zLgVUcOP5R51XetGjLEhRSCjtJEeMc/UNuzlVFbyktbSZueeqIiujgn7x2FnsXFd PS0F1gCVy4uOJHVTBwTx8PIWglbskXdyaNpuTRntvS4R1JN6Ef2Fs5XsvhiOyXKTq06UB2kt6PHO NZGO+idgbsTDdG25vQDQ== X-QQ-XMRINFO: Mp0Kj//9VHAxr69bL5MkOOs= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:06 +0800 X-OQ-MSGID: <20240506033809.3790245-6-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240506033809.3790245-1-uk7b@foxmail.com> References: <20240506033809.3790245-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 6/9] lavc/vp8dsp: R-V V put_epel hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: wgq7pbUgeM/+ From: sunyuechi C908: vp8_put_epel4_h4v4_c: 20.0 vp8_put_epel4_h4v4_rvv_i32: 11.0 vp8_put_epel4_h4v6_c: 25.2 vp8_put_epel4_h4v6_rvv_i32: 13.5 vp8_put_epel4_h6v4_c: 22.2 vp8_put_epel4_h6v4_rvv_i32: 14.5 vp8_put_epel4_h6v6_c: 29.0 vp8_put_epel4_h6v6_rvv_i32: 15.7 vp8_put_epel8_h4v4_c: 73.0 vp8_put_epel8_h4v4_rvv_i32: 22.2 vp8_put_epel8_h4v6_c: 90.5 vp8_put_epel8_h4v6_rvv_i32: 26.7 vp8_put_epel8_h6v4_c: 85.0 vp8_put_epel8_h6v4_rvv_i32: 27.2 vp8_put_epel8_h6v6_c: 104.7 vp8_put_epel8_h6v6_rvv_i32: 29.5 vp8_put_epel16_h4v4_c: 145.5 vp8_put_epel16_h4v4_rvv_i32: 26.5 vp8_put_epel16_h4v6_c: 190.7 vp8_put_epel16_h4v6_rvv_i32: 47.5 vp8_put_epel16_h6v4_c: 173.7 vp8_put_epel16_h6v4_rvv_i32: 33.2 vp8_put_epel16_h6v6_c: 222.2 vp8_put_epel16_h6v6_rvv_i32: 35.5 --- libavcodec/riscv/vp8dsp_init.c | 13 ++++ libavcodec/riscv/vp8dsp_rvv.S | 117 +++++++++++++++++++++++++++------ 2 files changed, 109 insertions(+), 21 deletions(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index dc3e087f01..463c8fa0a2 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -97,6 +97,19 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) c->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_rvv; c->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_rvv; c->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_rvv; + + c->put_vp8_epel_pixels_tab[0][2][2] = ff_put_vp8_epel16_h6v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][2] = ff_put_vp8_epel4_h6v6_rvv; + c->put_vp8_epel_pixels_tab[0][2][1] = ff_put_vp8_epel16_h4v6_rvv; + c->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_rvv; + c->put_vp8_epel_pixels_tab[2][2][1] = ff_put_vp8_epel4_h4v6_rvv; + c->put_vp8_epel_pixels_tab[0][1][1] = ff_put_vp8_epel16_h4v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][1] = ff_put_vp8_epel8_h4v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][1] = ff_put_vp8_epel4_h4v4_rvv; + c->put_vp8_epel_pixels_tab[0][1][2] = ff_put_vp8_epel16_h6v4_rvv; + c->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_rvv; + c->put_vp8_epel_pixels_tab[2][1][2] = ff_put_vp8_epel4_h6v4_rvv; } #endif #endif diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index bf268e4d8d..baa8152830 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -161,26 +161,26 @@ const subpel_filters .byte 0, -1, 12, 123, -6, 0 endconst -.macro epel_filter size type - lla t2, subpel_filters +.macro epel_filter size type regtype + lla \regtype\()2, subpel_filters .ifc \type,v - addi t0, a6, -1 + addi \regtype\()0, a6, -1 .elseif \type == h - addi t0, a5, -1 + addi \regtype\()0, a5, -1 .endif - li t1, 6 - mul t0, t0, t1 - add t0, t0, t2 + li \regtype\()1, 6 + mul \regtype\()0, \regtype\()0, \regtype\()1 + add \regtype\()0, \regtype\()0, \regtype\()2 .irp n 1,2,3,4 - lb t\n, \n(t0) + lb \regtype\n, \n(\regtype\()0) .endr .ifc \size,6 - lb t5, 5(t0) - lb t0, (t0) + lb \regtype\()5, 5(\regtype\()0) + lb \regtype\()0, (\regtype\()0) .endif .endm -.macro epel_load dst len size type +.macro epel_load dst len size type from_mem regtype .ifc \type,v mv a5, a3 .else @@ -189,24 +189,35 @@ endconst sub t6, a2, a5 add a7, a2, a5 +.if \from_mem vle8.v v24, (a2) vle8.v v22, (t6) vle8.v v26, (a7) add a7, a7, a5 vle8.v v28, (a7) - vwmulu.vx v16, v24, t2 - vwmulu.vx v20, v26, t3 + vwmulu.vx v16, v24, \regtype\()2 + vwmulu.vx v20, v26, \regtype\()3 .ifc \size,6 sub t6, t6, a5 add a7, a7, a5 vle8.v v24, (t6) vle8.v v26, (a7) - vwmaccu.vx v16, t0, v24 - vwmaccu.vx v16, t5, v26 + vwmaccu.vx v16, \regtype\()0, v24 + vwmaccu.vx v16, \regtype\()5, v26 +.endif + vwmaccsu.vx v16, \regtype\()1, v22 + vwmaccsu.vx v16, \regtype\()4, v28 +.else + vwmulu.vx v16, v4, \regtype\()2 + vwmulu.vx v20, v6, \regtype\()3 + .ifc \size,6 + vwmaccu.vx v16, \regtype\()0, v0 + vwmaccu.vx v16, \regtype\()5, v10 + .endif + vwmaccsu.vx v16, \regtype\()1, v2 + vwmaccsu.vx v16, \regtype\()4, v8 .endif li t6, 64 - vwmaccsu.vx v16, t1, v22 - vwmaccsu.vx v16, t4, v28 vwadd.wx v16, v16, t6 vsetvlstatic16 \len vwadd.vv v24, v16, v20 @@ -216,18 +227,18 @@ endconst vnclipu.wi \dst, v24, 0 .endm -.macro epel_load_inc dst len size type - epel_load \dst \len \size \type +.macro epel_load_inc dst len size type from_mem regtype + epel_load \dst \len \size \type \from_mem \regtype add a2, a2, a3 .endm .macro epel len size type func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x - epel_filter \size \type + epel_filter \size \type t vsetvlstatic8 \len 1: addi a4, a4, -1 - epel_load_inc v30 \len \size \type + epel_load_inc v30 \len \size \type 1 t vse8.v v30, (a0) add a0, a0, a1 bnez a4, 1b @@ -236,6 +247,66 @@ func ff_put_vp8_epel\len\()_\type\()\size\()_rvv, zve32x endfunc .endm +.macro epel_hv len hsize vsize +func ff_put_vp8_epel\len\()_h\hsize\()v\vsize\()_rvv, zve32x + addi sp, sp, -48 + .irp n 0,1,2,3,4,5 +#if __riscv_xlen >= 64 + sd s\n, \n\()<<3(sp) +#else + sw s\n, \n\()<<3(sp) +#endif + .endr + sub a2, a2, a3 + epel_filter \hsize h t + epel_filter \vsize v s + vsetvlstatic8 \len +.if \hsize == 6 || \vsize == 6 + sub a2, a2, a3 + epel_load_inc v0 \len \hsize h 1 t +.endif + epel_load_inc v2 \len \hsize h 1 t + epel_load_inc v4 \len \hsize h 1 t + epel_load_inc v6 \len \hsize h 1 t + epel_load_inc v8 \len \hsize h 1 t +.if \hsize == 6 || \vsize == 6 + epel_load_inc v10 \len \hsize h 1 t +.endif + addi a4, a4, -1 +1: + addi a4, a4, -1 + epel_load v30 \len \vsize v 0 s + vse8.v v30, (a0) +.if \hsize == 6 || \vsize == 6 + vmv.v.v v0, v2 +.endif + vmv.v.v v2, v4 + vmv.v.v v4, v6 + vmv.v.v v6, v8 +.if \hsize == 6 || \vsize == 6 + vmv.v.v v8, v10 + epel_load_inc v10 \len \hsize h 1 t +.else + epel_load_inc v8 \len 4 h 1 t +.endif + add a0, a0, a1 + bnez a4, 1b + epel_load v30 \len \vsize v 0 s + vse8.v v30, (a0) + + .irp n 0,1,2,3,4,5 +#if __riscv_xlen >= 64 + ld s\n, \n\()<<3(sp) +#else + lw s\n, \n\()<<3(sp) +#endif + .endr + addi sp, sp, 48 + + ret +endfunc +.endm + .irp len 16,8,4 put_vp8_bilin_h_v \len h a5 put_vp8_bilin_h_v \len v a6 @@ -244,4 +315,8 @@ epel \len 6 h epel \len 4 h epel \len 6 v epel \len 4 v +epel_hv \len 6 6 +epel_hv \len 4 4 +epel_hv \len 6 4 +epel_hv \len 4 6 .endr From patchwork Mon May 6 03:38:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48567 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152705pzb; Sun, 5 May 2024 20:39:23 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCU9P3Tb7XHLVqrrZ1v0RQ9FHeSydXDOecvPUeXb/V8gkSKxwbL52b3sQziuH5hiz3ucGBGOpwXlrtRCSQ/ddd9FtnqKbQisd2L4fQ== X-Google-Smtp-Source: AGHT+IH81fqRvXkdPLofpDtk6rWfOnFHBZAQStErdLKLX28lh1L94h+0MLOQCe2TNUc5IOyU0W+9 X-Received: by 2002:a17:906:5fd9:b0:a59:bc75:5000 with SMTP id k25-20020a1709065fd900b00a59bc755000mr2217732ejv.12.1714966762941; Sun, 05 May 2024 20:39:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966762; cv=none; d=google.com; s=arc-20160816; b=PEUCGAhMPChF0rH9uWzByZ68PAjXQxDiAIykchl+DA//LiBIe5IczvMLuuS4dF4wo0 dvhwhczHJUqjVW8uLTVWqrQvDwFsFAmSBtz0KkbHiNIzmobB1WvFyMRXvF+GZu3oGg3v I5Oqt0HkYKTBGVjG9b5FPZFtRfleUmTlb1gW/O7ndQvEYqvgcUQXpaLkcEB/Zh2P8BTh /2j+q4azEdw8LmGypA1cP9ohgU6qOjER/ySQ0gjECQtjZtJoLfVvrcZ1TPVRfV7IWS+N YSUanwpcqS2cktlkKwUZYsfOPFlbYQkkYHeJ+/SYqo5YaUzNxSBMx9LMpI8m5C2voVI+ 5V5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=4PVqwXJ6hs7wDjqPbRfE3dmo6/JvCCndEczUfNoS+Mc=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=hq7GmBiOqiv0oYT5EY5vP+GGrrZtT2IvxegbhSu9wvcJMgItClsrfMVTaIDv7JF/Ga LUto+pNzwvgLorrMTxRlbQrryRAaqiO7qoOUjs54hGJUN3FwGE6wq6aUjxjEfEUtOd7q UJoehgAw96a+6ubD678iUpM/eVgAq0T62LTaMUHhxWVaYqYA2hZyAneAw0eb2b2LbOrv U5kp0dOp7jn12Oq6aIcehtuaTu9EALV2ARy8nlKrNdfVGLDCyy++pXP13/HW8Psmj3UL qgopeV72UFh5hEaiCQpUMzWLaAjdtKNeieyBWgOWZ7NbUq11eKExoZUyMJ5okx6L9cqy H1Lw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="R4/cwT4q"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id lx16-20020a170906af1000b00a59aaf102cbsi2363299ejb.743.2024.05.05.20.39.22; Sun, 05 May 2024 20:39:22 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b="R4/cwT4q"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B47C468D607; Mon, 6 May 2024 06:38:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-252.mail.qq.com (out162-62-57-252.mail.qq.com [162.62.57.252]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1FB7668D587 for ; Mon, 6 May 2024 06:38:28 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966699; bh=4Ma67yI8jZ4T2ePP3zNktb5NpbYrDRocVA+3roBCrQ8=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=R4/cwT4q1QzEZsFtxoJPk2cDFgihiHongMoPLx2+O1Q+l9eoXHePcBR+l3WzAaccA wVXzFcU2VqDzsZwVu4gdr44fp2vmgsTtKFjSSOEF71I+BEj7MAQWCcI482LxZnLXH0 9+/SPWfjQX7Ya5yEIJa9ji87KQMmQv4cAxa6K+7o= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966698trm60sw5d Message-ID: X-QQ-XMAILINFO: N8AO27vo76JNmuquS6ojAYDvYW5haBsa3tymPraS661SNtTaIVLZ76wGEcwKzw ktEjFg1fiqyURP/tgULKWjNVWd5b5Mc+JaxxNJqCIX0n9yMG0kOQ/VGwb1Denq2wdasRf21VUO3a puZWfypXV43r/bNI9yHDNCuhR7ATTvUwqUUZM06jpcz197Fd1bpOvYROgDxddIc4kyYbwfcPJ36m liUbGpshsJJM//Jm3SepZM3EYJR4FpOzE7LkYsFo3wChO+4XorwySv4xSZMJqfJUhxMQSnepQpKt XSncgDXCpv/J3bn+Wdf9G0+NVDyPn0BxNbJ5ml24tXqIpBDncAnCsvqBRWXdaT7PgSFq5AXYmpPi BmLvox7c0y1EJE2gbU1hvi6FjOU9sxi/NAUwsg4oST8dA1bTxvQJ3qVE5XupgZZsXD+7MAS3C3ci txi/S8czQLkdKK8HapKEXhYVX6FQaqfVaf+6fwWB71bThwRbFzcE72sxFLDAPmXFJq3KTlavsDq8 TNto2qwvtc/j5nsxvRjCaHOUfyRqSDvf6oEEwqXuANg35XpbcrJMXxL6Wtl3jVzghaMeA4JqW8EX oT959PcsfuNAkTgB79I8d1dz6cZTvq73tLV3JJNNAC2SWxbFn3Wp/i5B7SeH+w/ImLjKAJnfxWs7 t68SnA2sVMDrRE7BSyQ5KlD5L0K+o4p0Em7lEHSds515h2MWL7LSUVH52Bpau8TKifl60HI8pwjw Pzzl8z/lTo6g5S7Wv9u0eij6bHBnLZvCxrspGWi06i4pzctlarBU6gcEMOzUDDN2t0ieTolg4BdA E1rpG7cepR8CIZtm/cdVw0lXOxOEDHedk1WxJLF2LBAwKQe//uzmdahUnjQWOWM2CdgApqgu0qEB O3KreI34/fWMpCiqHq0oT/YxY9wJ4mLHae11HkVYKQW14wCFcGN3K0DIibrg/Q69rLGjPnf0PxOQ E5e9S0inbnELEc4lPbsA== X-QQ-XMRINFO: NI4Ajvh11aEj8Xl/2s1/T8w= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:07 +0800 X-OQ-MSGID: <20240506033809.3790245-7-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240506033809.3790245-1-uk7b@foxmail.com> References: <20240506033809.3790245-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 7/9] lavc/vp8dsp: R-V V loop_filter_simple X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Nyyb9/MzCa4G From: sunyuechi C908: vp8_loop_filter_simple_h_c: 416.0 vp8_loop_filter_simple_h_rvv_i32: 187.5 vp8_loop_filter_simple_v_c: 429.7 vp8_loop_filter_simple_v_rvv_i32: 104.0 --- libavcodec/riscv/vp8dsp_init.c | 5 ++ libavcodec/riscv/vp8dsp_rvv.S | 85 ++++++++++++++++++++++++++++++++++ 2 files changed, 90 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 463c8fa0a2..3acfe75d67 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -41,6 +41,8 @@ VP8_BILIN(16, rvv); VP8_BILIN(8, rvv); VP8_BILIN(4, rvv); +VP8_LF(rvv); + av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RV @@ -126,6 +128,9 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) if (flags & AV_CPU_FLAG_RVB_ADDR) { c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + + c->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter16_simple_rvv; + c->vp8_h_loop_filter_simple = ff_vp8_h_loop_filter16_simple_rvv; } #endif } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index baa8152830..2ac79a3b77 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -94,6 +94,91 @@ func ff_vp8_idct_dc_add4uv_rvv, zve32x ret endfunc +.macro filter_fmin len a f1 p0f2 q0f1 + vsetvlstatic16 \len + vsext.vf2 \q0f1, \a + vmin.vx \p0f2, \q0f1, a7 + vmin.vx \q0f1, \q0f1, t3 + vadd.vi \p0f2, \p0f2, 3 + vadd.vi \q0f1, \q0f1, 4 + vsra.vi \p0f2, \p0f2, 3 + vsra.vi \f1, \q0f1, 3 + vadd.vv \p0f2, \p0f2, v8 + vsub.vv \q0f1, v16, \f1 + vmax.vx \p0f2, \p0f2, zero + vmax.vx \q0f1, \q0f1, zero +.endm + +.macro filter len type normal inner dst stride fE fI thresh +.ifc \type,v + slli a6, \stride, 1 + sub t2, \dst, a6 + add t4, \dst, \stride + sub t1, \dst, \stride + vle8.v v1, (t2) + vle8.v v11, (t4) + vle8.v v17, (t1) + vle8.v v22, (\dst) +.else + addi t1, \dst, -1 + addi a6, \dst, -2 + addi t4, \dst, 1 + vlse8.v v1, (a6), \stride + vlse8.v v11, (t4), \stride + vlse8.v v17, (t1), \stride + vlse8.v v22, (\dst), \stride +.endif + vwsubu.vv v12, v1, v11 // p1-q1 + vwsubu.vv v24, v22, v17 // q0-p0 + vnclip.wi v23, v12, 0 + vsetvlstatic16 \len + // vp8_simple_limit(dst + i, stride, flim) + li a7, 2 + vneg.v v18, v12 + vmax.vv v18, v18, v12 + vneg.v v8, v24 + vmax.vv v8, v8, v24 + vsrl.vi v18, v18, 1 + vmacc.vx v18, a7, v8 + vmsleu.vx v0, v18, \fE + + li t5, 3 + li a7, 124 + li t3, 123 + vsext.vf2 v4, v23 + vzext.vf2 v8, v17 // p0 + vzext.vf2 v16, v22 // q0 + vmul.vx v30, v24, t5 + vadd.vv v12, v30, v4 + vsetvlstatic8 \len + vnclip.wi v11, v12, 0 + filter_fmin \len v11 v24 v4 v6 + vsetvlstatic8 \len + vnclipu.wi v4, v4, 0 + vnclipu.wi v6, v6, 0 + +.ifc \type,v + vse8.v v4, (t1), v0.t + vse8.v v6, (\dst), v0.t +.else + vsse8.v v4, (t1), \stride, v0.t + vsse8.v v6, (\dst), \stride, v0.t +.endif + +.endm + +func ff_vp8_v_loop_filter16_simple_rvv, zve32x + vsetvlstatic8 16 + filter 16 v 0 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_h_loop_filter16_simple_rvv, zve32x + vsetvlstatic8 16 + filter 16 h 0 0 a0 a1 a2 a3 a4 + ret +endfunc + .macro bilin_load dst len type mn .ifc \type,v add t5, a2, a3 From patchwork Mon May 6 03:38:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48570 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152791pzb; Sun, 5 May 2024 20:39:48 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV7eSFdweiH3qAWKXK0Q1D5+vMG3IdWAQ2rzDc4KF/mjGT2+egpqEdZiKhv7NaKPHGafW3tAuLnia+7A/bMj1V2ZUYomy8m4GoY0g== X-Google-Smtp-Source: AGHT+IGWQtvV3bsxNA69Tc3QUGErBNvHT+uvlO/hiT7QW7r1OfwChD1FbStWjcTBigwYqSLJL+0v X-Received: by 2002:ac2:5051:0:b0:51f:40a6:234a with SMTP id a17-20020ac25051000000b0051f40a6234amr5154375lfm.4.1714966788396; Sun, 05 May 2024 20:39:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966788; cv=none; d=google.com; s=arc-20160816; b=qrUuh23/+BWMvkBa4C+fafxgTjv4GuvLNkh+7XTQ1Xvkeiu6I1eaSFRbtmXjvjQkgb eiTCSWwk9CkWHnKmaL2K90SOIGD6IPwjcvJs3vv8I7wsaFNlZ3qaQOjjJlhi5OYWyD5O +YsG5R1cySwrHoSCv+YBG9Te/3S/s3yeOQl8AWtK0tlwpHu636HBY7574B5MTKBk5wjb U4WyPeHhFgUQBUdFK9lhSqb8v/c44IVXC70eToKxoHTySTwYkV7AEBtcBxUNCtd9+iZr f5P6wkJsUniVkFS75NAFvZ0bp1+Gc7k/FQCbtC0oy7pkCw6E/YeDRtwl/LL5h5NyA8Ey ogpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=1dPMHUAwDDcQdv3sNXwUo37h4s57dc8ST+rXUfzCU04=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=uDYf4CFrXlDPYOWQPvHnQ63zdFrKs6IL6GZjfu1LBARNnQCBeDdFkKZ9cqVK5hvEDK V9gLll8cD0h98anfBKPPM5TcJvetqDa1pXcnBAKNWPGnaVmpmL+8HRL1j8WfFgv36GIp 4ciEu+tai53DFOXkeN0+6vrB6zNp89dz7h2uSwd+nr4BtuigfAppHwUzqB1aB/RHicZb 0FgxRamDOK7BYcQwyi/1iZlzu7lZdESCm8bDo8xH/LovpelGTC9A0R3InKf7xusFYFOh 36rmYE0lio+XhX3DVgVPx4LQurpW3NKp2ziju7arQemsrrKBFw9F2KszJPwg8ktp9HYH CiNA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=hdYg1Kud; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id bo1-20020a170906d04100b00a59a53181b6si2439976ejb.240.2024.05.05.20.39.47; Sun, 05 May 2024 20:39:48 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=hdYg1Kud; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3789F68D636; Mon, 6 May 2024 06:39:36 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-49.mail.qq.com (out162-62-57-49.mail.qq.com [162.62.57.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E0A3768D53B for ; Mon, 6 May 2024 06:39:33 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966771; bh=/2ywerMXvacHZT5ihwIImGkLiaQP4uzij+X7cN27uvg=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=hdYg1KudXwumAY0adW4cKqXIMQqWG0Fz2S1rVLvvO2s2443Fipod2dGUJ/yi2uCoM M6ofKcagfYzTVXTmER6VqrC9irPA31qpm/ra8ZLrI+LHfDKH+vPfy/0JQolj8Nn7zB bWZAaW7ZD9NcLcaawsgH+3bA3Oh6enp68c9ueXEA= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966699twkx8bpgg Message-ID: X-QQ-XMAILINFO: MZHbDvTHakKc1VEDqw2LHP8jgz6j+SjB7dR7ly8uR40vvaExlRc1kGSlsz0GHR 9qKHW2rd2WDmcdS2akWBt5B0lxWt6ij1X1H9w8kq3z9eSauJKWh0bTFJe4/SOsoVjISozISQypLr lY/GYe1mgVwN78TWvP+GWJ893ehfdjj/VFncnt1moFGpO9swQjYLc98MSYQm8hG0HF/x3zQz7Z3A b3paJa4Ty5R5SAc5OjGxGw0boRERMRBrEDqQqmQkRHyHKz+m1tIrrdi0rpQGipjLNfD/ChUhsZ7r 9cK6ew+O4M1lr9wWdrxxn+GqmKg0IZ7we4U2bKEYOAzepc5Msgc6zM5wHiaeos1FkCAtLqEz8QEn 00Cfwi5xzCpKu/fUm2f2DsCodP8sFnQz4VCxpfkNV44vQDWj86IzqqvYXDQS9YDjTv9zkY/YJn+Q yT1Xexq7MW9IOwy/U15SwGCUxTH+L3a77DUuvbWJ7Un/eG5bDuzNchmPD1U8PnimDcUPsnRfEwxa lmV6IG1rvcG/ydPe1f30PKyM/vaP/sBoywVg13VxNmg0fJqAglE4qpVg8i4ixC2ju9Qx5BufdbYw UKHDmrinF7yBDuJbVMjlaAJe16F+PRZBALxuT4ZCWeo2gbXulIqAjeXVKco0SJwLO1L23hSVY/+x 6+ecYZcqyRHslIPViG+RC4sYWlfAVoSuMISGlNNFhKxg28m4vjbwl59qnqpAqURgFMYdZB6wTOJY vdZQzXTsJHhSDdtbaOSOytwc0j+cTr7Y2JfvJrk6k9OEEfNljGkIsXvs5H3D+Xp7eb704bi6vX+p ZuP/bPdGtnfZARLUpuk/Yu8TXlrqgduHZdgVWXX8Do+9Fw2i7plyY9+lzSonB34rFf8Y3OASR7C9 WpkHd9ooTfYPfJBs44TJ1zOxrH9zaL7zXynz0R1QLrs+t0gL3a6g4qPUakldxrA8ZkPd95Pb32wa p0sMrvcoE= X-QQ-XMRINFO: MSVp+SPm3vtS1Vd6Y4Mggwc= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:08 +0800 X-OQ-MSGID: <20240506033809.3790245-8-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240506033809.3790245-1-uk7b@foxmail.com> References: <20240506033809.3790245-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 8/9] lavc/vp8dsp: R-V V loop_filter_inner X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Km7fLKzRb4M8 From: sunyuechi C908: vp8_loop_filter8uv_inner_v_c: 738.2 vp8_loop_filter8uv_inner_v_rvv_i32: 455.2 vp8_loop_filter16y_inner_h_c: 685.0 vp8_loop_filter16y_inner_h_rvv_i32: 497.0 vp8_loop_filter16y_inner_v_c: 743.7 vp8_loop_filter16y_inner_v_rvv_i32: 295.7 --- libavcodec/riscv/vp8dsp_init.c | 4 ++ libavcodec/riscv/vp8dsp_rvv.S | 104 +++++++++++++++++++++++++++++++++ 2 files changed, 108 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 3acfe75d67..2adff1052a 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -129,6 +129,10 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + c->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_rvv; + c->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_rvv; + c->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_rvv; + c->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter16_simple_rvv; c->vp8_h_loop_filter_simple = ff_vp8_h_loop_filter16_simple_rvv; } diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 2ac79a3b77..21fa232325 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -94,6 +94,13 @@ func ff_vp8_idct_dc_add4uv_rvv, zve32x ret endfunc +.macro filter_abs dst diff fI + vneg.v v8, \diff + vmax.vv \dst, v8, \diff + vmsleu.vx v8, \dst, \fI + vmand.mm v27, v27, v8 +.endm + .macro filter_fmin len a f1 p0f2 q0f1 vsetvlstatic16 \len vsext.vf2 \q0f1, \a @@ -119,6 +126,16 @@ endfunc vle8.v v11, (t4) vle8.v v17, (t1) vle8.v v22, (\dst) + .if \normal + sub t3, t2, a6 + sub t0, t1, a6 + add t6, \dst, a6 + add a7, t4, a6 + vle8.v v2, (t3) + vle8.v v15, (t0) + vle8.v v10, (t6) + vle8.v v14, (a7) + .endif .else addi t1, \dst, -1 addi a6, \dst, -2 @@ -127,9 +144,27 @@ endfunc vlse8.v v11, (t4), \stride vlse8.v v17, (t1), \stride vlse8.v v22, (\dst), \stride + .if \normal + addi t5, \dst, -4 + addi t0, \dst, -3 + addi t6, \dst, 2 + addi a7, \dst, 3 + vlse8.v v2, (t5), \stride + vlse8.v v15, (t0), \stride + vlse8.v v10, (t6), \stride + vlse8.v v14, (a7), \stride + .endif .endif vwsubu.vv v12, v1, v11 // p1-q1 vwsubu.vv v24, v22, v17 // q0-p0 +.if \normal + vwsubu.vv v30, v1, v17 + vwsubu.vv v20, v11, v22 + vwsubu.vv v28, v1, v15 + vwsubu.vv v4, v2, v15 + vwsubu.vv v6, v10, v11 + vwsubu.vv v2, v14, v10 +.endif vnclip.wi v23, v12, 0 vsetvlstatic16 \len // vp8_simple_limit(dst + i, stride, flim) @@ -141,6 +176,25 @@ endfunc vsrl.vi v18, v18, 1 vmacc.vx v18, a7, v8 vmsleu.vx v0, v18, \fE +.if \normal + vneg.v v18, v30 + vmax.vv v30, v18, v30 + vmsleu.vx v27, v30, \fI + filter_abs v18 v28 \fI + filter_abs v18 v4 \fI + filter_abs v18 v6 \fI + filter_abs v18 v2 \fI + filter_abs v20 v20 \fI + vmand.mm v27, v0, v27 // vp8_simple_limit && normal + + vmsgtu.vx v20, v20, \thresh // hev + vmsgtu.vx v3, v30, \thresh + vmor.mm v3, v3, v20 // v3 = hev: > thresh + vzext.vf2 v18, v1 // v18 = p1 + vmand.mm v0, v27, v3 // v0 = normal && hev + vzext.vf2 v20, v11 // v12 = q1 + vmnot.m v3, v3 // v3 = !hv +.endif li t5, 3 li a7, 124 @@ -165,6 +219,37 @@ endfunc vsse8.v v6, (\dst), \stride, v0.t .endif +.if \normal + vmand.mm v0, v27, v3 // vp8_normal_limit & !hv + + .if \inner + vnclip.wi v30, v30, 0 + filter_fmin \len v30 v24 v4 v6 + vadd.vi v24, v24, 1 + vsra.vi v24, v24, 1 // (f1 + 1) >> 1; + vadd.vv v8, v18, v24 + vsub.vv v10, v20, v24 + .endif + + vmax.vx v8, v8, zero + vmax.vx v10, v10, zero + vsetvlstatic8 \len + vnclipu.wi v4, v4, 0 + vnclipu.wi v5, v6, 0 + vnclipu.wi v6, v8, 0 + vnclipu.wi v7, v10, 0 + .ifc \type,v + vse8.v v4, (t1), v0.t + vse8.v v5, (\dst), v0.t + vse8.v v6, (t2), v0.t + vse8.v v7, (t4), v0.t + .else + vsse8.v v4, (t1), \stride, v0.t + vsse8.v v5, (\dst), \stride, v0.t + vsse8.v v6, (a6), \stride, v0.t + vsse8.v v7, (t4), \stride, v0.t + .endif +.endif .endm func ff_vp8_v_loop_filter16_simple_rvv, zve32x @@ -179,6 +264,25 @@ func ff_vp8_h_loop_filter16_simple_rvv, zve32x ret endfunc +func ff_vp8_h_loop_filter16_inner_rvv, zve32x + vsetvlstatic8 16 + filter 16 h 1 1 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter16_inner_rvv, zve32x + vsetvlstatic8 16 + filter 16 v 1 1 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter8uv_inner_rvv, zve32x + vsetvlstatic8 8 + filter 8 v 1 1 a0 a2 a3 a4 a5 + filter 8 v 1 1 a1 a2 a3 a4 a5 + ret +endfunc + .macro bilin_load dst len type mn .ifc \type,v add t5, a2, a3 From patchwork Mon May 6 03:38:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48568 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp1152727pzb; Sun, 5 May 2024 20:39:31 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXvOWsoogf9PPaZ8aIPEvn0ZEFT3gtJ+ezkd2ZMDIekxpsX5Y/OYzdE/jYVCwleB6AqyXaSJYBL86gtGvHjcbu8G20Myr85190w3Q== X-Google-Smtp-Source: AGHT+IHDVftBrJJoXJNGKBvbTb7oT0eRcTW+/PkzJ7akyYJuYah0A7B+7FGFzBVe61z4J4iZ5W4D X-Received: by 2002:a19:ca0d:0:b0:51e:ece3:1358 with SMTP id a13-20020a19ca0d000000b0051eece31358mr3866061lfg.0.1714966771672; Sun, 05 May 2024 20:39:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714966771; cv=none; d=google.com; s=arc-20160816; b=np9OrAX8CNRyd2RxLwvSpaKmE3Z6YpEMx30h+XHh8gyCWRp4x+wB3Xg9F7TknTZ6cv ULcs+f9IkPE/lU2kyS6Qk61mZ2auTaqS7A6MYBi7OsfLgi/CCj4iVrUWlLvkgJLvgREh oWaUtYQMq+44sd5uy8C0EDRalDF1gizNWaKLu3bQoQMhVXRr90b2TeWx03/dHQVchYv8 VHhlNPGvINjFugU1HJ1UevDbzrHIwaJ8rYUzEBnqgq2PAgUsyujpLJF93Z9vDdXItHbd JnRjxX9yzYztWAJKTpXYp9fMS0Tx0JM74DQ9CJnefZ69zUDawYFSVc8EYXuxCJNLB1zZ SATA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=zZ6txBynBYvT8CE4lZaeENhrz2veDU1WrtiQoIHeVrE=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=h+ONppvMlyXPD2c19lR6xT2jAfk5JKhVeg5dLnB+TaabnNvWcny8P0vrxmoNoBjmLS 7hXnDIK6ZoMCYzu4xpFRlaQx3jGkgI13q4YlVtHHEVVJAdEyftx8jjsGcEQPmoUBcs2v dT6/FQBYlVtE+0CHwqXZVXMIj4uvUi7CrTEogz0jBX5C8zFos6p55ocRAt62zGrGoE5H YMhlvWE/+ebJKBKPkplkiFrtGk5hO0dI0ypTQ9jHFOyFMXcFx6Vum/kzi5kfC6GObSvS Kt/lT3NrVSvatPMFwbx9E9RmNnuuBH0wN7gENP0iROCfRZB0uBJpJlH0H15FFi/99E0D LCcQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=GSpjYQXl; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hp16-20020a1709073e1000b00a59d54f9e28si340105ejc.150.2024.05.05.20.39.31; Sun, 05 May 2024 20:39:31 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=GSpjYQXl; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5181168D614; Mon, 6 May 2024 06:38:40 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-58-211.mail.qq.com (out162-62-58-211.mail.qq.com [162.62.58.211]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E323C68D5EE for ; Mon, 6 May 2024 06:38:29 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714966701; bh=VVTRryiIK5SUG9Di73T46hSNIiIR6a8iOJyszWHHjyQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=GSpjYQXlIlRh52tFIl0nHNznxxG7xMmTKGqoPHZe/dLgf0MaC7L/R8y2y3oO0ObwJ z4chcfgP4JgUmcGxEf0G7lNpcC1Eq1pDKO4iIqOlJCl1mxkwFbfvkfARLJnY7bE0QY d4LOGKD+FbjKWDHF6lX+kBIoXhpW276ZYwxGwLA4= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza15-1.qq.com (NewEsmtp) with SMTP id 98B83073; Mon, 06 May 2024 11:38:11 +0800 X-QQ-mid: xmsmtpt1714966700tus833yhi Message-ID: X-QQ-XMAILINFO: OZ9Y50FOG5bmsgvxger6PYMa6I5/3YNpymMvuEfES+YKTpGxZ5aBxJ3crfgGsd zRlTcIyYRJ58ppy28XfZ2fEhVQJB3gGqXQ78FUMZ786PcDkAtVYRUQCgNViPLSztn5EpxHnq1fM5 IQOK1ezaC7QW2oCMHfz92WSMAWIJuxhl361M0LHLBgQPExIlM84MqsOyaNwF8iDLN9EITy62aIsl innqQYYEM5cG8WfpwsAnkg9d8V8QgeG/J42S1yCgJC1hzC9nxEU9Zl0Au2AOAdu4beS3B1izU1gJ /zWQ5OqyOV1Cm9VcWsG9uFrG5uEPJJzIsncgSwvhfQ6+Zf9dSkNYWF9ujJOr9BcH4A0csef4NwOr dLkFbhfDvSp3qdB3Yf2EbyM9SwbtuO1mrJSjtsv4YqL0+7roUECfELXjBR+3H0TeC4aArMqZbEeG 4C1WEc2fjj/60IAlNDNZOOb43wyWquKwv7SY2t7MsDp5+mPlGu5PyCxHFJYWQmn8Z9vT3WpU+Hw3 KjA/uUWOW36W2MwPMdOQEHoJ5yvwCqmkUuaoVrJpT2LcRBD+yFCnO696OROEQs1cimYxkJ+RKCMU GZpXVv0Hm9cbfi/T6S0cprIbKSkbFlzSY9rJf6X1rgVIMs6O/eGhoxzhgl3YRv9ASZ36xJVSp+BN iJjNL1MZYopBLFnt+Dbz1s5s5WZM5uYT1EZ+nqlfV+I6KmLNbMUh+9mhykAOP7l9xos6NCnhO5IX H9d0YWMqSYc5VL+Croa+m1r1pLzGbLJGcipKjMiT6B3LmhVPxo6Q96D3l6fJQJPxiavXWJb90zfW uuhPBNXy261KMyi05VOw2Ud/S/LX0VyHR+Nl//AfgFIKged7wlQnmvFg/N3a8thLUi5S1JttoMp5 nw8hI3XT4hJ2U+VTib9NF08gtDU1gsllfBV/InQJMZdIFsm9XKKxoyUi8JspY2UwGoO1VQ584wdn bdQEwFoSw= X-QQ-XMRINFO: NyFYKkN4Ny6FSmKK/uo/jdU= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 11:38:09 +0800 X-OQ-MSGID: <20240506033809.3790245-9-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240506033809.3790245-1-uk7b@foxmail.com> References: <20240506033809.3790245-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 9/9] lavc/vp8dsp: R-V V loop_filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: p71kYWJYm0Z6 From: sunyuechi C908: vp8_loop_filter8uv_v_c: 745.5 vp8_loop_filter8uv_v_rvv_i32: 467.2 vp8_loop_filter16y_h_c: 674.2 vp8_loop_filter16y_h_rvv_i32: 553.0 vp8_loop_filter16y_v_c: 732.7 vp8_loop_filter16y_v_rvv_i32: 324.5 --- libavcodec/riscv/vp8dsp_init.c | 4 +++ libavcodec/riscv/vp8dsp_rvv.S | 57 ++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index 2adff1052a..1bb5aad518 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -129,6 +129,10 @@ av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) c->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_rvv; } + c->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_rvv; + c->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_rvv; + c->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_rvv; + c->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_rvv; c->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_rvv; c->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_rvv; diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S index 21fa232325..567dc96f76 100644 --- a/libavcodec/riscv/vp8dsp_rvv.S +++ b/libavcodec/riscv/vp8dsp_rvv.S @@ -229,6 +229,33 @@ endfunc vsra.vi v24, v24, 1 // (f1 + 1) >> 1; vadd.vv v8, v18, v24 vsub.vv v10, v20, v24 + .else + li t5, 27 + li t3, 9 + li a7, 18 + vwmul.vx v2, v11, t5 + vwmul.vx v6, v11, t3 + vwmul.vx v4, v11, a7 + vsetvlstatic16 \len + li a7, 63 + vzext.vf2 v14, v15 // p2 + vzext.vf2 v24, v10 // q2 + vadd.vx v2, v2, a7 + vadd.vx v4, v4, a7 + vadd.vx v6, v6, a7 + vsra.vi v2, v2, 7 // a0 + vsra.vi v12, v4, 7 // a1 + vsra.vi v6, v6, 7 // a2 + vadd.vv v14, v14, v6 // p2 + a2 + vsub.vv v22, v24, v6 // q2 - a2 + vsub.vv v10, v20, v12 // q1 - a1 + vadd.vv v4, v8, v2 // p0 + a0 + vsub.vv v6, v16, v2 // q0 - a0 + vadd.vv v8, v12, v18 // a1 + p1 + vmax.vx v4, v4, zero + vmax.vx v6, v6, zero + vmax.vx v14, v14, zero + vmax.vx v16, v22, zero .endif vmax.vx v8, v8, zero @@ -249,6 +276,17 @@ endfunc vsse8.v v6, (a6), \stride, v0.t vsse8.v v7, (t4), \stride, v0.t .endif + .if !\inner + vnclipu.wi v14, v14, 0 + vnclipu.wi v16, v16, 0 + .ifc \type,v + vse8.v v14, (t0), v0.t + vse8.v v16, (t6), v0.t + .else + vsse8.v v14, (t0), \stride, v0.t + vsse8.v v16, (t6), \stride, v0.t + .endif + .endif .endif .endm @@ -283,6 +321,25 @@ func ff_vp8_v_loop_filter8uv_inner_rvv, zve32x ret endfunc +func ff_vp8_v_loop_filter16_rvv, zve32x + vsetvlstatic8 16 + filter 16 v 1 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_h_loop_filter16_rvv, zve32x + vsetvlstatic8 16 + filter 16 h 1 0 a0 a1 a2 a3 a4 + ret +endfunc + +func ff_vp8_v_loop_filter8uv_rvv, zve32x + vsetvlstatic8 8 + filter 8 v 1 0 a0 a2 a3 a4 a5 + filter 8 v 1 0 a1 a2 a3 a4 a5 + ret +endfunc + .macro bilin_load dst len type mn .ifc \type,v add t5, a2, a3