From patchwork Fri Jul 5 18:28:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 50373 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:cc64:0:b0:482:c625:d099 with SMTP id k4csp4491570vqv; Fri, 5 Jul 2024 11:28:28 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXZA2RcmOf4JJaMKl2p0bAtAFYDbpw5nEgSZ64VUHUAMIM7JvyxSrUiHg3rOtXHXnJzeiYDRao2RFDCDZxWuAoy8iAgnLqgpsrtiQ== X-Google-Smtp-Source: AGHT+IEz9DVkz/LcGbSsWnp58+5L1B0cr9LZmLfJ8GrFiIewPKACPgNxgVX31M/3gBPsANinyJw4 X-Received: by 2002:a05:6402:27d4:b0:58b:e192:3632 with SMTP id 4fb4d7f45d1cf-58e5c72fd47mr3335483a12.27.1720204108406; Fri, 05 Jul 2024 11:28:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1720204108; cv=none; d=google.com; s=arc-20160816; b=SQp7eCm8dqpmA3bnclGzV72elKdDTmVO19ARmpGo9GZCCk2Gd/l/wjWYaAy9it4YA3 Dv3kPLNQp2TnWfNeAZvX4S/7ldVxe1SlkQhDntjS90w55lHOHkpsEH+TeLuq0aKYA25l 9ujbYMsccXoWmbZm907O47Lsa9z5pQOe77/JCR9qZ9AfKWyFZdq2mxk6PwkMq4bnXklJ Zp7OwgxLRFdUBh3fofcr8clG3WAmB8oVjZdByPBFbZg2u1s/yc/UKosOQkRbb1yQDEL5 jPfbBAmhySvj3gkcgpI7jb2IiCC36LZNm23zHE9Pj21fCEOssImJnuvFujyIoIDf8JcJ qugw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=EbspzSlsI+9uoPppla401qXCXrdY7m1pM1Pg8LxeFEU=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=OB6qX9NmdkcIORTDgZlVShLYRSUKcblTa57n4Rc+w1Svzg/nSXef54fVQvanlq//7c lyK3JFaN1O93R5xHqd8j/KZMd5K+vNrOYoLLQSjj3DDCkN4Q500mduKGVWNQdCW4bUMk w65nVE3fVqmIUjMxsJRzZxlsJcKE23nnaF43SJ1nguZnrf4Ua99s2YkhIoTaIA1BA5kj 3nWKx/2n3utBSIC35zLI3O9438DJEnLXe10Tyl4Hqe0zO/jIMVuCZ93OiDTKX/tlzZzz xF4+jFgogUd2YWUsLyxS8ft8r+Yvy/HACvXJDeMsm2u5uQrI2qPPmA2fphXPRoDg4KPB 254A==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-58b39360f71si4475899a12.265.2024.07.05.11.28.27; Fri, 05 Jul 2024 11:28:28 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 515AD68D1BB; Fri, 5 Jul 2024 21:28:24 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8C00F68D1BB for ; Fri, 5 Jul 2024 21:28:17 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id A059EC006E for ; Fri, 5 Jul 2024 21:28:16 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 5 Jul 2024 21:28:16 +0300 Message-ID: <20240705182816.27464-1-remi@remlab.net> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/h264dsp: R-V V 8-bit h264_weight_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: H7EB+G1iiPWO There are two implementations here: - a generic scalable one processing one column at a time, - a specialised processing one (fixed-size) row at a time. Unsurprisingly, the generic one works out better with smaller widths. With larger widths, the gains from filling vectors are outweighed by the extra cost of strided loads and stores. In other words, memory accesses become the bottleneck. T-Head C908: h264_weight2_8_c: 54.2 h264_weight2_8_rvv_i32: 17.5 h264_weight4_8_c: 102.0 h264_weight4_8_rvv_i32: 34.7 h264_weight8_8_c: 213.7 h264_weight8_8_rvv_i32: 79.7 h264_weight16_8_c: 401.0 h264_weight16_8_rvv_i32: 74.2 SpacemiT X60: h264_weight2_8_c: 48.5 h264_weight2_8_rvv_i32: 11.7 h264_weight4_8_c: 90.5 h264_weight4_8_rvv_i32: 23.7 h264_weight8_8_c: 175.0 h264_weight8_8_rvv_i32: 58.0 h264_weight16_8_c: 342.2 h264_weight16_8_rvv_i32: 66.0 --- libavcodec/riscv/h264dsp_init.c | 7 +++ libavcodec/riscv/h264dsp_rvv.S | 77 +++++++++++++++++++++++++++++++++ 2 files changed, 84 insertions(+) diff --git a/libavcodec/riscv/h264dsp_init.c b/libavcodec/riscv/h264dsp_init.c index bf9743eb6b..e1b725dcbb 100644 --- a/libavcodec/riscv/h264dsp_init.c +++ b/libavcodec/riscv/h264dsp_init.c @@ -21,12 +21,15 @@ #include "config.h" #include +#include #include "libavutil/attributes.h" #include "libavutil/cpu.h" #include "libavutil/riscv/cpu.h" #include "libavcodec/h264dsp.h" +extern const h264_weight_func ff_h264_weight_funcs_8_rvv[]; + void ff_h264_v_loop_filter_luma_8_rvv(uint8_t *pix, ptrdiff_t stride, int alpha, int beta, int8_t *tc0); void ff_h264_h_loop_filter_luma_8_rvv(uint8_t *pix, ptrdiff_t stride, @@ -60,6 +63,10 @@ av_cold void ff_h264dsp_init_riscv(H264DSPContext *dsp, const int bit_depth, # if HAVE_RVV if (flags & AV_CPU_FLAG_RVV_I32) { if (bit_depth == 8 && ff_rv_vlen_least(128)) { + memcpy(dsp->weight_h264_pixels_tab, + ff_h264_weight_funcs_8_rvv, + sizeof (dsp->weight_h264_pixels_tab)); + dsp->h264_v_loop_filter_luma = ff_h264_v_loop_filter_luma_8_rvv; dsp->h264_h_loop_filter_luma = ff_h264_h_loop_filter_luma_8_rvv; dsp->h264_h_loop_filter_luma_mbaff = diff --git a/libavcodec/riscv/h264dsp_rvv.S b/libavcodec/riscv/h264dsp_rvv.S index 96a8a0a8a3..ab85bfbd69 100644 --- a/libavcodec/riscv/h264dsp_rvv.S +++ b/libavcodec/riscv/h264dsp_rvv.S @@ -26,6 +26,83 @@ #include "libavutil/riscv/asm.S" +func ff_h264_weight_pixels_simple_8_rvv, zve32x + csrwi vxrm, 0 + sll a5, a5, a3 +1: + vsetvli zero, a6, e32, m4, ta, ma + vle8.v v8, (a0) + addi a2, a2, -1 + vmv.v.x v16, a5 + vsetvli zero, zero, e16, m2, ta, ma + vzext.vf2 v24, v8 + vwmaccsu.vx v16, a4, v24 + vnclip.wi v16, v16, 0 + vmax.vx v16, v16, zero + vsetvli zero, zero, e8, m1, ta, ma + vnclipu.wx v8, v16, a3 + vse8.v v8, (a0) + add a0, a0, a1 + bnez a2, 1b + + ret +endfunc + +func ff_h264_weight_pixels_8_rvv, zve32x + csrwi vxrm, 0 + sll a5, a5, a3 +1: + mv t0, a0 + mv t6, a6 +2: + vsetvli t2, a2, e32, m8, ta, ma + vlse8.v v8, (t0), a1 + addi t6, t6, -1 + vmv.v.x v16, a5 + vsetvli zero, zero, e16, m4, ta, ma + vzext.vf2 v24, v8 + vwmaccsu.vx v16, a4, v24 + vnclip.wi v16, v16, 0 + vmax.vx v16, v16, zero + vsetvli zero, zero, e8, m2, ta, ma + vnclipu.wx v8, v16, a3 + vsse8.v v8, (t0), a1 + addi t0, t0, 1 + bnez t6, 2b + + mul t3, a1, t2 + sub a2, a2, t2 + add a0, a0, t3 + bnez a2, 1b + + ret +endfunc + +.irp w, 16, 8, 4, 2 +func ff_h264_weight_pixels\w\()_8_rvv, zve32x + li a6, \w + .if \w == 16 + j ff_h264_weight_pixels_simple_8_rvv + .else + j ff_h264_weight_pixels_8_rvv + .endif +endfunc +.endr + + .global ff_h264_weight_funcs_8_rvv + .hidden ff_h264_weight_funcs_8_rvv +const ff_h264_weight_funcs_8_rvv + .irp w, 16, 8, 4, 2 +#if __riscv_xlen == 32 + .word ff_h264_weight_pixels\w\()_8_rvv +#elif __riscv_xlen == 64 + .dword ff_h264_weight_pixels\w\()_8_rvv +#else + .qword ff_h264_weight_pixels\w\()_8_rvv +#endif + .endr +endconst + .variant_cc ff_h264_loop_filter_luma_8_rvv func ff_h264_loop_filter_luma_8_rvv, zve32x # p2: v8, p1: v9, p0: v10, q0: v11, q1: v12, q2: v13 From patchwork Fri Jul 5 20:23:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 50374 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:cc64:0:b0:482:c625:d099 with SMTP id k4csp4545423vqv; Fri, 5 Jul 2024 13:24:01 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUZLuhiySCKhM8ItigpMkbtqi2O5UfE5viiVrfwfzqLpqUU7pe089i8+rC5mrXwI7KtEoA4CgjP9ruZAC4GXjcbKktmhcHDVvjRLQ== X-Google-Smtp-Source: AGHT+IGoeCH+/d5xioCqsTa/rPKLCnYvoy2KdPK+f+uiqoL8qDnNF1HnvRexPTsXkCXkHrFWovbC X-Received: by 2002:ac2:5193:0:b0:52c:d76a:867f with SMTP id 2adb3069b0e04-52ea0554de8mr3997449e87.0.1720211041082; Fri, 05 Jul 2024 13:24:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1720211041; cv=none; d=google.com; s=arc-20160816; b=s7opzthEj8R3GVxaz3Al2BDIGmp03M78m1AoWjZxnLGu2LQ5DzmFYgplp6teE4VdDP z0MxOrm2/+Sdb2l2KRtehWWe137/t3Xj8AK12V3rNmLaJbQXIxSy/eiw8onAXfG/97rQ 2fVvLZf3i8bZ6tMuhqPFDI1lDdMB5F7s9QbcP2FZ8dC73XhDOimnXJO7Qnij5h//gTpB qDjVX823XBvn6OUcYys8SdOLziSI3S6qA4Qm7nTbWPoZliKrggAwwh2ticrweTRDMTUf AcvTrPA/D92tTGBD0YeEN6vDCuKmynCqWfCL/ks8zAMTgNMXhquKyeZwRlcMjW7cPIkS no0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=QjwGkju7vXFFzBnl0xStxGfqLi30xwaGcGtX2CzY8Uc=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=V7Db8aL97TYZKE9kwgA8oh4DnM5EAak0aMEorPs363KJWEZfs7TMOnECyAXKQbkP4v YWBywhGG75zJMYPbgQdwvceE8hKU+sfWWZVIeMPpVKxhUVscUqwi2WgMRlvgF+KKMoeQ RFnlkQpysZSdEqi1HoSS8WRBHd78cp/4MsQknNXFPcsxsp/5NzcKf3eXJSy1S55Anpfc bnKGlEw8s3Pbh/y2JOkMFYOpA7C54Ex9ZERTWo1qPaQgAIhzOWt+BMIKtIqWmuWO7cui y9Dbkg2RnWVxzsgzHEOG7pnamhVre2PPBrBP+KF2pyvAsq1ffsqN+Dbl0+jGsRCwiDfE OcEA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a77cb7c205esi116948566b.1000.2024.07.05.13.23.59; Fri, 05 Jul 2024 13:24:01 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A807E68DBAE; Fri, 5 Jul 2024 23:23:56 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8102E68DAA8 for ; Fri, 5 Jul 2024 23:23:50 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 216CEC006E for ; Fri, 5 Jul 2024 23:23:50 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 5 Jul 2024 23:23:49 +0300 Message-ID: <20240705202349.51307-1-remi@remlab.net> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240705182816.27464-1-remi@remlab.net> References: <20240705182816.27464-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] lavc/h264dsp: R-V V 8-bit h264_biweight_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NsPLML/CWAvS T-Head C908: h264_biweight2_8_c: 65.2 h264_biweight2_8_rvv_i32: 24.0 h264_biweight4_8_c: 135.2 h264_biweight4_8_rvv_i32: 48.0 h264_biweight8_8_c: 231.5 h264_biweight8_8_rvv_i32: 104.7 h264_biweight16_8_c: 454.0 h264_biweight16_8_rvv_i32: 93.7 SpacemiT X60: h264_biweight2_8_c: 57.7 h264_biweight2_8_rvv_i32: 16.7 h264_biweight4_8_c: 106.0 h264_biweight4_8_rvv_i32: 33.7 h264_biweight8_8_c: 205.7 h264_biweight8_8_rvv_i32: 77.7 h264_biweight16_8_c: 403.5 h264_biweight16_8_rvv_i32: 83.2 --- libavcodec/riscv/h264dsp_init.c | 14 ++++-- libavcodec/riscv/h264dsp_rvv.S | 78 +++++++++++++++++++++++++++++++++ 2 files changed, 88 insertions(+), 4 deletions(-) diff --git a/libavcodec/riscv/h264dsp_init.c b/libavcodec/riscv/h264dsp_init.c index e1b725dcbb..88afec8df0 100644 --- a/libavcodec/riscv/h264dsp_init.c +++ b/libavcodec/riscv/h264dsp_init.c @@ -28,7 +28,10 @@ #include "libavutil/riscv/cpu.h" #include "libavcodec/h264dsp.h" -extern const h264_weight_func ff_h264_weight_funcs_8_rvv[]; +extern const struct { + const h264_weight_func weight; + const h264_biweight_func biweight; +} ff_h264_weight_funcs_8_rvv[]; void ff_h264_v_loop_filter_luma_8_rvv(uint8_t *pix, ptrdiff_t stride, int alpha, int beta, int8_t *tc0); @@ -63,9 +66,12 @@ av_cold void ff_h264dsp_init_riscv(H264DSPContext *dsp, const int bit_depth, # if HAVE_RVV if (flags & AV_CPU_FLAG_RVV_I32) { if (bit_depth == 8 && ff_rv_vlen_least(128)) { - memcpy(dsp->weight_h264_pixels_tab, - ff_h264_weight_funcs_8_rvv, - sizeof (dsp->weight_h264_pixels_tab)); + for (int i = 0; i < 4; i++) { + dsp->weight_h264_pixels_tab[i] = + ff_h264_weight_funcs_8_rvv[i].weight; + dsp->biweight_h264_pixels_tab[i] = + ff_h264_weight_funcs_8_rvv[i].biweight; + } dsp->h264_v_loop_filter_luma = ff_h264_v_loop_filter_luma_8_rvv; dsp->h264_h_loop_filter_luma = ff_h264_h_loop_filter_luma_8_rvv; diff --git a/libavcodec/riscv/h264dsp_rvv.S b/libavcodec/riscv/h264dsp_rvv.S index ab85bfbd69..6cbc699b21 100644 --- a/libavcodec/riscv/h264dsp_rvv.S +++ b/libavcodec/riscv/h264dsp_rvv.S @@ -48,6 +48,34 @@ func ff_h264_weight_pixels_simple_8_rvv, zve32x ret endfunc + .variant_cc ff_h264_biweight_pixels_simple_8_rvv +func ff_h264_biweight_pixels_simple_8_rvv, zve32x + csrwi vxrm, 0 + sll a7, a7, a3 + addi a4, a4, 1 +1: + vsetvli zero, t6, e32, m4, ta, ma + vle8.v v8, (a0) + addi a3, a3, -1 + vle8.v v12, (a1) + add a1, a1, a2 + vmv.v.x v16, a7 + vsetvli zero, zero, e16, m2, ta, ma + vzext.vf2 v24, v8 + vzext.vf2 v28, v12 + vwmaccsu.vx v16, a5, v24 + vwmaccsu.vx v16, a6, v28 + vnclip.wi v16, v16, 0 + vmax.vx v16, v16, zero + vsetvli zero, zero, e8, m1, ta, ma + vnclipu.wx v8, v16, a4 + vse8.v v8, (a0) + add a0, a0, a2 + bnez a3, 1b + + ret +endfunc + func ff_h264_weight_pixels_8_rvv, zve32x csrwi vxrm, 0 sll a5, a5, a3 @@ -78,6 +106,44 @@ func ff_h264_weight_pixels_8_rvv, zve32x ret endfunc + .variant_cc ff_h264_biweight_pixels_8_rvv +func ff_h264_biweight_pixels_8_rvv, zve32x + csrwi vxrm, 0 + sll a7, a7, a3 + addi a4, a4, 1 +1: + mv t0, a0 + mv t1, a1 + mv t5, t6 +2: + vsetvli t2, a3, e32, m8, ta, ma + vlse8.v v8, (t0), a2 + vlse8.v v12, (t1), a2 + addi t5, t5, -1 + vmv.v.x v16, a7 + vsetvli zero, zero, e16, m4, ta, ma + vzext.vf2 v24, v8 + vzext.vf2 v28, v12 + vwmaccsu.vx v16, a5, v24 + vwmaccsu.vx v16, a6, v28 + vnclip.wi v16, v16, 0 + vmax.vx v16, v16, zero + vsetvli zero, zero, e8, m2, ta, ma + vnclipu.wx v8, v16, a4 + vsse8.v v8, (t0), a2 + addi t0, t0, 1 + addi t1, t1, 1 + bnez t5, 2b + + mul t3, a2, t2 + sub a3, a3, t2 + add a0, a0, t3 + add a1, a1, t3 + bnez a3, 1b + + ret +endfunc + .irp w, 16, 8, 4, 2 func ff_h264_weight_pixels\w\()_8_rvv, zve32x li a6, \w @@ -87,6 +153,15 @@ func ff_h264_weight_pixels\w\()_8_rvv, zve32x j ff_h264_weight_pixels_8_rvv .endif endfunc + +func ff_h264_biweight_pixels\w\()_8_rvv, zve32x + li t6, \w + .if \w == 16 + j ff_h264_biweight_pixels_simple_8_rvv + .else + j ff_h264_biweight_pixels_8_rvv + .endif +endfunc .endr .global ff_h264_weight_funcs_8_rvv @@ -95,10 +170,13 @@ const ff_h264_weight_funcs_8_rvv .irp w, 16, 8, 4, 2 #if __riscv_xlen == 32 .word ff_h264_weight_pixels\w\()_8_rvv + .word ff_h264_biweight_pixels\w\()_8_rvv #elif __riscv_xlen == 64 .dword ff_h264_weight_pixels\w\()_8_rvv + .dword ff_h264_biweight_pixels\w\()_8_rvv #else .qword ff_h264_weight_pixels\w\()_8_rvv + .qword ff_h264_biweight_pixels\w\()_8_rvv #endif .endr endconst