From patchwork Fri Oct 27 19:25:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44392 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dd83:b0:15d:8365:d4b8 with SMTP id kw3csp69267pzb; Fri, 27 Oct 2023 12:26:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEM8v+GZ04j2e89DtgW9e9cGY9vNnjc0d5YwxDRXgNWpjOvRiH4kafWkOHRPesy+efvl6xv X-Received: by 2002:aa7:c0d3:0:b0:53d:9471:76b3 with SMTP id j19-20020aa7c0d3000000b0053d947176b3mr3095656edp.7.1698434761399; Fri, 27 Oct 2023 12:26:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698434761; cv=none; d=google.com; s=arc-20160816; b=P0cvkEGT4EBWmkEQBK9pZrfozrE8acLHcAj6iZxAjQC10i8mhIPdTdvhgg5mV+mQwY aObbC+kpK0mUBe/y3+GC/AYRFHO4xYfrNTLm52+SPGslX0SVSmTYhtmK4UzNf82XoYPb cG9QimOYpoFrXGYjg/0hEiWxuugfFq7aPR7uZMnzjKZUJrG3Lf6scCAauqgg+LrvYA7X P6ihOo9pIUC/LTTQAe5naBvxyvo1if7F/sZZhglJES3VJmkmCDL9q2/hAb9R+1tC0Jvl xOnb/hlrwtKiA0dpRp/jAXQ8vHrH9TDqMts2/mDcwIoMNiMd6z3JTZYZJWDcsC7jJp/6 nV2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=5U2bHjnzTAGIGV3ER0Lty1IZxyFSQbvGfeLTqgzOcy4=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=Ahz0ucTmLvO5SKeWM/j/ENzbD6My9HN+HwcCeo+eKf2d2sk9BIlebCKYVeTyLuGxcF r7lT0q/V80s4+vD6+QRh9E8G4jnhoFRysfOW5nQaMmNy52fYFGdzEvMRPgePLAufoKBY 6OfBjkHqXcsHD6+jrBBBFlLbnWyI1k1JlFLs86Gi01Y5mCZucRjOcGoTLh30GQwjS34Q 3egCq2GY8oPlQ/kAjeKl7eAz2rk9STkiMAewohV2F0kpI4nMl+D+rkLu1EsrKa4aFa8b R/hlVTXCidXKF/lGYpgFIU8rHgxv3lveNuDOh900R1YonaYao77NmhZ55nhnKQuSPHno 9J4g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id m27-20020a50999b000000b0053e232b121asi1022514edb.674.2023.10.27.12.26.00; Fri, 27 Oct 2023 12:26:01 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F165968CB27; Fri, 27 Oct 2023 22:25:48 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EAFB868CAE7 for ; Fri, 27 Oct 2023 22:25:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 40399C0014 for ; Fri, 27 Oct 2023 22:25:40 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 27 Oct 2023 22:25:35 +0300 Message-ID: <20231027192540.27373-1-remi@remlab.net> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/6] lavc/pixblockdsp: rename unaligned R-V V functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: J3EweKmxcWCO --- libavcodec/riscv/pixblockdsp_init.c | 26 +++++++++++++++----------- libavcodec/riscv/pixblockdsp_rvv.S | 6 +++--- 2 files changed, 18 insertions(+), 14 deletions(-) diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c index aa39a8a665..8f24281217 100644 --- a/libavcodec/riscv/pixblockdsp_init.c +++ b/libavcodec/riscv/pixblockdsp_init.c @@ -32,12 +32,12 @@ void ff_get_pixels_8_rvi(int16_t *block, const uint8_t *pixels, void ff_get_pixels_16_rvi(int16_t *block, const uint8_t *pixels, ptrdiff_t stride); -void ff_get_pixels_8_rvv(int16_t *block, const uint8_t *pixels, - ptrdiff_t stride); -void ff_get_pixels_16_rvv(int16_t *block, const uint8_t *pixels, - ptrdiff_t stride); -void ff_diff_pixels_rvv(int16_t *block, const uint8_t *s1, const uint8_t *s2, - ptrdiff_t stride); +void ff_get_pixels_unaligned_8_rvv(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); +void ff_get_pixels_unaligned_16_rvv(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); +void ff_diff_pixels_unaligned_rvv(int16_t *block, const uint8_t *s1, + const uint8_t *s2, ptrdiff_t stride); av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, AVCodecContext *avctx, @@ -54,12 +54,16 @@ av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, #if HAVE_RVV if ((cpu_flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16) { - if (high_bit_depth) - c->get_pixels_unaligned = c->get_pixels = ff_get_pixels_16_rvv; - else - c->get_pixels_unaligned = c->get_pixels = ff_get_pixels_8_rvv; + if (high_bit_depth) { + c->get_pixels = ff_get_pixels_unaligned_16_rvv; + c->get_pixels_unaligned = ff_get_pixels_unaligned_16_rvv; + } else { + c->get_pixels = ff_get_pixels_unaligned_8_rvv; + c->get_pixels_unaligned = ff_get_pixels_unaligned_8_rvv; + } - c->diff_pixels_unaligned = c->diff_pixels = ff_diff_pixels_rvv; + c->diff_pixels = ff_diff_pixels_unaligned_rvv; + c->diff_pixels_unaligned = ff_diff_pixels_unaligned_rvv; } #endif } diff --git a/libavcodec/riscv/pixblockdsp_rvv.S b/libavcodec/riscv/pixblockdsp_rvv.S index 1a364e6dab..e3a2fcc6ef 100644 --- a/libavcodec/riscv/pixblockdsp_rvv.S +++ b/libavcodec/riscv/pixblockdsp_rvv.S @@ -20,7 +20,7 @@ #include "libavutil/riscv/asm.S" -func ff_get_pixels_8_rvv, zve32x +func ff_get_pixels_unaligned_8_rvv, zve32x vsetivli zero, 8, e8, mf2, ta, ma vlsseg8e8.v v16, (a1), a2 vwcvtu.x.x.v v8, v16 @@ -35,14 +35,14 @@ func ff_get_pixels_8_rvv, zve32x ret endfunc -func ff_get_pixels_16_rvv, zve32x +func ff_get_pixels_unaligned_16_rvv, zve32x vsetivli zero, 8, e16, m1, ta, ma vlsseg8e16.v v0, (a1), a2 vsseg8e16.v v0, (a0) ret endfunc -func ff_diff_pixels_rvv, zve32x +func ff_diff_pixels_unaligned_rvv, zve32x vsetivli zero, 8, e8, mf2, ta, ma vlsseg8e8.v v16, (a1), a3 vlsseg8e8.v v24, (a2), a3 From patchwork Fri Oct 27 19:25:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44391 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dd83:b0:15d:8365:d4b8 with SMTP id kw3csp69201pzb; Fri, 27 Oct 2023 12:25:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG2J6bAWGZML7myu63S6N+/VYM4TPGAH0OxrSbPaxDzfIkKj9yikqPv8jRU3ZEUWElDClYE X-Received: by 2002:a05:6402:54f:b0:53f:a4f7:7bfb with SMTP id i15-20020a056402054f00b0053fa4f77bfbmr2827732edx.17.1698434751391; Fri, 27 Oct 2023 12:25:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698434751; cv=none; d=google.com; s=arc-20160816; b=xsKEUF7P96akWWgNsZ/ef/76q1KKNTQh6SYUxIzW/oDuT/ZV7D7IAqvH4PYDyMFH80 V7pLg1JdWYjRmahRD0TCv2xfkDao8pesWxpzgaC2Vv/405FGe1lIz0fgUcTXqQ0tNF2e y1fAvS+R4yUkR0npIhyYZfXexVmNgpCcGcL+LCiXzW+0DHejEOHZb0xWgJkp2FHwNF0C teCb6TWNdiszzgUAHR2/TiGry+ksgvphexD9QO9jQNjRilGufHUz2ecwwldpr8L+VfkN kHRB6s1/8RYnDjW7eEv0NgHud4aF0lzB2P395Vj2cjSD2wGS29+nOk48eqd3cjKcpHf8 /Upg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=49xaNXBvAQjHgLvYGtJc9KmdosmNs79EC5OvHU55Gy8=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=p0zOfTof/zb6r8ArVUNNs3DS8DwbL0Yh9GG4wXSgX4uz1dbzDC/Rx7J8qxXNZ/0sbU lMGa1t7abQyGH83kz0pbLevzyo3wj72L+h8RX0M/BcmuxmS4poOwCdUW0LvJi4CY4IW9 FLJGTTvw2Kf4kg02BJ9yo2m4aDe9BtzGRMAcytUzDwy0CnW9mhj807k6unl6BTWiDd/D KtlQD9L0eUJFi0SUkvt6PjaMWfGHzVPPgak4tVFpw9R3BCiiWzJEaszSSsO4rZ6RDuoX NZIpF+CV+LeurTx8iQ162uPuTP+r0phIqLgcvpPqO/LkEa19rdsFBdXKbZFk4SRmcySe yNaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f17-20020a0564021e9100b00533c5d23f39si1119619edf.399.2023.10.27.12.25.51; Fri, 27 Oct 2023 12:25:51 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9343C68CAE7; Fri, 27 Oct 2023 22:25:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E805068CACA for ; Fri, 27 Oct 2023 22:25:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 7A6C2C0017 for ; Fri, 27 Oct 2023 22:25:40 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 27 Oct 2023 22:25:36 +0300 Message-ID: <20231027192540.27373-2-remi@remlab.net> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/6] lavc/pixblockdsp: aligned R-V V 8-bit functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: bNQE8n75Al+X If the scan lines are aligned, we can load each row as a 64-bit value, thus avoiding segmentation. And then we can factor the conversion or subtraction. In principle, the same optimisation should be possible for high depth, but would require 128-bit elements, for which no FFmpeg CPU flag exists. --- libavcodec/riscv/pixblockdsp_init.c | 11 +++++++++++ libavcodec/riscv/pixblockdsp_rvv.S | 21 +++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c index 8f24281217..7d259a032f 100644 --- a/libavcodec/riscv/pixblockdsp_init.c +++ b/libavcodec/riscv/pixblockdsp_init.c @@ -32,10 +32,14 @@ void ff_get_pixels_8_rvi(int16_t *block, const uint8_t *pixels, void ff_get_pixels_16_rvi(int16_t *block, const uint8_t *pixels, ptrdiff_t stride); +void ff_get_pixels_8_rvv(int16_t *block, const uint8_t *pixels, + ptrdiff_t stride); void ff_get_pixels_unaligned_8_rvv(int16_t *block, const uint8_t *pixels, ptrdiff_t stride); void ff_get_pixels_unaligned_16_rvv(int16_t *block, const uint8_t *pixels, ptrdiff_t stride); +void ff_diff_pixels_rvv(int16_t *block, const uint8_t *s1, + const uint8_t *s2, ptrdiff_t stride); void ff_diff_pixels_unaligned_rvv(int16_t *block, const uint8_t *s1, const uint8_t *s2, ptrdiff_t stride); @@ -64,6 +68,13 @@ av_cold void ff_pixblockdsp_init_riscv(PixblockDSPContext *c, c->diff_pixels = ff_diff_pixels_unaligned_rvv; c->diff_pixels_unaligned = ff_diff_pixels_unaligned_rvv; + + if (cpu_flags & AV_CPU_FLAG_RVV_I64) { + if (!high_bit_depth) + c->get_pixels = ff_get_pixels_8_rvv; + + c->diff_pixels = ff_diff_pixels_rvv; + } } #endif } diff --git a/libavcodec/riscv/pixblockdsp_rvv.S b/libavcodec/riscv/pixblockdsp_rvv.S index e3a2fcc6ef..80c7415acf 100644 --- a/libavcodec/riscv/pixblockdsp_rvv.S +++ b/libavcodec/riscv/pixblockdsp_rvv.S @@ -20,6 +20,16 @@ #include "libavutil/riscv/asm.S" +func ff_get_pixels_8_rvv, zve64x + vsetivli zero, 8, e8, mf2, ta, ma + li t0, 8 * 8 + vlse64.v v16, (a1), a2 + vsetvli zero, t0, e8, m4, ta, ma + vwcvtu.x.x.v v8, v16 + vse16.v v8, (a0) + ret +endfunc + func ff_get_pixels_unaligned_8_rvv, zve32x vsetivli zero, 8, e8, mf2, ta, ma vlsseg8e8.v v16, (a1), a2 @@ -42,6 +52,17 @@ func ff_get_pixels_unaligned_16_rvv, zve32x ret endfunc +func ff_diff_pixels_rvv, zve64x + vsetivli zero, 8, e8, mf2, ta, ma + li t0, 8 * 8 + vlse64.v v16, (a1), a3 + vlse64.v v24, (a2), a3 + vsetvli zero, t0, e8, m4, ta, ma + vwsubu.vv v8, v16, v24 + vse16.v v8, (a0) + ret +endfunc + func ff_diff_pixels_unaligned_rvv, zve32x vsetivli zero, 8, e8, mf2, ta, ma vlsseg8e8.v v16, (a1), a3 From patchwork Fri Oct 27 19:25:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44393 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dd83:b0:15d:8365:d4b8 with SMTP id kw3csp69331pzb; Fri, 27 Oct 2023 12:26:09 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH7n5Whx079wQLkWNaIMIi5wdNKuMXWy2CojnCdHCAcdLKAfzJcaJXZQPG2QMvPI0POWVs+ X-Received: by 2002:a17:907:7f24:b0:9b2:b808:6a1c with SMTP id qf36-20020a1709077f2400b009b2b8086a1cmr3625298ejc.35.1698434769164; Fri, 27 Oct 2023 12:26:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698434769; cv=none; d=google.com; s=arc-20160816; b=bgCrqfahw8tCRLf7epm+jof9N3fEi4p5GlngDWWB9v2LZXUMQiYZ1yVPc6YpXKnmmA gBarV398ZuetQl1yoAU9CF0KiniQhvi9u6VimjHE1gYfexdtg8c6ClLSYiwd/25WdtUl kUWN0LVbAC3bi/rH8uA69ojjZVRegkMEq/CboO4+FKl7cIBcka96im5SNATYMU8PpciP 0r2ZqOQUd5drZal8Ywm544fpv7I8yYrB0/PJDt2gJIPBjWw5WgRhUk6SRnT6WGAEEwxK HIIcuW4s5NXBYyCRwNqATEUFPEvFfN6CntBbWfjpkrkvE46KPONur2zS6xxLXs/2ntJL Pgrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=OXdRmmTao7QvGVioWKR4gcvWmxWqsLVspj2Mm66medk=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=a0m75Ty4ar7u2TDf42thx5Ls5rT7o5CC5BhEBAJ+EiISE9LRDCJgVRz52eeo4b6WuD 44Hr5MinT1pvUkVbLI0fQMxynhFOIB3y3BLO+LY9W0Dj8+7M1DAGJQEpWw+SXLeyYk3g 4gRHSzpKPKCwUvIDLXr2mMZnSzwmtisxNz7yowKaFrJnTNQB9HL11pmTTUoUq3nrix00 ohOINKd4G9N0DIBPeXnsLwku8QJ8/EBIi90XTAcEi5TfYXKUlmZl3SPcUong5dgm/xIU 3QUB5Ma42jvWIvfRwjhXSAk8eoiIu7tCQt4JOvCBc6equSU5qfgGYIWUnHolqTXncWBM O2NQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hv18-20020a17090760d200b00992b49a8f89si1050771ejc.650.2023.10.27.12.26.08; Fri, 27 Oct 2023 12:26:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E453D68CBC4; Fri, 27 Oct 2023 22:25:49 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1FED768CB3B for ; Fri, 27 Oct 2023 22:25:41 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B4A56C01A3 for ; Fri, 27 Oct 2023 22:25:40 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 27 Oct 2023 22:25:37 +0300 Message-ID: <20231027192540.27373-3-remi@remlab.net> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/6] lavc/idctdsp: require Zve64x for R-V V functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Wv5M2AZIoeAm This will be required for the following changesets. --- libavcodec/riscv/idctdsp_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/riscv/idctdsp_init.c b/libavcodec/riscv/idctdsp_init.c index e6e616a555..4106d90c55 100644 --- a/libavcodec/riscv/idctdsp_init.c +++ b/libavcodec/riscv/idctdsp_init.c @@ -39,7 +39,7 @@ av_cold void ff_idctdsp_init_riscv(IDCTDSPContext *c, AVCodecContext *avctx, #if HAVE_RVV int flags = av_get_cpu_flags(); - if ((flags & AV_CPU_FLAG_RVV_I32) && ff_get_rv_vlenb() >= 16) { + if ((flags & AV_CPU_FLAG_RVV_I64) && ff_get_rv_vlenb() >= 16) { c->put_pixels_clamped = ff_put_pixels_clamped_rvv; c->put_signed_pixels_clamped = ff_put_signed_pixels_clamped_rvv; c->add_pixels_clamped = ff_add_pixels_clamped_rvv; From patchwork Fri Oct 27 19:25:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44394 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dd83:b0:15d:8365:d4b8 with SMTP id kw3csp69396pzb; Fri, 27 Oct 2023 12:26:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEqedSSUiJurlwTGzYfcLNmC7fA6p/RHpZtrm7154NHV1ZltT4lAjBMdoaSTo62n74h62MI X-Received: by 2002:a17:907:97c5:b0:9c7:5a01:ffe5 with SMTP id js5-20020a17090797c500b009c75a01ffe5mr2437979ejc.5.1698434777562; Fri, 27 Oct 2023 12:26:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698434777; cv=none; d=google.com; s=arc-20160816; b=qOpIEGiogfTBEpF8zfoa3KIcmlbFqjLyweOsKMKhZwqm/+EbNytPgqCL0zubr3umDy wbGPdUfSWhHURUABculp9sJwXQ1gbYRN53irmz2jR+WFzLMj1JrceSXOXBWvKZkaInwB kdV/eX+CPtmF1owklN97AOUdrmKxE6j1LZx+/XhKlrUXBiT46Z2GM76JF40sdxhmG2+H qWVHaGKLuberddZn1OpgVytnWs2tOlGMw8S8Dl8g6eySz6XThBDY6T5t9GuTxbkWiUV9 MmAP/rr6jKcas5DoiPDHTMDSAMGFOYFOe5nVta1cAZ4fog+cWEnlLsd9rpUfnow5TRmI hYCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=GXhxhFLQOXYgNEWNYpdnGFLLZhywmg0wLwgYGBxLh14=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=0/d+mVAGLktL1KOQ3Q3HehTw/GEU68ZYXNpl0yzzdqiFXAQ8Yzbdg+/oK2XQB37eYD BpLXGPzQPOi9ARAKOP64acBN8n3MeGSfuE6A5nz4umbDh1WqQQRc1c9QHAdOgnMpURjZ UqKMX9Ii8P/jXOr6kQrNEjwbZGkPR0l44qS8JrS6hGYoEK/zi4iVXJwFC1cwbwHF+DD3 RvnOBjURvR7NlmFfURQwf343qSegIq8DfPmTnmQzBXL0jEIEm4d+tRqVgPEdojXIMjeF VYdBmT/4VEJcRF2mJ7ZkR6uPNjsxkH8m6DCUkwqqk/17qDQWduEHA8gTTSrAKxPk4YXo fzSA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id jg40-20020a170907972800b009969e8865c6si1048612ejc.491.2023.10.27.12.26.17; Fri, 27 Oct 2023 12:26:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DD30B68CBD5; Fri, 27 Oct 2023 22:25:50 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4E6FC68CACA for ; Fri, 27 Oct 2023 22:25:41 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id E394DC01A9 for ; Fri, 27 Oct 2023 22:25:40 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 27 Oct 2023 22:25:38 +0300 Message-ID: <20231027192540.27373-4-remi@remlab.net> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/6] lavc/idctdsp: improve R-V V put_signed_pixels_clamped X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: bGsAlGonGLd5 This follows the same idea as with pixblockdsp, but applied at the other end, whilst writing data at the end of the function. --- libavcodec/riscv/idctdsp_rvv.S | 27 +++++++++------------------ 1 file changed, 9 insertions(+), 18 deletions(-) diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavcodec/riscv/idctdsp_rvv.S index 06e64e6529..4ff72f48d2 100644 --- a/libavcodec/riscv/idctdsp_rvv.S +++ b/libavcodec/riscv/idctdsp_rvv.S @@ -42,24 +42,15 @@ func ff_put_pixels_clamped_rvv, zve32x ret endfunc -func ff_put_signed_pixels_clamped_rvv, zve32x - vsetivli zero, 8, e16, m1, ta, ma - vlseg8e16.v v24, (a0) - - li t1, 128 - vsetivli zero, 8, e8, mf2, ta, ma - vnclip.wi v16, v24, 0 - vnclip.wi v17, v25, 0 - vnclip.wi v18, v26, 0 - vnclip.wi v19, v27, 0 - vnclip.wi v20, v28, 0 - vnclip.wi v21, v29, 0 - vnclip.wi v22, v30, 0 - vnclip.wi v23, v31, 0 - vsetvli t0, zero, e8, m8, ta, ma - vadd.vx v16, v16, t1 - vsetivli zero, 8, e8, mf2, ta, ma - vssseg8e8.v v16, (a1), a2 +func ff_put_signed_pixels_clamped_rvv, zve64x + li t0, 8 * 8 + vsetvli zero, t0, e8, m4, ta, ma + vle16.v v24, (a0) + li t1, 128 + vnclip.wi v16, v24, 0 + vadd.vx v16, v16, t1 + vsetivli zero, 8, e8, mf2, ta, ma + vsse64.v v16, (a1), a2 ret endfunc From patchwork Fri Oct 27 19:25:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44395 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dd83:b0:15d:8365:d4b8 with SMTP id kw3csp69469pzb; Fri, 27 Oct 2023 12:26:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGFhL7eMwRchV6lOstzCS7udI11DetPTBz7aW/j3a1EUtzOBcNqjJTTw58iUPrfI44yUuYZ X-Received: by 2002:a17:907:6d05:b0:9ad:a59f:331a with SMTP id sa5-20020a1709076d0500b009ada59f331amr3273109ejc.57.1698434785809; Fri, 27 Oct 2023 12:26:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698434785; cv=none; d=google.com; s=arc-20160816; b=UjWnZeG17At/wuBraf0UzUe23L6CXUjUfGoAolqlP3BEFHR6233NnY9txbYSHdSqVs UbFnce+VU+XLSOTekiWlBxtwcd4ZsnBM/c3r7e5L3UmGPu4qKZthzZ9EpcxvoaWHWeNu wYizpnkbn0dcauWaEnspmA92sOu79uWDcj6ABS06sNfderNaaIKhlRB2iS9E0v2OdOSc HFSS+3VSikNu+sO6dyJ77XimCCLloKOdfXmM+6gnfFj+Prp9nTnSaagvvAuU+0gwZeEP pM8nLrWY6Pe7y0fk3AWFZSqw7bbOg4iXKhjUAXuFtX5ux0qKxJ2dpNN8DTcJTom1Lbkb OU7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=0uF53fwFqSA7euM6MVOzhX3jsBkUr+tZPUetyKenxQQ=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=S2W3cJr3J3Ak4hG1PaNfqpWcbwRY2PfjMHarsnNnnJVedR76SuVHXZssjRs6kQjxEL UZAh8t4Lg20YhNXCm9LMSVeFoTCthhfnBC9BWlrruTSWt18KElLLZi+wx8mnKh62Pgz0 4BG/WTqn+El3gbcK3Epn31yKzfahLYpG7mi8Xl7glYFXLFTdrj95BXkiKHP4PhyYjFs3 9xRp40h7ChMxE9QQb89kLbdrPqvFIBTwTV3aJc30QGVJC0GMV5yw4lb6vAxucQurIvOa iPuoNUuNmwEMO1Jt0Hw11hquKycm05jbJBxOIKZlJ3BA7XHEPoIojTWDJb24hqY+C6QN bJ1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id u16-20020a170906109000b009a224eb0935si943233eju.122.2023.10.27.12.26.25; Fri, 27 Oct 2023 12:26:25 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EEFB368CB3B; Fri, 27 Oct 2023 22:25:51 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 73E2868CACA for ; Fri, 27 Oct 2023 22:25:41 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 1E2C6C01BD for ; Fri, 27 Oct 2023 22:25:41 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 27 Oct 2023 22:25:39 +0300 Message-ID: <20231027192540.27373-5-remi@remlab.net> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/6] lavc/idctdsp: improve R-V V add_pixels_clamped X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Ax9hIjhhddbl --- libavcodec/riscv/idctdsp_rvv.S | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavcodec/riscv/idctdsp_rvv.S index 4ff72f48d2..fafdddb174 100644 --- a/libavcodec/riscv/idctdsp_rvv.S +++ b/libavcodec/riscv/idctdsp_rvv.S @@ -23,7 +23,6 @@ func ff_put_pixels_clamped_rvv, zve32x vsetivli zero, 8, e16, m1, ta, ma vlseg8e16.v v24, (a0) -1: /* RVV only has signed-signed and unsigned-unsigned clipping. * We need two steps for signed-to-unsigned clipping. */ vsetvli t0, zero, e16, m8, ta, ma @@ -54,17 +53,18 @@ func ff_put_signed_pixels_clamped_rvv, zve64x ret endfunc -func ff_add_pixels_clamped_rvv, zve32x - vsetivli zero, 8, e8, mf2, ta, ma - vlseg8e16.v v24, (a0) - vlsseg8e8.v v16, (a1), a2 - vwaddu.wv v24, v24, v16 - vwaddu.wv v25, v25, v17 - vwaddu.wv v26, v26, v18 - vwaddu.wv v27, v27, v19 - vwaddu.wv v28, v28, v20 - vwaddu.wv v29, v29, v21 - vwaddu.wv v30, v30, v22 - vwaddu.wv v31, v31, v23 - j 1b +func ff_add_pixels_clamped_rvv, zve64x + vsetivli zero, 8, e8, mf2, ta, ma + li t0, 8 * 8 + vlse64.v v16, (a1), a2 + vsetvli zero, t0, e8, m4, ta, ma + vle16.v v24, (a0) + vwaddu.wv v24, v24, v16 + vsetvli zero, zero, e16, m8, ta, ma + vmax.vx v24, v24, zero + vsetvli zero, zero, e8, m4, ta, ma + vnclipu.wi v16, v24, 0 + vsetivli zero, 8, e8, mf2, ta, ma + vsse64.v v16, (a1), a2 + ret endfunc From patchwork Fri Oct 27 19:25:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44396 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dd83:b0:15d:8365:d4b8 with SMTP id kw3csp69535pzb; Fri, 27 Oct 2023 12:26:35 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF/Z9TeU7FnQnaoZ9kj4t3pfJqa7ATeyrasDfkC8+esSGzZvdY0enKb8WMEXjITMCTmsJCv X-Received: by 2002:a17:907:970d:b0:9ad:93c8:c483 with SMTP id jg13-20020a170907970d00b009ad93c8c483mr6473440ejc.2.1698434794670; Fri, 27 Oct 2023 12:26:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698434794; cv=none; d=google.com; s=arc-20160816; b=LDr0YRD+Wjm//5yKKALgODmFoKh5kP9MWE3okxt6660jq9ZUXDwI4RMfIDc8ELeopd /0p1sr+3LhXAWludP2lcno0GTz4ac46sYVZwVCkLznWhzvXr52UKMRnYuGNflugFId/w MrsufAquSxZidBGsVrPSZ1V/liyj8JGc93UdStTkMU7HDuboBeub6+cHDc1QFujg9WEX 5RiIbrssQNycitsUIPEjrETrJyVolEr3neL1tPQAVHa9wNur1rBMjNoypPrOuVp62vNu YxwYvF4muBPxbz3NS8C+NC8NI9pH+N5pHS2FwmJFD69+ZG1ooMNpxzbafOw8ZvuuxA9O KQBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=Rjv0XbIDhrcPtJkkuG8dZABjjj8WAz7yMQLMQ9+PSAo=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=kd7NNf01Ai/4+nEQ0BAaHZ/D3MtaKe1U/ZV3Z/S+nYHm48Sxhdb672I9fvwu8DaJuI 2tscyreDfPN8OS2eevTEauxG40vlwDUPcDO0SrMFzGTrYCNPEEgS0YasGdlcqZB7eK5c /ZjzZT8Wrx5SxuYzeJF17Phja41/EP+m9wpJAIVMQrO4ncaHyttuLC5yO2iTt7DA3Y3y sLQWqKBFXVWfr9i9pNz59kO5wicFlphEIvuIhqLTlY2wFwGVUC5BEMNP48bwKGn5ZpOQ ND4W4s8fpcpqlb2pHE+jhPKx17huuR99ro8XXixNr5pdTUf1yuJOOq78anAN1fIUE5WQ pygg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id js22-20020a17090797d600b0099bcde0f1b1si1146892ejc.152.2023.10.27.12.26.34; Fri, 27 Oct 2023 12:26:34 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D9A7368CBE7; Fri, 27 Oct 2023 22:25:52 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3C0D768CBDA for ; Fri, 27 Oct 2023 22:25:46 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 4D57CC01C3 for ; Fri, 27 Oct 2023 22:25:41 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 27 Oct 2023 22:25:40 +0300 Message-ID: <20231027192540.27373-6-remi@remlab.net> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 6/6] lavc/idctdsp: improve R-V V put_pixels_clamped X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: LH7PtJLtZ2kt --- libavcodec/riscv/idctdsp_rvv.S | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavcodec/riscv/idctdsp_rvv.S index fafdddb174..e93e6b5e7a 100644 --- a/libavcodec/riscv/idctdsp_rvv.S +++ b/libavcodec/riscv/idctdsp_rvv.S @@ -20,24 +20,17 @@ #include "libavutil/riscv/asm.S" -func ff_put_pixels_clamped_rvv, zve32x - vsetivli zero, 8, e16, m1, ta, ma - vlseg8e16.v v24, (a0) +func ff_put_pixels_clamped_rvv, zve64x + li t0, 8 * 8 + vsetvli zero, t0, e16, m8, ta, ma + vle16.v v24, (a0) /* RVV only has signed-signed and unsigned-unsigned clipping. * We need two steps for signed-to-unsigned clipping. */ - vsetvli t0, zero, e16, m8, ta, ma - vmax.vx v24, v24, zero - - vsetivli zero, 8, e8, mf2, ta, ma - vnclipu.wi v16, v24, 0 - vnclipu.wi v17, v25, 0 - vnclipu.wi v18, v26, 0 - vnclipu.wi v19, v27, 0 - vnclipu.wi v20, v28, 0 - vnclipu.wi v21, v29, 0 - vnclipu.wi v22, v30, 0 - vnclipu.wi v23, v31, 0 - vssseg8e8.v v16, (a1), a2 + vmax.vx v24, v24, zero + vsetvli zero, zero, e8, m4, ta, ma + vnclipu.wi v16, v24, 0 + vsetivli zero, 8, e8, mf2, ta, ma + vsse64.v v16, (a1), a2 ret endfunc