From patchwork Wed Sep 28 15:29:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38439 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp309498pzh; Wed, 28 Sep 2022 08:30:10 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5Zk1/QegfLOoIvOAwwY4ICXyo6Whkz1hcMgGC+vikQY2fautziHzHrLf1P8Yff++Wcsq/C X-Received: by 2002:a17:906:8473:b0:77b:efa8:50e4 with SMTP id hx19-20020a170906847300b0077befa850e4mr27629381ejc.250.1664379010703; Wed, 28 Sep 2022 08:30:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664379010; cv=none; d=google.com; s=arc-20160816; b=kPyHRNilOIiRJ9yVKmmhnZFtkH3ai3DM3UBnGi44VN8MR6nX908prFDzYAkBe85uuf bgz9DIe31lc2cixedFthe2tJAnsHDHTvkgxwMglsToShqvqQQBMiC0SEooPenufNck2C y1JDX9aUJ7TY0HGkaTk7aezxGIBvVWxGvO6xuLhQKVIgpgyqhoTLRms0L+Y+KVjEwX/Z E/cSJdIJpEPsm2+IVGF9Z5cC72e1oy5SluhGjwdXuGoQnPtebxEbKb0UdrWhxPM4P70F ANcApfL/EVK9tIv7XgIqX73qaWnPhhNMPmLpCPn+SHtFFlaDQ7meFHJdg1TRebR16l94 HbOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=LgKky3EyhS48XReo8jqzJDThag3lXl5GzsV7jUD7Q6E=; b=bqT4GSriwbQNxYUoMReH+3NAyNyFE375HNI97Uh79It+6CIewLODuofrv94YdiAdPm l4tCKMsskLvWqKPsCDgxsLR8935jHZtn4wY/kZCNH5hKCLqhAUG76OF+s2JQFlmlt2Zz lYuJ/OanVBlSwQYfLOvKzxuU2OX7JaiUuInjM3R20L/DJiMi61prZJFF1ulbBTRHjixO Yfpdc+O/XwLNTt0oraVjzV4RZ1HQXTNtlYRf19Pt/Y5iT94P0hMiw5InVPG2b8HQkGYj 9EqajTd4v7nRSsxOkjEd1XG1FFiE6iV6QinWgvzMmpUJ+pqmDjDXOQTXmhcmt9tWaP3E LdtA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a2-20020a056402168200b00457dbbf8336si3184240edv.456.2022.09.28.08.30.10; Wed, 28 Sep 2022 08:30:10 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5995568BBAB; Wed, 28 Sep 2022 18:30:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2808068BB63 for ; Wed, 28 Sep 2022 18:30:02 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B71FBC003A for ; Wed, 28 Sep 2022 18:30:01 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Wed, 28 Sep 2022 18:29:59 +0300 Message-Id: <20220928153001.30025-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12088142.O9o76ZdvQC@basile.remlab.net> References: <12088142.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/3] sws/rgb2rgb: RISC-V V shuffle_bytes_xxxx functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: gFrWxTZHTrdd From: Rémi Denis-Courmont --- libswscale/rgb2rgb.c | 2 + libswscale/rgb2rgb.h | 1 + libswscale/riscv/Makefile | 2 + libswscale/riscv/rgb2rgb.c | 47 ++++++++++++++++++++ libswscale/riscv/rgb2rgb_rvv.S | 78 ++++++++++++++++++++++++++++++++++ 5 files changed, 130 insertions(+) create mode 100644 libswscale/riscv/Makefile create mode 100644 libswscale/riscv/rgb2rgb.c create mode 100644 libswscale/riscv/rgb2rgb_rvv.S diff --git a/libswscale/rgb2rgb.c b/libswscale/rgb2rgb.c index 3af775b389..e98fdac8ea 100644 --- a/libswscale/rgb2rgb.c +++ b/libswscale/rgb2rgb.c @@ -139,6 +139,8 @@ av_cold void ff_sws_rgb2rgb_init(void) rgb2rgb_init_c(); #if ARCH_AARCH64 rgb2rgb_init_aarch64(); +#elif ARCH_RISCV + rgb2rgb_init_riscv(); #elif ARCH_X86 rgb2rgb_init_x86(); #elif ARCH_LOONGARCH64 diff --git a/libswscale/rgb2rgb.h b/libswscale/rgb2rgb.h index db85bfc42f..f3951d523e 100644 --- a/libswscale/rgb2rgb.h +++ b/libswscale/rgb2rgb.h @@ -167,6 +167,7 @@ extern void (*yuyvtoyuv422)(uint8_t *ydst, uint8_t *udst, uint8_t *vdst, const u void ff_sws_rgb2rgb_init(void); void rgb2rgb_init_aarch64(void); +void rgb2rgb_init_riscv(void); void rgb2rgb_init_x86(void); void rgb2rgb_init_loongarch(void); diff --git a/libswscale/riscv/Makefile b/libswscale/riscv/Makefile new file mode 100644 index 0000000000..214d877b62 --- /dev/null +++ b/libswscale/riscv/Makefile @@ -0,0 +1,2 @@ +OBJS += riscv/rgb2rgb.o +RVV-OBJS += riscv/rgb2rgb_rvv.o diff --git a/libswscale/riscv/rgb2rgb.c b/libswscale/riscv/rgb2rgb.c new file mode 100644 index 0000000000..5654154494 --- /dev/null +++ b/libswscale/riscv/rgb2rgb.c @@ -0,0 +1,47 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libswscale/rgb2rgb.h" + +void ff_shuffle_bytes_0321_rvv(const uint8_t *src, uint8_t *dst, int src_len); +void ff_shuffle_bytes_2103_rvv(const uint8_t *src, uint8_t *dst, int src_len); +void ff_shuffle_bytes_1230_rvv(const uint8_t *src, uint8_t *dst, int src_len); +void ff_shuffle_bytes_3012_rvv(const uint8_t *src, uint8_t *dst, int src_len); +void ff_shuffle_bytes_3210_rvv(const uint8_t *src, uint8_t *dst, int src_len); + +av_cold void rgb2rgb_init_riscv(void) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_I32) { + shuffle_bytes_0321 = ff_shuffle_bytes_0321_rvv; + shuffle_bytes_2103 = ff_shuffle_bytes_2103_rvv; + shuffle_bytes_1230 = ff_shuffle_bytes_1230_rvv; + shuffle_bytes_3012 = ff_shuffle_bytes_3012_rvv; + shuffle_bytes_3210 = ff_shuffle_bytes_3210_rvv; + } +#endif +} diff --git a/libswscale/riscv/rgb2rgb_rvv.S b/libswscale/riscv/rgb2rgb_rvv.S new file mode 100644 index 0000000000..3eb11262c0 --- /dev/null +++ b/libswscale/riscv/rgb2rgb_rvv.S @@ -0,0 +1,78 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_shuffle_bytes_0321_rvv, zve32x + addi t1, a0, 3 + addi t2, a0, 2 + addi t3, a0, 1 +1: + srai a2, a2, 2 + li t4, 4 +2: + vsetvli t0, a2, e8, m1, ta, ma + sub a2, a2, t0 + vlse8.v v8, (a0), t4 + sh2add a0, t0, a0 + vlse8.v v9, (t1), t4 + sh2add t1, t0, t1 + vlse8.v v10, (t2), t4 + sh2add t2, t0, t2 + vlse8.v v11, (t3), t4 + sh2add t3, t0, t3 + vsseg4e8.v v8, (a1) + sh2add a1, t0, a1 + bnez a2, 2b + + ret +endfunc + +func ff_shuffle_bytes_2103_rvv, zve32x + addi t1, a0, 1 + addi t2, a0, 0 + addi t3, a0, 3 + addi a0, a0, 2 + j 1b +endfunc + +func ff_shuffle_bytes_1230_rvv, zve32x + addi t1, a0, 2 + addi t2, a0, 3 + addi t3, a0, 0 + addi a0, a0, 1 + j 1b +endfunc + +func ff_shuffle_bytes_3012_rvv, zve32x + addi t1, a0, 0 + addi t2, a0, 1 + addi t3, a0, 2 + addi a0, a0, 3 + j 1b +endfunc + +func ff_shuffle_bytes_3210_rvv, zve32x + addi t1, a0, 2 + addi t2, a0, 1 + addi t3, a0, 0 + addi a0, a0, 3 + j 1b +endfunc From patchwork Wed Sep 28 15:30:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38440 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp309614pzh; Wed, 28 Sep 2022 08:30:18 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6Diq2IphfhYg2PbHuYq6mZQWH5woe45vaivAjrQ8cSQXpWyaMJoVMpLrpZ7j/JJlzklE93 X-Received: by 2002:a17:906:db0d:b0:77b:a7cd:8396 with SMTP id xj13-20020a170906db0d00b0077ba7cd8396mr28691933ejb.264.1664379018740; Wed, 28 Sep 2022 08:30:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664379018; cv=none; d=google.com; s=arc-20160816; b=bMptF97WWzt2IHS3pf2q12UDogJMC7P0Yr7GLiRU9MBQf/HUjItcmbt4QU7NLk8+s0 Y6vhxwMdk/sKS/js4Wmx/iQdqQN4diHUToE6T2wlhOj+2r8zqIgnfjd4fNk9b2apbp26 ObI2H84tscywlS1C2LPPzV1hKizR/yVptJHL/KwVnGcyUBbJZKq1nzTL/UhVTtzMZESv SrnuPi0qtm38BbECvE8eojZr38OtcPURR8uwFag/Ru69cr/g5DKagXVj1qWbU1b5dRPm as63+M95CZvqUVMoR83IxN/XQQVhNqUWIimvfrGY8BI4Hk53SDCGRpqZdFHVm+OeJ2wj QvXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=oJKDEW9dnrMNq5rhLGDmuXYrgmt+RRxpJpOb6ZJHMMA=; b=hlvadDrXz0+nO5Ict5vDqDI9C5BrhXQf6sjmow+b65xnxG7bdsU1UZTRUHYR9pXM2Q P68Nr8EP4K5vvcRS8Mw1HkGTf4o8ZupHN7ZXhp9bL85mmyqwLq4SJtV8LmhraZiBKobe dP7+bMAyxh6awC74AUkXSTIqzFPArYOue3r/6I9u9MwS93MAAZsaBhsQhFr+mskeby7m 5TD8YrJkpkBjsyfBnac5tAD9fyB9hTkEvX8LZxl92vpVhNA56l08ZjoJhoF8FPOBZfYV HyVKUN4cZA3BZ50jynTRBEE//DFefWGDYQgYp0gvcQHH7DCVUHBN0JSn5aXF6NpFg4oe Jkeg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dn6-20020a17090794c600b00730a20dd838si5008733ejc.84.2022.09.28.08.30.17; Wed, 28 Sep 2022 08:30:18 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6FA5768BBB6; Wed, 28 Sep 2022 18:30:09 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3BF0168BB66 for ; Wed, 28 Sep 2022 18:30:02 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id EA188C0090 for ; Wed, 28 Sep 2022 18:30:01 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Wed, 28 Sep 2022 18:30:00 +0300 Message-Id: <20220928153001.30025-2-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12088142.O9o76ZdvQC@basile.remlab.net> References: <12088142.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/3] sws/rgb2rgb: RISC-V V interleaveBytes X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ko7HdZIrNTOK From: Rémi Denis-Courmont --- libswscale/riscv/rgb2rgb.c | 4 ++++ libswscale/riscv/rgb2rgb_rvv.S | 26 ++++++++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/libswscale/riscv/rgb2rgb.c b/libswscale/riscv/rgb2rgb.c index 5654154494..32c1546827 100644 --- a/libswscale/riscv/rgb2rgb.c +++ b/libswscale/riscv/rgb2rgb.c @@ -30,6 +30,9 @@ void ff_shuffle_bytes_2103_rvv(const uint8_t *src, uint8_t *dst, int src_len); void ff_shuffle_bytes_1230_rvv(const uint8_t *src, uint8_t *dst, int src_len); void ff_shuffle_bytes_3012_rvv(const uint8_t *src, uint8_t *dst, int src_len); void ff_shuffle_bytes_3210_rvv(const uint8_t *src, uint8_t *dst, int src_len); +void ff_interleave_bytes_rvv(const uint8_t *src1, const uint8_t *src2, + uint8_t *dst, int width, int height, int s1stride, + int s2stride, int dstride); av_cold void rgb2rgb_init_riscv(void) { @@ -42,6 +45,7 @@ av_cold void rgb2rgb_init_riscv(void) shuffle_bytes_1230 = ff_shuffle_bytes_1230_rvv; shuffle_bytes_3012 = ff_shuffle_bytes_3012_rvv; shuffle_bytes_3210 = ff_shuffle_bytes_3210_rvv; + interleaveBytes = ff_interleave_bytes_rvv; } #endif } diff --git a/libswscale/riscv/rgb2rgb_rvv.S b/libswscale/riscv/rgb2rgb_rvv.S index 3eb11262c0..7f8c2efd80 100644 --- a/libswscale/riscv/rgb2rgb_rvv.S +++ b/libswscale/riscv/rgb2rgb_rvv.S @@ -76,3 +76,29 @@ func ff_shuffle_bytes_3210_rvv, zve32x addi a0, a0, 3 j 1b endfunc + +func ff_interleave_bytes_rvv, zve32x +1: + mv t0, a0 + mv t1, a1 + mv t2, a2 + mv t3, a3 + addi a4, a4, -1 +2: + vsetvli t4, t3, e8, ta, ma + sub t3, t3, t4 + vle8.v v8, (t0) + add t0, t4, t0 + vle8.v v9, (t1) + add t1, t4, t1 + vsseg2e8.v v8, (t2) + sh1add t2, t4, t2 + bnez t4, 2b + + add a0, a0, a5 + add a1, a1, a6 + add a2, a2, a7 + bnez a4, 1b + + ret +endfunc From patchwork Wed Sep 28 15:30:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38441 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp310090pzh; Wed, 28 Sep 2022 08:30:47 -0700 (PDT) X-Google-Smtp-Source: AMsMyM47PngOo0XYYGT4mc8QEa0PatthXTWTf7+dJVHsSnHr+8Jlfkv15pPELUhIexfSPaobAp3d X-Received: by 2002:a17:907:c10:b0:782:386f:f558 with SMTP id ga16-20020a1709070c1000b00782386ff558mr27260337ejc.739.1664379047208; Wed, 28 Sep 2022 08:30:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664379047; cv=none; d=google.com; s=arc-20160816; b=EmlQ2SJjL/VI20F4OFx/sVRYQcaSRvxIxmKJecYz1eW1SlTHjWzEqzcYVRS+3Pqlfz 5LA4FTUhvv2bVU5kS1YO+avVdrNUA4WMw89I5C9dguIKz99F6F8lNz+XtEPbQaF16TW+ CJjYruOqkBB/ayfst+/uIaDBzksHWVBrm0p0RPID4bxCIFU2f0WqcSXWjwnJpe3WKGCb 22aB1N8ahvC56C8nqGFxHqODKsvhpukjbQIL8oUKADLBTJbCP55pYhrRf5RAgS6U8Xe8 TdrY8wnxgEU1q/fbkvD8DzwECsWZXJXBrhvzlhZmMDX1Ua88/IGIHRY/fRec1Pi9v/cW ggRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=CEL85XYkiTXBKDtotHO3DHvknwRgfJltaOUUHDR8M04=; b=hdgeTZfwcozlKoMMKyYGBv3xkKxbAzfnp9fKXB/mWBiByDoBL9U5IkUAaXdgBOUMEB 5rk362VsOxJb6TCvs9gmMTM8vktiLYc/uyHxhXFxFkhi/ziNvMCVKuU4744q3bpxjf3x UTgOeokjdJzqgqMMISCBgor4xJeiKi+kji+kUc9yH2xpsz8b1c9sqwN3R41Hb00LpxdL 3EI873bnzwXg226L3XX2Emxu0llKOaDtcN12UsxSRaBnEpDFsSbjomVXyOAy6sHl2TPh xcLMsrVckJGHumP75+fUW3tvvBt3SRRuVqX64nYYcaUZRQoAF2Q9vp7H6xusGbmoimPU UYRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ka7-20020a170907990700b0073d8e4e8c95si4634126ejc.923.2022.09.28.08.30.25; Wed, 28 Sep 2022 08:30:47 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5934B68BB63; Wed, 28 Sep 2022 18:30:10 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6B8B268BB66 for ; Wed, 28 Sep 2022 18:30:02 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 287F8C00AF for ; Wed, 28 Sep 2022 18:30:02 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Wed, 28 Sep 2022 18:30:01 +0300 Message-Id: <20220928153001.30025-3-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12088142.O9o76ZdvQC@basile.remlab.net> References: <12088142.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/3] sws/rgb2rgb: RISC-V 64-bit V packed YUYV/UYVY to planar 4:2:2 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: HNeBr7kDmqWX From: Rémi Denis-Courmont This is currently 64-bit only because the stack spilling code would not assemble on RV32I (and it would corrupt s0 and s1 on RV128I, in theory). This could be added later in the unlikely that someone wants it. --- libswscale/riscv/rgb2rgb.c | 10 +++++++ libswscale/riscv/rgb2rgb_rvv.S | 53 ++++++++++++++++++++++++++++++++++ 2 files changed, 63 insertions(+) diff --git a/libswscale/riscv/rgb2rgb.c b/libswscale/riscv/rgb2rgb.c index 32c1546827..93bc6b6245 100644 --- a/libswscale/riscv/rgb2rgb.c +++ b/libswscale/riscv/rgb2rgb.c @@ -33,6 +33,12 @@ void ff_shuffle_bytes_3210_rvv(const uint8_t *src, uint8_t *dst, int src_len); void ff_interleave_bytes_rvv(const uint8_t *src1, const uint8_t *src2, uint8_t *dst, int width, int height, int s1stride, int s2stride, int dstride); +void ff_uyvytoyuv422_rvv(uint8_t *ydst, uint8_t *udst, uint8_t *vdst, + const uint8_t *src, int width, int height, + int ystride, int uvstride, int src_stride); +void ff_yuyvtoyuv422_rvv(uint8_t *ydst, uint8_t *udst, uint8_t *vdst, + const uint8_t *src, int width, int height, + int ystride, int uvstride, int src_stride); av_cold void rgb2rgb_init_riscv(void) { @@ -46,6 +52,10 @@ av_cold void rgb2rgb_init_riscv(void) shuffle_bytes_3012 = ff_shuffle_bytes_3012_rvv; shuffle_bytes_3210 = ff_shuffle_bytes_3210_rvv; interleaveBytes = ff_interleave_bytes_rvv; +# if (__riscv_xlen == 64) + uyvytoyuv422 = ff_uyvytoyuv422_rvv; + yuyvtoyuv422 = ff_yuyvtoyuv422_rvv; +# endif } #endif } diff --git a/libswscale/riscv/rgb2rgb_rvv.S b/libswscale/riscv/rgb2rgb_rvv.S index 7f8c2efd80..5626d906eb 100644 --- a/libswscale/riscv/rgb2rgb_rvv.S +++ b/libswscale/riscv/rgb2rgb_rvv.S @@ -102,3 +102,56 @@ func ff_interleave_bytes_rvv, zve32x ret endfunc + +#if (__riscv_xlen == 64) +.macro yuy2_to_i422p v_y0, v_y1, v_u, v_v + addi sp, sp, -16 + sd s0, (sp) + sd s1, 8(sp) + addi a4, a4, 1 + lw s0, 16(sp) + srai a4, a4, 1 // pixel width -> chroma width + li s1, 2 +1: + mv t4, a4 + mv t3, a3 + mv t0, a0 + addi t6, a0, 1 + mv t1, a1 + mv t2, a2 + addi a5, a5, -1 +2: + vsetvli t5, t4, e8, m1, ta, ma + sub t4, t4, t5 + vlseg4e8.v v8, (t3) + sh2add t3, t5, t3 + vsse8.v \v_y0, (t0), s1 + sh1add t0, t5, t0 + vsse8.v \v_y1, (t6), s1 + sh1add t6, t5, t6 + vse8.v \v_u, (t1) + add t1, t5, t1 + vse8.v \v_v, (t2) + add t2, t5, t2 + bnez t4, 2b + + add a3, a3, s0 + add a0, a0, a6 + add a1, a1, a7 + add a2, a2, a7 + bnez a5, 1b + + ld s1, 8(sp) + ld s0, (sp) + addi sp, sp, 16 + ret +.endm + +func ff_uyvytoyuv422_rvv, zve32x + yuy2_to_i422p v9, v11, v8, v10 +endfunc + +func ff_yuyvtoyuv422_rvv, zve32x + yuy2_to_i422p v8, v10, v9, v11 +endfunc +#endif