From patchwork Wed Sep 28 15:29:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38439 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp309498pzh; Wed, 28 Sep 2022 08:30:10 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5Zk1/QegfLOoIvOAwwY4ICXyo6Whkz1hcMgGC+vikQY2fautziHzHrLf1P8Yff++Wcsq/C X-Received: by 2002:a17:906:8473:b0:77b:efa8:50e4 with SMTP id hx19-20020a170906847300b0077befa850e4mr27629381ejc.250.1664379010703; Wed, 28 Sep 2022 08:30:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664379010; cv=none; d=google.com; s=arc-20160816; b=kPyHRNilOIiRJ9yVKmmhnZFtkH3ai3DM3UBnGi44VN8MR6nX908prFDzYAkBe85uuf bgz9DIe31lc2cixedFthe2tJAnsHDHTvkgxwMglsToShqvqQQBMiC0SEooPenufNck2C y1JDX9aUJ7TY0HGkaTk7aezxGIBvVWxGvO6xuLhQKVIgpgyqhoTLRms0L+Y+KVjEwX/Z E/cSJdIJpEPsm2+IVGF9Z5cC72e1oy5SluhGjwdXuGoQnPtebxEbKb0UdrWhxPM4P70F ANcApfL/EVK9tIv7XgIqX73qaWnPhhNMPmLpCPn+SHtFFlaDQ7meFHJdg1TRebR16l94 HbOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=LgKky3EyhS48XReo8jqzJDThag3lXl5GzsV7jUD7Q6E=; b=bqT4GSriwbQNxYUoMReH+3NAyNyFE375HNI97Uh79It+6CIewLODuofrv94YdiAdPm l4tCKMsskLvWqKPsCDgxsLR8935jHZtn4wY/kZCNH5hKCLqhAUG76OF+s2JQFlmlt2Zz lYuJ/OanVBlSwQYfLOvKzxuU2OX7JaiUuInjM3R20L/DJiMi61prZJFF1ulbBTRHjixO Yfpdc+O/XwLNTt0oraVjzV4RZ1HQXTNtlYRf19Pt/Y5iT94P0hMiw5InVPG2b8HQkGYj 9EqajTd4v7nRSsxOkjEd1XG1FFiE6iV6QinWgvzMmpUJ+pqmDjDXOQTXmhcmt9tWaP3E LdtA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a2-20020a056402168200b00457dbbf8336si3184240edv.456.2022.09.28.08.30.10; Wed, 28 Sep 2022 08:30:10 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5995568BBAB; Wed, 28 Sep 2022 18:30:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2808068BB63 for ; Wed, 28 Sep 2022 18:30:02 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B71FBC003A for ; Wed, 28 Sep 2022 18:30:01 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Wed, 28 Sep 2022 18:29:59 +0300 Message-Id: <20220928153001.30025-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 In-Reply-To: <12088142.O9o76ZdvQC@basile.remlab.net> References: <12088142.O9o76ZdvQC@basile.remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/3] sws/rgb2rgb: RISC-V V shuffle_bytes_xxxx functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: gFrWxTZHTrdd From: Rémi Denis-Courmont --- libswscale/rgb2rgb.c | 2 + libswscale/rgb2rgb.h | 1 + libswscale/riscv/Makefile | 2 + libswscale/riscv/rgb2rgb.c | 47 ++++++++++++++++++++ libswscale/riscv/rgb2rgb_rvv.S | 78 ++++++++++++++++++++++++++++++++++ 5 files changed, 130 insertions(+) create mode 100644 libswscale/riscv/Makefile create mode 100644 libswscale/riscv/rgb2rgb.c create mode 100644 libswscale/riscv/rgb2rgb_rvv.S diff --git a/libswscale/rgb2rgb.c b/libswscale/rgb2rgb.c index 3af775b389..e98fdac8ea 100644 --- a/libswscale/rgb2rgb.c +++ b/libswscale/rgb2rgb.c @@ -139,6 +139,8 @@ av_cold void ff_sws_rgb2rgb_init(void) rgb2rgb_init_c(); #if ARCH_AARCH64 rgb2rgb_init_aarch64(); +#elif ARCH_RISCV + rgb2rgb_init_riscv(); #elif ARCH_X86 rgb2rgb_init_x86(); #elif ARCH_LOONGARCH64 diff --git a/libswscale/rgb2rgb.h b/libswscale/rgb2rgb.h index db85bfc42f..f3951d523e 100644 --- a/libswscale/rgb2rgb.h +++ b/libswscale/rgb2rgb.h @@ -167,6 +167,7 @@ extern void (*yuyvtoyuv422)(uint8_t *ydst, uint8_t *udst, uint8_t *vdst, const u void ff_sws_rgb2rgb_init(void); void rgb2rgb_init_aarch64(void); +void rgb2rgb_init_riscv(void); void rgb2rgb_init_x86(void); void rgb2rgb_init_loongarch(void); diff --git a/libswscale/riscv/Makefile b/libswscale/riscv/Makefile new file mode 100644 index 0000000000..214d877b62 --- /dev/null +++ b/libswscale/riscv/Makefile @@ -0,0 +1,2 @@ +OBJS += riscv/rgb2rgb.o +RVV-OBJS += riscv/rgb2rgb_rvv.o diff --git a/libswscale/riscv/rgb2rgb.c b/libswscale/riscv/rgb2rgb.c new file mode 100644 index 0000000000..5654154494 --- /dev/null +++ b/libswscale/riscv/rgb2rgb.c @@ -0,0 +1,47 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "config.h" +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libswscale/rgb2rgb.h" + +void ff_shuffle_bytes_0321_rvv(const uint8_t *src, uint8_t *dst, int src_len); +void ff_shuffle_bytes_2103_rvv(const uint8_t *src, uint8_t *dst, int src_len); +void ff_shuffle_bytes_1230_rvv(const uint8_t *src, uint8_t *dst, int src_len); +void ff_shuffle_bytes_3012_rvv(const uint8_t *src, uint8_t *dst, int src_len); +void ff_shuffle_bytes_3210_rvv(const uint8_t *src, uint8_t *dst, int src_len); + +av_cold void rgb2rgb_init_riscv(void) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_I32) { + shuffle_bytes_0321 = ff_shuffle_bytes_0321_rvv; + shuffle_bytes_2103 = ff_shuffle_bytes_2103_rvv; + shuffle_bytes_1230 = ff_shuffle_bytes_1230_rvv; + shuffle_bytes_3012 = ff_shuffle_bytes_3012_rvv; + shuffle_bytes_3210 = ff_shuffle_bytes_3210_rvv; + } +#endif +} diff --git a/libswscale/riscv/rgb2rgb_rvv.S b/libswscale/riscv/rgb2rgb_rvv.S new file mode 100644 index 0000000000..3eb11262c0 --- /dev/null +++ b/libswscale/riscv/rgb2rgb_rvv.S @@ -0,0 +1,78 @@ +/* + * Copyright © 2022 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_shuffle_bytes_0321_rvv, zve32x + addi t1, a0, 3 + addi t2, a0, 2 + addi t3, a0, 1 +1: + srai a2, a2, 2 + li t4, 4 +2: + vsetvli t0, a2, e8, m1, ta, ma + sub a2, a2, t0 + vlse8.v v8, (a0), t4 + sh2add a0, t0, a0 + vlse8.v v9, (t1), t4 + sh2add t1, t0, t1 + vlse8.v v10, (t2), t4 + sh2add t2, t0, t2 + vlse8.v v11, (t3), t4 + sh2add t3, t0, t3 + vsseg4e8.v v8, (a1) + sh2add a1, t0, a1 + bnez a2, 2b + + ret +endfunc + +func ff_shuffle_bytes_2103_rvv, zve32x + addi t1, a0, 1 + addi t2, a0, 0 + addi t3, a0, 3 + addi a0, a0, 2 + j 1b +endfunc + +func ff_shuffle_bytes_1230_rvv, zve32x + addi t1, a0, 2 + addi t2, a0, 3 + addi t3, a0, 0 + addi a0, a0, 1 + j 1b +endfunc + +func ff_shuffle_bytes_3012_rvv, zve32x + addi t1, a0, 0 + addi t2, a0, 1 + addi t3, a0, 2 + addi a0, a0, 3 + j 1b +endfunc + +func ff_shuffle_bytes_3210_rvv, zve32x + addi t1, a0, 2 + addi t2, a0, 1 + addi t3, a0, 0 + addi a0, a0, 3 + j 1b +endfunc