From patchwork Sun Dec 3 14:40:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 44888 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:a301:b0:181:818d:5e7f with SMTP id x1csp2254456pzk; Sun, 3 Dec 2023 06:40:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IHGwuZ6g7qxk3iMe0olmWLSzmZcel37HPjplLqtNvi+t81ZGvnJ2pG/crgJdK3uG/56t9ze X-Received: by 2002:a17:907:90d4:b0:9bd:a65e:b594 with SMTP id gk20-20020a17090790d400b009bda65eb594mr2127383ejb.3.1701614433883; Sun, 03 Dec 2023 06:40:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701614433; cv=none; d=google.com; s=arc-20160816; b=M9AcEtkA0/n4uoYuI0U346CeHIYkP+DBA4lCYsO3Y9LfSEPGyuOl1VsrozqF9MnLVP NmbhWW4zEZEVZINx5jqJVq6e0iPkqTSmmCX2wqqGrW1BohcJrbfqQKYw8bVN8z86shYl K5NRTviraAXsD+LqhXmxkp5acfCQ14dNf4S8RiAxEUTtoir0SoqcQNUQbjWAG1Qb30zK DPlxc9vhs2cgxwu9LrTScChztWXqNP/dNXfWy6QM8XaUCxyOcpCuBaVvodeuSZw09ZAj AfOFoYZGOP3TEJwEZncorrClC4KFSd0OWxpSHnSNLrwrcwgwM//g9zYEezIxIybxDM7P Q6lA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=COgJRWQS1gjvyASvlCmNbNMNqR8SP2t/3k+S3hzPtuE=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=a4IRkwEiNmAG+G4i+FvPjI+nfGih61RQJzPHvrE12734/SIFHt4zx/TGqCWksUYqYx c5djpZj+vbPTTYRZNMV26t2fbXz0h+y192GrQqOUZw4yPWUPHmESjcE/54e3zfe9Qnpx dj01+XXgzZPLYge5SHTuaiocnIzkY48tXTY1jiF9jMoo9eBk9psl7Zil5x/q+sWBhaXG /mvP95StFV9HBbGh60CpEtJse8CD/8S4z38IuJvPyfEd6jAbz/AOdDlgIjdeTvEUVGTj ubiY7ML959PeDhTospTDz/UDekXjDlILk8JRy4Yq2PC+UOto9WRh9IvzS7FKaoKFa1i3 KVZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=J34WbijB; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id q16-20020a1709060e5000b009b2ca12045esi3514053eji.762.2023.12.03.06.40.33; Sun, 03 Dec 2023 06:40:33 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=J34WbijB; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4214C68AA07; Sun, 3 Dec 2023 16:40:29 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A3F5768AA07 for ; Sun, 3 Dec 2023 16:40:21 +0200 (EET) Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-425469c4143so8376721cf.0 for ; Sun, 03 Dec 2023 06:40:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701614419; x=1702219219; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=4Gg0nyL+Ce9FnepapfYs9pub6IjoW1WB4RkSxuZxpn8=; b=J34WbijBz6C4rmuCl0htY7JjECFfWX0LhpFO17Zq44no6PMVwkP6gKN2Z9S2dWsiLd Mmml9F1T3vB5CB7a5cjkxksDYU/00FZF7KwvEEcMbDGNnF3wXsH4aALbmKcETWOJSuj/ zFw1fRa1jMiQrml3XGILWkEj2wUylpzDGtrFzfdyxcBaurJBg9CM+YLMzrluGvGn1JdA febHLelGJkmANPlApoAsqfKHH8BiR4wF6p1K3nt51fPj4VIZ/HAxv4fiMZOhWkIT47dK sx7jcAAaQX7q8diO70YeQOL/64oxdDF41nRXMHjGfpzi47ywKDia4+6P5AByxEyutYbM Wblw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701614419; x=1702219219; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=4Gg0nyL+Ce9FnepapfYs9pub6IjoW1WB4RkSxuZxpn8=; b=URWVHvCUjorJwdLGwL92/kLbxoTionxxvrHm99IMwb2DnOooM9HS38ziWNdFRkDoF/ sKXSNGOG//10zjsHvchKa3A+iCi5U3V9JIFDRqpUW4u6587/RkDmadCx6Jpz7p0q/pfc v8YeslN7gtMNDrsOk4Q5bMuobcbqCJSaaGzDGmAdB5RYD+dOmRUoHSiCt4Su10wGS9zN 8W2+Xnmh5Ngso5dFkM9061oKAX4x1RFQtF8d9glJJnqrTzNTiNDwvQc+2wrgQfTWp9sQ Ua3CCT9F7gjwQOAQTUaiAyfqzWghlhH8z5oirySYR4DUpRQLaK+LwBXzIakuAKuUo8Tp q8jA== X-Gm-Message-State: AOJu0Yw762HzwKOAHMZKdgiwSvRfup6h/fC4cGoqM+AHqRNFMOr54Fio 9cCO1tTQs9/IfOVnRbLXgYgdD7hZSto6sMx5xOwTN9QumNZSGp8Q X-Received: by 2002:a05:620a:4106:b0:77e:fba3:93ac with SMTP id j6-20020a05620a410600b0077efba393acmr3578006qko.142.1701614419243; Sun, 03 Dec 2023 06:40:19 -0800 (PST) MIME-Version: 1.0 From: flow gg Date: Sun, 3 Dec 2023 22:40:08 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH] lavc/vc1dsp: R-V V inv_trans X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: szY77JFZoRg4 c910 vc1dsp.vc1_inv_trans_4x4_dc_c: 84.0 vc1dsp.vc1_inv_trans_4x4_dc_rvv_i32: 74.0 vc1dsp.vc1_inv_trans_4x8_dc_c: 150.2 vc1dsp.vc1_inv_trans_4x8_dc_rvv_i32: 83.5 vc1dsp.vc1_inv_trans_8x4_dc_c: 129.0 vc1dsp.vc1_inv_trans_8x4_dc_rvv_i64: 75.7 vc1dsp.vc1_inv_trans_8x8_dc_c: 254.7 vc1dsp.vc1_inv_trans_8x8_dc_rvv_i64: 90.5 From cba93503a6f0753b56c1d0cb00f642b3982ee656 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Fri, 1 Dec 2023 10:07:40 +0800 Subject: [PATCH] lavc/vc1dsp: R-V V inv_trans c910 vc1dsp.vc1_inv_trans_4x4_dc_c: 84.0 vc1dsp.vc1_inv_trans_4x4_dc_rvv_i32: 74.0 vc1dsp.vc1_inv_trans_4x8_dc_c: 150.2 vc1dsp.vc1_inv_trans_4x8_dc_rvv_i32: 83.5 vc1dsp.vc1_inv_trans_8x4_dc_c: 129.0 vc1dsp.vc1_inv_trans_8x4_dc_rvv_i64: 75.7 vc1dsp.vc1_inv_trans_8x8_dc_c: 254.7 vc1dsp.vc1_inv_trans_8x8_dc_rvv_i64: 90.5 --- libavcodec/riscv/Makefile | 2 + libavcodec/riscv/vc1dsp_init.c | 47 +++++++++++++ libavcodec/riscv/vc1dsp_rvv.S | 123 +++++++++++++++++++++++++++++++++ libavcodec/vc1dsp.c | 2 + libavcodec/vc1dsp.h | 1 + 5 files changed, 175 insertions(+) create mode 100644 libavcodec/riscv/vc1dsp_init.c create mode 100644 libavcodec/riscv/vc1dsp_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 2d0e6c19c8..442c5961ea 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -39,5 +39,7 @@ OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_init.o \ RVV-OBJS-$(CONFIG_PIXBLOCKDSP) += riscv/pixblockdsp_rvv.o OBJS-$(CONFIG_UTVIDEO_DECODER) += riscv/utvideodsp_init.o RVV-OBJS-$(CONFIG_UTVIDEO_DECODER) += riscv/utvideodsp_rvv.o +OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_init.o +RVV-OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_rvv.o OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o RVV-OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_rvv.o diff --git a/libavcodec/riscv/vc1dsp_init.c b/libavcodec/riscv/vc1dsp_init.c new file mode 100644 index 0000000000..88e0434f0e --- /dev/null +++ b/libavcodec/riscv/vc1dsp_init.c @@ -0,0 +1,47 @@ +/* + * Copyright (c) 2023 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/attributes.h" +#include "libavutil/cpu.h" +#include "libavutil/riscv/cpu.h" +#include "libavcodec/vc1.h" + +void ff_vc1_inv_trans_8x8_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block); +void ff_vc1_inv_trans_4x8_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block); +void ff_vc1_inv_trans_8x4_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block); +void ff_vc1_inv_trans_4x4_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block); + +av_cold void ff_vc1dsp_init_riscv(VC1DSPContext *dsp) +{ +#if HAVE_RVV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVV_I64) { + dsp->vc1_inv_trans_8x8_dc = ff_vc1_inv_trans_8x8_dc_rvv; + dsp->vc1_inv_trans_8x4_dc = ff_vc1_inv_trans_8x4_dc_rvv; + } + if (flags & AV_CPU_FLAG_RVV_I32) { + dsp->vc1_inv_trans_4x8_dc = ff_vc1_inv_trans_4x8_dc_rvv; + dsp->vc1_inv_trans_4x4_dc = ff_vc1_inv_trans_4x4_dc_rvv; + } +#endif +} diff --git a/libavcodec/riscv/vc1dsp_rvv.S b/libavcodec/riscv/vc1dsp_rvv.S new file mode 100644 index 0000000000..8a6b27192a --- /dev/null +++ b/libavcodec/riscv/vc1dsp_rvv.S @@ -0,0 +1,123 @@ +/* + * Copyright (c) 2023 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_vc1_inv_trans_8x8_dc_rvv, zve64x + lh t2, (a2) + li t1, 3 + mul t2, t2, t1 + addi t2, t2, 1 + srai t2, t2, 1 + mul t2, t2, t1 + addi t2, t2, 16 + srai t2, t2, 5 + vsetivli zero, 8, e8, mf2, ta, ma + vlse64.v v0, (a0), a1 + li t0, 8*8 + vsetvli zero, t0, e16, m8, ta, ma + vmv.v.x v8, t2 + vsetvli zero, t0, e8, m4, ta, ma + vwaddu.wv v8, v8, v0 + vsetvli zero, t0, e16, m8, ta, ma + vmax.vx v8, v8, zero + vsetvli zero, t0, e8, m4, ta, ma + vnclipu.wi v0, v8, 0 + vsetivli zero, 8, e8, mf2, ta, ma + vsse64.v v0, (a0), a1 + ret +endfunc + +func ff_vc1_inv_trans_4x8_dc_rvv, zve32x + lh t2, (a2) + li t1, 17 + mul t2, t2, t1 + addi t2, t2, 4 + srai t2, t2, 3 + li t1, 12 + mul t2, t2, t1 + addi t2, t2, 64 + srai t2, t2, 7 + vsetivli zero, 8, e8, mf2, ta, ma + vlse32.v v0, (a0), a1 + li t0, 4*8 + vsetvli zero, t0, e16, m4, ta, ma + vmv.v.x v4, t2 + vsetvli zero, t0, e8, m2, ta, ma + vwaddu.wv v4, v4, v0 + vsetvli zero, t0, e16, m4, ta, ma + vmax.vx v4, v4, zero + vsetvli zero, t0, e8, m2, ta, ma + vnclipu.wi v0, v4, 0 + vsetivli zero, 8, e8, mf2, ta, ma + vsse32.v v0, (a0), a1 + ret +endfunc + +func ff_vc1_inv_trans_8x4_dc_rvv, zve64x + lh t2, (a2) + li t1, 3 + mul t2, t2, t1 + addi t2, t2, 1 + srai t2, t2, 1 + li t1, 17 + mul t2, t2, t1 + addi t2, t2, 64 + srai t2, t2, 7 + vsetivli zero, 8, e8, mf2, ta, ma + vlse64.v v0, (a0), a1 + li t0, 8*4 + vsetvli zero, t0, e16, m4, ta, ma + vmv.v.x v4, t2 + vsetvli zero, t0, e8, m2, ta, ma + vwaddu.wv v4, v4, v0 + vsetvli zero, t0, e16, m4, ta, ma + vmax.vx v4, v4, zero + vsetvli zero, t0, e8, m2, ta, ma + vnclipu.wi v0, v4, 0 + vsetivli zero, 8, e8, mf2, ta, ma + vsse64.v v0, (a0), a1 + ret +endfunc + +func ff_vc1_inv_trans_4x4_dc_rvv, zve32x + lh t2, (a2) + li t1, 17 + mul t2, t2, t1 + addi t2, t2, 4 + srai t2, t2, 3 + mul t2, t2, t1 + addi t2, t2, 64 + srai t2, t2, 7 + vsetivli zero, 4, e8, mf2, ta, ma + vlse32.v v0, (a0), a1 + li t0, 4*4 + vsetvli zero, t0, e16, m2, ta, ma + vmv.v.x v2, t2 + vsetvli zero, t0, e8, m1, ta, ma + vwaddu.wv v2, v2, v0 + vsetvli zero, t0, e16, m2, ta, ma + vmax.vx v2, v2, zero + vsetvli zero, t0, e8, m1, ta, ma + vnclipu.wi v0, v2, 0 + vsetivli zero, 4, e8, mf2, ta, ma + vsse32.v v0, (a0), a1 + ret +endfunc diff --git a/libavcodec/vc1dsp.c b/libavcodec/vc1dsp.c index 62c8eb21fa..2caa3c6863 100644 --- a/libavcodec/vc1dsp.c +++ b/libavcodec/vc1dsp.c @@ -1039,6 +1039,8 @@ av_cold void ff_vc1dsp_init(VC1DSPContext *dsp) ff_vc1dsp_init_arm(dsp); #elif ARCH_PPC ff_vc1dsp_init_ppc(dsp); +#elif ARCH_RISCV + ff_vc1dsp_init_riscv(dsp); #elif ARCH_X86 ff_vc1dsp_init_x86(dsp); #elif ARCH_MIPS diff --git a/libavcodec/vc1dsp.h b/libavcodec/vc1dsp.h index 7ed1776ca7..e3b90d2b62 100644 --- a/libavcodec/vc1dsp.h +++ b/libavcodec/vc1dsp.h @@ -89,6 +89,7 @@ void ff_vc1dsp_init(VC1DSPContext* c); void ff_vc1dsp_init_aarch64(VC1DSPContext* dsp); void ff_vc1dsp_init_arm(VC1DSPContext* dsp); void ff_vc1dsp_init_ppc(VC1DSPContext *c); +void ff_vc1dsp_init_riscv(VC1DSPContext *c); void ff_vc1dsp_init_x86(VC1DSPContext* dsp); void ff_vc1dsp_init_mips(VC1DSPContext* dsp); void ff_vc1dsp_init_loongarch(VC1DSPContext* dsp); -- 2.43.0