From patchwork Sun May 12 10:55:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 48812 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:1706:b0:1af:cdee:28c5 with SMTP id nv6csp520336pzb; Sun, 12 May 2024 03:55:38 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVTdb3Hp5R1UCETjHMUaKCwP8FTaONAlWuTGAG3UrEN5R+LvtDxnLB0RWKJkTN1aU/RAXdGXPJbzWrGemt6ZPpz7PnldHG6XDwlGA== X-Google-Smtp-Source: AGHT+IEfTvEIeS8/hclUsyvHGlhxYFJbXwM6wP6806AjZv/7F8hX85tA86MmO1+MZQk88MXXdiIp X-Received: by 2002:a05:6402:1497:b0:572:dfd8:f44a with SMTP id 4fb4d7f45d1cf-5734d5c17a1mr5324520a12.1.1715511338341; Sun, 12 May 2024 03:55:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715511338; cv=none; d=google.com; s=arc-20160816; b=u0mVH3d2x/SK57Z39dWHqaYellNtqVNZQ8jFBD4AvK9UDbXGZPHNl5rlpb+lVi3NvC WFNVpTkhjdM/GjT6C5QqOjrsMhj0nUTabmmmzKfJWUY/DG2zvdQM0tmj4ZB6ifKi/WC3 tSlmaAYHXW8MuiHQ1jIldZZP64dXSCS6sIHhurWURaMaMFWoYEt0cnlUisJ/sFfeBf30 KF4DFi8E5nPlu2WhaZPcmTuViivO+2dAn3x4QHZzA87T0ykLXDeOcvbtbzUnsGbojdqw xXxhe0SOvQ89pzUsNGW+2+vuHPSg5N+oawSjbESM8ZNtEM8aHRwoVM9BiBTumS3CFJGx ZfKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=Ducm+pr9MDbWbHsOCjqyFl8nUJfg80xVhkWRjnRO+bQ=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=GZ7CNrskJHNiqJCHSEHZjG8+USqhfBf9ZjrR/8+x+KrJtSD9wP6WpIslBDGHb0mcua L/oged9W8psvfr6ZIBHGFSlHtFh8j6WphRTPOGpvwmWyVdqp6EDwvd16e3/O+ouPHb6R 1j05d/Rle6MTmhK+uy6UcPMFGZt9g7tlsJylzAFpT+Lc2oobCzdyl6MiKs/uGJahHDcC XXQhTpZ86koKQf1saWPxMqvSVYrLwCSzzAwn6tlOV7szjS9hyoJNE7P8slilPMr8BHyX HgJlh9zFe7ZvIJxN1iK8f5vmTnuSsce9Ojk0BuqeZtn/tq7k7XHEblTnDA6JZOOgsbdD hWwg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5733c378d3dsi3838519a12.663.2024.05.12.03.55.37; Sun, 12 May 2024 03:55:38 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D371668D62A; Sun, 12 May 2024 13:55:24 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A593868D2D8 for ; Sun, 12 May 2024 13:55:16 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 14AEEC009A for ; Sun, 12 May 2024 13:55:16 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sun, 12 May 2024 13:55:14 +0300 Message-ID: <20240512105515.24624-2-remi@remlab.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240512105515.24624-1-remi@remlab.net> References: <20240512105515.24624-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] lavc/startcode: add R-V V startcode_find_candidate X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: VWuFLgiJ8kLO --- libavcodec/riscv/Makefile | 1 + libavcodec/riscv/h264dsp_init.c | 3 +++ libavcodec/riscv/startcode_rvv.S | 44 ++++++++++++++++++++++++++++++++ libavcodec/riscv/vc1dsp_init.c | 16 +++++++----- 4 files changed, 58 insertions(+), 6 deletions(-) create mode 100644 libavcodec/riscv/startcode_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 319ea6427b..1b52d60dbf 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -53,6 +53,7 @@ RVV-OBJS-$(CONFIG_RV34DSP) += riscv/rv34dsp_rvv.o OBJS-$(CONFIG_RV40_DECODER) += riscv/rv40dsp_init.o RVV-OBJS-$(CONFIG_RV40_DECODER) += riscv/rv40dsp_rvv.o RV-OBJS-$(CONFIG_STARTCODE) += riscv/startcode_rvb.o +RVV-OBJS-$(CONFIG_STARTCODE) += riscv/startcode_rvv.o OBJS-$(CONFIG_SVQ1_ENCODER) += riscv/svqenc_init.o RVV-OBJS-$(CONFIG_SVQ1_ENCODER) += riscv/svqenc_rvv.o OBJS-$(CONFIG_TAK_DECODER) += riscv/takdsp_init.o diff --git a/libavcodec/riscv/h264dsp_init.c b/libavcodec/riscv/h264dsp_init.c index 60c84734cd..d5984f1805 100644 --- a/libavcodec/riscv/h264dsp_init.c +++ b/libavcodec/riscv/h264dsp_init.c @@ -27,6 +27,7 @@ #include "libavcodec/h264dsp.h" extern int ff_startcode_find_candidate_rvb(const uint8_t *, int); +extern int ff_startcode_find_candidate_rvv(const uint8_t *, int); av_cold void ff_h264dsp_init_riscv(H264DSPContext *dsp, const int bit_depth, const int chroma_format_idc) @@ -36,5 +37,7 @@ av_cold void ff_h264dsp_init_riscv(H264DSPContext *dsp, const int bit_depth, if (flags & AV_CPU_FLAG_RVB_BASIC) dsp->startcode_find_candidate = ff_startcode_find_candidate_rvb; + if (flags & AV_CPU_FLAG_RVV_I32) + dsp->startcode_find_candidate = ff_startcode_find_candidate_rvv; #endif } diff --git a/libavcodec/riscv/startcode_rvv.S b/libavcodec/riscv/startcode_rvv.S new file mode 100644 index 0000000000..7c43b1d7f3 --- /dev/null +++ b/libavcodec/riscv/startcode_rvv.S @@ -0,0 +1,44 @@ +/* + * Copyright © 2024 Rémi Denis-Courmont. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * 1. Redistributions of source code must retain the above copyright notice, + * this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +#include "libavutil/riscv/asm.S" + +func ff_startcode_find_candidate_rvv, zve32x + mv t0, a0 +1: + vsetvli t1, a1, e8, m8, ta, ma + vle8.v v8, (t0) + sub a1, a1, t1 + vmseq.vi v0, v8, 0 + vfirst.m t2, v0 + bgez t2, 2f + add t0, t0, t1 + bnez a1, 1b +2: + add t0, t0, t2 + sub a0, t0, a0 + ret +endfunc diff --git a/libavcodec/riscv/vc1dsp_init.c b/libavcodec/riscv/vc1dsp_init.c index 7868091978..03357e262b 100644 --- a/libavcodec/riscv/vc1dsp_init.c +++ b/libavcodec/riscv/vc1dsp_init.c @@ -30,6 +30,7 @@ void ff_vc1_inv_trans_4x8_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block void ff_vc1_inv_trans_8x4_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block); void ff_vc1_inv_trans_4x4_dc_rvv(uint8_t *dest, ptrdiff_t stride, int16_t *block); int ff_startcode_find_candidate_rvb(const uint8_t *, int); +int ff_startcode_find_candidate_rvv(const uint8_t *, int); av_cold void ff_vc1dsp_init_riscv(VC1DSPContext *dsp) { @@ -39,13 +40,16 @@ av_cold void ff_vc1dsp_init_riscv(VC1DSPContext *dsp) if (flags & AV_CPU_FLAG_RVB_BASIC) dsp->startcode_find_candidate = ff_startcode_find_candidate_rvb; # if HAVE_RVV - if (flags & AV_CPU_FLAG_RVV_I32 && ff_get_rv_vlenb() >= 16) { - dsp->vc1_inv_trans_4x8_dc = ff_vc1_inv_trans_4x8_dc_rvv; - dsp->vc1_inv_trans_4x4_dc = ff_vc1_inv_trans_4x4_dc_rvv; - if (flags & AV_CPU_FLAG_RVV_I64) { - dsp->vc1_inv_trans_8x8_dc = ff_vc1_inv_trans_8x8_dc_rvv; - dsp->vc1_inv_trans_8x4_dc = ff_vc1_inv_trans_8x4_dc_rvv; + if (flags & AV_CPU_FLAG_RVV_I32) { + if (ff_get_rv_vlenb() >= 16) { + dsp->vc1_inv_trans_4x8_dc = ff_vc1_inv_trans_4x8_dc_rvv; + dsp->vc1_inv_trans_4x4_dc = ff_vc1_inv_trans_4x4_dc_rvv; + if (flags & AV_CPU_FLAG_RVV_I64) { + dsp->vc1_inv_trans_8x8_dc = ff_vc1_inv_trans_8x8_dc_rvv; + dsp->vc1_inv_trans_8x4_dc = ff_vc1_inv_trans_8x4_dc_rvv; + } } + dsp->startcode_find_candidate = ff_startcode_find_candidate_rvv; } # endif #endif