From patchwork Sun May 5 16:45:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48542 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:e68f:b0:1af:836d:81b3 with SMTP id mz15csp962589pzb; Sun, 5 May 2024 09:46:01 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV9SN398iaK1zHogl5RjNfnti5lPGu2zQplcyiDOOG95xClHySSuTeeQGwusPPQIaBtsWR8F+Sw1y7v6maw5gFr8I76TEmTXTsXBA== X-Google-Smtp-Source: AGHT+IEwpg7W6h+H+pbbIPW1dOQMt94GH9k+d5/4sPq2J9xFhbqvC43U60figWviSGFZ8er0NrjN X-Received: by 2002:a17:906:4ed3:b0:a59:9f88:f1f1 with SMTP id i19-20020a1709064ed300b00a599f88f1f1mr4247678ejv.19.1714927560808; Sun, 05 May 2024 09:46:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1714927560; cv=none; d=google.com; s=arc-20160816; b=efH6EcCqeiOwUeFooA82/N+kX8NfGhHjBY1SjyVVyrn3lygQKzj/lYDQgLJWRgP4E5 90MfgIZekYMJtUBuKGCR+Wui8eUFgn0bDE1Zxggmg9E1nPDX4/3A+T/CggcWnjpVjB7V IQGPfOzn+yU4/6OYlAy3njxtvS7WKoYsC5OsqOHdj68lmMHONkTAD/UOAPoaQ2ceqHZN KnEANeHD2wTRFKpTJkuF1kdidjLro7gp1RHcitV9/rOPpjfWV/euJSf8iSm6FWruXY8K STOl3bXNLFgRsKjoLfRKlZ8ZdP1kh9/pYNPOJKiCqYYrdbr0h7nftEQXHDujiFf6D95o d7kQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:date:to:from:message-id :dkim-signature:delivered-to; bh=3VGPmzax+lsvCeSNqnV+jdrPzPWjUQzVBWromYyMF1c=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=d017hhJPT1Km1Sa1ggu6K8BgooODLlksmlrS3OkyhLvh0lRdyzYpMbKCRMmVhEaNJ5 uFN0vYAjzrVX9OeVxeZmJZ1Fse0iuYosfDP/AzokQGULJnJWSgXQ8uWHIXXMh/z5wvE8 xh9DbyqAMMBEvgVggcxIgU4gtPlrsaIY3xDEjBIUdkOW9KAFRYGiAaqq8t7BgD3fE/pP 9PpM04MkE3aOQMEryycfC672tCN+d3vVP26xtAYiQvhSg4d4xq78/OpndzH018PNgzcj 6hUh7x6GsE4eySM+aNe9/E+1tsY0yiNFUyQle+S3+3YGyhIJIiWsMqL3GWxPBvv6hiq+ UhQg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=eL172myo; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b9-20020a170906194900b00a5237093b7bsi1836014eje.807.2024.05.05.09.46.00; Sun, 05 May 2024 09:46:00 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=eL172myo; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C688668D5AA; Sun, 5 May 2024 19:45:56 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out162-62-57-210.mail.qq.com (out162-62-57-210.mail.qq.com [162.62.57.210]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C4CCB68D544 for ; Sun, 5 May 2024 19:45:48 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1714927539; bh=guC8hC0bTXtJRwAbN7baeYh/nfETQOHAtyGHeBSrxkQ=; h=From:To:Cc:Subject:Date; b=eL172myoK81xueH2c5SOnYaz1gOmtPzttaJLMjtsp4/8HtCEbeTUcNjjrFg/eXoBD Zltm5E1biakfls7IlrTaam8dgkU5Lefypv55Yz++ZlS8BhuC739rrsF10LP51EauWX S3gIrHbauSs2Vuf+j7MCC7iGqs0AScXuC8YcrENs= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza10-0.qq.com (NewEsmtp) with SMTP id B65A42EA; Mon, 06 May 2024 00:45:37 +0800 X-QQ-mid: xmsmtpt1714927537tf8vfpmqr Message-ID: X-QQ-XMAILINFO: MwqrwaLzgdeb32LlDEaave0rJq9Ia8Fj65/GoePBF65mOnSSHWKbwAyRukT+Ek nY7dPZ4G453yi5wgkOkamIwf1O5TibMJRDg+IEE64qOd6IOobjg4Ntc4OdntjDkTYmyZQq8Aerna BbR3hRg9CyLv389Ta5dDERQU7j47XVTH5c8FD1IIfC6jj4Mhs6tJmpmGGkWfZJXnJncf2bAPOuos Jymlm9SWpj7oYFLKoi+fA5q07uyzJWBUTiJxipFcslorpNWeJM6hduc7lP3InT4PPzvUYMGU5sF/ fiOA5O5s3UrAvZ6+qg5HJpBtZNkfT6x7amodieSldBtMbHXWjkUs1i3Pr1+C84Tn/1w5xE83BunQ sPvKNEaXzqWoG4o2k0a8pYJEXgyfSzi8cIdFqLiD5rxDkIbHNLIbRaobR+oKkqu44GF51NTNmo7f P90MHzdhAeijIxAGbcx7XqzIhhpAWSZQ9dswPYZQmujbj4h6LaMkdQhh9m4E1LYJM5Cm1p9XV9HH z/a5XQYpxRZwuxBVacj9VU9PRh7Q9wnyYoItSfUb2+IDEVB3MmCBNJ1aY83GIB41gozbOsRt2CF8 HeTBcndkr5mhiku6xe6sIiLRHD4UC+ZCDei7uZD31JLd24fcHgWLXfLO03hVzMLcs0oCQDRdMxiE IE62kuJBJZbWO9Br+2mq5g56OvLbI4Ug6Apj9AjZNw413CFNGkK8cJiphYE9Jep/89ysG4icDWtX DQmuFo0O8297i/RCN8K+0BIYQtGDZ/Y523l8GYhoreKx1MOjHhSZ6y6tTD3xfKoL2eSG5dkLZIho K9HXRbMwCnIIpT6t/YCFuBbiHK0cLIpJaqqyvdzY0MNCC7oqiyKapFrxiE91/t/VdhVN5ENEfosW pbQvM2aDGUssaW42U/I07BLkAfuBM4qg== X-QQ-XMRINFO: MPJ6Tf5t3I/ycC2BItcBVIA= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 May 2024 00:45:27 +0800 X-OQ-MSGID: <20240505164536.872683-1-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 01/10] lavc/vp8dsp: R-V put_vp8_pixels X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Zo72X9RMJ9zw From: sunyuechi C908: vp8_put_pixels4_c: 78.0 vp8_put_pixels4_rvi: 33.7 vp8_put_pixels8_c: 278.0 vp8_put_pixels8_rvi: 55.0 vp8_put_pixels16_c: 999.0 vp8_put_pixels16_rvi: 86.7 --- libavcodec/riscv/Makefile | 1 + libavcodec/riscv/vp8dsp.h | 75 ++++++++++++++++++++++++++++++++++ libavcodec/riscv/vp8dsp_init.c | 22 ++++++++++ libavcodec/riscv/vp8dsp_rvi.S | 61 +++++++++++++++++++++++++++ libavcodec/vp8dsp.c | 2 + libavcodec/vp8dsp.h | 1 + 6 files changed, 162 insertions(+) create mode 100644 libavcodec/riscv/vp8dsp.h create mode 100644 libavcodec/riscv/vp8dsp_rvi.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index 050c08ee61..526cb5c97c 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -61,6 +61,7 @@ RVV-OBJS-$(CONFIG_UTVIDEO_DECODER) += riscv/utvideodsp_rvv.o OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_init.o RVV-OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_rvv.o OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_init.o +RV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvi.o RVV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvv.o OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9dsp_init.o RVV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_intra_rvv.o diff --git a/libavcodec/riscv/vp8dsp.h b/libavcodec/riscv/vp8dsp.h new file mode 100644 index 0000000000..971c5c0a96 --- /dev/null +++ b/libavcodec/riscv/vp8dsp.h @@ -0,0 +1,75 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVCODEC_RISCV_VP8DSP_H +#define AVCODEC_RISCV_VP8DSP_H + +#include "libavcodec/vp8dsp.h" + +#define VP8_LF_Y(hv, inner, opt) \ + void ff_vp8_##hv##_loop_filter16##inner##_##opt(uint8_t *dst, \ + ptrdiff_t stride, \ + int flim_E, int flim_I, \ + int hev_thresh) + +#define VP8_LF_UV(hv, inner, opt) \ + void ff_vp8_##hv##_loop_filter8uv##inner##_##opt(uint8_t *dstU, \ + uint8_t *dstV, \ + ptrdiff_t stride, \ + int flim_E, int flim_I, \ + int hev_thresh) + +#define VP8_LF_SIMPLE(hv, opt) \ + void ff_vp8_##hv##_loop_filter16_simple_##opt(uint8_t *dst, \ + ptrdiff_t stride, \ + int flim) + +#define VP8_LF_HV(inner, opt) \ + VP8_LF_Y(h, inner, opt); \ + VP8_LF_Y(v, inner, opt); \ + VP8_LF_UV(h, inner, opt); \ + VP8_LF_UV(v, inner, opt) + +#define VP8_LF(opt) \ + VP8_LF_HV(, opt); \ + VP8_LF_HV(_inner, opt); \ + VP8_LF_SIMPLE(h, opt); \ + VP8_LF_SIMPLE(v, opt) + +#define VP8_MC(n, opt) \ + void ff_put_vp8_##n##_##opt(uint8_t *dst, ptrdiff_t dststride, \ + const uint8_t *src, ptrdiff_t srcstride,\ + int h, int x, int y) + +#define VP8_EPEL(w, opt) \ + VP8_MC(pixels ## w, opt); \ + VP8_MC(epel ## w ## _h4, opt); \ + VP8_MC(epel ## w ## _h6, opt); \ + VP8_MC(epel ## w ## _v4, opt); \ + VP8_MC(epel ## w ## _h4v4, opt); \ + VP8_MC(epel ## w ## _h6v4, opt); \ + VP8_MC(epel ## w ## _v6, opt); \ + VP8_MC(epel ## w ## _h4v6, opt); \ + VP8_MC(epel ## w ## _h6v6, opt) + +#define VP8_BILIN(w, opt) \ + VP8_MC(bilin ## w ## _h, opt); \ + VP8_MC(bilin ## w ## _v, opt); \ + VP8_MC(bilin ## w ## _hv, opt) + +#endif /* AVCODEC_RISCV_VP8DSP_H */ diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index af57aabb71..fa3feeacf7 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -24,11 +24,33 @@ #include "libavutil/cpu.h" #include "libavutil/riscv/cpu.h" #include "libavcodec/vp8dsp.h" +#include "vp8dsp.h" void ff_vp8_idct_dc_add_rvv(uint8_t *dst, int16_t block[16], ptrdiff_t stride); void ff_vp8_idct_dc_add4y_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t stride); void ff_vp8_idct_dc_add4uv_rvv(uint8_t *dst, int16_t block[4][16], ptrdiff_t stride); +VP8_EPEL(16, rvi); +VP8_EPEL(8, rvi); +VP8_EPEL(4, rvi); + +av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) +{ +#if HAVE_RV + int flags = av_get_cpu_flags(); + if (flags & AV_CPU_FLAG_RVI) { +#if __riscv_xlen >= 64 + c->put_vp8_epel_pixels_tab[0][0][0] = ff_put_vp8_pixels16_rvi; + c->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvi; + c->put_vp8_bilinear_pixels_tab[0][0][0] = ff_put_vp8_pixels16_rvi; + c->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvi; +#endif + c->put_vp8_epel_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; + c->put_vp8_bilinear_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvi; + } +#endif +} + av_cold void ff_vp8dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RVV diff --git a/libavcodec/riscv/vp8dsp_rvi.S b/libavcodec/riscv/vp8dsp_rvi.S new file mode 100644 index 0000000000..50ba4f293f --- /dev/null +++ b/libavcodec/riscv/vp8dsp_rvi.S @@ -0,0 +1,61 @@ +/* + * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +#if __riscv_xlen >= 64 +func ff_put_vp8_pixels16_rvi +1: + addi a4, a4, -1 + ld t0, (a2) + ld t1, 8(a2) + sd t0, (a0) + sd t1, 8(a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc + +func ff_put_vp8_pixels8_rvi +1: + addi a4, a4, -1 + ld t0, (a2) + sd t0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc +#endif + +func ff_put_vp8_pixels4_rvi +1: + addi a4, a4, -1 + lw t0, (a2) + sw t0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +endfunc diff --git a/libavcodec/vp8dsp.c b/libavcodec/vp8dsp.c index df7bd12424..f7c9c9899c 100644 --- a/libavcodec/vp8dsp.c +++ b/libavcodec/vp8dsp.c @@ -1402,6 +1402,8 @@ dsp->put_vp8_epel_pixels_tab[2][2][2] = put_vp8_epel4_h6v6_c; ff_vp78dsp_init_arm(dsp); #elif ARCH_PPC ff_vp78dsp_init_ppc(dsp); +#elif ARCH_RISCV + ff_vp78dsp_init_riscv(dsp); #elif ARCH_X86 ff_vp78dsp_init_x86(dsp); #endif diff --git a/libavcodec/vp8dsp.h b/libavcodec/vp8dsp.h index 30dc2c6cc1..3bf12b6b45 100644 --- a/libavcodec/vp8dsp.h +++ b/libavcodec/vp8dsp.h @@ -87,6 +87,7 @@ void ff_vp78dsp_init(VP8DSPContext *c); void ff_vp78dsp_init_aarch64(VP8DSPContext *c); void ff_vp78dsp_init_arm(VP8DSPContext *c); void ff_vp78dsp_init_ppc(VP8DSPContext *c); +void ff_vp78dsp_init_riscv(VP8DSPContext *c); void ff_vp78dsp_init_x86(VP8DSPContext *c); void ff_vp8dsp_init(VP8DSPContext *c);