From patchwork Fri Mar 22 06:03:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 47303 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a4a:b0:1a3:31a3:7958 with SMTP id zu10csp1007364pzb; Thu, 21 Mar 2024 23:04:04 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCU3UU0t++mcmHgRdt67+888WxcBsxk6UvtjT01C/s28mgkjk5yhPuHyNFfTQDABaswvpX55504Qz5iFKl1pyPpv5fHx/AGm3hUOTQ== X-Google-Smtp-Source: AGHT+IFFWXKfCl0FgyriO3qPLVm6wuPJUChV5azwtQcv0G36i6vNHGcO9wMR5G0IJLRhnth99AMM X-Received: by 2002:a17:906:d150:b0:a46:9a9e:f3b5 with SMTP id br16-20020a170906d15000b00a469a9ef3b5mr568626ejb.0.1711087444245; Thu, 21 Mar 2024 23:04:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1711087444; cv=none; d=google.com; s=arc-20160816; b=IgGaiEY9VY6HMknzm7Dh2wbZUI3TGs5LrXlJzBRZNVfc0W9aFtcPghq3/iBkK+xK34 Wzeyg35lCOD+0909OYdOtmgR8+KQVvFF4dUPxPf9Wgz7B0qgSPQxdlMyy7NVmRJOco+K hhIZdPNWvSTWK1gBcSpGxjWM5XSbb4e9zBKHe0AQ0EYB/Bbrr+MWi6XR0YrkVIModJtn RI3LyMNi1sPIVGixMc/xbTXZieinZ7MavbGWnZnCc1iDHUrFz3BHmdQMwc7Liw43qsjw ZaHCAY3/fHwaOjEDi5Md4YNuRstO6eaq9AMFlti+X49XBpmZq+7sNKqg0fsrHkokCTVJ oYdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=K76+aT6BvNaNnQ+RKEyRCbRvUxEaE4gQn72vSsASyHA=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=YAUMaus98q3AYZpgN8URXmoGAK2pnoYg1JEG1ZLaDcukIrgzeuVqhgxCvNkWAJOXUa YBqBOHrudgmv1zxW2p87cARvk0NEPYenHtaTjsFYOg3dQdZDoVr2xwdtRFB1omP9JEzz m+VyULMJ4BxUbyRHysL3pp2fk7d6Qe7jNDXarJ0pVXuWOlflXkv3BmPVmQuSd37s3/YG 2Fn86gqFtt0nYRJdIITOKgA6hGKQbiwfXAG5bLwZ6XvkgC5d8pJjj5/h0ZuiTUeBTLGc rVcFizBOnQ0PL5goff4UdHD70+Kwyc2I8QQwQ+j8D1i59pagtbRvcp+CXEZmCWe4q8nq IGhg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=BlHqvN7M; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y26-20020a170906471a00b00a4668973ad2si635803ejq.937.2024.03.21.23.04.03; Thu, 21 Mar 2024 23:04:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=BlHqvN7M; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 87C6068D58A; Fri, 22 Mar 2024 08:04:01 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-vk1-f170.google.com (mail-vk1-f170.google.com [209.85.221.170]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 625E368D4CE for ; Fri, 22 Mar 2024 08:04:00 +0200 (EET) Received: by mail-vk1-f170.google.com with SMTP id 71dfb90a1353d-4d4404fbdf5so764844e0c.0 for ; Thu, 21 Mar 2024 23:04:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711087438; x=1711692238; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=Ajh3+UZP2zUodgxrUdz1IZm60viT8tJBmLgtrTK+tkY=; b=BlHqvN7MFks1uB0WO0g/JiwIre8wNI+gRV8OXmj3a/z4zb6C063SXR5pZgTZ8Qh1Zs GfYxtlIST+rfUPsR89AwDbDBalIKWqjzdK72G7KONoksAlUjTEkBJ51rM7pdJBCoh0Bh VHqjggrQ4bQVSHLYqS6eB42phSKKbhL7bQgQgWVcGPxAONGESPXrrxDH6HSyPD9gMcCn xTon74GH4kXw7cK5ph7OtjxVwhC73ydtl6d/6ve2BgS9JaZh/l0l5SqK1hhTAO00cFMr ME70LTQ4YzUf4dDdPjgfXFycIRTfAj8ksJ+gwOUWowLa/51ARrlGjLGf0DXUYS6O6Fzm /VDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711087438; x=1711692238; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Ajh3+UZP2zUodgxrUdz1IZm60viT8tJBmLgtrTK+tkY=; b=sBB+d8+QFESCNE5ROip8S7exVxLPcWMGLixabOiqhtYgEWYwpO4WySsZgBCum322Z2 wTIGuniC15QhzxejniPmy07LkRjBMBFrF5Os8imx0j8DXhQdicykNSnB18Z9nOn7Lhl3 QrJmaZE+PeWdYnP6xOAQZyKngsUBK9OkYhX/lsK/G2Mw41PQKH9vR+96p7ZPM9LCyDC1 X8p5HWLT9+BYljBZFaXKTI85IYzoFJj138E8oyMquGHlSXscUGcGslbkAQ0gPcspvJop IN3tKLXFTyZgMD9gois1gmXqzsu23zhkdWbExZRN6XyV8OZLEPEXyKvCr8KqhWyKPpA1 CWZg== X-Gm-Message-State: AOJu0YxqpaJiBWpimnBOM7lXk3+hV6s9fk0YG6fMkBYpPmfckAnk2Qpl FsZpcuJDyGDVP5Ytd6YUvDIBx9T6ust2X5SeeTvvJjZOerS3kyHXqpEkceKbtyseY/IOqwcdCzG il4NI4EDkbjhNrbogaO9MhuGpEnMtmW4QSmMQkg== X-Received: by 2002:a05:6122:2505:b0:4d3:34b1:7211 with SMTP id cl5-20020a056122250500b004d334b17211mr1817782vkb.3.1711087438285; Thu, 21 Mar 2024 23:03:58 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Fri, 22 Mar 2024 14:03:47 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 1/7] lavc/vp9dsp: R-V mc copy_avg X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: KS8bqbGsL8NA (This should be used after applying these patches) ``` [FFmpeg-devel] [PATCH 1/4] lavc/vp9dsp: R-V V ipred dc 1-4 ``` From ea81872215165ff859a0b5b2e003c5c678ea8ed0 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Thu, 21 Mar 2024 22:01:18 +0800 Subject: [PATCH 1/7] lavc/vp9dsp: R-V mc copy_avg vp9_avg4_8bpp_c: 1.2 vp9_avg4_8bpp_rvv_i64: 1.0 vp9_avg8_8bpp_c: 3.7 vp9_avg8_8bpp_rvv_i64: 1.5 vp9_avg16_8bpp_c: 14.7 vp9_avg16_8bpp_rvv_i64: 3.5 vp9_avg32_8bpp_c: 57.7 vp9_avg32_8bpp_rvv_i64: 10.0 vp9_avg64_8bpp_c: 229.0 vp9_avg64_8bpp_rvv_i64: 31.7 vp9_put4_8bpp_c: 0.7 vp9_put4_8bpp_rvi: 0.2 vp9_put8_8bpp_c: 2.5 vp9_put8_8bpp_rvi: 0.5 vp9_put16_8bpp_c: 16.5 vp9_put16_8bpp_rvv_i64: 1.7 vp9_put32_8bpp_c: 37.2 vp9_put32_8bpp_rvv_i64: 5.7 vp9_put64_8bpp_c: 91.2 vp9_put64_8bpp_rvv_i64: 19.7 --- libavcodec/riscv/Makefile | 4 ++- libavcodec/riscv/vp9_mc_rvi.S | 43 +++++++++++++++++++++++ libavcodec/riscv/vp9_mc_rvv.S | 64 ++++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp_init.c | 47 +++++++++++++++++++++++++ 4 files changed, 157 insertions(+), 1 deletion(-) create mode 100644 libavcodec/riscv/vp9_mc_rvi.S create mode 100644 libavcodec/riscv/vp9_mc_rvv.S diff --git a/libavcodec/riscv/Makefile b/libavcodec/riscv/Makefile index c237e60800..dce1236b84 100644 --- a/libavcodec/riscv/Makefile +++ b/libavcodec/riscv/Makefile @@ -61,6 +61,8 @@ RVV-OBJS-$(CONFIG_VC1DSP) += riscv/vc1dsp_rvv.o OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_init.o RVV-OBJS-$(CONFIG_VP8DSP) += riscv/vp8dsp_rvv.o OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9dsp_init.o -RVV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_intra_rvv.o +RV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_mc_rvi.o +RVV-OBJS-$(CONFIG_VP9_DECODER) += riscv/vp9_intra_rvv.o \ + riscv/vp9_mc_rvv.o OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_init.o RVV-OBJS-$(CONFIG_VORBIS_DECODER) += riscv/vorbisdsp_rvv.o diff --git a/libavcodec/riscv/vp9_mc_rvi.S b/libavcodec/riscv/vp9_mc_rvi.S new file mode 100644 index 0000000000..03d8dbbbae --- /dev/null +++ b/libavcodec/riscv/vp9_mc_rvi.S @@ -0,0 +1,43 @@ +/* + * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +func ff_copy8_rvi +1: + addi a4, a4, -1 + ld t4, (a2) + sd t4, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + ret +endfunc + +func ff_copy4_rvi +1: + addi a4, a4, -1 + lw t4, (a2) + sw t4, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + ret +endfunc diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S new file mode 100644 index 0000000000..ba9ec3431f --- /dev/null +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -0,0 +1,64 @@ +/* + * Copyright (c) 2024 Institue of Software Chinese Academy of Sciences (ISCAS). + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/riscv/asm.S" + +.macro copy_avg len type +.ifc \type,avg + csrwi vxrm, 0 +.endif +.ifc \len,64 + li t5, 64 + vsetvli t0, t5, e8, m4, ta, ma +.elseif \len == 32 + li t5, 32 + vsetvli t0, t5, e8, m2, ta, ma +.elseif \len == 16 + vsetivli t0, 16, e8, m1, ta, ma +.elseif \len == 8 + vsetivli t0, 8, e8, mf2, ta, ma +.elseif \len == 4 + vsetivli t0, 4, e8, mf4, ta, ma +.endif +1: + addi a4, a4, -1 + vle8.v v8, (a2) +.ifc \type,avg + vle8.v v16, (a0) + vaaddu.vv v8, v8, v16 +.endif + vse8.v v8, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + ret +.endm + +.irp len 64, 32, 16 +func ff_copy\len\()_rvv, zve32x + copy_avg \len copy +endfunc +.endr + +.irp len 64, 32, 16, 8, 4 +func ff_avg\len\()_rvv, zve32x + copy_avg \len avg +endfunc +.endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index f08c8f6a42..c602c38bb2 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -65,7 +65,54 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) #endif } +static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) +{ +#if HAVE_RV + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_RVI) { + dsp->mc[3][FILTER_8TAP_SMOOTH][0][0][0] = ff_copy8_rvi; + dsp->mc[3][FILTER_8TAP_REGULAR][0][0][0] = ff_copy8_rvi; + dsp->mc[3][FILTER_8TAP_SHARP][0][0][0] = ff_copy8_rvi; + dsp->mc[3][FILTER_BILINEAR][0][0][0] = ff_copy8_rvi; + dsp->mc[4][FILTER_8TAP_SMOOTH][0][0][0] = ff_copy4_rvi; + dsp->mc[4][FILTER_8TAP_REGULAR][0][0][0] = ff_copy4_rvi; + dsp->mc[4][FILTER_8TAP_SHARP][0][0][0] = ff_copy4_rvi; + dsp->mc[4][FILTER_BILINEAR][0][0][0] = ff_copy4_rvi; + } + +#if HAVE_RVV + if (bpp == 8 && flags & AV_CPU_FLAG_RVV_I64 && ff_get_rv_vlenb() >= 16) { + +#define init_fpel(idx1, idx2, sz, type) \ + dsp->mc[idx1][FILTER_8TAP_SMOOTH ][idx2][0][0] = ff_##type##sz##_rvv; \ + dsp->mc[idx1][FILTER_8TAP_REGULAR][idx2][0][0] = ff_##type##sz##_rvv; \ + dsp->mc[idx1][FILTER_8TAP_SHARP ][idx2][0][0] = ff_##type##sz##_rvv; \ + dsp->mc[idx1][FILTER_BILINEAR ][idx2][0][0] = ff_##type##sz##_rvv + +#define init_copy_avg(idx, sz) \ + init_fpel(idx, 0, sz, copy); \ + init_fpel(idx, 1, sz, avg) + +#define init_avg(idx, sz) \ + init_fpel(idx, 1, sz, avg) + + init_copy_avg(0, 64); + init_copy_avg(1, 32); + init_copy_avg(2, 16); + init_avg(3, 8); + init_avg(4, 4); + +#undef init_copy_avg +#undef init_avg +#undef init_fpel + } +#endif +#endif +} + av_cold void ff_vp9dsp_init_riscv(VP9DSPContext *dsp, int bpp, int bitexact) { vp9dsp_intrapred_init_rvv(dsp, bpp); + vp9dsp_mc_init_rvv(dsp, bpp); } -- 2.44.0 From patchwork Fri Mar 22 06:04:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 47302 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a4a:b0:1a3:31a3:7958 with SMTP id zu10csp1007558pzb; Thu, 21 Mar 2024 23:04:33 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV98nxmjFj27eniCEx4pAlN2P96aB06PUfFGkGYGkNJZYLZbLObI0DiXAlVd7dZWdgLq5+G6Y39WBq3CsV86xxjJVE5dBDPKXoipw== X-Google-Smtp-Source: AGHT+IHBmq/yxnz90n8JsMyJXypzZDgAuZTmnJnEgkEG+ZaEgQK5MydSJtEOQq1QRSaEZuAFc7NN X-Received: by 2002:a50:8a96:0:b0:568:bfcd:e895 with SMTP id j22-20020a508a96000000b00568bfcde895mr650435edj.41.1711087473020; Thu, 21 Mar 2024 23:04:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1711087473; cv=none; d=google.com; s=arc-20160816; b=iGx0hmFpC3dEg9gXSsn/aUwh76UVBY74OGgqXucOGD5U/NOEOId0LTa65ZoVs2r0s9 6r/0MeA9z4rz27rFBQ1urv8kWM+isvFKQIKIfj/7JhpvUDmqvGt2KY66T3ojn3AjVCp3 erJIMpCyqy/l8VHFQrHrG/0y+46XzO8fJ1RvRBeJBiD01W6bVb09SANe6zp6fcC7nYp/ 36n+bZK09uvWLvSt2obtcDj28Zuqe9U3C7EsYaevmR3NuitTOIWs53/Fm9FrLs6ShMMt cQe1gwNwUntv/nAmkssDtkPI/W0ZHSDhAAmZt52tkbickzd8X5MJd4p7gko61Oh4pBKW vPcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=u5MQKehGrUFlTsiI0EJXGgjwEWkC7vEUP9UaM9RAMmU=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=bI6pGFBZIxfNMFM/lDYC1WdRJxIPftjOX32ELO0xl27/gMmaQ3rMaRkaW7IxMmCugB wBaAj80vLAgxm+SCY/48xXe99R6aLdAkCsV+68U65w1mVywwNsZs9WVdDdsHdEhaALb2 g8L9+dG+QRp/ZvEwKAGFCVpvGYO8+W2GszeyUzJ7Dc+1US0IeUUIBNsQKWRKVfH+evRf k1esjTciTm26RUAuguUVkxcLKAqjZWsWwppfF95AiAY/iZTQOPR6kLO5fagsvKj7J1mz eRmRshDYCDuezNc1k2YlehHmtbb9pe53QPWxf3qT1HKa9h3jqluJzfVYWEzW/bHmxqKh zYUw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=IXpwA9gj; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j8-20020a05640211c800b005683ccfad2fsi598040edw.644.2024.03.21.23.04.32; Thu, 21 Mar 2024 23:04:33 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=IXpwA9gj; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B4A9268D4AF; Fri, 22 Mar 2024 08:04:30 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 70C8E68D574 for ; Fri, 22 Mar 2024 08:04:23 +0200 (EET) Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-690de619293so11518816d6.0 for ; Thu, 21 Mar 2024 23:04:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711087462; x=1711692262; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=qRPYROYdEPgZpPORY5qH5HTkpI3U21sDiZEH6TDp784=; b=IXpwA9gjtPgFbQwhZIiUN0xahMeZlOAcgFMwAdERXdjOy/7IVAg3n1DC7oBsWXLF/b B3ov6cIYrNbemTkeI9c9li+2CxiWIg18D5yXI5Qttj33fEODnWbLDoTRQz7azrDqJ/KS eitMoUq+3rWT5g5lCXaITE+AOzbyJe2hAE8IpZ5ycId1wtrJ9/hxpnwIZPKBaZnySwPI X/Mf3Y6rbfrYCdaii9bL2+0EhrDZW1qI8n5DnqHEDBzI9pNAG6hH6c4HSlL/oGltUPue KwZZFRn1Od8umltiLTNy0SyfFOYlAqiVWksOjnNNw0gObqmPlDGBr+kPTp0Z4jPMEIHM zm9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711087462; x=1711692262; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=qRPYROYdEPgZpPORY5qH5HTkpI3U21sDiZEH6TDp784=; b=tNkNmNAwgayPr7wX3PUBuHCvOwF0zgv5tRnwa3W+yi6EmVRvmoUkYWaNrx0Ogv7YRN ZB1AG2lCClIjvyePH3RwIQi/mDFE32azzTs7DmakFwPDtZV9P3kdbWBcFQQc/Csp2P4v 73wI2Ird3sWJGMD+RlXKfq8R5h4lsQx/OAXF+YyfNHCPNztSigszDY5NdvZo329S2j7K dMhHltcwRRdvmnsF+X8bXKbfRF5Jbg/27g1URAfrzWxslp7EDa6IO1F5kkKQq+0E9KVY /Yo/RGqTqEYjzD8EAV1LYlzsQETnOgb0HF8vbXmbLHbnNX9WbLcLHweH4eH02wxLZWIW hafQ== X-Gm-Message-State: AOJu0YwB/M/9IHqwDQrU8UEPYpo714oTV3b+u98m81NL0KJ93TJh7FUS fFLJfio5/r+OjENHcH9r2zm7Z7kbhD214YMI4KRuNyjATi+kTqQOtF6V9pVpv4BuJOOtZjuK6Ig VLyPT7XfqgZddTxGtJj8CiVUVko5ari1QqWk= X-Received: by 2002:a05:6214:f6a:b0:690:d42a:d4f2 with SMTP id iy10-20020a0562140f6a00b00690d42ad4f2mr1191223qvb.49.1711087462176; Thu, 21 Mar 2024 23:04:22 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Fri, 22 Mar 2024 14:04:10 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 2/7] lavc/vp9dsp: R-V V mc bilin h X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 02oF/yoYhriH From 7ad03f4bc70e4c334d8e52dce2ea2b6f09a9a244 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Thu, 21 Mar 2024 22:11:26 +0800 Subject: [PATCH 2/7] lavc/vp9dsp: R-V V mc bilin h C908: vp9_avg_bilin_4h_8bpp_c: 5.5 vp9_avg_bilin_4h_8bpp_rvv_i64: 2.5 vp9_avg_bilin_8h_8bpp_c: 19.7 vp9_avg_bilin_8h_8bpp_rvv_i64: 5.0 vp9_avg_bilin_16h_8bpp_c: 78.2 vp9_avg_bilin_16h_8bpp_rvv_i64: 10.0 vp9_avg_bilin_32h_8bpp_c: 325.2 vp9_avg_bilin_32h_8bpp_rvv_i64: 28.5 vp9_avg_bilin_64h_8bpp_c: 1266.2 vp9_avg_bilin_64h_8bpp_rvv_i64: 115.0 vp9_put_bilin_4h_8bpp_c: 4.5 vp9_put_bilin_4h_8bpp_rvv_i64: 2.2 vp9_put_bilin_8h_8bpp_c: 16.7 vp9_put_bilin_8h_8bpp_rvv_i64: 4.2 vp9_put_bilin_16h_8bpp_c: 65.2 vp9_put_bilin_16h_8bpp_rvv_i64: 8.7 vp9_put_bilin_32h_8bpp_c: 273.5 vp9_put_bilin_32h_8bpp_rvv_i64: 26.7 vp9_put_bilin_64h_8bpp_c: 1041.0 vp9_put_bilin_64h_8bpp_rvv_i64: 87.2 --- libavcodec/riscv/vp9_mc_rvv.S | 73 ++++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp_init.c | 17 ++++++++ 2 files changed, 90 insertions(+) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index ba9ec3431f..a97807633e 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -51,6 +51,72 @@ ret .endm +.macro bilin_h_load dst len type +.ifc \len,4 + vsetivli zero, 5, e8, mf2, ta, ma +.elseif \len == 8 + vsetivli zero, 9, e8, m1, ta, ma +.elseif \len == 16 + vsetivli zero, 17, e8, m2, ta, ma +.elseif \len == 32 + li t0, 33 + vsetvli zero, t0, e8, m4, ta, ma +.elseif \len == 64 + li t0, 65 + vsetvli zero, t0, e8, m8, ta, ma +.endif + + vle8.v v8, (a2) + vslide1down.vx v0, v8, t5 + +.ifc \len,4 + vsetivli zero, 4, e8, mf4, ta, ma +.elseif \len == 8 + vsetivli zero, 8, e8, mf2, ta, ma +.elseif \len == 16 + vsetivli zero, 16, e8, m1, ta, ma +.elseif \len == 32 + li t0, 32 + vsetvli zero, t0, e8, m2, ta, ma +.elseif \len == 64 + li t0, 64 + vsetvli zero, t0, e8, m4, ta, ma +.endif + + vwmulu.vx v16, v0, a5 + vwmaccsu.vx v16, t1, v8 + vwadd.wx v16, v16, t4 + vnsra.wi v16, v16, 4 + vadd.vv \dst, v16, v8 + +.ifc \type,put + vadd.vv \dst, v16, v8 +.elseif \type == avg + vadd.vv v16, v16, v8 + vle8.v \dst, (a0) + vaaddu.vv \dst, \dst, v16 +.endif + +.endm + +.macro bilin_h len type +.ifc \type,avg + csrwi vxrm, 0 +.endif + li t4, 8 + li t5, 1 + neg t1, a5 +1: + addi a4, a4, -1 + bilin_h_load v0, \len, \type + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + .irp len 64, 32, 16 func ff_copy\len\()_rvv, zve32x copy_avg \len copy @@ -61,4 +127,11 @@ endfunc func ff_avg\len\()_rvv, zve32x copy_avg \len avg endfunc + +func ff_put_bilin_\len\()h_rvv, zve32x + bilin_h \len put +endfunc +func ff_avg_bilin_\len\()h_rvv, zve32x + bilin_h \len avg +endfunc .endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index c602c38bb2..d6d6fb52cc 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -106,6 +106,23 @@ static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) #undef init_copy_avg #undef init_avg #undef init_fpel + +#define init_subpel1(idx1, idx2, idxh, idxv, sz, dir, type) \ + dsp->mc[idx1][FILTER_BILINEAR ][idx2][idxh][idxv] = \ + ff_##type##_bilin_##sz##dir##_rvv; + +#define init_subpel2(idx, idxh, idxv, dir, type) \ + init_subpel1(0, idx, idxh, idxv, 64, dir, type); \ + init_subpel1(1, idx, idxh, idxv, 32, dir, type); \ + init_subpel1(2, idx, idxh, idxv, 16, dir, type); \ + init_subpel1(3, idx, idxh, idxv, 8, dir, type); \ + init_subpel1(4, idx, idxh, idxv, 4, dir, type) + +#define init_subpel3(idx, type) \ + init_subpel2(idx, 1, 0, h, type) + + init_subpel3(0, put); + init_subpel3(1, avg); } #endif #endif -- 2.44.0 From patchwork Fri Mar 22 06:04:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 47300 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a4a:b0:1a3:31a3:7958 with SMTP id zu10csp1007736pzb; Thu, 21 Mar 2024 23:05:00 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVlXICSbRnj9MoeF5J/0OAYwbftLR0qxZb9eSPFEcTMMKJ89h8rEmtjGLZBDefQ5SEGbywRfwWyncUwBHdwrHTKKNAQvm6o/yM8bg== X-Google-Smtp-Source: AGHT+IENS1fyFckEHCYebn3C0aSpd5ONx8WZ62/iQcCCh6e0fs7KtTAeTVblzSqKRYbdbde8AsIj X-Received: by 2002:a17:906:e290:b0:a46:f018:3f1d with SMTP id gg16-20020a170906e29000b00a46f0183f1dmr928502ejb.73.1711087499667; Thu, 21 Mar 2024 23:04:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1711087499; cv=none; d=google.com; s=arc-20160816; b=Ty1rPh1U4I9J1x1g3TsH1M/kuj+t+D/fRB+rJSMPV1Y8iR8cB1TbQ0oAejxFDZvgjC LQQ5pya8Q8wxK4ZwJykKRIpQWFmoZAxW5h3P4FxQU1m85ysG/q5CjMkDxmE9OP773qzv lYRvSb+qYe0Aw+8ejia41fnmPX/lHgrQwFpC/hT6iHjmbXAqqPGNFWovRCIkY/yPRIXi gBTN8Mn6TdSykcMHRRzNTiKM5C6gUaB4F2mgsQkWCvO1YmOLoa+aW0Av7ZHraLn2qiYT hJKm6yvMy3HLwVEs+CXau6ziph8TNqKd4nxcyg/Um+8iQPP87mwV4ZRpWOvRVKlerbsD cdFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=pfc2MuQjsF8ODP/bHLxww83OZg62SHvy9KKQmiUtIMU=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=pxp4St6KlvzcQo7zxdZ7JHT2dncogBnmyBDFAPeIK6E49C6GbM+ugKF6/3H9u6I8ZX hpq7na2C0DY7OUFTJlMBJqzPjq0xKzYrZrIo4yWy4JrjLlH6Bd7G/hJ35Bivt+Ph6rhN P/iBnxMvuGXm8boahn8YWhT6dkj9XhftXJ1TeD8/I4CIBQIWPq3u9AynCGRgQWuJ11Iw r4x4YLnN9aeidQG21bumkHg9LTj99/sUIAk9SwKQjscb92s0HCXLrIvoQnPPTwtTALIM 7EsnS+OHqbZoTAPtCMu66blMPd9TvLgHAilMY4fhH49QLBV662IJzizyKeTsrPiDK14F fzFg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=PdNlkyM7; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v20-20020a170906565400b00a465f304271si600396ejr.483.2024.03.21.23.04.59; Thu, 21 Mar 2024 23:04:59 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=PdNlkyM7; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E836E68D591; Fri, 22 Mar 2024 08:04:56 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f46.google.com (mail-qv1-f46.google.com [209.85.219.46]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id ED6E468D582 for ; Fri, 22 Mar 2024 08:04:50 +0200 (EET) Received: by mail-qv1-f46.google.com with SMTP id 6a1803df08f44-69629b4ae2bso15697236d6.3 for ; Thu, 21 Mar 2024 23:04:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711087489; x=1711692289; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=vSTu/xIppxaUItVIJEDZHvelmqsSYHIa7TJ7tSd/Ggc=; b=PdNlkyM79wEAZQ4bjvAk72+ShlqwIwaQaG5HRiUOZBTGf7Yj0GuoBwgjgL0v3KKMUr lnFzn+D35nST/f7WrXffSrfPSG3hpIplSGO4JQAgTcMcjNabu+8zlru8iFhrVLtUPvok vZo/UayStwj0AJsTtGH4fn4kLnDj12C2KH+UH4BNIGHLNQQPZ7WcXnQ0+WqWmp7wOhOa NCv/odBKPMEvVokLietE/UYD2/9TTThkmwMor5J6mQMWPN/JsyJJr1+Yf6r5wqgy8+UK KFug19Y+swaXAv6psu1Xo3ywqJq+bJerebWCm7wWdtqEuBt7cqy5oH9Qpj8BJSucmagC Zm4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711087489; x=1711692289; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=vSTu/xIppxaUItVIJEDZHvelmqsSYHIa7TJ7tSd/Ggc=; b=GZiRhLrybR3m0bCqsnCGH3LKtwjJmQrhkJODAVNfOXObXCGHmDO0Q2wFaRI62oZ1FN q6gb+NAggNqc5oyK2Pf9kppHXNkpD6jRitBZ7NRSyG5J6G7GRXKmP7CBfU5XKlhwsRKY wzd6nDLoallzriSG8guDuw5QnMe5tBBs3W2muSWlk7eLXP39AGVpFBH3rFiTVU6KjO2h YKonsBWGI9pBrI65q0nSNe7gwoGP35QvjhuRXhJ4OWjoTolVxiA2AEotNJY1yqKXVWFp Dso/6bZX9LAFBZecXSOT9ZiiVCIT8A8ykD51eLYKVirrl6G0k+pznxl7ZaeS2DPmu2d6 Mogg== X-Gm-Message-State: AOJu0Ywz4nh9jlk3b6aOkyWcmzlK2URpLHE5ykDXEQW7ByjX1g3B6nrO kD2PPBt1Pz1+8Sv3lznV7pSSMiSXRr3BIkDRRcVZhjrjNaH2efajnC5kFmazQqb0yEoiuK+/3Sj kla90Xant4bSV0un2doE2DY3P7Ir8AwzciWvLmg== X-Received: by 2002:a05:6214:2627:b0:696:4f6b:2c95 with SMTP id gv7-20020a056214262700b006964f6b2c95mr1705986qvb.43.1711087489568; Thu, 21 Mar 2024 23:04:49 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Fri, 22 Mar 2024 14:04:38 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 3/7] lavc/vp9dsp: R-V V mc tap h X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: QfidEAZupJ3R The order of some instructions appears imperfect because, when len==32, the registers for operations like hv can only just suffice, making it difficult to adjust. It's possible to create a separate function for len<32, but it likely won't have a significant impact, so this hasn't been done yet. From d9044b400f5a161928a920f0399e5e0715f0c8e6 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Thu, 21 Mar 2024 22:53:59 +0800 Subject: [PATCH 3/7] lavc/vp9dsp: R-V V mc tap h C908: vp9_avg_8tap_smooth_4h_8bpp_c: 12.7 vp9_avg_8tap_smooth_4h_8bpp_rvv_i64: 5.0 vp9_avg_8tap_smooth_8h_8bpp_c: 48.5 vp9_avg_8tap_smooth_8h_8bpp_rvv_i64: 9.2 vp9_avg_8tap_smooth_16h_8bpp_c: 191.7 vp9_avg_8tap_smooth_16h_8bpp_rvv_i64: 21.0 vp9_avg_8tap_smooth_32h_8bpp_c: 780.0 vp9_avg_8tap_smooth_32h_8bpp_rvv_i64: 66.5 vp9_avg_8tap_smooth_64h_8bpp_c: 3123.7 vp9_avg_8tap_smooth_64h_8bpp_rvv_i64: 264.2 vp9_put_8tap_smooth_4h_8bpp_c: 11.0 vp9_put_8tap_smooth_4h_8bpp_rvv_i64: 4.2 vp9_put_8tap_smooth_8h_8bpp_c: 42.0 vp9_put_8tap_smooth_8h_8bpp_rvv_i64: 8.2 vp9_put_8tap_smooth_16h_8bpp_c: 165.5 vp9_put_8tap_smooth_16h_8bpp_rvv_i64: 19.7 vp9_put_8tap_smooth_32h_8bpp_c: 659.0 vp9_put_8tap_smooth_32h_8bpp_rvv_i64: 64.0 vp9_put_8tap_smooth_64h_8bpp_c: 2682.0 vp9_put_8tap_smooth_64h_8bpp_rvv_i64: 272.2 --- libavcodec/riscv/vp9_mc_rvv.S | 232 +++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp_init.c | 8 +- 2 files changed, 239 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index a97807633e..eacc174bc4 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -123,6 +123,230 @@ func ff_copy\len\()_rvv, zve32x endfunc .endr +subpel_filters_regular: + .byte 0, 0, 0, 128, 0, 0, 0, 0 + .byte 0, 1, -5, 126, 8, -3, 1, 0 + .byte -1, 3, -10, 122, 18, -6, 2, 0 + .byte -1, 4, -13, 118, 27, -9, 3, -1 + .byte -1, 4, -16, 112, 37, -11, 4, -1 + .byte -1, 5, -18, 105, 48, -14, 4, -1 + .byte -1, 5, -19, 97, 58, -16, 5, -1 + .byte -1, 6, -19, 88, 68, -18, 5, -1 + .byte -1, 6, -19, 78, 78, -19, 6, -1 + .byte -1, 5, -18, 68, 88, -19, 6, -1 + .byte -1, 5, -16, 58, 97, -19, 5, -1 + .byte -1, 4, -14, 48, 105, -18, 5, -1 + .byte -1, 4, -11, 37, 112, -16, 4, -1 + .byte -1, 3, -9, 27, 118, -13, 4, -1 + .byte 0, 2, -6, 18, 122, -10, 3, -1 + .byte 0, 1, -3, 8, 126, -5, 1, 0 +subpel_filters_sharp: + .byte 0, 0, 0, 128, 0, 0, 0, 0 + .byte -1, 3, -7, 127, 8, -3, 1, 0 + .byte -2, 5, -13, 125, 17, -6, 3, -1 + .byte -3, 7, -17, 121, 27, -10, 5, -2 + .byte -4, 9, -20, 115, 37, -13, 6, -2 + .byte -4, 10, -23, 108, 48, -16, 8, -3 + .byte -4, 10, -24, 100, 59, -19, 9, -3 + .byte -4, 11, -24, 90, 70, -21, 10, -4 + .byte -4, 11, -23, 80, 80, -23, 11, -4 + .byte -4, 10, -21, 70, 90, -24, 11, -4 + .byte -3, 9, -19, 59, 100, -24, 10, -4 + .byte -3, 8, -16, 48, 108, -23, 10, -4 + .byte -2, 6, -13, 37, 115, -20, 9, -4 + .byte -2, 5, -10, 27, 121, -17, 7, -3 + .byte -1, 3, -6, 17, 125, -13, 5, -2 + .byte 0, 1, -3, 8, 127, -7, 3, -1 +subpel_filters_smooth: + .byte 0, 0, 0, 128, 0, 0, 0, 0 + .byte -3, -1, 32, 64, 38, 1, -3, 0 + .byte -2, -2, 29, 63, 41, 2, -3, 0 + .byte -2, -2, 26, 63, 43, 4, -4, 0 + .byte -2, -3, 24, 62, 46, 5, -4, 0 + .byte -2, -3, 21, 60, 49, 7, -4, 0 + .byte -1, -4, 18, 59, 51, 9, -4, 0 + .byte -1, -4, 16, 57, 53, 12, -4, -1 + .byte -1, -4, 14, 55, 55, 14, -4, -1 + .byte -1, -4, 12, 53, 57, 16, -4, -1 + .byte 0, -4, 9, 51, 59, 18, -4, -1 + .byte 0, -4, 7, 49, 60, 21, -3, -2 + .byte 0, -4, 5, 46, 62, 24, -3, -2 + .byte 0, -4, 4, 43, 63, 26, -2, -2 + .byte 0, -3, 2, 41, 63, 29, -2, -2 + .byte 0, -3, 1, 38, 64, 32, -1, -3 + +.macro epel_filter name type regtype + lla \regtype\()2, subpel_filters_\name + li \regtype\()1, 8 + mul \regtype\()0, a5, \regtype\()1 + add \regtype\()0, \regtype\()0, \regtype\()2 + .irp n 1,2,3,4,5,6 + lb \regtype\n, \n(\regtype\()0) + .endr +.ifc \regtype,t + lb a7, 7(\regtype\()0) +.elseif \regtype == s + lb s7, 7(\regtype\()0) +.endif + lb \regtype\()0, 0(\regtype\()0) +.endm + +.macro epel_load dst len do name type from_mem regtype + li a5, 64 +.ifc \from_mem, 1 + vle8.v v22, (a2) + addi a2, a2, -1 + vle8.v v20, (a2) + addi a2, a2, 2 + vle8.v v24, (a2) + addi a2, a2, 1 + vle8.v v26, (a2) + addi a2, a2, 1 + vle8.v v28, (a2) + addi a2, a2, 1 + vle8.v v30, (a2) + +.ifc \name,smooth + vwmulu.vx v16, v24, \regtype\()4 + vwmaccu.vx v16, \regtype\()2, v20 + vwmaccu.vx v16, \regtype\()5, v26 + vwmaccsu.vx v16, \regtype\()6, v28 +.else + vwmulu.vx v16, v28, \regtype\()6 + vwmaccsu.vx v16, \regtype\()2, v20 + vwmaccsu.vx v16, \regtype\()5, v26 +.endif + +.ifc \regtype,t + vwmaccsu.vx v16, a7, v30 +.elseif \regtype == s + vwmaccsu.vx v16, s7, v30 +.endif + + addi a2, a2, -6 + vle8.v v28, (a2) + addi a2, a2, -1 + vle8.v v26, (a2) + addi a2, a2, 3 + +.ifc \name,smooth + vwmaccsu.vx v16, \regtype\()1, v28 +.else + vwmaccu.vx v16, \regtype\()1, v28 + vwmulu.vx v28, v24, \regtype\()4 +.endif + vwmaccsu.vx v16, \regtype\()0, v26 + vwmulu.vx v20, v22, \regtype\()3 +.else +.ifc \name,smooth + vwmulu.vx v16, v8, \regtype\()4 + vwmaccu.vx v16, \regtype\()2, v4 + vwmaccu.vx v16, \regtype\()5, v10 + vwmaccsu.vx v16, \regtype\()6, v12 + vwmaccsu.vx v16, \regtype\()1, v2 +.else + vwmulu.vx v16, v2, \regtype\()1 + vwmaccu.vx v16, \regtype\()6, v12 + vwmaccsu.vx v16, \regtype\()5, v10 + vwmaccsu.vx v16, \regtype\()2, v4 + vwmulu.vx v28, v8, \regtype\()4 +.endif + vwmaccsu.vx v16, \regtype\()0, v0 + vwmulu.vx v20, v6, \regtype\()3 + +.ifc \regtype,t + vwmaccsu.vx v16, a7, v14 +.elseif \regtype == s + vwmaccsu.vx v16, s7, v14 +.endif + +.endif + vwadd.wx v16, v16, a5 +.ifc \len,4 + vsetvli zero, zero, e16, mf2, ta, ma +.elseif \len == 8 + vsetvli zero, zero, e16, m1, ta, ma +.elseif \len == 16 + vsetvli zero, zero, e16, m2, ta, ma +.else + vsetvli zero, zero, e16, m4, ta, ma +.endif + +.ifc \name,smooth + vwadd.vv v24, v16, v20 +.else + vwadd.vv v24, v16, v28 + vwadd.wv v24, v24, v20 +.endif + vnsra.wi v24, v24, 7 + vmax.vx v24, v24, zero +.ifc \len,4 + vsetvli zero, zero, e8, mf4, ta, ma +.elseif \len == 8 + vsetvli zero, zero, e8, mf2, ta, ma +.elseif \len == 16 + vsetvli zero, zero, e8, m1, ta, ma +.else + vsetvli zero, zero, e8, m2, ta, ma +.endif + +.ifc \do,put + vnclipu.wi \dst, v24, 0 +.elseif \do == avg + vle8.v \dst, (a0) + vnclipu.wi v24, v24, 0 + vaaddu.vv \dst, \dst, v24 +.endif + +.endm + +.macro epel_load_inc dst len do name type from_mem regtype + epel_load \dst \len \do \name \type \from_mem \regtype + add a2, a2, a3 +.endm + +.macro epel len do name type + epel_filter \name \type t + +.ifc \len,4 + vsetivli zero, 4, e8, mf4, ta, ma +.elseif \len == 8 + vsetivli zero, 8, e8, mf2, ta, ma +.elseif \len == 16 + vsetivli zero, 16, e8, m1, ta, ma +.else + li a5, 32 + vsetvli zero, a5, e8, m2, ta, ma +.endif +.ifc \do,avg + csrwi vxrm, 0 +.endif + +1: + addi a4, a4, -1 + epel_load v30 \len \do \name \type 1 t + vse8.v v30, (a0) +.ifc \len,64 + addi a0, a0, 32 + addi a2, a2, 32 + epel_load v30 \len \do \name \type 1 t + vse8.v v30, (a0) + addi a0, a0, -32 + addi a2, a2, -32 +.endif + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + +.macro gen_epel len do name type +func ff_\do\()_8tap_\name\()_\len\()\type\()_rvv, zve32x + epel \len \do \name \type +endfunc +.endm + .irp len 64, 32, 16, 8, 4 func ff_avg\len\()_rvv, zve32x copy_avg \len avg @@ -134,4 +358,12 @@ endfunc func ff_avg_bilin_\len\()h_rvv, zve32x bilin_h \len avg endfunc + +.irp name regular sharp smooth + .irp do put avg + .irp type h + gen_epel \len \do \name \type + .endr + .endr +.endr .endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index d6d6fb52cc..413b203e5f 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -109,7 +109,13 @@ static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) #define init_subpel1(idx1, idx2, idxh, idxv, sz, dir, type) \ dsp->mc[idx1][FILTER_BILINEAR ][idx2][idxh][idxv] = \ - ff_##type##_bilin_##sz##dir##_rvv; + ff_##type##_bilin_##sz##dir##_rvv; \ + dsp->mc[idx1][FILTER_8TAP_SMOOTH ][idx2][idxh][idxv] = \ + ff_##type##_8tap_smooth_##sz##dir##_rvv; \ + dsp->mc[idx1][FILTER_8TAP_REGULAR][idx2][idxh][idxv] = \ + ff_##type##_8tap_regular_##sz##dir##_rvv; \ + dsp->mc[idx1][FILTER_8TAP_SHARP ][idx2][idxh][idxv] = \ + ff_##type##_8tap_sharp_##sz##dir##_rvv; #define init_subpel2(idx, idxh, idxv, dir, type) \ init_subpel1(0, idx, idxh, idxv, 64, dir, type); \ -- 2.44.0 From patchwork Fri Mar 22 06:04:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 47301 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a4a:b0:1a3:31a3:7958 with SMTP id zu10csp1007844pzb; Thu, 21 Mar 2024 23:05:14 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUfW/o7PDZF1e4I+Fy6Cg8s9ST9YjumUE6GKBniX56d8L17NXvUJzGYh3sVA2UhsKAGLfKe2b07tu4eMmfDCb+QCa0PIBOC3JZRnA== X-Google-Smtp-Source: AGHT+IG8HKZZ3WZspUj1b/mA9xxvCopwHZHx32EWkaXn0CeITMiTOf5AnHord63zszX1/TbWF5EO X-Received: by 2002:a17:907:7248:b0:a47:33f1:419 with SMTP id ds8-20020a170907724800b00a4733f10419mr246693ejc.2.1711087514639; Thu, 21 Mar 2024 23:05:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1711087514; cv=none; d=google.com; s=arc-20160816; b=BUsFx9jE7ca/iVzTmnlM5CJsVwvRz3jSqPJhTVa59fAmqevivaBE2mYDuI69Sv+a7K 9VuDY8H2PLdHRW6AayYjot6CNLFz/W6J1O8556Hb2+Vq6NwHP0VTUdZ1opuDsxe+RH5i HltOV+e6ysRGDytVIsbF2CnSioL4Bv8adTUjLR0dWmHOvH3Q2JK32N/tNL0fKsIlMHbw iwqVdyGXUtaLEPqkDsjYfYquO64UWVPY8n42XO8kCRDgRuc3TEgTSbr4WABp7XyYPb9m NEE6MQX+/IIM6GFcaGVbHvpuvYMU5XcKebM+RxToHK189KlMs1sLjfLQTeIy9j52S6xO cfYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=Vpr4UkfkITbVwenQuXOLiUYPk25jQxZeT3XrerO234k=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=cx5fzw2Y/mBcJSq8wxCYcWKYuQI+vDx4baV38UHh+XzZylKPjcaCv7dpjDPUHjfSB8 b39LovZ0FEsd4rL5pBjGGSMLWjr9bMbzYT0ZlXGh4j4Hp1DivBqxTP8p2VzhT258IUpt sEd3cfoymQTRES+s0E8utsgNw8HFEFpB+2BsWQ/B4pR7eUmn+cBd82SDBj7lwjU2zN/w z3NT8v6upTMzIfOakeU4RITsLkP4GyNdtyUqJguXgnZxqUBibxTjXvvn8+jcC4v/6U92 IVy2vqc9PMEoh+yZUEnGMNJcv5D8sd0xLsog1XYSrfKjBPM0C2e1T+2JC2rMM+Fy0B60 +jyg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=BR1fuEVu; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id pv28-20020a170907209c00b00a46bda1dbf5si629575ejb.567.2024.03.21.23.05.14; Thu, 21 Mar 2024 23:05:14 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=BR1fuEVu; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 15FBA68D598; Fri, 22 Mar 2024 08:05:12 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-yb1-f170.google.com (mail-yb1-f170.google.com [209.85.219.170]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 776AC68D587 for ; Fri, 22 Mar 2024 08:05:05 +0200 (EET) Received: by mail-yb1-f170.google.com with SMTP id 3f1490d57ef6-dc74e33fe1bso1788685276.0 for ; Thu, 21 Mar 2024 23:05:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711087504; x=1711692304; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=Dz6Q8SUggSG8TIsQVduhySdgzwY1aXe/jTuF7AXmKHQ=; b=BR1fuEVuFIKyBHmqXoz+vZiLKDmFUhsuM25gtf9E0YxgZu+uc76E0a91hURnYWAh1a pTTE74g4yT5zBK6n5X8Pqdfgs7OrsxkGRQSGpzceu5HzCFULjp8OJBXt8sMbX6vPw5DW R4D8WYhV3szVg51W1q+lCG9sCPaQsmi4tC6AXashxzYZboq/UuXqPC4daEoH7kRdSDoF hIwo8IF9oYbjpDAFhOThMpxoaCywpf/9kmQBDgdlj5uurTSlwtoL2MU1fjx3aH+ADO4v 3GAaM43t20HUEC0iW86ocivlH9npHDQN+BglN3wWUK5ZA698hfl7r7Zif2x9GsVkQpse dK7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711087504; x=1711692304; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Dz6Q8SUggSG8TIsQVduhySdgzwY1aXe/jTuF7AXmKHQ=; b=NjYDfo+NiP+9cHzrujh44tMdIFrjm2X0/LJgh9i2f0aEGqqGP8U2AC6ImYBkc0JZtK ct6iVhu/csbRRk8DjsyIrC7JlZJnLtpiuZ8mHZxfUEhQ9XRs+3RFAHoMAadWgsjl8CIM USna7CnfZ1HA3mGT7x9g1mx93YBUpI4dTPkHGqKJhP4jbG8bF1bND8VZJbtYlXYFFDix e5Rl4IXsJcRz15YsKuCDJjjdNA8UBYuJc/H/ioNfbvdaBlOxYSacsJWRR5gPXaFI9P+X Sa4bkZmYpPT8DWs8CKL01lQNRpQgYZVI9q+fulIbyf1DWb65fhMI2RW8qcwNHElJNNT+ mGkg== X-Gm-Message-State: AOJu0YxVJlSBSfw5fKIYCcUIhzmHgnRds4rwEyjCk7+HfOpcqzTfEOoX MQCqkDi7eEu1qY5DG4VwgRIjksH12dR3twlSj1dGcAF2UGO3oI5afKoILDSrimnpjAN4VYwgNiA qVmCWAc/1EURNheiooo55X7zURwvhz/RUUtw= X-Received: by 2002:a25:aa8a:0:b0:dc6:9d35:f9aa with SMTP id t10-20020a25aa8a000000b00dc69d35f9aamr1337333ybi.19.1711087503600; Thu, 21 Mar 2024 23:05:03 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Fri, 22 Mar 2024 14:04:52 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 4/7] lavc/vp9dsp: R-V V mc bilin v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: SIn8j64LeqlP From eb004dcf5cc6a3c379cb6cb7b8592afa65626c5c Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Thu, 21 Mar 2024 23:00:19 +0800 Subject: [PATCH 4/7] lavc/vp9dsp: R-V V mc bilin v C908: vp9_avg_bilin_4v_8bpp_c: 5.5 vp9_avg_bilin_4v_8bpp_rvv_i64: 2.2 vp9_avg_bilin_8v_8bpp_c: 20.7 vp9_avg_bilin_8v_8bpp_rvv_i64: 4.2 vp9_avg_bilin_16v_8bpp_c: 82.2 vp9_avg_bilin_16v_8bpp_rvv_i64: 9.0 vp9_avg_bilin_32v_8bpp_c: 342.5 vp9_avg_bilin_32v_8bpp_rvv_i64: 27.0 vp9_avg_bilin_64v_8bpp_c: 1319.2 vp9_avg_bilin_64v_8bpp_rvv_i64: 93.2 vp9_put_bilin_4v_8bpp_c: 4.7 vp9_put_bilin_4v_8bpp_rvv_i64: 1.7 vp9_put_bilin_8v_8bpp_c: 17.7 vp9_put_bilin_8v_8bpp_rvv_i64: 3.2 vp9_put_bilin_16v_8bpp_c: 69.2 vp9_put_bilin_16v_8bpp_rvv_i64: 7.5 vp9_put_bilin_32v_8bpp_c: 274.2 vp9_put_bilin_32v_8bpp_rvv_i64: 23.2 vp9_put_bilin_64v_8bpp_c: 1109.5 vp9_put_bilin_64v_8bpp_rvv_i64: 82.2 --- libavcodec/riscv/vp9_mc_rvv.S | 49 +++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index eacc174bc4..9458a2e82b 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -117,6 +117,49 @@ ret .endm +.macro bilin_v len type +.ifc \type,avg + csrwi vxrm, 0 +.endif +.ifc \len,4 + vsetivli zero, 4, e8, mf4, ta, ma +.elseif \len == 8 + vsetivli zero, 8, e8, mf2, ta, ma +.elseif \len == 16 + vsetivli zero, 16, e8, m1, ta, ma +.elseif \len == 32 + li t0, 32 + vsetvli zero, t0, e8, m2, ta, ma +.elseif \len == 64 + li t0, 64 + vsetvli zero, t0, e8, m4, ta, ma +.endif + li t4, 8 + neg t1, a6 +1: + add t2, a2, a3 + addi a4, a4, -1 + vle8.v v0, (a2) + vle8.v v8, (t2) +.ifc \type,avg + vle8.v v16, (a0) +.endif + vwmulu.vx v24, v8, a6 + vwmaccsu.vx v24, t1, v0 + vwadd.wx v24, v24, t4 + vnsra.wi v24, v24, 4 + vadd.vv v0, v24, v0 +.ifc \type,avg + vaaddu.vv v0, v0, v16 +.endif + vse8.v v0, (a0) + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + .irp len 64, 32, 16 func ff_copy\len\()_rvv, zve32x copy_avg \len copy @@ -358,6 +401,12 @@ endfunc func ff_avg_bilin_\len\()h_rvv, zve32x bilin_h \len avg endfunc +func ff_put_bilin_\len\()v_rvv, zve32x + bilin_v \len put +endfunc +func ff_avg_bilin_\len\()v_rvv, zve32x + bilin_v \len avg +endfunc .irp name regular sharp smooth .irp do put avg -- 2.44.0 From patchwork Fri Mar 22 06:05:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 47304 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a4a:b0:1a3:31a3:7958 with SMTP id zu10csp1008049pzb; Thu, 21 Mar 2024 23:05:39 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXQ2qlVcaoguJoQysG97zTfV4XXYwZNE7GaG6EpmER+PCxxhZNo48+J2hP6JwMdFMsdm7HVTeVGlY6j8BJSky7esNLXiWnt3D5XRA== X-Google-Smtp-Source: AGHT+IGAshhGx/OyVgUyMZ0TAGQ3NabVMh3MFfR6mijftVHEMiMGOeCLCYfuBi0+pJaexTYpz/6s X-Received: by 2002:a50:9553:0:b0:56b:a74e:d581 with SMTP id v19-20020a509553000000b0056ba74ed581mr994218eda.13.1711087539697; Thu, 21 Mar 2024 23:05:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1711087539; cv=none; d=google.com; s=arc-20160816; b=FVXc0KfGhgzayCm4QDn4Qd+0BK28O/ZPh9nDAcf4pbJMDirR2hik/eFirRKgXnvPbx RXgRwFvx+zVPCPZpA+4o1+XISGGLHc9kTuoeg7BhMxb9lbzqnNuAWIc3HSfgS1++xvs3 BMKonHDQkFxynm6JRB+mbggsZF3dIv58MMmyFhM46qDB7SukbOVb7+RDMVJgmvvtmgjY BFIpFMWwRXp550m4k/PvZuIX3+GNY1iG3hQRQxlTceI9ob52abHFHFpzkjXNV3+3X8Bb 0sgu9wqapqBEYUawCfmL8VSHOTqmhdEYZCr6w4VoKMVs3u/8G/hIFa+Aee6WIFOYXe/q 7w/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=kKtRu+ZLjN2cJ4w1aCHOWBgHV9papYwSJlRNz7iErfc=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=EmvB7eWLUId1I9jQBMZU3nM3UPhJ1Qc8HyJTWrQe0i3A0aZ668JJq7yFc8qYXpKLbd P5eckioaLqsILnVKT2v1O6Wlp6tyO9+HHrL2AW8HExooADEja7HoHlVzXgobYn92dm9K KUfg3A8UaCRsoYvDoYpUJxkBkz6kQ9UqcjR+qIy+UDVA1Zyy4YCvBGft1/1uCoqzKJY1 RaCrS49rTDFf2Bz+8ZGfzlUgbG/1hSKDDk48oKSUyo6RBs1eOTKwuvCgBo++MSvmosbU KTr5xpT3A8KjH6gsc7Nv9H47Wzn5TpFABL3V0CcyFxZevOWH8sN/Fh5M8wsWDIqoRpvv 6t3w==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="V4236G1/"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f22-20020a056402069600b005684dd18df4si563670edy.387.2024.03.21.23.05.39; Thu, 21 Mar 2024 23:05:39 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="V4236G1/"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2EF5E68D59A; Fri, 22 Mar 2024 08:05:37 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1F86768D587 for ; Fri, 22 Mar 2024 08:05:31 +0200 (EET) Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-690d7a8f904so25888086d6.1 for ; Thu, 21 Mar 2024 23:05:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711087529; x=1711692329; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=LNg68o16HA0q1G58eyYiPB7KhEr7Vk7rBQmeNt2F16Y=; b=V4236G1/RKT77hZBeFEHq2XE8QcqLYKHVW9DEKddKyDhgR17C2wtpfk29fHnreFKgl qlhcIqL9sIZmi9qeoSE3AQ9GvWHCf/t0rpl83rO8PTnmW0F2NMKxn/fiePBhvtTvwEKw vOh/qDDWVWJYj2Srb1Oj2qqhUhWLwEWLaVR8SrqfuNaA8EBoAEB6lPhlVY3mt9cyIOoE C4xmzNdsY77jl4efoxiYrbq9pK/0PRiiNS0SMPBW+5FgdsltCnPUSYZNvN6cVygzBhVO cPVfRSvu7lyp9yofG0MkW7+4XwK47uK/FNSKW51HYsdt/p8KPV7uQ5RSslw8jqPPJbux r7iA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711087529; x=1711692329; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=LNg68o16HA0q1G58eyYiPB7KhEr7Vk7rBQmeNt2F16Y=; b=V/krC3QF63SMElzxclSY91Wn5ZO7yLqfyscLVreAEkVp2GQHfZUyOCfAuSo39rEMF1 wef+nfAX88z1gJ6dcH2mg4dtm5GR+Urj5EMG3GWhFZ9/+YCTHS1rdJupci8w1qTxh/Q8 InEKP2bvL76fd/2A3ZNS/NZ8apQ7xqoraQJW1jufE1gNsNcA7DunhyZ3eqXJjDXl//UL gzbiXNL1HhoMLMnZP9D0mmoQUnGTA4xTccNFW1GBFwO28CEZ0OLB9KoumzwZl3sQmYsX Y7dOBs/zAIC2e7U7vcNdfhBUMRjEV92C/bQvacsUPidCIhbU4DDwYeFobpTHoteJUiWh Q1JQ== X-Gm-Message-State: AOJu0Yx71a6ThF7q/qIZsZoJlPHrZjiCpjEfkJlRyQeHwhYTYMRIi0ts wy4nqSHDUaP96EBqTVqMaTUBmEIUkMVk9c+hGCYqsbmvpGU0RfwMpbKpQfWHif0Ied9uBrvDFJD 6XHC7mXxwsiPy5bf6VJqPY0HfFHTmhYrOKds= X-Received: by 2002:a05:6214:21ad:b0:68f:5fe3:a90e with SMTP id t13-20020a05621421ad00b0068f5fe3a90emr1469924qvc.29.1711087529698; Thu, 21 Mar 2024 23:05:29 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Fri, 22 Mar 2024 14:05:17 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 5/7] lavc/vp9dsp: R-V V mc tap v X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: x7cf27o7xhn/ From 94aacf6d1d49cc009669f89c91db71038a13285d Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Thu, 21 Mar 2024 23:08:01 +0800 Subject: [PATCH 5/7] lavc/vp9dsp: R-V V mc tap v C908: vp9_avg_8tap_smooth_4v_8bpp_c: 13.7 vp9_avg_8tap_smooth_4v_8bpp_rvv_i64: 5.0 vp9_avg_8tap_smooth_8v_8bpp_c: 49.7 vp9_avg_8tap_smooth_8v_8bpp_rvv_i64: 9.2 vp9_avg_8tap_smooth_16v_8bpp_c: 191.5 vp9_avg_8tap_smooth_16v_8bpp_rvv_i64: 21.2 vp9_avg_8tap_smooth_32v_8bpp_c: 770.5 vp9_avg_8tap_smooth_32v_8bpp_rvv_i64: 66.0 vp9_avg_8tap_smooth_64v_8bpp_c: 3068.0 vp9_avg_8tap_smooth_64v_8bpp_rvv_i64: 262.5 vp9_put_8tap_smooth_4v_8bpp_c: 12.0 vp9_put_8tap_smooth_4v_8bpp_rvv_i64: 4.5 vp9_put_8tap_smooth_8v_8bpp_c: 43.7 vp9_put_8tap_smooth_8v_8bpp_rvv_i64: 8.5 vp9_put_8tap_smooth_16v_8bpp_c: 168.7 vp9_put_8tap_smooth_16v_8bpp_rvv_i64: 20.0 vp9_put_8tap_smooth_32v_8bpp_c: 681.5 vp9_put_8tap_smooth_32v_8bpp_rvv_i64: 63.7 vp9_put_8tap_smooth_64v_8bpp_c: 2692.7 vp9_put_8tap_smooth_64v_8bpp_rvv_i64: 253.5 --- libavcodec/riscv/vp9_mc_rvv.S | 32 +++++++++++++++++++++++++++++++- libavcodec/riscv/vp9dsp_init.c | 3 ++- 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index 9458a2e82b..2d4b56516f 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -221,7 +221,11 @@ subpel_filters_smooth: .macro epel_filter name type regtype lla \regtype\()2, subpel_filters_\name li \regtype\()1, 8 +.ifc \type,v + mul \regtype\()0, a6, \regtype\()1 +.elseif \type == h mul \regtype\()0, a5, \regtype\()1 +.endif add \regtype\()0, \regtype\()0, \regtype\()2 .irp n 1,2,3,4,5,6 lb \regtype\n, \n(\regtype\()0) @@ -238,6 +242,19 @@ subpel_filters_smooth: li a5, 64 .ifc \from_mem, 1 vle8.v v22, (a2) +.ifc \type,v + sub a2, a2, a3 + vle8.v v20, (a2) + add a2, a2, a3 + add a2, a2, a3 + vle8.v v24, (a2) + add a2, a2, a3 + vle8.v v26, (a2) + add a2, a2, a3 + vle8.v v28, (a2) + add a2, a2, a3 + vle8.v v30, (a2) +.elseif \type == h addi a2, a2, -1 vle8.v v20, (a2) addi a2, a2, 2 @@ -248,6 +265,7 @@ subpel_filters_smooth: vle8.v v28, (a2) addi a2, a2, 1 vle8.v v30, (a2) +.endif .ifc \name,smooth vwmulu.vx v16, v24, \regtype\()4 @@ -266,11 +284,23 @@ subpel_filters_smooth: vwmaccsu.vx v16, s7, v30 .endif +.ifc \type,v + .rept 6 + sub a2, a2, a3 + .endr + vle8.v v28, (a2) + sub a2, a2, a3 + vle8.v v26, (a2) + .rept 3 + add a2, a2, a3 + .endr +.elseif \type == h addi a2, a2, -6 vle8.v v28, (a2) addi a2, a2, -1 vle8.v v26, (a2) addi a2, a2, 3 +.endif .ifc \name,smooth vwmaccsu.vx v16, \regtype\()1, v28 @@ -410,7 +440,7 @@ endfunc .irp name regular sharp smooth .irp do put avg - .irp type h + .irp type h v gen_epel \len \do \name \type .endr .endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 413b203e5f..da09918796 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -125,7 +125,8 @@ static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) init_subpel1(4, idx, idxh, idxv, 4, dir, type) #define init_subpel3(idx, type) \ - init_subpel2(idx, 1, 0, h, type) + init_subpel2(idx, 1, 0, h, type); \ + init_subpel2(idx, 0, 1, v, type) init_subpel3(0, put); init_subpel3(1, avg); -- 2.44.0 From patchwork Fri Mar 22 06:05:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 47305 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a4a:b0:1a3:31a3:7958 with SMTP id zu10csp1008130pzb; Thu, 21 Mar 2024 23:05:53 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVQYIyuBQ/+htZgjQMzTkeEdSJsxCuJqy0xfxBtJyeaizB8t4H5HRvlidZ/cZikOULSSY0tuknlAIjfgMTLc0BX8dTl+LU5ugixyA== X-Google-Smtp-Source: AGHT+IGd7YjhyXrS4yWJwQDs3tqs/alBjn0BCD6ZYJVKiim0FJdCHDWbIpfHrV7xJwt1Qj5hcC16 X-Received: by 2002:a05:6402:2d9:b0:56b:ced7:5d27 with SMTP id b25-20020a05640202d900b0056bced75d27mr175667edx.41.1711087552773; Thu, 21 Mar 2024 23:05:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1711087552; cv=none; d=google.com; s=arc-20160816; b=N5nruJibZOqbTnUjN0vsSFBxeUXI/raaPhpsqExTr3iUbxqj56bWw4OBIv0nzN1pTb wUlZIynxbWwRbYY0GQEnTe/xuQ8XgNfHzIrwV97Xs2IltPITGhZJ6IJ9J+9m1OkCOstR KuyXv5dmE8fyrBLB6QaTdU9NE0AsfkIyrdRWIIGAke/qcPAU+NbaFnQ29WPakUt7bGqP ssllfaer6wc1fqDGMXF3AOrEfizMBvz2kfq5RN8Dnr3LQ2bUOHoJElOqh68Y/L+7QXzN m+lQpYaSKCUXRKye1nAuWGgG8Ax65LF+CjCqGnp86NFMFJe7xEtGLoKJbiuF2c/PWlzr vf0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=IbKFyU2/T3tvrklxO6q0c1D2Cl/L7TqRmXYHiGSdmFY=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=D1GyB3um3Lbn3yoR3CVKmipOngp8a5Ss78f+v7Iuju7JWNqk86ejKqJY5cBUmUdrW5 lwUjJiOLdkc6ygQsr1B5T/B3Cg3DzPt8mkulQosSfrKn7IHgvL6VCnX5J5eayAGQlYOL GCjuSxJDn+jtrfel9/5CXXAUsVb6HXp6W08YsUVfzLJU+LAuy7sSsl9KU/QVBcq8Qd9D 9R7V95DVr2QC6r6z0FHse/aTi3FBGxEPkC0/Dgg1otumU6e0t+jnUB+uiFHFGcURvSz9 Xl8+Ttte4KQV2q0IQlaoIcKvamCaXVYzjsifyfY3se3DO3paiEPXIeB7Oh1muCbuAHcg 0YGQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=Hoh7wa8i; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id p23-20020a056402501700b0056752f48f11si591586eda.249.2024.03.21.23.05.52; Thu, 21 Mar 2024 23:05:52 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=Hoh7wa8i; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 46F5568D59F; Fri, 22 Mar 2024 08:05:50 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5C9E768D587 for ; Fri, 22 Mar 2024 08:05:49 +0200 (EET) Received: by mail-qv1-f54.google.com with SMTP id 6a1803df08f44-696609f5cf2so6313786d6.3 for ; Thu, 21 Mar 2024 23:05:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711087548; x=1711692348; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=TS+OSB3s0hXmugJSdxH48Samkey5qJfsr3StOdalUPI=; b=Hoh7wa8iWn5xYghVMojlu6+i1BKVGqKwCypduHg7LyVm0zX3q6nvg17fNselO0fH1f OBx1cqtdaDd1PdmQsorlRLol9hgyIdUjra23PQmQdcRyHLpazKAVOXBkwRE6AL/yqTQj C6fsaq58ZLX9V4xJegBijF1wL7SNwpsNTzdkl8rZVtFKDV8l8k2ha54TDi8xltuu/XZz Blo/J3SjQR/op0MO+KJ6/mrIoRoMxrsNeu0vnPyNErgd+GCcCAkcFyeFP9pezFFoU7WG 9nG18KLT0wPq1cjLOALU2EztokE1JgKvhM/X5k8rYFiouXNAmBH90rDy/iEozHAZJnPx OO4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711087548; x=1711692348; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=TS+OSB3s0hXmugJSdxH48Samkey5qJfsr3StOdalUPI=; b=u4mu9DSHFeBQjoDWDJzJYCbr8wSgVZtRYrOfVuzgjGay7feY7eK9mH0fd0YQ9p13lM GrDmfF2AjWDfSvcBwg1/5DDOH7g9Ko/IWP4IBIu2M0gy0K+8GfjBjwinATN35OzZqFPJ plM5M/2xE/t5zZf087oqOhuxDsZTZu855YOlzRkfcpU02yp5HaSKBPWTTUP52c+lnAEN nMA9TcwXSwK9AjopOQMcicXsLrHN6Yf7uJ9Z0gdKEEyVnMIsY5336xa01OHjHFelBdaC FcixD7BIyMUxHpP7gQKHm1rrdu/SNv7DUGclWYR7ft1oQg6fAiX4HXSm3Mj5/9eONBjl Vtiw== X-Gm-Message-State: AOJu0YwVNV/CMtFQ0qJANcUOgqb2PLN2FjDXpkKJuMdi7ZGVCqh9aygp dGRdsyFyeTj+M/QMyhkpK2ydF+BXeOrwU+3ADif5qNu1GxYLN2qTzQxiwDby6FblhNPN3NZChS+ dvYEAJdrPS0qtps0gog3TChS23ugE6KGaTho= X-Received: by 2002:ad4:5c4b:0:b0:696:2e6d:93 with SMTP id a11-20020ad45c4b000000b006962e6d0093mr1576165qva.0.1711087547730; Thu, 21 Mar 2024 23:05:47 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Fri, 22 Mar 2024 14:05:36 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 6/7] lavc/vp9dsp: R-V V mc bilin hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: dKrh+4S480SU From 5df2835fd182378b78530e001669c65f3638946d Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Thu, 21 Mar 2024 23:14:10 +0800 Subject: [PATCH 6/7] lavc/vp9dsp: R-V V mc bilin hv C908: vp9_avg_bilin_4hv_8bpp_c: 10.7 vp9_avg_bilin_4hv_8bpp_rvv_i64: 4.5 vp9_avg_bilin_8hv_8bpp_c: 38.7 vp9_avg_bilin_8hv_8bpp_rvv_i64: 8.2 vp9_avg_bilin_16hv_8bpp_c: 147.2 vp9_avg_bilin_16hv_8bpp_rvv_i64: 32.2 vp9_avg_bilin_32hv_8bpp_c: 590.7 vp9_avg_bilin_32hv_8bpp_rvv_i64: 47.5 vp9_avg_bilin_64hv_8bpp_c: 2323.7 vp9_avg_bilin_64hv_8bpp_rvv_i64: 153.5 vp9_put_bilin_4hv_8bpp_c: 10.0 vp9_put_bilin_4hv_8bpp_rvv_i64: 3.7 vp9_put_bilin_8hv_8bpp_c: 35.2 vp9_put_bilin_8hv_8bpp_rvv_i64: 7.2 vp9_put_bilin_16hv_8bpp_c: 133.7 vp9_put_bilin_16hv_8bpp_rvv_i64: 14.2 vp9_put_bilin_32hv_8bpp_c: 521.7 vp9_put_bilin_32hv_8bpp_rvv_i64: 43.0 vp9_put_bilin_64hv_8bpp_c: 2098.0 vp9_put_bilin_64hv_8bpp_rvv_i64: 144.5 --- libavcodec/riscv/vp9_mc_rvv.S | 37 +++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index 2d4b56516f..1fad17266d 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -160,6 +160,37 @@ ret .endm +.macro bilin_hv len type +.ifc \type,avg + csrwi vxrm, 0 +.endif + neg t1, a5 + neg t2, a6 + li t4, 8 + li t5, 1 + bilin_h_load v24, \len, put + add a2, a2, a3 +1: + addi a4, a4, -1 + bilin_h_load v4, \len, put + vwmulu.vx v16, v4, a6 + vwmaccsu.vx v16, t2, v24 + vwadd.wx v16, v16, t4 + vnsra.wi v16, v16, 4 + vadd.vv v0, v16, v24 +.ifc \type,avg + vle8.v v16, (a0) + vaaddu.vv v0, v0, v16 +.endif + vse8.v v0, (a0) + vmv.v.v v24, v4 + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + + ret +.endm + .irp len 64, 32, 16 func ff_copy\len\()_rvv, zve32x copy_avg \len copy @@ -437,6 +468,12 @@ endfunc func ff_avg_bilin_\len\()v_rvv, zve32x bilin_v \len avg endfunc +func ff_put_bilin_\len\()hv_rvv, zve32x + bilin_hv \len put +endfunc +func ff_avg_bilin_\len\()hv_rvv, zve32x + bilin_hv \len avg +endfunc .irp name regular sharp smooth .irp do put avg -- 2.44.0 From patchwork Fri Mar 22 06:05:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 47306 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a4a:b0:1a3:31a3:7958 with SMTP id zu10csp1008264pzb; Thu, 21 Mar 2024 23:06:15 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUDqALW7U20PbjFDegtS/sKqRv9DUY3yVaq3FURZ3SOPIxfYLPviG0qloc/27Or/VwfuDrRPV8LwhE500+W6U0KihiDTMU52Hc4ew== X-Google-Smtp-Source: AGHT+IFFqOuvrgbaVmsnxOBfnV3IZdKIt+VSvyxXNWUKOHf5R8WtaPkUbMAk3/fp1ggd7bp2a9tZ X-Received: by 2002:a50:950a:0:b0:568:75c5:8fb with SMTP id u10-20020a50950a000000b0056875c508fbmr237788eda.3.1711087575213; Thu, 21 Mar 2024 23:06:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1711087575; cv=none; d=google.com; s=arc-20160816; b=fFkn1vqG85TAmn9utaae/eJPZSubgE93Rb0rhj/1jtcFSYiT0RCtVncoBrxrjPp1hc 1VarwNCi1ed7j+5MSzIEO2DUZo2lRD1pxHP8OLhluuQDmNbcaEDHPTNtg7nWQo/XE24u kboGg+N8uTq3QrWinNdqXv3aA5JWyIhkLGrZRiJz+T9KIRrU9KMbkH4tTw3NWT1R0EFR ia6/nvqLmPEyfZrwXhKAYsQt7NxQtE0iVaDOLuKhvcr7lieohMlEftPbHgXTAAUVK1wc rNVRHA1gv2ETD8vwwl1utv30pn5nP2DQKbXjSUAyBF8aCOjjmWukXAI6k1HwMVnOtgZY BuSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=wRajecvhrHGyibr33jIATN8pCf1/hgTiCP3g6savhAI=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=U3IYPF4ald64BVK/emo/vInA8BR7XVSYlpS6ZMn4TL1HxDyos+EzE/vVpYW0IdLrm1 ry64wdP7/U0WBDtLRIS501bNmpw94SiGLznv7vSAhiSYtJzwWb9xzTiVDgOYKzwnbwBz HPmIqgrJA+Qr7vOnd44/cVvyCo3vMWBGKUpQFu9dfvOglEtw2uEVzac4jPdmegVUvMkb Xxya5U8lZyUTYHQnJB/Y9k9lv3hByQawoIjN5ZlzyV2CAxN8M8VaZqwW2tdI5P3U8xIM eVfvfEtizRsZq6QeFOzkwN/C7347SqyVXBpjZIL8rT4LSrX8YpjvJvJFpLpRv72w/jPq 6WrA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=ZhLUnikm; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id t19-20020a50ab53000000b0056bcba9d410si593507edc.372.2024.03.21.23.06.14; Thu, 21 Mar 2024 23:06:15 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=ZhLUnikm; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9828368D5A3; Fri, 22 Mar 2024 08:06:12 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f53.google.com (mail-qv1-f53.google.com [209.85.219.53]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6036C68D534 for ; Fri, 22 Mar 2024 08:06:06 +0200 (EET) Received: by mail-qv1-f53.google.com with SMTP id 6a1803df08f44-69185f093f5so14134566d6.3 for ; Thu, 21 Mar 2024 23:06:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711087564; x=1711692364; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=6ZP9EzW8eXVkj1cpLEYS74BqtHOSJEBDZEwolKsX24w=; b=ZhLUnikmJkD9thhWrnigPGxVU7DMainpZKm9hoRiYujrlWmkzigbBU7kpB6CLApwH7 jFU/k+8D/LZfou3eedflsrNegKB/7xJlF8K8QSOy920XLmJJLRt+jYHUYEeXJVS46H8g OaAV7EOcj5RgSH0NZYlqpJDgTLe/SUmgqw8eisA4+VjcrJop3t6afkLftxVPS/wMumOf PiEOnIO+SD807lsFnF8gu7VKenKq0fNJ+VLO9TLMc2oxXF8WVhRjt2H08k3QOutzz8aS Z85Gt62SxcUCfYs/nykWVcbW/hrgF/Zg9GdWJvh+FRkfrWHYEGHMqwPjT6GewOtEkDuZ GHhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711087564; x=1711692364; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=6ZP9EzW8eXVkj1cpLEYS74BqtHOSJEBDZEwolKsX24w=; b=tW0um8IK/o8GuKfA6A9AbVYjsBiyhDn82SAxrH7Yp6wLmBLiU0MMS3QipnQn0xKRkW tXlcOh9r3vpqqXBhbPcTptAuHOWFVJgPkm5q9zkWkl0Z1i1GX1gmfRnxfeWDy6v2OdHd ZszZQuBVXSZVhgcdnufRuTZAnrRaTmuNo5yRMI2Jjbrp1Dd2bKkLm9e7cXoBnmwa0vZd Us2oy20/S/+6SRrCP9TvZx/3RYvlQlC3anX/Vl96CRXNcSnTQqg/6oDHkpQUw6oTf0RN /fVH8MuE7/AMSLIyYd835/C6S4MRCOYwVN78C/fqRV6JdONjtj13oapAaAhdW7V9+E9M +Pww== X-Gm-Message-State: AOJu0Yz8kSQSBustknvM9HmtRJIHfSNSjuqoQvUobIYVmdZ4JngOhOHo fAn7B4aPnlonPiX/wIZL/c67PtM7CU3RR4c/doUjVNl/gf05AAQ4YXam7WfwwyonFLwmbty9g/T YpdX6OMTArAEGWdHjrqaV2/kmpWt0BRUJOjs= X-Received: by 2002:a05:6214:27ec:b0:696:6712:66a0 with SMTP id jt12-20020a05621427ec00b00696671266a0mr1150330qvb.61.1711087564532; Thu, 21 Mar 2024 23:06:04 -0700 (PDT) MIME-Version: 1.0 From: flow gg Date: Fri, 22 Mar 2024 14:05:52 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 7/7] lavc/vp9dsp: R-V V mc tap hv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 0Bznb5POaBag From 5d29de366bab4736b1e05e2167d976d344dd8c44 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Thu, 21 Mar 2024 23:21:18 +0800 Subject: [PATCH 7/7] lavc/vp9dsp: R-V V mc tap hv C908: vp9_avg_8tap_smooth_4hv_8bpp_c: 32.2 vp9_avg_8tap_smooth_4hv_8bpp_rvv_i64: 15.2 vp9_avg_8tap_smooth_8hv_8bpp_c: 98.5 vp9_avg_8tap_smooth_8hv_8bpp_rvv_i64: 23.5 vp9_avg_8tap_smooth_16hv_8bpp_c: 355.5 vp9_avg_8tap_smooth_16hv_8bpp_rvv_i64: 46.2 vp9_avg_8tap_smooth_32hv_8bpp_c: 1270.7 vp9_avg_8tap_smooth_32hv_8bpp_rvv_i64: 133.2 vp9_avg_8tap_smooth_64hv_8bpp_c: 4936.5 vp9_avg_8tap_smooth_64hv_8bpp_rvv_i64: 521.7 vp9_put_8tap_smooth_4hv_8bpp_c: 30.2 vp9_put_8tap_smooth_4hv_8bpp_rvv_i64: 14.2 vp9_put_8tap_smooth_8hv_8bpp_c: 91.5 vp9_put_8tap_smooth_8hv_8bpp_rvv_i64: 22.7 vp9_put_8tap_smooth_16hv_8bpp_c: 330.0 vp9_put_8tap_smooth_16hv_8bpp_rvv_i64: 45.0 vp9_put_8tap_smooth_32hv_8bpp_c: 1296.5 vp9_put_8tap_smooth_32hv_8bpp_rvv_i64: 131.0 vp9_put_8tap_smooth_64hv_8bpp_c: 4497.7 vp9_put_8tap_smooth_64hv_8bpp_rvv_i64: 513.2 --- libavcodec/riscv/vp9_mc_rvv.S | 79 ++++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp_init.c | 3 +- 2 files changed, 81 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/vp9_mc_rvv.S b/libavcodec/riscv/vp9_mc_rvv.S index 1fad17266d..0b054db522 100644 --- a/libavcodec/riscv/vp9_mc_rvv.S +++ b/libavcodec/riscv/vp9_mc_rvv.S @@ -445,12 +445,90 @@ subpel_filters_smooth: ret .endm +.macro epel_hv_once len name do + sub a2, a2, a3 + sub a2, a2, a3 + sub a2, a2, a3 + .irp n 0 2 4 6 8 10 12 14 + epel_load_inc v\n \len put \name h 1 t + .endr + addi a4, a4, -1 +1: + addi a4, a4, -1 + epel_load v30 \len \do \name v 0 s + vse8.v v30, (a0) + vmv.v.v v0, v2 + vmv.v.v v2, v4 + vmv.v.v v4, v6 + vmv.v.v v6, v8 + vmv.v.v v8, v10 + vmv.v.v v10, v12 + vmv.v.v v12, v14 + epel_load v14 \len put \name h 1 t + add a2, a2, a3 + add a0, a0, a1 + bnez a4, 1b + epel_load v30 \len \do \name v 0 s + vse8.v v30, (a0) +.endm + +.macro epel_hv do name len + addi sp, sp, -64 + .irp n 0,1,2,3,4,5,6,7 + sd s\n, \n\()<<3(sp) + .endr +.ifc \len,64 + addi sp, sp, -48 + .irp n 0,1,2,3,4,5 + sd a\n, \n\()<<3(sp) + .endr +.endif +.ifc \do,avg + csrwi vxrm, 0 +.endif + epel_filter \name h t + epel_filter \name v s +.ifc \len,4 + vsetivli zero, 4, e8, mf4, ta, ma +.elseif \len == 8 + vsetivli zero, 8, e8, mf2, ta, ma +.elseif \len == 16 + vsetivli zero, 16, e8, m1, ta, ma +.else + li a6, 32 + vsetvli zero, a6, e8, m2, ta, ma +.endif + epel_hv_once \len \name \do +.ifc \len,64 + .irp n 0,1,2,3,4,5 + ld a\n, \n\()<<3(sp) + .endr + addi sp, sp, 48 + addi a0, a0, 32 + addi a2, a2, 32 + epel_filter \name h t + epel_hv_once \len \name \do +.endif + .irp n 0,1,2,3,4,5,6,7 + ld s\n, \n\()<<3(sp) + .endr + addi sp, sp, 64 + + ret +.endm + .macro gen_epel len do name type func ff_\do\()_8tap_\name\()_\len\()\type\()_rvv, zve32x epel \len \do \name \type endfunc .endm +.macro gen_epelhv len name do +func ff_\do\()_8tap_\name\()_\len\()hv_rvv, zve32x + epel_hv \do \name \len +endfunc +.endm + .irp len 64, 32, 16, 8, 4 func ff_avg\len\()_rvv, zve32x copy_avg \len avg @@ -480,6 +558,7 @@ endfunc .irp type h v gen_epel \len \do \name \type .endr + gen_epelhv \len \name \do .endr .endr .endr diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index da09918796..d27f5e7b85 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -126,7 +126,8 @@ static av_cold void vp9dsp_mc_init_rvv(VP9DSPContext *dsp, int bpp) #define init_subpel3(idx, type) \ init_subpel2(idx, 1, 0, h, type); \ - init_subpel2(idx, 0, 1, v, type) + init_subpel2(idx, 0, 1, v, type); \ + init_subpel2(idx, 1, 1, hv, type) init_subpel3(0, put); init_subpel3(1, avg); -- 2.44.0