From patchwork Sat Mar 2 07:42:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 46691 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:a919:b0:19e:cdac:8cce with SMTP id cd25csp1729500pzb; Fri, 1 Mar 2024 23:42:44 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCUa+1xVfPGQkOJXfABlYOMKGog0SXA4IoAmeJMsn0vLz3CH2pn75u3Jl3A76jPXQF/6K7SKCM4TLcM2cdPsSu/bB30o5uB4b7z+qA== X-Google-Smtp-Source: AGHT+IF7wm8JhkscybJo61f5kkx2xlwhq5x3n02XDO97Dj4PE/tiaxkoiu7qBIXNw3vKjWJvnXEu X-Received: by 2002:a17:907:1708:b0:a43:dc5f:f271 with SMTP id le8-20020a170907170800b00a43dc5ff271mr2924822ejc.42.1709365364502; Fri, 01 Mar 2024 23:42:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1709365364; cv=none; d=google.com; s=arc-20160816; b=MzmQe6+eaMANMWUMOnJSDO/1JoPIsP+NDWUtZ1lDmou3TzimpIsXmqHr3OEt8l+wHq kARNqA3HpC+eXv8Tsswz9xQNEpJ2xBOOeakQ/S6TdFKdD0YxzIQSMRtaR0H/3BgwrnTT GwTJ8pI7qIE/MURvxo62Qb37l55W5GfYx7zVogZ+NH0hWzmIbtw4lgkquI2NyovyGR27 JOWsZ+OoHh9RbYaGjS/f+61vx228RkPJLabvKx6n+1BLu6Qu5sZ/IqTgrMC2Ivwen8dc 7RVHk6jXgrA2K/KepRqhY2cd8y1vWq0vBCLxAF+7VZQLBI8RqVa4g+8xArQ0gywz76RJ SWpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=g68D4DtpaATAxXzT6PD0mrSgkr0/NKoLBpJG7p5X1zk=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=GpXE/cOXdSEIrYxY5QAnztHlLWvDShjLGnNR7LV6Ew2zQgv+Hzc42QN0eJunM7HPKy qflCMiWYihz6xHqp16Al7No5EPHQfpSJuyK+CnHtBiQ/ypWXiCZFES++bmayAy4o7H2r 9x2XRvDGJbBTh5mx278VbxH4OEK/DAA8YLFTF2EhhXg+zOZ89iHJu/zc+33DIAayEzMA 18SuCWoSi5BY9ufcslbRbfNyztoQHvZF+M+Mc+85ynMsmsXZLSl0FQilVyR/V8grBK8r Ph/Jt9uvOW7SAlVI856Tz+5JRChwGE5mGS+5yQmdJ4lR7in6h2oxSddD+Aux3TwUXXHk VRtQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="GCYu/OOR"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id e13-20020a170906044d00b00a449e80c9fcsi1016525eja.617.2024.03.01.23.42.43; Fri, 01 Mar 2024 23:42:44 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="GCYu/OOR"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6B32768D2F9; Sat, 2 Mar 2024 09:42:41 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8E0DE68D2E9 for ; Sat, 2 Mar 2024 09:42:34 +0200 (EET) Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-7881b8f4155so6210185a.0 for ; Fri, 01 Mar 2024 23:42:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709365353; x=1709970153; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=/cFMM0hdi0bZmFS7/L70fYaQ50ueX/9r8nJbv4IR5J4=; b=GCYu/OOR6Kzly/JzyzWjAXoUUmq779gC1vQ6yadofI4GdHI3M579UikRGi9mP7fydQ Fweo19D4U15epyyHfTLsQeCZjzNCulpS8mzeTtzSm8Y6IqXzP6Cgid49psCZcDJEBYtv 9meFs8LNOavdwsJbBZUUdIHpoRsd7iTKlKY/c4AnvQGAxdhyRW0gJ1fQao5NCP0wzKJn FbWkmjWa0muMlXOBAC+2v8T2CVZNx37IZ1X7VYIGxbUg6jdrbCOzBIJ6LBQFIvuBmilL DGq6kjBZ2uNxgFEVCMVUtInTJBOrf8eUoF26BiXKLGx8ce5zSNpUQcqvbWFV+KigSipd ItEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709365353; x=1709970153; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=/cFMM0hdi0bZmFS7/L70fYaQ50ueX/9r8nJbv4IR5J4=; b=KDJLeVKtDo86rUZSdKJpyupxTdC5aE8B63VAHV33AYCLGuFKDbfJ1MutZCwkzUumtm eOLk5KVHv4499z0ECqyeOlZdghlpmjgDlJBDh8KC1k6w9lmUhM7qGsm+L47BpvCy8VyX Jy3LmlHdGqRk8gP/N7UJs//K7+03jZKX3BYRuHPnmWVyfTPS6lPfQSFh1sMi1aOeNDvz B3Z9YMwnDJiVNMlcwYKI7T0QSCwVjNsk3Cep8MIVkod4pSxGCEWvRfIf+oio6Bf3SR6O ZdN1Rky3pXYG1OlBD5e+1zI613sx1yNuiv1OxhGYqrXgddK7sWoHF445gHjT0YAsdwd4 pJeQ== X-Gm-Message-State: AOJu0YyzxTUP91tqLBWIYpucvsRwa+7T2ukgizQCskCfH/b5JXPQ6xV2 4To1lWfov8ZUljnKNpRTUwgsnq0/uiqUglykIHUUMNIcZJQr1pa2RAhqGs/TK3fu7RV8opFdi8a L2TK7y5vEqT1kEPGrhJRCpfrxENr5TLIaNHQ= X-Received: by 2002:a05:620a:7fb:b0:787:9304:84 with SMTP id k27-20020a05620a07fb00b0078793040084mr3648423qkk.21.1709365353065; Fri, 01 Mar 2024 23:42:33 -0800 (PST) MIME-Version: 1.0 From: flow gg Date: Sat, 2 Mar 2024 15:42:21 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 2/4] lavc/vp9dsp: R-V V ipred vert X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: oPA8Hcxi2xr1 From 7abd262daa281cee412a905ea75a5f10dd0b1fbe Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Fri, 1 Mar 2024 18:38:43 +0800 Subject: [PATCH 2/4] lavc/vp9dsp: R-V V ipred vert C908: vp9_vert_8x8_8bpp_c: 22.0 vp9_vert_8x8_8bpp_rvv_i64: 18.5 vp9_vert_16x16_8bpp_c: 71.2 vp9_vert_16x16_8bpp_rvv_i32: 50.7 vp9_vert_32x32_8bpp_c: 300.2 vp9_vert_32x32_8bpp_rvv_i32: 136.7 --- libavcodec/riscv/vp9_intra_rvv.S | 35 ++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp.h | 6 ++++++ libavcodec/riscv/vp9dsp_init.c | 3 +++ 3 files changed, 44 insertions(+) diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S index b3b0470cfc..88b54f37b0 100644 --- a/libavcodec/riscv/vp9_intra_rvv.S +++ b/libavcodec/riscv/vp9_intra_rvv.S @@ -199,3 +199,38 @@ endfunc func ff_dc_top_8x8_rvv, zve64x dc8x8 top endfunc + +func ff_v_32x32_rvv, zve32x + vsetivli zero, 8, e8, mf2, ta, ma + vle32.v v8, (a3) + + .rept 31 + vse32.v v8, (a0) + add a0, a0, a1 + .endr + vse32.v v8, (a0) + + ret +endfunc + +func ff_v_16x16_rvv, zve32x + vsetivli zero, 4, e8, mf4, ta, ma + vle32.v v8, (a3) + + .rept 15 + vse32.v v8, (a0) + add a0, a0, a1 + .endr + vse32.v v8, (a0) + + ret +endfunc + +func ff_v_8x8_rvv, zve64x + ld t0, (a3) + vsetivli zero, 8, e64, m4, ta, ma + vmv.v.x v8, t0 + vsse64.v v8, (a0), a1 + + ret +endfunc diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h index abd57bd836..ae4fb266d0 100644 --- a/libavcodec/riscv/vp9dsp.h +++ b/libavcodec/riscv/vp9dsp.h @@ -60,5 +60,11 @@ void ff_dc_129_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); void ff_dc_129_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); +void ff_v_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_v_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_v_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); #endif // #ifndef AVCODEC_RISCV_VP9DSP_RISCV_H diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 69ab39004c..9c550d40b5 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -36,6 +36,7 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_8X8][DC_128_PRED] = ff_dc_128_8x8_rvv; dsp->intra_pred[TX_8X8][DC_129_PRED] = ff_dc_129_8x8_rvv; dsp->intra_pred[TX_8X8][TOP_DC_PRED] = ff_dc_top_8x8_rvv; + dsp->intra_pred[TX_8X8][VERT_PRED] = ff_v_8x8_rvv; } if (bpp == 8 && flags & AV_CPU_FLAG_RVV_I32 && ff_get_rv_vlenb() >= 16) { @@ -51,6 +52,8 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_16X16][DC_129_PRED] = ff_dc_129_16x16_rvv; dsp->intra_pred[TX_32X32][TOP_DC_PRED] = ff_dc_top_32x32_rvv; dsp->intra_pred[TX_16X16][TOP_DC_PRED] = ff_dc_top_16x16_rvv; + dsp->intra_pred[TX_32X32][VERT_PRED] = ff_v_32x32_rvv; + dsp->intra_pred[TX_16X16][VERT_PRED] = ff_v_16x16_rvv; } #endif } -- 2.44.0