From patchwork Sat Mar 2 07:42:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 46693 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:a919:b0:19e:cdac:8cce with SMTP id cd25csp1729555pzb; Fri, 1 Mar 2024 23:42:58 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCXIXCbJJ7X0NrBla2t6RCGS58cuJoScEUMqAirhUq99DOVXglv6CbBa8CLnVGGNK+PszI/FlrZReIVdFAYZuH3GNuCXWy4kWjNTdg== X-Google-Smtp-Source: AGHT+IEsQxPSdgd7ALsrSX4F+8sF6b0eMOmAl7fLhCE/10ubwYTA3z2Q/5s8gDP2kZ0Nh3NZiINV X-Received: by 2002:a17:907:20b9:b0:a3e:7dbe:298b with SMTP id pw25-20020a17090720b900b00a3e7dbe298bmr2225092ejb.24.1709365377941; Fri, 01 Mar 2024 23:42:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1709365377; cv=none; d=google.com; s=arc-20160816; b=g8Dezd9BPgwmGteIpLWfFDxgW+koGzCNy16TVbkUYwgsBonqaKKy7nCK88U8CFBozZ a19Omks0va+msOE83eneslBD7OoamGcxqXu7o0gyj9q6eo72JarVPHNsQsU9i/NqLTfo IiQCTHR6hEixiw9ACkSKzWiKYFyDeltJ2+1VGvbvG3M5UKFB2R5t2Byu4MNwH+RliHNg iMUzANj1/XE5lNOqJj1+6Eb99mNoYXqMKHqZh7agN0a/ei+tR6QCa+ZGadFs65eWYuYg QAHR+BlJun+ljEZmRoPKkozlizm8pn9nDhy3WB7/y3m1/j5jE4thuvdQVMjVmX7KjOPO MbvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=dfBBs+eZBQYSgKyFeWSOsWIhQ6mmZdHhYSa7OMGmL6U=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=ipVtMInC7AY1hLL2RUnjGXVjPH5nBBnkcxiWiTndoki1ZpX93CvETkc0Ggsc52f/i+ FSGeuSjgKB9f4Ms7eHKebztMZIxkXLQ2OxSlNXoXQmT7dU0/8AFnKv6yPn4ArzwOKnSL FMGJh7r+WK2PTcSsJt4yY+09caYZfT3QnUjJPMA1myp8ED4YuMRbvkzMqq5Fad8qoZG3 hWQwisNe2IFipp56LE5sxxfseE1FnAR2EKHkT+IkX4aviHbo2hjQS9svNOTW4C8jpeve hDYf9yr665acANL1C7+7ZI6VWQGlJi+jRiqIOCjAhXXkucsj5Xejdjp/YtUJP0Yn8N7K BoIQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=UlACDvi+; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id kz1-20020a17090777c100b00a4432548165si2291520ejc.552.2024.03.01.23.42.57; Fri, 01 Mar 2024 23:42:57 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=UlACDvi+; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 74FA068D2FD; Sat, 2 Mar 2024 09:42:55 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C2C4868D2E7 for ; Sat, 2 Mar 2024 09:42:48 +0200 (EET) Received: by mail-qv1-f54.google.com with SMTP id 6a1803df08f44-68f901192afso13299866d6.1 for ; Fri, 01 Mar 2024 23:42:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709365367; x=1709970167; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=PKnDqZ2/gvmPUNiuSF6wZ3cROFAuKjlDb8TcR2lg75Q=; b=UlACDvi+lBo61KEEU5FX/JnDc0C87mqqB7O4Ll7QpDNOBP8dkpkBTc1ald66+3ftkh eZ0xI8R7VsFUfSYQzqpUr8CSNme4B6nwjTRziK7g1Rs320P5rAHwvl6+UfDVYzaAFZb8 uZTKPKbezQZLEooS0EQryGyQR9WvALeLX4HJiCba0aNtFk/jn6r9zekL2f9D39LSu7YM 9+Yd/wpaKgrJ7VbzDEusIFTngd1jn96kORnE3x2V/MTBhgRGmJIyNlT6r+7YECnMk5vU 6gZl/8z5Ck4FzLTUE1jx7RObYybROoptbQ512hvbgmSYxbmJXqyjDXmGPqgFTtFXq43j B4+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709365367; x=1709970167; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=PKnDqZ2/gvmPUNiuSF6wZ3cROFAuKjlDb8TcR2lg75Q=; b=cvi/WKS4544GfNxrlVzRaak/ex7HiFR9PHTDzIst7UidbJ0SE98WsivsTLWSE7JbYT Js6MdrzrpRbv5/Kp20OZhsZtNG9SrZu4u3MhrPSiJxvdJE13D3LR1VRyhtZ5i2OfayEz 2TnjmirrCR4wZxw38TE8l9h81JYZuB9Jfd6LcGL4B5jlcqqxx/Egns8xtKamZR2J+ecL bNQmKpQQwEM/0lAE3BXhc6iqFgfqForX4+zZq8YfGXApimIU5pkESD2kfVx8758GZyrB keao7tgSg9RmnBJvEJeFClTH6KeS6xHEFDL9qmXWXclRSmt6xO6mGDOMv3B2x56u48wJ cCxg== X-Gm-Message-State: AOJu0YwGp6yPXk0zy+nivAzP36WOUknyBC5gylGzH+b+JwogHDC4qu7w 5YA9Ngra1TS22NuTE2/WQ9R5qriNYfE1cUvkmD4JmHssx5Gi6oZBfqclpQOOzM0C7rEhIpy++sa EDGLQn/wuTeaxmzo5dMXU7dm4HECssxQrjoM= X-Received: by 2002:a05:6214:184c:b0:690:6239:bbe6 with SMTP id d12-20020a056214184c00b006906239bbe6mr991355qvy.20.1709365367476; Fri, 01 Mar 2024 23:42:47 -0800 (PST) MIME-Version: 1.0 From: flow gg Date: Sat, 2 Mar 2024 15:42:36 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH 3/4] lavc/vp9dsp: R-V V ipred hor X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: YMwSHMp9Vtrk From 173072b33d3237b924f3fa342e20558d96a72457 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Sat, 2 Mar 2024 08:35:39 +0800 Subject: [PATCH 3/4] lavc/vp9dsp: R-V V ipred hor C908: vp9_hor_8x8_8bpp_c: 74.7 vp9_hor_8x8_8bpp_rvv_i32: 35.7 vp9_hor_16x16_8bpp_c: 175.5 vp9_hor_16x16_8bpp_rvv_i32: 80.2 vp9_hor_32x32_8bpp_c: 510.2 vp9_hor_32x32_8bpp_rvv_i32: 264.0 --- libavcodec/riscv/vp9_intra_rvv.S | 56 ++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp.h | 6 ++++ libavcodec/riscv/vp9dsp_init.c | 3 ++ 3 files changed, 65 insertions(+) diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S index 88b54f37b0..35fb0ebe10 100644 --- a/libavcodec/riscv/vp9_intra_rvv.S +++ b/libavcodec/riscv/vp9_intra_rvv.S @@ -234,3 +234,59 @@ func ff_v_8x8_rvv, zve64x ret endfunc + +func ff_h_32x32_rvv, zve32x + li t0, 32 + addi a2, a2, 31 + vsetvli zero, t0, e8, m2, ta, ma + + .rept 2 + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + .endr + + ret +endfunc + +func ff_h_16x16_rvv, zve32x + addi a2, a2, 15 + vsetivli zero, 16, e8, m1, ta, ma + + .irp n 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vse8.v v23, (a0) + + ret +endfunc + +func ff_h_8x8_rvv, zve32x + addi a2, a2, 7 + vsetivli zero, 8, e8, mf2, ta, ma + + .irp n 8, 9, 10, 11, 12, 13, 14, 15 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 8, 9, 10, 11, 12, 13, 14 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vse8.v v15, (a0) + + ret +endfunc diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h index ae4fb266d0..2b2e0db0d8 100644 --- a/libavcodec/riscv/vp9dsp.h +++ b/libavcodec/riscv/vp9dsp.h @@ -66,5 +66,11 @@ void ff_v_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); void ff_v_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); +void ff_h_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_h_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_h_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); #endif // #ifndef AVCODEC_RISCV_VP9DSP_RISCV_H diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index 9c550d40b5..16aeeb260a 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -54,6 +54,9 @@ static av_cold void vp9dsp_intrapred_init_rvv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_16X16][TOP_DC_PRED] = ff_dc_top_16x16_rvv; dsp->intra_pred[TX_32X32][VERT_PRED] = ff_v_32x32_rvv; dsp->intra_pred[TX_16X16][VERT_PRED] = ff_v_16x16_rvv; + dsp->intra_pred[TX_32X32][HOR_PRED] = ff_h_32x32_rvv; + dsp->intra_pred[TX_16X16][HOR_PRED] = ff_h_16x16_rvv; + dsp->intra_pred[TX_8X8][HOR_PRED] = ff_h_8x8_rvv; } #endif } -- 2.44.0