From patchwork Tue May 7 07:36:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: uk7b@foxmail.com X-Patchwork-Id: 48610 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:6816:b0:1af:836d:81b3 with SMTP id wr22csp208698pzb; Tue, 7 May 2024 00:37:09 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVIbZuwImj3l0CvAdfo/r1ifVfDHvZ6K0iSQ/C84UM6T0ARdWHByeYPWv5kqyk9GRoBm88DU8q/KokeHcNmEuUdwYsfo7y6Va3usQ== X-Google-Smtp-Source: AGHT+IHO4Rnz3D+jpatX/g9eCn8Ql2g1+F3lFssgzHY9+kAHF84wH46y2pgSQjYEAhjXuhiHjJE0 X-Received: by 2002:a17:907:2da9:b0:a59:c8bf:1269 with SMTP id gt41-20020a1709072da900b00a59c8bf1269mr4639640ejc.37.1715067429242; Tue, 07 May 2024 00:37:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715067429; cv=none; d=google.com; s=arc-20160816; b=yLl+IOPwnP29OJMJyBj6aNbRwjunu+3ciObKklQrQ0+cFrCofuGsodxTwnZUqwHRD7 7wxTvDryJb7jjkpD7F2vAKz3ITQb8JvjHQSdvZhxxP/AdU3ai59JZae62eE3PssLA34S RwieZiwbJ/+2JvTvG1nKqBcM4YROMK2fbkFMGiVhV1pIFTvkUqI2Nc9q5XdMwu7S7NwP kC6r30zQdR0HRrCBvRA7yWJYi6ejQkPdV0TF+MR6HiQ2QUXxvXu8+cp5SH91PPI7oFcd ob20NXtFDGGaVKJjihPvb6o65OWpZx075thJ6B161JXfDYW1fHSUvLwgJSAyg0F86ddT Rw0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=yHgOD/RkBhlV5Q8JA3lE78Ac/Stzngt0AvuzspzD6kA=; fh=D0bFwGkf4X22/D/bfeDVrXKIx7S6kcXsNzy10j8ORbQ=; b=sGO5xQD0p3rzN8MkBERNKv/mjhMNnfeSxj5X7dEMCh2s21+wAqxRRs/IBKAqwLcO4m efGSl4ZpVB4OagwyvaoSFcZYWysP5u18mEWg4Ps+BBCmxqd4Xj7bG7kXDDPYTlWrSwMT urn36HeCVcTvWapwzDMjH83MD1fIFoZY28n+CJqXhCxEbFTmU0IoE6dTzrtNwMnTSKJP cIcm9Zhd7x5SMIf/YJFecAr0YcQChZ59/N94fKonCubvYCSloIIuuB4cdC+QCqSLlmBN 0RmYacop2fdax0/m+FwSdMj41lYx6yb1lVt2CzCYTmoDw1LCTMgoyUocnlhzKP+gaH5x WTmg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=l63ZR9W9; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id kq1-20020a170906abc100b00a58765be16fsi5584032ejb.918.2024.05.07.00.37.08; Tue, 07 May 2024 00:37:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=l63ZR9W9; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EF05E68D72E; Tue, 7 May 2024 10:36:45 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-192.mail.qq.com (out203-205-221-192.mail.qq.com [203.205.221.192]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E372168D672 for ; Tue, 7 May 2024 10:36:37 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1715067388; bh=F5+UkpW8pQqQS1RzKvzIBgMF4o/GslDPM5vosCpD6eY=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=l63ZR9W9RVqAYvbcsph6BgD0crt7SKa59ALccIJuO+66nB5AphPS3NYNB4/9mID60 g9X8QDqW1YHXIAAwufbLUy/pRu13ejlqWzUvRKJHOACq7uuax3W+jUlnKovMtkiGm/ lQs0EhkLVXEjbKAdcLpp/inxSXrXNL35fCnIOW3c= Received: from localhost.localdomain ([42.56.223.122]) by newxmesmtplogicsvrsza29-0.qq.com (NewEsmtp) with SMTP id 9188D283; Tue, 07 May 2024 15:36:24 +0800 X-QQ-mid: xmsmtpt1715067386tykmv3dgv Message-ID: X-QQ-XMAILINFO: MmPNY57tR1XnycgwNRWbJTxCBVhsa/4BeIc28psVZ+eUc3GbPxubrO67ztMzOU AWuSWlDyNGzKcQ+btvHkM+NKbIoV80CWWETBRIb8r4CVt8LsAbfkxyEUTEAXZrgGLkTD/bkrY1XQ W4ueVgQldIbKlSsuVgDewXlkwnnZlXgt57iCvTZpSRjyzx1B5j+SE0/wnW1kVqu7cJBH4FPuCgjj U97zUCdzZvv4WSaHFZCHkxC26yhbI9t0hWgKUQh1TXLWt90+obBj0cRtVgOfijN+RvMlubXdRUo4 PZ0grnkHZXLTVZp1hokC2wYqycuXYOXrVysAV4WStjE/0uF1EF97fRsQm6d956VfrNSyHtHxOWjo qJNYF5A64bcjJPVmgT3KBKxi+dGXhFLCfDAPOrNXSDsnrlOLC/mYYNUmGT8I1kY6aD6h+Wj9sv6O kd0ynf4hzvODt9ZZGFR7fuNPauoK8zeaGpuVgRZd4RAXPUtcBJkgM2wRorum5CFbcupQ+zLzfGQr 3CnDkhznsSYVEzituXnBG0zNngVeldmLGRDejcwhyTlHJIancKqyZfTpQY3zaXWMSsXOS6f3KX6m HaIgCsIuIuuy83L1auWTHindJUo/Ci3LQWm91JuyOhtDWk+B4vBdjk9f7YZ86fIjiHj1g5AYcpji DvuiXNq8DtLLofNLFq2aWSyEtXmPkJis5/utMm8wD3xZSVZxeDisTIrdOGEh2qtgXrGuZlzeD6MD no7mddh7ffUZbhWM9CrGyiZJSaB5AEAAMGn1t3R1U2bL4llaAs7ke+PyROB3Gu0VavgHXideaefF F0aso5rN4jGoTSoS6kg/PvOb0milnS0ZHjaDB2xV1bwlma1kQBRuMjAl3yL2llY6fOLP+NPsiBsp GxUwNA3r7rw19OPUNyvh3x7NUxYoMNPA== X-QQ-XMRINFO: MSVp+SPm3vtS1Vd6Y4Mggwc= From: uk7b@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Tue, 7 May 2024 15:36:07 +0800 X-OQ-MSGID: <20240507073613.2871668-3-uk7b@foxmail.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240507073613.2871668-1-uk7b@foxmail.com> References: <20240507073613.2871668-1-uk7b@foxmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 3/9] lavc/vp9dsp: R-V V ipred hor X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: sunyuechi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: EojwquPDEwX7 From: sunyuechi C908: vp9_hor_8x8_8bpp_c: 74.7 vp9_hor_8x8_8bpp_rvv_i32: 35.7 vp9_hor_16x16_8bpp_c: 175.5 vp9_hor_16x16_8bpp_rvv_i32: 80.2 vp9_hor_32x32_8bpp_c: 510.2 vp9_hor_32x32_8bpp_rvv_i32: 264.0 --- libavcodec/riscv/vp9_intra_rvv.S | 56 ++++++++++++++++++++++++++++++++ libavcodec/riscv/vp9dsp.h | 6 ++++ libavcodec/riscv/vp9dsp_init.c | 3 ++ 3 files changed, 65 insertions(+) diff --git a/libavcodec/riscv/vp9_intra_rvv.S b/libavcodec/riscv/vp9_intra_rvv.S index db9774c263..dd9bc036e7 100644 --- a/libavcodec/riscv/vp9_intra_rvv.S +++ b/libavcodec/riscv/vp9_intra_rvv.S @@ -113,3 +113,59 @@ func_dc dc_left 8 left 3 0 zve64x func_dc dc_top 32 top 5 1 zve32x func_dc dc_top 16 top 4 1 zve32x func_dc dc_top 8 top 3 0 zve64x + +func ff_h_32x32_rvv, zve32x + li t0, 32 + addi a2, a2, 31 + vsetvli zero, t0, e8, m2, ta, ma + + .rept 2 + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + .endr + + ret +endfunc + +func ff_h_16x16_rvv, zve32x + addi a2, a2, 15 + vsetivli zero, 16, e8, m1, ta, ma + + .irp n 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vse8.v v23, (a0) + + ret +endfunc + +func ff_h_8x8_rvv, zve32x + addi a2, a2, 7 + vsetivli zero, 8, e8, mf2, ta, ma + + .irp n 8, 9, 10, 11, 12, 13, 14, 15 + lbu t1, (a2) + addi a2, a2, -1 + vmv.v.x v\n, t1 + .endr + .irp n 8, 9, 10, 11, 12, 13, 14 + vse8.v v\n, (a0) + add a0, a0, a1 + .endr + vse8.v v15, (a0) + + ret +endfunc diff --git a/libavcodec/riscv/vp9dsp.h b/libavcodec/riscv/vp9dsp.h index b8ff282f8a..0ad961c7e0 100644 --- a/libavcodec/riscv/vp9dsp.h +++ b/libavcodec/riscv/vp9dsp.h @@ -66,6 +66,12 @@ void ff_v_16x16_rvi(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); void ff_v_8x8_rvi(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, const uint8_t *a); +void ff_h_32x32_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_h_16x16_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); +void ff_h_8x8_rvv(uint8_t *dst, ptrdiff_t stride, const uint8_t *l, + const uint8_t *a); #define VP9_8TAP_RISCV_RVV_FUNC(SIZE, type, type_idx) \ void ff_put_8tap_##type##_##SIZE##h_rvv(uint8_t *dst, ptrdiff_t dststride, \ diff --git a/libavcodec/riscv/vp9dsp_init.c b/libavcodec/riscv/vp9dsp_init.c index c10f8bbe41..7816b13fe0 100644 --- a/libavcodec/riscv/vp9dsp_init.c +++ b/libavcodec/riscv/vp9dsp_init.c @@ -59,6 +59,9 @@ static av_cold void vp9dsp_intrapred_init_riscv(VP9DSPContext *dsp, int bpp) dsp->intra_pred[TX_16X16][DC_129_PRED] = ff_dc_129_16x16_rvv; dsp->intra_pred[TX_32X32][TOP_DC_PRED] = ff_dc_top_32x32_rvv; dsp->intra_pred[TX_16X16][TOP_DC_PRED] = ff_dc_top_16x16_rvv; + dsp->intra_pred[TX_32X32][HOR_PRED] = ff_h_32x32_rvv; + dsp->intra_pred[TX_16X16][HOR_PRED] = ff_h_16x16_rvv; + dsp->intra_pred[TX_8X8][HOR_PRED] = ff_h_8x8_rvv; } #endif #endif