From patchwork Tue May 14 19:35:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 48878 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a48:b0:1af:fc2d:ff5a with SMTP id zu8csp1172818pzb; Tue, 14 May 2024 12:36:19 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV5nSxtWYj4nnFed+fG0eGQvKCIhkpa3kyiSIcaBpmFPMivRA29ZGtHZYzhGPZTuGf764A3Ap239DBYrG5mie2qumkRyxkFqqgFjA== X-Google-Smtp-Source: AGHT+IGKPTVZdi1293NJuFzODb9biPbUR92wEZ9fkaCra2krCuEOAaPG7J0TyuJeXvIPeOrwqe+e X-Received: by 2002:a17:906:7196:b0:a55:5ba7:2889 with SMTP id a640c23a62f3a-a5a2d5c9f69mr911931466b.42.1715715379542; Tue, 14 May 2024 12:36:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715715379; cv=none; d=google.com; s=arc-20160816; b=xCSvC+YjJGnwotqUGQXSQI++zesg4ORXGhx/1qz2KXnQ6cehTkGGDEjmn8m4/pIosK oGe7mL7XFUfLNVsSVRB+0B0Pnr5rlfSTw7mxMtgYQmD3wL1jHRj2iedUgUpuOmdSnnkH nQu9XKFBTh/xXJBH5yZzMR0sjkX5NpeUkIAwYwymcHTHSlvWn17nQYScRbVgqFNF1idm BD8bltdlV+6F9pjfT/TTGZmIxYxCUAzOvbtaj63KoRzgWkR1IWjihqN3KGABwq9Z7crX pSwwqlGfO4HnubEkXRLjIGr3dKFEUuFjpHO+37kbgghqGYpeiQJBYSDaNV8CPGNmT/kX MyWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=bXZ9oU6ZcFFSCfkeJDvv1hRiDTleT1NvfflK1q8kuBo=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=YEwVPEkUhsYsJqsZ22KyVX6Pkq1nnHxxR8alJ/34kXp2U6MULHOqB8e+cGMqxzXWta 3OTFdImyeEQpsdMA+jwsU9eG3JHMA8qvUQkyWjLbbOxlAAjCny8yRcu7Mv52qym6cJsK JoZSjThSp4Kk0er7LgrvXcZCsPGjtt8bz0T/6CmvfKxzN07NT+IJaW2x67Xu0TCZ9OlZ 0gWwwIg4sMxhH8Z+B/RJ52igB4wUP7YLEJJW2KesPlsqtP1rivRESMVTvnYouKKJdg5O DgcCkAdUEla4SW7nplBu5MOoT6PEYMe71E0ASJMSBiMmB++yLmqop7LOah5cwKrak2q3 /HDQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a5a8970ff4asi78990866b.262.2024.05.14.12.36.19; Tue, 14 May 2024 12:36:19 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5049368D735; Tue, 14 May 2024 22:36:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 413A468D6DC for ; Tue, 14 May 2024 22:35:58 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 5AF8DC00A2 for ; Tue, 14 May 2024 22:35:57 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 14 May 2024 22:35:56 +0300 Message-ID: <20240514193557.32759-1-remi@remlab.net> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] lavu/riscv: assembler macros for VTYPE fields X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: zSJBe9hExa6H --- libavutil/riscv/asm.S | 48 +++++++++++++++++++++++++++++-------------- 1 file changed, 33 insertions(+), 15 deletions(-) diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S index 14be5055f5..ecf3081e61 100644 --- a/libavutil/riscv/asm.S +++ b/libavutil/riscv/asm.S @@ -96,20 +96,38 @@ .endm #endif +#define VTYPE_E8 000 +#define VTYPE_E16 010 +#define VTYPE_E32 020 +#define VTYPE_E64 030 + +#define VTYPE_MF8 05 +#define VTYPE_MF4 06 +#define VTYPE_MF2 07 +#define VTYPE_M1 00 +#define VTYPE_M2 01 +#define VTYPE_M4 02 +#define VTYPE_M8 03 + +#define VTYPE_TU 0000 +#define VTYPE_TA 0100 +#define VTYPE_MU 0000 +#define VTYPE_MA 0200 + /* Convenience macro to load a Vector type (vtype) as immediate */ .macro lvtypei rd, e, m=m1, tp=tu, mp=mu .ifc \e,e8 - .equ ei, 0 + .equ ei, VTYPE_E8 .else .ifc \e,e16 - .equ ei, 8 + .equ ei, VTYPE_E16 .else .ifc \e,e32 - .equ ei, 16 + .equ ei, VTYPE_E32 .else .ifc \e,e64 - .equ ei, 24 + .equ ei, VTYPE_E64 .else .error "Unknown element type" .endif @@ -118,25 +136,25 @@ .endif .ifc \m,m1 - .equ mi, 0 + .equ mi, VTYPE_M1 .else .ifc \m,m2 - .equ mi, 1 + .equ mi, VTYPE_M2 .else .ifc \m,m4 - .equ mi, 2 + .equ mi, VTYPE_M4 .else .ifc \m,m8 - .equ mi, 3 + .equ mi, VTYPE_M8 .else .ifc \m,mf8 - .equ mi, 5 + .equ mi, VTYPE_MF8 .else .ifc \m,mf4 - .equ mi, 6 + .equ mi, VTYPE_MF4 .else .ifc \m,mf2 - .equ mi, 7 + .equ mi, VTYPE_MF2 .else .error "Unknown multiplier" .equ mi, 3 @@ -149,20 +167,20 @@ .endif .ifc \tp,tu - .equ tpi, 0 + .equ tpi, VTYPE_TU .else .ifc \tp,ta - .equ tpi, 64 + .equ tpi, VTYPE_TA .else .error "Unknown tail policy" .endif .endif .ifc \mp,mu - .equ mpi, 0 + .equ mpi, VTYPE_MU .else .ifc \mp,ma - .equ mpi, 128 + .equ mpi, VTYPE_MA .else .error "Unknown mask policy" .endif From patchwork Tue May 14 19:35:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 48877 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:3a48:b0:1af:fc2d:ff5a with SMTP id zu8csp1172738pzb; Tue, 14 May 2024 12:36:10 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCU+dQasLLVCWuIOZx5o2DzfGiuEtUHefhYUfvfXwKYW/JAtgFhQLcSYwHUfvmwqMhLGOpCWNmuvgjMWIlxb8WWt2gfLg16LIMOQEQ== X-Google-Smtp-Source: AGHT+IGjJbctlJ252cO/J4D7EEC9zUiaqJS5zNCRfwIXEAOQVAvqar410H+KTUm/41CUIrs6R/4b X-Received: by 2002:a50:d682:0:b0:574:ebe0:7dc0 with SMTP id 4fb4d7f45d1cf-574ebe07eb6mr1358935a12.18.1715715369937; Tue, 14 May 2024 12:36:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715715369; cv=none; d=google.com; s=arc-20160816; b=x89FDbmi3DbnPR2i1zlCuY/9Q+4V9222ir5Miyw/TFeAfKg5UJiMiMMoMklNDhrpeg mJPhxyw3EReEi5Thn2UkqXiKjkkTTl6Z8yQ9gOTb1YfVT5Bm3pmQPy2mRP80ME86IICZ bvLN2ez2zzJZXqRnqHZsoOoNhpZR70D5AZCupakCzICMbq18TtIkRrJAAhHNy0e6NyIT RtyfvdwMmHUPECbMnjVNUSm9uSdxdsFD8Xvn5s+ie6W1XcMWLyuoxUsz0drk4cxn/9RL QWR72J+gj0Uk+HtK7sCxxY93w0+S67ycYvuu+zLWzUGKOoeJ1GSZuX1J4Ka9iyNPjnjB UlwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=62Qd2EomhCAw+aKxrAg5/lIuHFcLyQ+9gXOQ9tdGMYY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=KHWgB+dc1tB4NjG5Ph+Pnbd82v2qBcoFWzNedJlF2inlPmWQZeXz+5xWH+sGFeOTrp VC2LNiJSNAwFFUnz7p2QsfBZt1O0ENORxSfQh4DbL4wa6dlHWAERIn5XqYCdritDMUxn pEH4NxuPdt/yrGGAndLUUKJDsDikPaolkS36/MEIlbZK7ZnegyeH1GmMTRarQms5BtNN 4jh8chwR7zPd0dpUACrfVp+IdJJ6e1dOCUYxC2DUNjTqG5RlAOLgt/gk3e+QG6h21ApE ACw9E/bwjoAq9AsYrhRaRtNTz/NEgz5Bxep9U+zGdci0+1Wdf5siJG9c8hhpREgvxpSx jeXg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-574ec5bf8b2si477787a12.434.2024.05.14.12.36.08; Tue, 14 May 2024 12:36:09 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EE8DB68D627; Tue, 14 May 2024 22:36:04 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4411B68D713 for ; Tue, 14 May 2024 22:35:58 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 8F58EC01A3 for ; Tue, 14 May 2024 22:35:57 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 14 May 2024 22:35:57 +0300 Message-ID: <20240514193557.32759-2-remi@remlab.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240514193557.32759-1-remi@remlab.net> References: <20240514193557.32759-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: optimise RVV vector type for lpc16 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 2gbArPw9djFO This calculates the optimal vector type value at run-time based on the hardware vector length and the FLAC LPC prediction order. In this particular case, the additional computation is easily amortised over the loop iterations: T-Head C908: C V before V after flac_lpc_16_13: 14180.2 11229.0 7338.5 flac_lpc_16_16: 16833.2 11091.0 7248.5 flac_lpc_16_29: 28817.2 11455.7 10506.5 flac_lpc_16_32: 31059.7 10368.5 11305.2 With 128-bit vectors, improvements are expected for the first two test cases only. For the other two, there is overhead but below noise. Improvements should be better observable with prediction order of 8 and less, or on hardware with larger vector sizes. The same optimisation strategy should be applicable to LPC32 (and work-in-progress LPC33), but is left as a future exercise. --- libavcodec/riscv/flacdsp_init.c | 2 +- libavcodec/riscv/flacdsp_rvv.S | 10 ++++++++-- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/libavcodec/riscv/flacdsp_init.c b/libavcodec/riscv/flacdsp_init.c index 77ffd09244..097f938f04 100644 --- a/libavcodec/riscv/flacdsp_init.c +++ b/libavcodec/riscv/flacdsp_init.c @@ -71,7 +71,7 @@ av_cold void ff_flacdsp_init_riscv(FLACDSPContext *c, enum AVSampleFormat fmt, if ((flags & AV_CPU_FLAG_RVV_I32) && (flags & AV_CPU_FLAG_RVB_ADDR)) { int vlenb = ff_get_rv_vlenb(); - if (vlenb >= 16) + if ((flags & AV_CPU_FLAG_RVB_BASIC) && vlenb >= 16) c->lpc16 = ff_flac_lpc16_rvv; c->wasted32 = ff_flac_wasted32_rvv; diff --git a/libavcodec/riscv/flacdsp_rvv.S b/libavcodec/riscv/flacdsp_rvv.S index 8b9c626198..42cece9786 100644 --- a/libavcodec/riscv/flacdsp_rvv.S +++ b/libavcodec/riscv/flacdsp_rvv.S @@ -20,8 +20,14 @@ #include "libavutil/riscv/asm.S" -func ff_flac_lpc16_rvv, zve32x - vsetvli zero, a2, e32, m8, ta, ma +func ff_flac_lpc16_rvv, zve32x, zbb + csrr t0, vlenb + addi t2, a2, -1 + clz t0, t0 + clz t2, t2 + addi t0, t0, VTYPE_E32 | VTYPE_M8 | VTYPE_TA | VTYPE_MA + sub t0, t0, t2 // t0 += log2(next_power_of_two(len) / vlenb) - 1 + vsetvl zero, a2, t0 vle32.v v8, (a1) sub a4, a4, a2 vle32.v v16, (a0)