From patchwork Mon May 27 15:59:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49298 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:542:0:b0:460:55fa:d5ed with SMTP id 63csp3357695vqf; Mon, 27 May 2024 08:59:57 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCX2lOHaixhrHicnsJYTJZQVM8gO6Q/Bsu6Uj/sANUffNt4dWmFNl71/viQf7JzVo8wJ1BreTKiAhhBoQFP0R66dExcsZeEKHVxg6A== X-Google-Smtp-Source: AGHT+IHr6w5H++VJQ4w2Cka50xoweTO3Pq7SyuJLFlaEHehX97qeZLFWERwbKE3ghiLpvAb8GTHV X-Received: by 2002:ac2:5f16:0:b0:51a:c3b8:b9cf with SMTP id 2adb3069b0e04-52967f06eb8mr7248810e87.69.1716825596856; Mon, 27 May 2024 08:59:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716825596; cv=none; d=google.com; s=arc-20160816; b=rKGlrpgrQAKzhMvDnblzbxrsxISkm81mc3JFdBajdGQTQTJAZ1Xj9JvLqjQMZUEmS5 5Q6LwCYtzWxHguR/CSX/VslU2yJEUuM6kXpzRyiqLAvPiAIctQpLopP6qu7bKjCFW0zK lD12g6fGmDuigzFRVpSYU27jAAVOVlieVcNkju5nHPFw9OoJpvz/DZXNx6xJDDiFFVp5 jSU1Opl2Qe/bFvT7EAq2Wkne3RZu/jrH7O2Db5U6aoJx0HS9+qxrTSL/sMrU6Y89IMrw ZTVcvCpmayWQvr0lKHkHVL9fm0VazXK90fMXAa84wrFubpcgPs7O9HD41OJmq6mkwCn/ b72Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=SylfK/tBUtmy/gaOE/zDD1mWFr2mfTnJXDrJCBC0qkk=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=WptYlP2lQ/jO3Imggq+Ni479yHNygPUayX21fZ+nQY+VpyhcMo1V105PKoSqQQ94DU 0nuIAK5A9Oxswb5qnjSicftD7ayeWpHpNBfakAKf8mZNJE+t+vdyam7mQcd0DvqWmsFl XAi+dIMSb8pM2AWERHf+j4wozOgNaJCWa/5EuHjwUkhLRP7c/3ajwjkYx0i0s5Nv3SiG tMMfwwWaEKhThvcCHPQoTE95si68ND9jQWij9rF9BI5QIQd+N7KOQavXGQXUf9Ps7yXo SVVX1RV/5dq+Iun2HVX3XCLY+YfzpDgH0XMt97LfazNpFa9gQpJk7LmDW/DpQFId6sQ6 Gk+Q==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cc644dbsi387306766b.544.2024.05.27.08.59.56; Mon, 27 May 2024 08:59:56 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8E41D68CE05; Mon, 27 May 2024 18:59:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D59BE68CE05 for ; Mon, 27 May 2024 18:59:46 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 68897C0069 for ; Mon, 27 May 2024 18:59:46 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Mon, 27 May 2024 18:59:45 +0300 Message-ID: <20240527155946.750660-1-remi@remlab.net> X-Mailer: git-send-email 2.45.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] lavc/lpc: fix off-by-one in R-V V compute_autocorr X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: hrhUZw5fjhur --- libavcodec/riscv/lpc_init.c | 2 +- libavcodec/riscv/lpc_rvv.S | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/lpc_init.c b/libavcodec/riscv/lpc_init.c index f21eca4caa..d9bcbbe604 100644 --- a/libavcodec/riscv/lpc_init.c +++ b/libavcodec/riscv/lpc_init.c @@ -37,7 +37,7 @@ av_cold void ff_lpc_init_riscv(LPCContext *c) c->lpc_apply_welch_window = ff_lpc_apply_welch_window_rvv; if ((flags & AV_CPU_FLAG_RVB_BASIC) && - ff_get_rv_vlenb() >= c->max_order) + ff_get_rv_vlenb() > c->max_order) c->lpc_compute_autocorr = ff_lpc_compute_autocorr_rvv; } #endif diff --git a/libavcodec/riscv/lpc_rvv.S b/libavcodec/riscv/lpc_rvv.S index 024837102c..8cf79963f1 100644 --- a/libavcodec/riscv/lpc_rvv.S +++ b/libavcodec/riscv/lpc_rvv.S @@ -87,6 +87,7 @@ func ff_lpc_apply_welch_window_rvv, zve64d endfunc func ff_lpc_compute_autocorr_rvv, zve64d, zbb + addi a2, a2, 1 vtype_vli t1, a2, t2, e64, ta, ma li t0, 1 vsetvl zero, a2, t1 From patchwork Mon May 27 15:59:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49299 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:542:0:b0:460:55fa:d5ed with SMTP id 63csp3357799vqf; Mon, 27 May 2024 09:00:05 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUylwj43MsnZ9EdVpnpItfyv/74/+jmmrFk5ZNRwoddVXRLiH1X0MAbLNvuOkSODZdrnEY0ykUwtHoprkq0Uxn2BB9PAhaoX1yq6A== X-Google-Smtp-Source: AGHT+IEOLlft6QDcVNcWt+lf9p1Su/3SeIvWDpGx/PejIX5usUSHgy7EKXRiY89O4hcfsoVYo2fF X-Received: by 2002:a19:9148:0:b0:523:bbf5:4b36 with SMTP id 2adb3069b0e04-529646df136mr4677395e87.20.1716825605517; Mon, 27 May 2024 09:00:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716825605; cv=none; d=google.com; s=arc-20160816; b=G0hJ1PxDwmWfDELswpt94vyXVDOaYbOSLVbM1rI9mOP1dda+3aYoQsSjMtEh/EgN/+ Ugfmt96Rv5ohhlfbxMdZAV1tnfY44nj4UitqKjHGv9GR6qeXB9hhqyBkHjdWDnzhQO4r 1jvzfnGBvWM+dBR/GTl1AZO9vVluvjOAi1Se/ovRhx4AH9HH+zS3s5odljl0/xkLW9UO q32tIbPloO6dYxy+VuO4FDqTIyPSklHI/UUUWn4agWvPCOcnA9spcc0nuuXmfD2C11NP ds6+bngwZpb4bq0pzI7ltyK2ygRzgp117BzVjOH5CtZAPPB4OaJoGA1TmSzjAg0cC9R2 ZDpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=oilfrY67mUiW664sEyMotSQXrSVqhIifrESOrTalJKE=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=bXCPVRsE+0KQtRUGr4yvlDkUhhfPsLFBRFPWeINlm1roWGXXMqmx0ZSXYWwovViMeP eubVRVhdDdFbg+mbpk4huS+D6HgBzTFylgIzAmv7nVbJHyrSpcMiXg8fWxqGH4Tz0ff9 v+KDDt9GU0EZ5vA4NcQCcXHZ0Tx6PHBz+gbjCwOCliOVTroSylfDmYgZhC/rBsrXUmfn FwDzUMTQyiZCCx6puYN6g/ePN0J0iFKTb9PjslGYJ7lO4YcfshcxfX+5VeBHzBu12gJZ ib2U1pteSDLJlT3Tawzi/pqPuAzi+RRsE4alYJdZi4NWz8+AKRyrf2GY4APpYFo26+/T HvxQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-57852384ad4si3996591a12.159.2024.05.27.09.00.05; Mon, 27 May 2024 09:00:05 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A2EBB68D4BE; Mon, 27 May 2024 18:59:54 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0C4E568D2B7 for ; Mon, 27 May 2024 18:59:47 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 99737C0186 for ; Mon, 27 May 2024 18:59:46 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Mon, 27 May 2024 18:59:46 +0300 Message-ID: <20240527155946.750660-2-remi@remlab.net> X-Mailer: git-send-email 2.45.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] riscv: allow passing addend to vtype_vli macro X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: rrtpu06oWb4g A constant (-1) is added to the length value, so we can have an added for free, and optimise the addition away if the addend is exactly 1. --- libavcodec/riscv/lpc_rvv.S | 2 +- libavutil/riscv/asm.S | 9 ++++++--- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/libavcodec/riscv/lpc_rvv.S b/libavcodec/riscv/lpc_rvv.S index 8cf79963f1..fe80305d9a 100644 --- a/libavcodec/riscv/lpc_rvv.S +++ b/libavcodec/riscv/lpc_rvv.S @@ -87,8 +87,8 @@ func ff_lpc_apply_welch_window_rvv, zve64d endfunc func ff_lpc_compute_autocorr_rvv, zve64d, zbb + vtype_vli t1, a2, t2, e64, ta, ma, 1 addi a2, a2, 1 - vtype_vli t1, a2, t2, e64, ta, ma li t0, 1 vsetvl zero, a2, t1 fcvt.d.l ft0, t0 diff --git a/libavutil/riscv/asm.S b/libavutil/riscv/asm.S index 1e6358dcb5..2cf4f7b7ab 100644 --- a/libavutil/riscv/asm.S +++ b/libavutil/riscv/asm.S @@ -196,18 +196,21 @@ * @param ew element width: e8, e16, e32 or e64 * @param tp tail policy: tu or ta * @param mp mask policty: mu or ma + * @param addend optional addend for the vector length register */ - .macro vtype_vli rd, rs, tmp, ew, tp=tu, mp=mu + .macro vtype_vli rd, rs, tmp, ew, tp=tu, mp=mu, addend=0 parse_vtype \ew, \tp, \mp /* * The difference between the CLZ's notionally equals the VLMUL value * for 4-bit elements. But we want the value for SEW_MAX-bit elements. */ slli \tmp, \rs, 1 + VSEW_MAX + .if \addend - 1 + addi \tmp, \tmp, \addend - 1 + .endif csrr \rd, vlenb - addi \tmp, \tmp, -1 - clz \rd, \rd clz \tmp, \tmp + clz \rd, \rd sub \rd, \rd, \tmp max \rd, \rd, zero // VLMUL must be >= VSEW - VSEW_MAX .if vsew < VSEW_MAX