From patchwork Sat Dec 16 08:50:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 45174 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp6189789pzf; Sat, 16 Dec 2023 00:51:08 -0800 (PST) X-Google-Smtp-Source: AGHT+IGphiqbV9NzFRudWfPrUMipiAN7JWU2NR1R/LoV/gC9R31KColn8vJUdZuHOQHIOI+lqMl8 X-Received: by 2002:a17:906:99cb:b0:9e6:dfee:8143 with SMTP id s11-20020a17090699cb00b009e6dfee8143mr14542902ejn.3.1702716667984; Sat, 16 Dec 2023 00:51:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702716667; cv=none; d=google.com; s=arc-20160816; b=Xm8WKYEWGmn06DVjlyUklVmpSIwIiDcLIlMVLStuftpiK6eicHr2a2I9JzFTUxdhRN edYPheFh9f9t+4RvqzLyULNChmkvBIH2HM7VpYKwSvhYm730Q5A6COB5xZW8eeW7jNav ufFH9YNtB9tQjzr1Ez7Ja+JM1gM0ZSEGohxUz22IZFYvULS+LveMbF5urJGtBo21VwVj JKTEuyVqmtZJvQ5X7dehF8y9xOnm+990v+0gCuRmF7cmaS+W53uQPNVRUAFownCXdLbI zqF/5xNciiVda9517jcVol9/tA6PtM7jF9QMPVgoa8ExQ0WlMcoTi19u8BvEM5Qb0BVZ tBEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=bNhm62OxATzi+qSmTpy0mxBf/FlgpeY94RTV2j706MA=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=FEvU7QFkaLbSaeDgpsgu9dF4WGkRuzxp/wHI7/e0ow5oIjzzGHx9RTA3O6fKiTZZMC zUCTiilHzyZO45f5y0ENmzxcZZXHWax/8NFi347cVCnbAuy7zKSEEUlUFdL4qi6CKIAd +85PoqYj87gyoA/HD6UL8e/NoSszJ6LoWcKujkockuOVKoVlkLRmsijYRIYn4PVNS40N pIbvx+02yyYSkLvnbGpkcZj/S5vk9Gcqq4zdUgXiqoCxYP1hiqd9NRClQamwdSj67t5W wOr1Z46S/lM3mJDZ5UhggdiWnRAagCzraXNFpP5TrIZPXeK+DmQj+uvHObuHyWGTrpoF zHLw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qk12-20020a170906d9cc00b00a1cdf89bc57si7729816ejb.186.2023.12.16.00.51.07; Sat, 16 Dec 2023 00:51:07 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 294A568D04F; Sat, 16 Dec 2023 10:51:04 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0791E68CFFB for ; Sat, 16 Dec 2023 10:50:57 +0200 (EET) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 6310CC000E for ; Sat, 16 Dec 2023 10:50:56 +0200 (EET) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 16 Dec 2023 10:50:56 +0200 Message-ID: <20231216085056.4939-1-remi@remlab.net> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/vc1dsp: fix R-V V vector lengths X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 6wuNcVmxXble The 8x4 and 4x4 use a needlessly large multiplier (unless/until we care about embedded 64-bit-vector hardware). This is merely suboptimal. The 8x4 case also uses an incorrect vector length, which leads to incorrect behaviour on future/hypothetical hardware with 256-bit or larger vectors. Pointed-out-by: Martin Storsjö --- libavcodec/riscv/vc1dsp_rvv.S | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/libavcodec/riscv/vc1dsp_rvv.S b/libavcodec/riscv/vc1dsp_rvv.S index 1a503ecc87..4a00945ead 100644 --- a/libavcodec/riscv/vc1dsp_rvv.S +++ b/libavcodec/riscv/vc1dsp_rvv.S @@ -68,7 +68,7 @@ endfunc func ff_vc1_inv_trans_8x4_dc_rvv, zve64x lh t2, (a2) - vsetivli zero, 8, e8, mf2, ta, ma + vsetivli zero, 4, e8, mf4, ta, ma vlse64.v v0, (a0), a1 sh1add t2, t2, t2 addi t2, t2, 1 @@ -84,14 +84,14 @@ func ff_vc1_inv_trans_8x4_dc_rvv, zve64x vmax.vx v4, v4, zero vsetvli zero, zero, e8, m2, ta, ma vnclipu.wi v0, v4, 0 - vsetivli zero, 8, e8, mf2, ta, ma + vsetivli zero, 4, e8, mf4, ta, ma vsse64.v v0, (a0), a1 ret endfunc func ff_vc1_inv_trans_4x4_dc_rvv, zve32x lh t2, (a2) - vsetivli zero, 4, e8, mf2, ta, ma + vsetivli zero, 4, e8, mf4, ta, ma vlse32.v v0, (a0), a1 slli t1, t2, 4 add t2, t2, t1 @@ -107,7 +107,7 @@ func ff_vc1_inv_trans_4x4_dc_rvv, zve32x vmax.vx v2, v2, zero vsetvli zero, zero, e8, m1, ta, ma vnclipu.wi v0, v2, 0 - vsetivli zero, 4, e8, mf2, ta, ma + vsetivli zero, 4, e8, mf4, ta, ma vsse32.v v0, (a0), a1 ret endfunc