From patchwork Mon Jun 10 19:20:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49782 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:c209:0:b0:460:55fa:d5ed with SMTP id d9csp2742103vqo; Mon, 10 Jun 2024 12:20:50 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWojHDiXjIXxhbOICYms9Tz322R8sKc1A+DOBW8OJRhGpDgNwnJrTHtzXS8ypQZlgby47ryPMj89Zk+H4ccUhePm885PH0HUwqlyw== X-Google-Smtp-Source: AGHT+IF3S0cJpkHgpOe1A56FshBKZvCOEJiNNW5Z6E1U1SQVSu/p5LHPn0neXmtc8By2M5NDuLrz X-Received: by 2002:a17:907:1c05:b0:a6d:de5b:5b1d with SMTP id a640c23a62f3a-a6dde5b5d36mr974579866b.18.1718047250418; Mon, 10 Jun 2024 12:20:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1718047250; cv=none; d=google.com; s=arc-20160816; b=BppRQD0r0/ngroFP832ZqTQn5KerbCnNN8Q0JCrIrTYEUa9/4eRJG4E6oOmXw98PMo 4GXPd6dGLEDvItwSroBaLZ7rG36Q7vHGxsH7qzKDRwFYsSWQ6u3fPiVJB02QsXU7ffgn iu4vZZXQWMF7H19toIwimPSXlAkBKiWlP6n8isNZS0q1PX5oVXx2MgaWsjjjKqkCrfYK 27NfmLStGsq9hSVPmkXT0nZ76CxIxQDGYkFddH6rEU0lVu3RpwH1OZkC9GqWVyNI+fdx LzsgL53aBQ8/WFd2yPg7zwQCs07AczzKWzvNHx/0om1S5pTLXBzk80H58WMJBV8PYEc7 D7nw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=b4hAYmD9Dhi842HvkWKMSEyOZRYiQoRbL6Zmaf5pUiY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=TYtD0BB1AUnvF22QChaX0I1LylvhLYionrYIWK6OSrggOTw5uU+rpUHSKBhtNrBilh 1rZEdpsBrLgrZVPAFgC3JLPaL1jFO2WWyfXj9lRhemHVP3Q46xCe0BgxBugPlMzFHpDE YKtA32UzBaWKELnIpu+THVirhgh4GNm7r40N8rrXzgftP2Y9tVL/wUxdxlkE2xXRKvK7 pMslKAAKORTh0uDBJzyjl7SjcxabT26Jq2JivTx8tSssWqalnp4KR1tdOZoas1gOj+7y MXR3TPqUrDrIJNmdgtjs7FNHmTBFc51dERfWvY3D2Ohu8Ht7o4tP+zWCt312niAiWVny KrNw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a6ef5401350si324948866b.329.2024.06.10.12.20.50; Mon, 10 Jun 2024 12:20:50 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 19B6E68D77B; Mon, 10 Jun 2024 22:20:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1B17368D74D for ; Mon, 10 Jun 2024 22:20:40 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 64577C0069 for ; Mon, 10 Jun 2024 22:20:39 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Mon, 10 Jun 2024 22:20:39 +0300 Message-ID: <20240610192039.27012-1-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240610190306.23569-1-remi@remlab.net> References: <20240610190306.23569-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/vc1dsp: match C block layout in inv_trans_4x8_rvv X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 05gvDhqDKaCF Although checkasm does not verify this, the decoder requires that the transform updates the input block exactly like the C code does. This fixes vc1-ism, vc1_ilaced_twomv, vc1_sa00040, vc1_sa10091, vc1_sa10143, vc1_sa20021, vc1test_smm0005 and wmv3-drm-dec tests. --- libavcodec/riscv/vc1dsp_rvv.S | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/libavcodec/riscv/vc1dsp_rvv.S b/libavcodec/riscv/vc1dsp_rvv.S index c4517d54f5..860b0cc5b1 100644 --- a/libavcodec/riscv/vc1dsp_rvv.S +++ b/libavcodec/riscv/vc1dsp_rvv.S @@ -303,15 +303,24 @@ func ff_vc1_inv_trans_4x8_rvv, zve32x vlsseg4e16.v v0, (a2), a3 li t1, 3 jal t0, ff_vc1_inv_trans_4_rvv + vssseg4e16.v v0, (a2), a3 + vsetivli zero, 4, e16, mf2, ta, ma addi t1, a2, 1 * 8 * 2 - vse16.v v0, (a2) + vle16.v v0, (a2) addi t2, a2, 2 * 8 * 2 - vse16.v v1, (t1) + vle16.v v1, (t1) addi t3, a2, 3 * 8 * 2 - vse16.v v2, (t2) - vse16.v v3, (t3) - vsetivli zero, 4, e16, mf2, ta, ma - vlseg8e16.v v0, (a2) + vle16.v v2, (t2) + addi t4, a2, 4 * 8 * 2 + vle16.v v3, (t3) + addi t5, a2, 5 * 8 * 2 + vle16.v v4, (t4) + addi t6, a2, 6 * 8 * 2 + vle16.v v5, (t5) + addi t1, a2, 7 * 8 * 2 + vle16.v v6, (t6) + vle16.v v7, (t1) + jal t0, ff_vc1_inv_trans_8_rvv vadd.vi v4, v4, 1 add t0, a1, a0