From patchwork Thu Oct 6 18:46:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38587 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:4d9:b0:9c:f4b:4e41 with SMTP id 25csp338674pzd; Thu, 6 Oct 2022 11:46:24 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4QvggVf7ark7JYe/9bD3Al58mWXsI7cVAQ94fcDGX4O+bTBIGcJHT7xRSDrsAie2joyQLA X-Received: by 2002:a17:907:7e91:b0:78d:4830:16a5 with SMTP id qb17-20020a1709077e9100b0078d483016a5mr1036026ejc.714.1665081984533; Thu, 06 Oct 2022 11:46:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665081984; cv=none; d=google.com; s=arc-20160816; b=nTSGHDUUr3ZmpS/+4mtGbbqMSiyUZdk0+83t2IDnhzIii8A8MqimHTdkT9SHVEGQ7y veU21qtHP9WAlLCPDTQ+78dFYVYexTbCYKw/yRakhfEIBl1Emg+bZKbOfzl8qBxW7M+G rwirhF3GCaRka3O5SaGwqlSaOiJlHe6Cb6eRJ/yrZo4FV2WEUskrKGcsb2YU9wUxlUS0 dW8L4umEOxqT0sHVMiTZT2VWxTD7YVE+LmZ8V6T4TQf/6QipiuhrZaxsZfLyppQ1CAsJ UoCZYR6SfnBP8dsSoH0mLrNmsqk1M4dxEO1sbqL70AIPjO2nPbNTuRcBibtdKCAWykgd NWmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=HiTRVvyVXRF0t5HbitX4164FSgS/UE3V1Hs/0gaf4ho=; b=1EClw3XmNNT5L1DvjJiUBQQ6pT+jVgXKxXat0WNKktmzwW+hm4gTsgFCpHlOwmrV9d cSKdEHxMJvDgd4du3fdP/0P/vh0DrjE+r+yFlA/F709wPu3H6Ct9ngDZnk9d7ie9hFK2 RVqnkVcWVrqir+NjEtkQsj6FRiLuX2Db7BBnxCZOmPih+jggbKbjXz6VPwyzkyir9ZAV ZnVFY64AUDCSHyXBpJb5NrsTQSmDfmqn4HH4LkFq2EWoKjYZ3Dp9VDk73+N9BuJ7RVca xJO79JMqPSLn/XBmSV7pUp3rB5tAXE5JMFAfHUcBfe5hOGmTGGjs8DY8PvwGI4WzdmZ/ c7wQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hs13-20020a1709073e8d00b00741757d62adsi52883ejc.993.2022.10.06.11.46.23; Thu, 06 Oct 2022 11:46:24 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 06BBF68BBC2; Thu, 6 Oct 2022 21:46:20 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D968168B3A2 for ; Thu, 6 Oct 2022 21:46:13 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id DD9D6C0072 for ; Thu, 6 Oct 2022 21:46:12 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Thu, 6 Oct 2022 21:46:12 +0300 Message-Id: <20221006184612.51719-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/aacpsdsp: fix clobber on RISC-V LP64D/ILP32D X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: C1AkCsTMpHdi Although the DSP function only uses single precision from RISC-V F, the caller may leave double precision values in the spilled registers if the calling convention supports double precision hardware floats. Then, we need to save and restore FS registers as double precision. Conversely, we do not need to save anything at all if an integer calling convention is in use. However we can assume that single precision floats are supported, since the Zve32f extension implies the F extension. So for the sake of simplicity, we always save at least single precision values. In theory, we should even save quadruple precision values if the LP64Q ABI is in use. I have yet to see a compiler that supports it though. --- libavcodec/riscv/aacpsdsp_rvv.S | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/libavcodec/riscv/aacpsdsp_rvv.S b/libavcodec/riscv/aacpsdsp_rvv.S index 1d6e73fd2d..80bd19f6ad 100644 --- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -55,9 +55,10 @@ endfunc func ff_ps_hybrid_analysis_rvv, zve32f /* We need 26 FP registers, for 20 scratch ones. Spill fs0-fs5. */ - addi sp, sp, -32 + addi sp, sp, -48 .irp n, 0, 1, 2, 3, 4, 5 - fsw fs\n, (4 * \n)(sp) +HWD fsd fs\n, (8 * \n)(sp) +NOHWD fsw fs\n, (4 * \n)(sp) .endr .macro input, j, fd0, fd1, fd2, fd3 @@ -142,9 +143,10 @@ func ff_ps_hybrid_analysis_rvv, zve32f bnez a4, 1b .irp n, 5, 4, 3, 2, 1, 0 - flw fs\n, (4 * \n)(sp) +HWD fld fs\n, (8 * \n)(sp) +NOHWD flw fs\n, (4 * \n)(sp) .endr - addi sp, sp, 32 + addi sp, sp, 48 ret .purgem input .purgem filter