From patchwork Fri Jan 13 20:18:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 39998 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:bb05:b0:b6:a58c:cef with SMTP id fc5csp732785pzb; Fri, 13 Jan 2023 12:18:58 -0800 (PST) X-Google-Smtp-Source: AMrXdXtsoHg1xB/SyKPElJdqkQrebJ9K0PmNBmVM7s7S3ngnGT4dLXmvmigZsFGMALK89vsg8vG0 X-Received: by 2002:aa7:c2d4:0:b0:497:233d:3ef7 with SMTP id m20-20020aa7c2d4000000b00497233d3ef7mr24025988edp.7.1673641138772; Fri, 13 Jan 2023 12:18:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673641138; cv=none; d=google.com; s=arc-20160816; b=X2wasSnJ5ENtnS8gldxc/O27LFSiMlUWzEtgmWFm0a7ScWrrg8xLyrM/Xrigx36JY3 M5Rk8sZpdXontBCg7FNQEGHt4Hl1pjakKn4ODDhk6lkNrpm7i3Aje8xHCvE8UNGJOWl2 H5jupCjg7x2iFw/kToYdyBieQOySx7D5QrWFopWaYm/I0F0+3AOLh2PkX0Xnzd6KYs7K VAyHahiSC1HKcshrGeceSG/FtdjDMCf5i/3xs2g3A2BL5h22kfA2bMbcnPcmP4hf7nzh IJSyN3sXJ4TjnFlUzE0xQK686/EuAMGoCvWPc0QvW7eWcGtuEdMKwH5+uxKRPAZKURa+ GTVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=B8yVhQaI7eTBia22+vdw6fc/aHwY6/y7L8hrUlctAzI=; b=nvlXzGjAhODMHNAAjVuqNtS5tagqxnxThXrqKhtWXl04HaLAuPeI/oqK2OOSKHgRTN F/klDe8/q1Gap93ioNeeDIWDsXgTyJiC4xkWxJ/goaBlMCzD5L3XL0uKc+5FnL2I7p0K VFTesUtvCir9jyrPj/NKsmz1xxBNkdVvZwx0ryWzwEnhGzr287p/tkb5xXBpTFJB9Cix wt72iTbxHa/orR6JLcYGDD5zn89s3qBy5yyWYus/6BQ4bMi0rFCXq7wzC0QduDKSSmrb NxrOz7L91PrJGmlCye2wb3nAmlnzpJwoRrQmJ/Qo8xOmCzu//tKVtjex+NSKsXW3Gtuv 8hBg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s24-20020a056402037800b00499c63cd389si10079626edw.442.2023.01.13.12.18.55; Fri, 13 Jan 2023 12:18:58 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 32AA268BAA5; Fri, 13 Jan 2023 22:18:52 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5F98E689BD5 for ; Fri, 13 Jan 2023 22:18:45 +0200 (EET) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id DCB79C000E for ; Fri, 13 Jan 2023 22:18:44 +0200 (EET) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 13 Jan 2023 22:18:44 +0200 Message-Id: <20230113201844.49829-1-remi@remlab.net> X-Mailer: git-send-email 2.39.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [NOT FOR MERGE] [PATCH] lavc/bswapdsp: do not assume aligned input on RISC-V X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: gRfWJ/qrwR7I This fixes the RISC-V B code not to assume alignment. Unfortunately, the whole idea behind the optimisation does not really work if the input is unaligned, and the C code works just as well. Notes: - This does not fix the call prototypes, whose second parameter is expected to change to `const void *` separately. - The RISC-V Vector code does not assume any alignment of either input or output buffers. --- libavcodec/bswapdsp.c | 4 ++-- libavcodec/bswapdsp.h | 2 ++ libavcodec/riscv/bswapdsp_rvb.S | 5 +++++ 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/libavcodec/bswapdsp.c b/libavcodec/bswapdsp.c index f0ea2b55c5..901610c96d 100644 --- a/libavcodec/bswapdsp.c +++ b/libavcodec/bswapdsp.c @@ -22,7 +22,7 @@ #include "libavutil/bswap.h" #include "bswapdsp.h" -static void bswap_buf(uint32_t *dst, const uint32_t *src, int w) +void ff_bswap32_buf(uint32_t *dst, const uint32_t *src, int w) { int i; @@ -48,7 +48,7 @@ static void bswap16_buf(uint16_t *dst, const uint16_t *src, int len) av_cold void ff_bswapdsp_init(BswapDSPContext *c) { - c->bswap_buf = bswap_buf; + c->bswap_buf = ff_bswap32_buf; c->bswap16_buf = bswap16_buf; #if ARCH_RISCV diff --git a/libavcodec/bswapdsp.h b/libavcodec/bswapdsp.h index 6f4db66115..fa199b3be9 100644 --- a/libavcodec/bswapdsp.h +++ b/libavcodec/bswapdsp.h @@ -30,4 +30,6 @@ void ff_bswapdsp_init(BswapDSPContext *c); void ff_bswapdsp_init_riscv(BswapDSPContext *c); void ff_bswapdsp_init_x86(BswapDSPContext *c); +void ff_bswap32_buf(uint32_t *dst, const uint32_t *src, int w); + #endif /* AVCODEC_BSWAPDSP_H */ diff --git a/libavcodec/riscv/bswapdsp_rvb.S b/libavcodec/riscv/bswapdsp_rvb.S index 91b47bf82d..795e44f478 100644 --- a/libavcodec/riscv/bswapdsp_rvb.S +++ b/libavcodec/riscv/bswapdsp_rvb.S @@ -23,7 +23,9 @@ #if (__riscv_xlen >= 64) func ff_bswap32_buf_rvb, zbb + andi t1, a1, 3 andi t0, a1, 4 + bnez t1, 6f beqz t0, 1f /* Align a1 (input) to 64-bit */ lwu t0, (a1) @@ -64,5 +66,8 @@ func ff_bswap32_buf_rvb, zbb sw t0, -4(a0) 5: ret + +6: /* No worthy optimisation if unaligned */ + tail ff_bswap32_buf endfunc #endif