From patchwork Mon Feb 7 20:47:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 34158 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6602:2c4e:0:0:0:0 with SMTP id x14csp197342iov; Mon, 7 Feb 2022 12:48:10 -0800 (PST) X-Google-Smtp-Source: ABdhPJx0+Ra4FjvwMRxxOWcy6o/FQa97d6vsyIBWsT34WoW6JvbqrePJse7r6w0Tqs/Qe2vK+nVz X-Received: by 2002:a17:906:d553:: with SMTP id cr19mr1151654ejc.65.1644266889915; Mon, 07 Feb 2022 12:48:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644266889; cv=none; d=google.com; s=arc-20160816; b=RzG6UndkCGAMsSZ4aPrevVdyMSjSY2KW3LVyqUpkdoEqr0pVQcf2N6W2hpojw9ZI6B cAkv7wf27z9Kw5ngFwXM1KUlVgsr+jCR44xmlswh36qwg5X7E8NIexLGUMrbxoFQWJ+1 rN2p/y4+qzNx3cc0HUpCiOL/jaj3RM2r4kbAPrCFz50hRAa/zMmdBTeV6ZmOH5Gsay9y 3aotKGpNK2QucHD1SZgLSvJYngeQL0dFSepSw7AiLDXcH9vc6eoaF4b86LPD8qFCp282 dSxweYX8yedCK2IZ8fl/BBgNhFUy2BE2/LvIrI1uFDvpdvdHtkFP68pgfv02nmRXVlwz DYBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=j7/OBQMw/r+mDWAQqg2uq0uC7Vi0X58CVLBAicGqehE=; b=h2Sfx/VMUOJXyrW8b8V+SIy/jx7AKp+Pk1IRAHRgbWkuh6g6JZ0Nruu9BvpEslZl/Y ZRw1jbuxjZjAm6dgfnlVLmxNn7j2Hvs6LVJ/zffE9lTLAlxSGuB/nFrJ46AQdH4RFda+ sONc1wFDKmZ6qg/vb7/ZS3kL2owIj7UTi1QWyU9z6QYTUnIXpDBdGV/i+xjIIY3ZRy1N BbkhZp1m9l1yVpeLGwXdbJ9PdxiF5u/jyiqK/lb3MyFtIHagNhLH6F83Kgy2EOFYIprx w2Ksi2tKQ7aNkSxiAe+i4kMg71+99lNKcI8gsNYjbeUJNW3Ti3Sp7xmw0uUHa+enPCKW q1JA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20210112.gappssmtp.com header.s=20210112 header.b=5x0Coc+4; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id nc4si10508478ejc.938.2022.02.07.12.48.08; Mon, 07 Feb 2022 12:48:09 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20210112.gappssmtp.com header.s=20210112 header.b=5x0Coc+4; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6A50868B1CA; Mon, 7 Feb 2022 22:48:04 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com [209.85.208.180]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2891568AD97 for ; Mon, 7 Feb 2022 22:47:57 +0200 (EET) Received: by mail-lj1-f180.google.com with SMTP id c15so21423847ljf.11 for ; Mon, 07 Feb 2022 12:47:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=HUDduXv5B1T6aVOrKZHNdWS1RVrDN4W9gOblY1GmfDY=; b=5x0Coc+4IFoT9HyLQcbSFaieHyJAHKcUizxWL88WLMhT3QUtA/M7ePWwX3A+kRF+EA ORbua/+7FWYpXZeg6LQYpL5S+BIlNbgcpj1/PWI4FdJ/9Ykt3exAISoBd9jXPQsTjHp1 yAh3X9Vwb/+cdMuP2iuBd7FJnSRVcC6BX7Gd5tnuZ6xZruQPUwQUkjR+avJMlzJs+ptA xCzCsvw5V9B0XdShQjywYTrcRj8xWjmLYkjhHpzHTUtnlcOU7AXAMezINdB1mFdccUL+ O6f09P27NFmURWt1+8eWiOTejdlO2G1zFZsmLr0UcR72lurCC86YlSNchCIyo+uX19qx cRwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=HUDduXv5B1T6aVOrKZHNdWS1RVrDN4W9gOblY1GmfDY=; b=zcKrr6Daxau6l8SGojZpmHD860JAhVoXDfS0x4jTOUf0uZ78ePUPnfrjCwrIaHb3o7 sdceszdbrQvP9W7GSXLwsejhUpLwsAWry4XS/1i6vOvwitdTRL7MDKGnUtKtVtjld/ko ElZUxXxSO7tBT6NebcpIUMLMX1wRWmTsEzzVfSfnPUh86UniOvGnzqZu0mcQhJvhw+mv GgrWAI8Z8EctqNmPVAZNUJqwFFW6jDnLv2o2Tm0MV0zXIzMEN3OHLkknCf9HpSmKXhJK IRCX2BUCW9hW805GKsu/lZLZPE/zZoflyRywS7n4oREOskc4/BNtJ91r9aIoyNorXFHL H2bA== X-Gm-Message-State: AOAM533MP8GnjOywtt6kQMknLFjBKrU33x1Hqkh7DHpyCwTCEPZl4S6i Pg8p4WVuy574EBaLOuT9AnuJ+5ZVn8FCKr+o X-Received: by 2002:a2e:5848:: with SMTP id x8mr750598ljd.297.1644266876794; Mon, 07 Feb 2022 12:47:56 -0800 (PST) Received: from localhost.localdomain (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id f19sm1641730lfa.52.2022.02.07.12.47.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Feb 2022 12:47:56 -0800 (PST) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Mon, 7 Feb 2022 22:47:55 +0200 Message-Id: <20220207204755.1224825-1-martin@martin.st> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] aarch64: h264dsp: Fix incorrectly indented code X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: =?utf-8?q?Martin_Storsj=C3=B6?= Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: XSJeSC57lMjA Signed-off-by: Martin Storsjö --- This should reduce the risk of anyone accidentally writing new code based on an incorrect example. --- libavcodec/aarch64/h264dsp_neon.S | 176 +++++++++++++++--------------- 1 file changed, 88 insertions(+), 88 deletions(-) diff --git a/libavcodec/aarch64/h264dsp_neon.S b/libavcodec/aarch64/h264dsp_neon.S index 000ff762a3..ea221e6862 100644 --- a/libavcodec/aarch64/h264dsp_neon.S +++ b/libavcodec/aarch64/h264dsp_neon.S @@ -960,117 +960,117 @@ function ff_h264_h_loop_filter_chroma422_neon_10, export=1 endfunc .macro h264_loop_filter_chroma_intra_10 - uabd v26.8h, v16.8h, v17.8h // abs(p0 - q0) - uabd v27.8h, v18.8h, v16.8h // abs(p1 - p0) - uabd v28.8h, v19.8h, v17.8h // abs(q1 - q0) - cmhi v26.8h, v30.8h, v26.8h // < alpha - cmhi v27.8h, v31.8h, v27.8h // < beta - cmhi v28.8h, v31.8h, v28.8h // < beta - and v26.16b, v26.16b, v27.16b - and v26.16b, v26.16b, v28.16b - mov x2, v26.d[0] - mov x3, v26.d[1] - - shl v4.8h, v18.8h, #1 - shl v6.8h, v19.8h, #1 - - adds x2, x2, x3 - b.eq 9f - - add v20.8h, v16.8h, v19.8h - add v22.8h, v17.8h, v18.8h - add v20.8h, v20.8h, v4.8h - add v22.8h, v22.8h, v6.8h - urshr v24.8h, v20.8h, #2 - urshr v25.8h, v22.8h, #2 - bit v16.16b, v24.16b, v26.16b - bit v17.16b, v25.16b, v26.16b + uabd v26.8h, v16.8h, v17.8h // abs(p0 - q0) + uabd v27.8h, v18.8h, v16.8h // abs(p1 - p0) + uabd v28.8h, v19.8h, v17.8h // abs(q1 - q0) + cmhi v26.8h, v30.8h, v26.8h // < alpha + cmhi v27.8h, v31.8h, v27.8h // < beta + cmhi v28.8h, v31.8h, v28.8h // < beta + and v26.16b, v26.16b, v27.16b + and v26.16b, v26.16b, v28.16b + mov x2, v26.d[0] + mov x3, v26.d[1] + + shl v4.8h, v18.8h, #1 + shl v6.8h, v19.8h, #1 + + adds x2, x2, x3 + b.eq 9f + + add v20.8h, v16.8h, v19.8h + add v22.8h, v17.8h, v18.8h + add v20.8h, v20.8h, v4.8h + add v22.8h, v22.8h, v6.8h + urshr v24.8h, v20.8h, #2 + urshr v25.8h, v22.8h, #2 + bit v16.16b, v24.16b, v26.16b + bit v17.16b, v25.16b, v26.16b .endm function ff_h264_v_loop_filter_chroma_intra_neon_10, export=1 - h264_loop_filter_start_intra_10 - mov x9, x0 - sub x0, x0, x1, lsl #1 - ld1 {v18.8h}, [x0], x1 - ld1 {v17.8h}, [x9], x1 - ld1 {v16.8h}, [x0], x1 - ld1 {v19.8h}, [x9] + h264_loop_filter_start_intra_10 + mov x9, x0 + sub x0, x0, x1, lsl #1 + ld1 {v18.8h}, [x0], x1 + ld1 {v17.8h}, [x9], x1 + ld1 {v16.8h}, [x0], x1 + ld1 {v19.8h}, [x9] - h264_loop_filter_chroma_intra_10 + h264_loop_filter_chroma_intra_10 - sub x0, x9, x1, lsl #1 - st1 {v16.8h}, [x0], x1 - st1 {v17.8h}, [x0], x1 + sub x0, x9, x1, lsl #1 + st1 {v16.8h}, [x0], x1 + st1 {v17.8h}, [x0], x1 9: - ret + ret endfunc function ff_h264_h_loop_filter_chroma_mbaff_intra_neon_10, export=1 - h264_loop_filter_start_intra_10 + h264_loop_filter_start_intra_10 - sub x4, x0, #4 - sub x0, x0, #2 - add x9, x4, x1, lsl #1 - ld1 {v18.8h}, [x4], x1 - ld1 {v17.8h}, [x9], x1 - ld1 {v16.8h}, [x4], x1 - ld1 {v19.8h}, [x9], x1 + sub x4, x0, #4 + sub x0, x0, #2 + add x9, x4, x1, lsl #1 + ld1 {v18.8h}, [x4], x1 + ld1 {v17.8h}, [x9], x1 + ld1 {v16.8h}, [x4], x1 + ld1 {v19.8h}, [x9], x1 - transpose_4x8H v18, v16, v17, v19, v26, v27, v28, v29 + transpose_4x8H v18, v16, v17, v19, v26, v27, v28, v29 - h264_loop_filter_chroma_intra_10 + h264_loop_filter_chroma_intra_10 - st2 {v16.h,v17.h}[0], [x0], x1 - st2 {v16.h,v17.h}[1], [x0], x1 - st2 {v16.h,v17.h}[2], [x0], x1 - st2 {v16.h,v17.h}[3], [x0], x1 + st2 {v16.h,v17.h}[0], [x0], x1 + st2 {v16.h,v17.h}[1], [x0], x1 + st2 {v16.h,v17.h}[2], [x0], x1 + st2 {v16.h,v17.h}[3], [x0], x1 9: - ret + ret endfunc function ff_h264_h_loop_filter_chroma_intra_neon_10, export=1 - h264_loop_filter_start_intra_10 - sub x4, x0, #4 - sub x0, x0, #2 + h264_loop_filter_start_intra_10 + sub x4, x0, #4 + sub x0, x0, #2 h_loop_filter_chroma420_intra_10: - add x9, x4, x1, lsl #2 - ld1 {v18.4h}, [x4], x1 - ld1 {v18.d}[1], [x9], x1 - ld1 {v16.4h}, [x4], x1 - ld1 {v16.d}[1], [x9], x1 - ld1 {v17.4h}, [x4], x1 - ld1 {v17.d}[1], [x9], x1 - ld1 {v19.4h}, [x4], x1 - ld1 {v19.d}[1], [x9], x1 - - transpose_4x8H v18, v16, v17, v19, v26, v27, v28, v29 - - h264_loop_filter_chroma_intra_10 - - st2 {v16.h,v17.h}[0], [x0], x1 - st2 {v16.h,v17.h}[1], [x0], x1 - st2 {v16.h,v17.h}[2], [x0], x1 - st2 {v16.h,v17.h}[3], [x0], x1 - st2 {v16.h,v17.h}[4], [x0], x1 - st2 {v16.h,v17.h}[5], [x0], x1 - st2 {v16.h,v17.h}[6], [x0], x1 - st2 {v16.h,v17.h}[7], [x0], x1 + add x9, x4, x1, lsl #2 + ld1 {v18.4h}, [x4], x1 + ld1 {v18.d}[1], [x9], x1 + ld1 {v16.4h}, [x4], x1 + ld1 {v16.d}[1], [x9], x1 + ld1 {v17.4h}, [x4], x1 + ld1 {v17.d}[1], [x9], x1 + ld1 {v19.4h}, [x4], x1 + ld1 {v19.d}[1], [x9], x1 + + transpose_4x8H v18, v16, v17, v19, v26, v27, v28, v29 + + h264_loop_filter_chroma_intra_10 + + st2 {v16.h,v17.h}[0], [x0], x1 + st2 {v16.h,v17.h}[1], [x0], x1 + st2 {v16.h,v17.h}[2], [x0], x1 + st2 {v16.h,v17.h}[3], [x0], x1 + st2 {v16.h,v17.h}[4], [x0], x1 + st2 {v16.h,v17.h}[5], [x0], x1 + st2 {v16.h,v17.h}[6], [x0], x1 + st2 {v16.h,v17.h}[7], [x0], x1 9: - ret + ret endfunc function ff_h264_h_loop_filter_chroma422_intra_neon_10, export=1 - h264_loop_filter_start_intra_10 - sub x4, x0, #4 - add x5, x0, x1, lsl #3 - sub x0, x0, #2 - mov x7, x30 - bl h_loop_filter_chroma420_intra_10 - mov x4, x9 - sub x0, x5, #2 - mov x30, x7 - b h_loop_filter_chroma420_intra_10 + h264_loop_filter_start_intra_10 + sub x4, x0, #4 + add x5, x0, x1, lsl #3 + sub x0, x0, #2 + mov x7, x30 + bl h_loop_filter_chroma420_intra_10 + mov x4, x9 + sub x0, x5, #2 + mov x30, x7 + b h_loop_filter_chroma420_intra_10 endfunc