From patchwork Mon Apr 8 10:57:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 47912 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:24a8:b0:1a3:b6bb:3029 with SMTP id m40csp2393373pzd; Mon, 8 Apr 2024 03:57:17 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXiXisWVM4oOt98AAbdmxmkk0+vT1UwmjAOxgPl9VhZTP1BpdkX0r/qyGYAww+bWUJQPPxxIPRZ2/jO779f79aQzSR7k5ETy7vx8Q== X-Google-Smtp-Source: AGHT+IGzKlCxlh3sdqgeJrx2B6vc34C7xiKWeLZ827/EjwmkosQ4bRQ7uTkc85duBin+wUQ+/h+U X-Received: by 2002:a17:907:940f:b0:a51:dd36:9a37 with SMTP id dk15-20020a170907940f00b00a51dd369a37mr1270399ejc.17.1712573837614; Mon, 08 Apr 2024 03:57:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1712573837; cv=none; d=google.com; s=arc-20160816; b=JQoyaDTCKtkePuBNMcvPDMAj0KTF7uFbwCp1ylQZQkMCL9ky13Zkv1foq7eT/Sktoa WHVBTkea3UleAt3BDn3H/sBVlaTfKP4ldLxegstvTo3Kdop0fv11zIGBGQIyd7tobqMg 2lvu9s78mNc9dSomb3muz+I1AP+EUSWgpDQv2Fy5g4VBHheGWgwAbZR9ytI39DAHqpxs naG2m8jGIhsQm576NGV//bUgSiNV1lUYjyrFc5CenD4mByO1Y4bC8W0vMetMBKRHezdx Ig70VxtxA1heqQ2SztnxgHMY6JnZfCYijBg5/XU4f1twemPb4JLGzH5libCawOfXmuqt Nt6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=3xdNClZ47sIrPN+dq8CfmpREhHjFg3+4QqSIwOD2hx0=; fh=Fri2GBOpQC6LOAHAkawKBl23zw/FHNm4CIFmm1NzuIQ=; b=F7DkXvz4rNh/vU+lIzMa/H+0IPZqqiiuQORYMVv7CMe9J+NzBZo6NPqgmYS2JVadJi 8bCW8cpup5V25LiF/K6e9mBaiEKJF+0Cql9lL5zF41QBqVm0dRK3eNjHpLNUwedx4eCd bsVfXjFIMQTZX+AlWHwlpy0KsFtYjT6EegeFIhbotdfFkzHjVtsxBFGaF10F85PAGorw NUO8r4Lx+ZbvFq56ua3TavPbxLbbTlgefYlWe30fD6mYDoToUBKBfxm2ta+xvKHnZukT pk2BYUcrwBbxfCvxgrdmNIjK9iZQKfP58PSLfTic0C6BwAT0weJwZPS2dAWiJiFWjs4P Dj8g==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=EpQXFj0J; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id kx17-20020a170907775100b00a45ab47d3fasi3496469ejc.75.2024.04.08.03.57.17; Mon, 08 Apr 2024 03:57:17 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=EpQXFj0J; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4153568D1F4; Mon, 8 Apr 2024 13:57:14 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f41.google.com (mail-lf1-f41.google.com [209.85.167.41]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 82B4F68B004 for ; Mon, 8 Apr 2024 13:57:07 +0300 (EEST) Received: by mail-lf1-f41.google.com with SMTP id 2adb3069b0e04-516d3a470d5so3534958e87.3 for ; Mon, 08 Apr 2024 03:57:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1712573826; x=1713178626; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=hkmMy504uKdti5Xd9hQ/FS/tR3OM4YfIUBgnqI+wlRw=; b=EpQXFj0JqzXAz6t7P92VmYXrF1E2XkexypSENh+Pf4wbDr/c8afi2zpSavYvvc/i8V ltzaMCRFW0Pe3yoGq3R/Xknoc4H2dB7sMPLhGasAG5/Z0/Eg9NWXO8xEdcRHG+mGYQB8 oVigSyiXCejycNWzoErjW5soDAkagayhUsTicqm4uzD80QuZGJW/T1pPt8E3OZW9DRMk 9bLTrqPc/fC6MT0Bndy1mefTT3F7PF13k97AWedczGCCjVMCSieOR8chnaXcJFzmqiem 4ggPI2Vt8bWWABA9emWYUMO7E1wKpbRWNPzg9CodvV//D7cdcZ29IQaZgYI/lgreW4gA J63g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712573826; x=1713178626; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hkmMy504uKdti5Xd9hQ/FS/tR3OM4YfIUBgnqI+wlRw=; b=PaqMPZH6r/yRaeKpLUF7gHDRjBhlVafANavGSz3tOiQAHZtYA58x9O2z4j8Kxo7O7h oown43AGuhmP2QaA+Mbg9qwz8T+kkwpSqLlBapV1PXi/0hcA86FZ2BwGPvcUKEBzAbK1 uR0nDPpYp9yQh6CAfSijOJvMPs6DaddMUxj4NjMtV36Kky1+dYjVtXuYgfjcLnssIi4+ 4iZEO2XoNkZ3B6Fztbp89M2wamflNU9+C9k9GAmp8kjNYiLwe7qqMMQksbjYTZqxkhYU 1fvs6sce3/SbmSbtBFxh3jmSMjk0sAqQ8F22EeWErBpOvxWud1NBBdh/ywwqBKtpx4W3 QhUA== X-Gm-Message-State: AOJu0YwIW8+q//OfGsT4drDMaiNvU8CW3Q+/lxoQHPew0XOmJE1i3jMj Yxvol36G/SU4Ki20HJ0cOPS7wK+tTXjJerobap99lFFqrm0dsQPvSQN97PFb/Su9XfypN+32R0c qjn7k X-Received: by 2002:a05:6512:3ba4:b0:516:be61:7688 with SMTP id g36-20020a0565123ba400b00516be617688mr7290072lfv.22.1712573826589; Mon, 08 Apr 2024 03:57:06 -0700 (PDT) Received: from localhost (host-114-191.parnet.fi. [77.234.114.191]) by smtp.gmail.com with ESMTPSA id e10-20020ac2546a000000b00516a21346e3sm1132480lfn.218.2024.04.08.03.57.06 (version=TLS1 cipher=AES128-SHA bits=128/128); Mon, 08 Apr 2024 03:57:06 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Mon, 8 Apr 2024 13:57:05 +0300 Message-Id: <20240408105705.99898-1-martin@martin.st> X-Mailer: git-send-email 2.39.3 (Apple Git-146) MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] aarch64: ac3dsp: Simplify the end of ff_ac3_sum_square_butterfly_float_neon X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Geoff Hill Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: vYaEGZ5nmH1y Before: Cortex A53 A72 A78 ac3_sum_square_bufferfly_float_neon: 1005.7 516.5 224.5 After: ac3_sum_square_bufferfly_float_neon: 981.7 504.5 223.2 --- libavcodec/aarch64/ac3dsp_neon.S | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/libavcodec/aarch64/ac3dsp_neon.S b/libavcodec/aarch64/ac3dsp_neon.S index 20beb6cc50..7e97cc39f7 100644 --- a/libavcodec/aarch64/ac3dsp_neon.S +++ b/libavcodec/aarch64/ac3dsp_neon.S @@ -103,17 +103,9 @@ function ff_ac3_sum_square_butterfly_float_neon, export=1 fmla v3.4s, v17.4s, v17.4s subs w3, w3, #4 b.gt 1b - faddp v0.4s, v0.4s, v0.4s - faddp v0.2s, v0.2s, v0.2s - st1 {v0.s}[0], [x0], #4 - faddp v1.4s, v1.4s, v1.4s - faddp v1.2s, v1.2s, v1.2s - st1 {v1.s}[0], [x0], #4 - faddp v2.4s, v2.4s, v2.4s - faddp v2.2s, v2.2s, v2.2s - st1 {v2.s}[0], [x0], #4 - faddp v3.4s, v3.4s, v3.4s - faddp v3.2s, v3.2s, v3.2s - st1 {v3.s}[0], [x0] + faddp v0.4s, v0.4s, v1.4s + faddp v2.4s, v2.4s, v3.4s + faddp v0.4s, v0.4s, v2.4s + st1 {v0.4s}, [x0] ret endfunc