From patchwork Fri Jul 14 16:10:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 42674 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:6da1:b0:131:a7d0:bc6d with SMTP id gl33csp3779417pzb; Fri, 14 Jul 2023 09:10:54 -0700 (PDT) X-Google-Smtp-Source: APBJJlHVX/8AUGMLTcRvorKMAjXaC5K9hSTAIQnk88qaYKOGGp63Hjb5C0JvZWipDmfTm+0vrLMs X-Received: by 2002:a05:6512:e9a:b0:4f8:7055:6f7e with SMTP id bi26-20020a0565120e9a00b004f870556f7emr5204232lfb.44.1689351054239; Fri, 14 Jul 2023 09:10:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689351054; cv=none; d=google.com; s=arc-20160816; b=1BlzRVHLoGrF02SYQM14wygvVme6P2riQUy7pUuQwMCeAZtnizJ8ILq26vKOaRF8td Pb41X63KoxHOC3enRSjQO3kyT84UzSRgFb7JTX4pwExN9uW36qEpBRxPRVyWo1eaO8md m+tz1Wo00KJWlUiqrHHx+7c/Yq2s+Ac9Kq6+yJmkThjVzyjscNMcwPn9t9l2PWVB0YB4 +hhoDNJSPdaMHkCKtxPByfBqmxCD2jcfFXxbCeYYsJKe181CZgybxjrDqafUN7PUOc44 XepMFp40tz44xY4HCpciWZVKyDn/V/Tspu7YvKOcZRcDU134z97zytGvmZ86GagIMFgB Nlvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=mMHwEE6gGlfZ0uKUEXTdn5DbmIBV0BsKbhRCZcDzEAU=; fh=hQcp50obTJ8bXC1it5NuEN23RGKfx0/zZ3s2gmreL+A=; b=Y1u/ES0TAd0oFuxz6M2Fra/9h6CMit9eF+y88pzW0FD8HUwSjhqId4Trdff4irgRs+ qfbTxtAHidRJNE3sMLJ75PKNpJdWy0R+AWY/8iP9NxqVNLXrHjsbIJfEjRSYw6pxwXrL lpKYJ4loFXscsWc0RwvsPAgAvnzXCHiDPfzmbCtN8BS1ZsRd2PNoCYeRFO3iLruSovlg KUq78Y++SRnTPTP6b8cvO5GvOjNjfOJ97cmrP5U9kn8H9eqgQPjumtbR05hVP3h56wzm KsXM3ePyXumBjCaSsootQL0+yZ7niGg4oCtoTjsvOqFGlbLgDfn5z5RGPmtDd5bZVgLv 9gDA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c13-20020aa7c74d000000b0051e050cfec7si9164617eds.151.2023.07.14.09.10.50; Fri, 14 Jul 2023 09:10:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2357368C5EA; Fri, 14 Jul 2023 19:10:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7B18368C4C5 for ; Fri, 14 Jul 2023 19:10:41 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 0560BC000E for ; Fri, 14 Jul 2023 19:10:40 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Fri, 14 Jul 2023 19:10:40 +0300 Message-Id: <20230714161040.3138-1-remi@remlab.net> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/aacpsdsp: use restrict qualifier X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: cEaqEAvjXojC Except for add_squares, telling the compiler that the output vector(s) cannot alias helps quite a bit (cycles on SiFive U74-MC): ps_add_squares_c: 98277.7 ps_add_squares_r: 98320.2 ps_hybrid_analysis_c: 3731.2 ps_hybrid_analysis_r: 2495.7 ps_hybrid_analysis_ileave_c: 20478.0 ps_hybrid_analysis_ileave_r: 16092.2 ps_hybrid_synthesis_deint_c: 19051.5 ps_hybrid_synthesis_deint_r: 15420.0 ps_mul_pair_single_c: 122941.2 ps_mul_pair_single_r: 91035.0 --- libavcodec/aacpsdsp_template.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/libavcodec/aacpsdsp_template.c b/libavcodec/aacpsdsp_template.c index c063788b89..7b3eb78db1 100644 --- a/libavcodec/aacpsdsp_template.c +++ b/libavcodec/aacpsdsp_template.c @@ -26,24 +26,25 @@ #include "libavutil/attributes.h" #include "aacpsdsp.h" -static void ps_add_squares_c(INTFLOAT *dst, const INTFLOAT (*src)[2], int n) +static void ps_add_squares_c(INTFLOAT *av_restrict dst, + const INTFLOAT (*src)[2], int n) { - int i; - for (i = 0; i < n; i++) + for (int i = 0; i < n; i++) dst[i] += (UINTFLOAT)AAC_MADD28(src[i][0], src[i][0], src[i][1], src[i][1]); } -static void ps_mul_pair_single_c(INTFLOAT (*dst)[2], INTFLOAT (*src0)[2], INTFLOAT *src1, +static void ps_mul_pair_single_c(INTFLOAT (*av_restrict dst)[2], + INTFLOAT (*src0)[2], INTFLOAT *src1, int n) { - int i; - for (i = 0; i < n; i++) { + for (int i = 0; i < n; i++) { dst[i][0] = AAC_MUL16(src0[i][0], src1[i]); dst[i][1] = AAC_MUL16(src0[i][1], src1[i]); } } -static void ps_hybrid_analysis_c(INTFLOAT (*out)[2], INTFLOAT (*in)[2], +static void ps_hybrid_analysis_c(INTFLOAT (*av_restrict out)[2], + INTFLOAT (*in)[2], const INTFLOAT (*filter)[8][2], ptrdiff_t stride, int n) { @@ -76,13 +77,12 @@ static void ps_hybrid_analysis_c(INTFLOAT (*out)[2], INTFLOAT (*in)[2], } } -static void ps_hybrid_analysis_ileave_c(INTFLOAT (*out)[32][2], INTFLOAT L[2][38][64], - int i, int len) +static void ps_hybrid_analysis_ileave_c(INTFLOAT (*av_restrict out)[32][2], + INTFLOAT L[2][38][64], + int i, int len) { - int j; - for (; i < 64; i++) { - for (j = 0; j < len; j++) { + for (int j = 0; j < len; j++) { out[i][j][0] = L[0][j][i]; out[i][j][1] = L[1][j][i]; } @@ -90,13 +90,11 @@ static void ps_hybrid_analysis_ileave_c(INTFLOAT (*out)[32][2], INTFLOAT L[2][38 } static void ps_hybrid_synthesis_deint_c(INTFLOAT out[2][38][64], - INTFLOAT (*in)[32][2], - int i, int len) + INTFLOAT (*av_restrict in)[32][2], + int i, int len) { - int n; - for (; i < 64; i++) { - for (n = 0; n < len; n++) { + for (int n = 0; n < len; n++) { out[0][n][i] = in[i][n][0]; out[1][n][i] = in[i][n][1]; }