From patchwork Thu Sep 22 16:02:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 38151 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1c:b0:96:9ee8:5cfd with SMTP id c28csp417875pzh; Thu, 22 Sep 2022 09:02:48 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6xRCXbd1duyxyKEj57AZCY5xhZIV5qYscnZwloEf6dTEnjJFm94pNDonptAxHZVCG/hMHM X-Received: by 2002:a17:907:7f9e:b0:781:5752:4f36 with SMTP id qk30-20020a1709077f9e00b0078157524f36mr3426544ejc.760.1663862568169; Thu, 22 Sep 2022 09:02:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663862568; cv=none; d=google.com; s=arc-20160816; b=qnUc7jCs5ozrUergrsSKO7fn+MPPmeO1xTRv7qnvKturXGNiiTrzpYfXX9DxqcA1QU /Q0Rxv3MNLoyGX4+CFNKHB09yuaP69pufdnYWm0X2JVOn9qBX5Im9TRPPcaN6pHjkZhV 4yhDduDVSzIYAoLwOetNBlWWvhFygSUcUcusqXFji9v3bsuq6049LeygsEWDDqMg2n/N PnoN+hypyblmWjbw6eZj4dS986Lz8eZuX8nZstdvs88aXboOPKIIsXJq1JHowVSGz8wK XqTBLoWmrvqUAzuLtBw+Pbe8p9xGEKlpRZ/gno7kWOYi39BcSH7U5jDpABWvQY83doZe uYkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=idn9FQzA1nJQZ9YQZoEwIx0mdKwOM0aZf18P2Y5KC0E=; b=Cc3Cn3bgBhdCYOdYx7jXpf/lg3p7ZUAZ6UU+iD/ZVHPJ6cd/R8wT5W6a7Yl08bJ88G p4sIiCl6xAs7jQGO6mXuKINltbb+iPXdT2DJuTkgnKKgYQ2X9VCq7BkTqX0ken6RsK+j 80SIgyKT/Ab8DlyTwr4e/vOft13kS+3KKofmXXCCKzelbB9aOAxacd6vX+swn8GVZxHR 1Uehfx8+HUxCqCzlxchseBEFNfcdbXiwBwnw5BW7qlIhunrwn6U6zknPEtHXgrzeVr8s 27LI0eK8dqnViO+X6hCUUKhBdreekndiOQ4hVCA9SU10ixdDmHpAByOgvR2Ojn6oa6/7 1wiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 17-20020a508e51000000b0044f1e94831bsi5226273edx.97.2022.09.22.09.02.46; Thu, 22 Sep 2022 09:02:48 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 59D7B68BBD4; Thu, 22 Sep 2022 19:02:43 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6546568BBC2 for ; Thu, 22 Sep 2022 19:02:37 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id CF202C0072 for ; Thu, 22 Sep 2022 19:02:36 +0300 (EEST) From: remi@remlab.net To: ffmpeg-devel@ffmpeg.org Date: Thu, 22 Sep 2022 19:02:36 +0300 Message-Id: <20220922160236.29347-1-remi@remlab.net> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/aacpsdsp: use restrict qualifier X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 810urf4mwsQ4 From: RĂ©mi Denis-Courmont Except for add_squares, telling the compiler that the output vector(s) cannot alias helps quite a bit (cycles on SiFive U7-MC): ps_add_squares_c: 98277.7 ps_add_squares_r: 98320.2 ps_hybrid_analysis_c: 3731.2 ps_hybrid_analysis_r: 2495.7 ps_hybrid_analysis_ileave_c: 20478.0 ps_hybrid_analysis_ileave_r: 16092.2 ps_hybrid_synthesis_deint_c: 19051.5 ps_hybrid_synthesis_deint_r: 15420.0 ps_mul_pair_single_c: 122941.2 ps_mul_pair_single_r: 91035.0 --- libavcodec/aacpsdsp_template.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/libavcodec/aacpsdsp_template.c b/libavcodec/aacpsdsp_template.c index 31ff718420..93b72ad2af 100644 --- a/libavcodec/aacpsdsp_template.c +++ b/libavcodec/aacpsdsp_template.c @@ -26,24 +26,25 @@ #include "libavutil/attributes.h" #include "aacpsdsp.h" -static void ps_add_squares_c(INTFLOAT *dst, const INTFLOAT (*src)[2], int n) +static void ps_add_squares_c(INTFLOAT *av_restrict dst, + const INTFLOAT (*src)[2], int n) { - int i; - for (i = 0; i < n; i++) + for (int i = 0; i < n; i++) dst[i] += (UINTFLOAT)AAC_MADD28(src[i][0], src[i][0], src[i][1], src[i][1]); } -static void ps_mul_pair_single_c(INTFLOAT (*dst)[2], INTFLOAT (*src0)[2], INTFLOAT *src1, +static void ps_mul_pair_single_c(INTFLOAT (*av_restrict dst)[2], + INTFLOAT (*src0)[2], INTFLOAT *src1, int n) { - int i; - for (i = 0; i < n; i++) { + for (int i = 0; i < n; i++) { dst[i][0] = AAC_MUL16(src0[i][0], src1[i]); dst[i][1] = AAC_MUL16(src0[i][1], src1[i]); } } -static void ps_hybrid_analysis_c(INTFLOAT (*out)[2], INTFLOAT (*in)[2], +static void ps_hybrid_analysis_c(INTFLOAT (*av_restrict out)[2], + INTFLOAT (*in)[2], const INTFLOAT (*filter)[8][2], ptrdiff_t stride, int n) { @@ -73,13 +74,12 @@ static void ps_hybrid_analysis_c(INTFLOAT (*out)[2], INTFLOAT (*in)[2], } } -static void ps_hybrid_analysis_ileave_c(INTFLOAT (*out)[32][2], INTFLOAT L[2][38][64], - int i, int len) +static void ps_hybrid_analysis_ileave_c(INTFLOAT (*av_restrict out)[32][2], + INTFLOAT L[2][38][64], + int i, int len) { - int j; - for (; i < 64; i++) { - for (j = 0; j < len; j++) { + for (int j = 0; j < len; j++) { out[i][j][0] = L[0][j][i]; out[i][j][1] = L[1][j][i]; } @@ -87,13 +87,11 @@ static void ps_hybrid_analysis_ileave_c(INTFLOAT (*out)[32][2], INTFLOAT L[2][38 } static void ps_hybrid_synthesis_deint_c(INTFLOAT out[2][38][64], - INTFLOAT (*in)[32][2], - int i, int len) + INTFLOAT (*av_restrict in)[32][2], + int i, int len) { - int n; - for (; i < 64; i++) { - for (n = 0; n < len; n++) { + for (int n = 0; n < len; n++) { out[0][n][i] = in[i][n][0]; out[1][n][i] = in[i][n][1]; }