From patchwork Sun Nov 19 12:54:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 44720 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:6a89:b0:181:818d:5e7f with SMTP id bi9csp1016806pzb; Sun, 19 Nov 2023 04:54:24 -0800 (PST) X-Google-Smtp-Source: AGHT+IEH773nM+I01L3ep6OmOcwnpDqZjPP12t+NErSa2CCc0CM8rstxUJ4ZX2A71M1l7J79hnDe X-Received: by 2002:a05:6402:885:b0:540:9d0c:75fc with SMTP id e5-20020a056402088500b005409d0c75fcmr3102013edy.26.1700398463748; Sun, 19 Nov 2023 04:54:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700398463; cv=none; d=google.com; s=arc-20160816; b=xg4TjSZ5amnVdQ3SXzHLbr+6Yd2M08+OhjeMNbmg/TbcxoDtlrbbz/BmiYgcgXzQqh DeNcL2dv7olgdyXq/wcSQh4Z1VvgWNmPmVx0GOveqh0uU6a3+FZJQD+Fa82MCPWX+j5k 9MSyIg04E4kLz72HcgvJ6A8NcTW2oS4H3iqjWBbIXq6LgkA1NyzDizj8ZmKGl0KcT/A1 6Zd2N5DeM0BQ36YApruVYlgaTEnNN1W1rqsifZWk5kh+NtOA6LKatC+wP6Vy7smGvjY/ lVcs3/DsvfSJHEuSh2m+xrG6FudR7hN31qY9/nsJdJkzgLqJYE8MBYIyfe23zFHeMCYO sK8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=gpBSZhTb5K0gs3Virhty7gaAXJ8HRIWHb3F9B+Ob20g=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=xLiC4h1klUmcOBcbCgIPP67w0IyxJRkdBUPKxT6B50n9MTNJCE7NzA0PJ1qC6Uixc7 9ZZdISSRF/zSkhvD4GNjhqZ1gHFcCB4oCedKzLMXLt6swju17pkhHVB322RBzkiKCTgE dGxhxLSk4VkuOjFKfSrxsTiBMXeU/sxdWdxu4iic5b0crS/BAHtGhqWm7JFQGJJez/gE kmyIFdJKR3Hv3HnFUhiwjGx9tyLmpsID3wSBAHejzdzDysCcML85EwQ3+w60uHMK80kv Q9QA88Rc5ARK3ln+rOJ9fcGYyy1ZNochZjUKz1rP/HMxjNqoe0KdYfgOOsHr//Lg5HpH /eWg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id m16-20020a056402431000b00544b3738a97si3593622edc.659.2023.11.19.04.54.23; Sun, 19 Nov 2023 04:54:23 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 044B068CCF4; Sun, 19 Nov 2023 14:54:20 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A43C068CB41 for ; Sun, 19 Nov 2023 14:54:13 +0200 (EET) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 38DF6C018B for ; Sun, 19 Nov 2023 14:54:13 +0200 (EET) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sun, 19 Nov 2023 14:54:13 +0200 Message-ID: <20231119125413.14429-1-remi@remlab.net> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] lavc/g722dsp: optimise R-V V apply_qmf X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: durl4/8XuFUA This stores the constant coefficients deinterleaved, so that they can be loaded directly with NF=0. Unfortunately, we cannot optimise loading the input, due to insufficient memory alignment (not 32-bit). Before: g722_apply_qmf_c: 82.5 g722_apply_qmf_rvv_i32: 78.2 After: g722_apply_qmf_c: 82.5 g722_apply_qmf_rvv_i32: 65.2 --- libavcodec/riscv/g722dsp_rvv.S | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/libavcodec/riscv/g722dsp_rvv.S b/libavcodec/riscv/g722dsp_rvv.S index 350be8dc1f..981d5cecd8 100644 --- a/libavcodec/riscv/g722dsp_rvv.S +++ b/libavcodec/riscv/g722dsp_rvv.S @@ -24,7 +24,9 @@ func ff_g722_apply_qmf_rvv, zve32x lla t0, qmf_coeffs vsetivli zero, 12, e16, m2, ta, ma vlseg2e16.v v28, (a0) - vlseg2e16.v v24, (t0) + addi t1, t0, 12 * 2 + vle16.v v24, (t0) + vle16.v v26, (t1) vwmul.vv v16, v28, v24 vwmul.vv v20, v30, v26 vsetivli zero, 12, e32, m4, ta, ma @@ -41,26 +43,26 @@ endfunc const qmf_coeffs, align=2 .short 3 .short -11 - .short -11 - .short 53 .short 12 - .short -156 .short 32 - .short 362 .short -210 - .short -805 .short 951 .short 3876 - .short 3876 - .short 951 .short -805 - .short -210 .short 362 - .short 32 .short -156 - .short 12 .short 53 .short -11 .short -11 + .short 53 + .short -156 + .short 362 + .short -805 + .short 3876 + .short 951 + .short -210 + .short 32 + .short 12 + .short -11 .short 3 endconst