From patchwork Tue Dec 19 02:53:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: flow gg X-Patchwork-Id: 45235 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp7764267pzf; Mon, 18 Dec 2023 18:53:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IHR2MVQMdEU4pFM5onNHjBd6xC/JJUo6qXvo3u5yOrZENZ7GgKAJkngPoJlVlOgV1JUr3AV X-Received: by 2002:a05:651c:211e:b0:2c9:e7cb:fe8e with SMTP id a30-20020a05651c211e00b002c9e7cbfe8emr19091170ljq.2.1702954413927; Mon, 18 Dec 2023 18:53:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702954413; cv=none; d=google.com; s=arc-20160816; b=NVNtaUYTa4iXwMTuf0ZJvrt3sBnAPBeSqKMejSWwLHkMKaV1mlevH96V4h5padj5S2 u+Fq81RN6EzsjTWQW6tD6G4rimZkDBYJ+wkctw0XRzfPz9gpfRDW+D+LwGywlluU3Rj5 jDeKTtTIVQJrrETTtowRDDybHdc23ARG/tw2paJzd7Z5W5tpjhIsgaHoFLMWJIEuzJKL XQrx02e5pj9PyOs8thFUxXsiclHOLBKcDafQQr1lhFwnzS1l0ET+tHEqbduyBvTiZzZb mggK7SBMwy05bPyfRzM60ScmJuM/RXEVVjCbwdVMdWk2z2wQAHwbgF2xcX4oAvlsJe2y bxrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=c2Pq1YRFhotFoxGCFhJrcsPLh8jE3BvEolJAmwRB5hI=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=ilszbrwq3l2OvhDB69BJ6z+DIWt7rYQsq7b0qWVpsIMhJs/F9o/f92CvRjwJzGuWex bTNNGi34IB9uVLJPGRzXreBOkhprRb9e9glxFQH0VSGsl4oeO+53DfQ/TKKn3pV57yzc fQ4qZwL6BW1XekQXSqCGFptzVaS9FnYLlhopV+zify3nOsqiWwvsmnFepNX3OTzPmiFL 8jKvTJ4E1G9ouKH8N/woJDH4y1OXLCktD5OnAJxgEeRa8Auw7x5kyNilZtlM8uFyRc7A 9D3S1VIETzqoftJsTPehmLI7uvmZaabqlhV5cN2HrFch2KMKOJTQa+fjW94VHgIZiopq 4yGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=BiadoPSH; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i2-20020a05640242c200b0054aedd37a43si10381838edc.574.2023.12.18.18.53.33; Mon, 18 Dec 2023 18:53:33 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=BiadoPSH; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0F32D68C416; Tue, 19 Dec 2023 04:53:31 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 83FC568CF76 for ; Tue, 19 Dec 2023 04:53:24 +0200 (EET) Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-67f5132e8fcso6559686d6.2 for ; Mon, 18 Dec 2023 18:53:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702954403; x=1703559203; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=eTnlzk2FWmJNqKP0vQk7Pm94wEtl4hQIGAZ26VwcFGk=; b=BiadoPSHE5wdEr2mICJRDpZllSph9nJaJrwzwljuk9p0kfj2rHhd1L31hupzz8kG/X vW3YNSzi+UEUCa0SY9ZuQmrBjd+7XZX5Adfsb2BjkGjzmpoyzrcsXWQCOE8w6NtFI9/p evGM3kl7LzgxMM3te1suJWYb92wkLBSwvb65V9BAI7sFFOkiZG4eJHol3AflgZWkesf5 vle2KMeZ4aXrcv9fDbWw4cQXgWa/rOqsXrR8hfG3CbW01vGyOL01cdf5a150SE0MEDDJ iPGQA2kI6E/e8TtcPBKJW3RbhTSuaMl6eKNZCjLHFpraZVHSq0nyM/SmWFpjeI8ZibDV mzlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702954403; x=1703559203; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=eTnlzk2FWmJNqKP0vQk7Pm94wEtl4hQIGAZ26VwcFGk=; b=TCaNLu8E+f290exbNfgUG7cJS1xgwJ73yoteKk679OW5YCF4/JRDnsxbyxUQ5j+DTw ZlUcgZf0gqQhCXZZHzfArNHfkHsiTLsnLk6XjwyeL4SRNVnWE1EyjLZkEOnAKbEzUsPw UVlRcZHqwwyCnvEHMawfvbEZ87qYzgwqX/pBUyUmigD2fAucHPfhti4E9o3/AMoaW8ww jQvIXEIa+a8jdm8O+Inr2ZhNokNfF2omsZ0jqJjWSNW21gpGdDbI9IX5fzVLY6347AfE SxknZMGMGfI3/amclUCpoP2HxkvoggvOoRIESi3l3yW8ArtEibFOldmq4Xi2DlLlaAUv JuTA== X-Gm-Message-State: AOJu0Yw/kRsj+LV/3WsyHcV9WDRWOR8UIGc1/ztQetQlpElcBy2mvcuz YJTH5IAYfaApsMTqfh9TwCs1V8AjNGZ71nzCSJVrKbxjhu/2CGMX X-Received: by 2002:ad4:53ac:0:b0:67a:a721:d779 with SMTP id j12-20020ad453ac000000b0067aa721d779mr18060063qvv.95.1702954402972; Mon, 18 Dec 2023 18:53:22 -0800 (PST) MIME-Version: 1.0 From: flow gg Date: Tue, 19 Dec 2023 10:53:12 +0800 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH] libavfilter/af_afir: R-V V dcmul_add X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3I9P+zxaJG2x c908: dcmul_add_c: 88.0 dcmul_add_rvv_f64: 46.2 Did not use vlseg2e64, because it is much slower than vlse64 Did not use vsseg2e64, because it is slightly slower than vsse64 From 80b6694bc29ed1c37852dc079a6d91a24dd6f18e Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Tue, 19 Dec 2023 09:11:28 +0800 Subject: [PATCH] libavfilter/af_afir: R-V V dcmul_add c908: dcmul_add_c: 88.0 dcmul_add_rvv_f64: 46.2 --- libavfilter/riscv/af_afir_init.c | 3 +++ libavfilter/riscv/af_afir_rvv.S | 41 ++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) diff --git a/libavfilter/riscv/af_afir_init.c b/libavfilter/riscv/af_afir_init.c index 52aa18c126..f9a76f108b 100644 --- a/libavfilter/riscv/af_afir_init.c +++ b/libavfilter/riscv/af_afir_init.c @@ -27,6 +27,8 @@ void ff_fcmul_add_rvv(float *sum, const float *t, const float *c, ptrdiff_t len); +void ff_dcmul_add_rvv(double *sum, const double *t, const double *c, + ptrdiff_t len); av_cold void ff_afir_init_riscv(AudioFIRDSPContext *s) { @@ -36,6 +38,7 @@ av_cold void ff_afir_init_riscv(AudioFIRDSPContext *s) if (flags & AV_CPU_FLAG_RVV_F64) { if (flags & AV_CPU_FLAG_RVB_ADDR) { s->fcmul_add = ff_fcmul_add_rvv; + s->dcmul_add = ff_dcmul_add_rvv; } } #endif diff --git a/libavfilter/riscv/af_afir_rvv.S b/libavfilter/riscv/af_afir_rvv.S index 04ec2e50d8..d1fa6e22e5 100644 --- a/libavfilter/riscv/af_afir_rvv.S +++ b/libavfilter/riscv/af_afir_rvv.S @@ -53,3 +53,44 @@ func ff_fcmul_add_rvv, zve64f ret endfunc + +func ff_dcmul_add_rvv, zve64f +1: + vsetvli t0, a3, e64, m4, ta, ma + li t1, 16 + li t2, 8 + vlse64.v v0, (a1), t1 + add a1, a1, t2 + vlse64.v v4, (a2), t1 + add a2, a2, t2 + vlse64.v v12, (a0), t1 + add a0, a0, t2 + vfmacc.vv v12, v0, v4 + sub a3, a3, t0 + vlse64.v v8, (a2), t1 + sub a2, a2, t2 + sh3add a2, t0, a2 + vlse64.v v16, (a0), t1 + sub a0, a0, t2 + vfmacc.vv v16, v0, v8 + sh3add a2, t0, a2 + vlse64.v v0, (a1), t1 + sub a1, a1, t2 + sh3add a1, t0, a1 + vfnmsac.vv v12, v0, v8 + sh3add a1, t0, a1 + vfmacc.vv v16, v0, v4 + vsse64.v v12, (a0), t1 + add a0, a0, t2 + vsse64.v v16, (a0), t1 + sub a0, a0, t2 + sh3add a0, t0, a0 + sh3add a0, t0, a0 + bgtz a3, 1b + fld fa0, 0(a1) + fld fa1, 0(a2) + fld fa2, 0(a0) + fmadd.d fa2, fa0, fa1, fa2 + fsd fa2, 0(a0) + ret +endfunc -- 2.43.0