From patchwork Thu May 30 19:06:55 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= <remi@remlab.net>
X-Patchwork-Id: 49408
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a59:9183:0:b0:460:55fa:d5ed with SMTP id s3csp49136vqg;
        Thu, 30 May 2024 12:07:32 -0700 (PDT)
X-Forwarded-Encrypted: i=2;
 AJvYcCU4XDLUNEQcN2DuS7V1ubW1i1ZyAlQEEplFxJWvunZi57x8H5fviGnoOLe8pKuRuw4vXHKmHF+6QZLqO95ucSyOvnSQv/KvZm/Aag==
X-Google-Smtp-Source: 
 AGHT+IFRN94fLV3sqpnnKNylpQAa2xfHIJ3guMGupORzjwOYrrIpKXIEsGPb1nFKnsSTqxh6F28Q
X-Received: by 2002:a17:906:2603:b0:a59:c23d:85d8 with SMTP id
 a640c23a62f3a-a65e90d7733mr184143766b.51.1717096052573;
        Thu, 30 May 2024 12:07:32 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1717096052; cv=none;
        d=google.com; s=arc-20160816;
        b=cZXyZE4pgSlZUeGxBUkTUOVxb+z/dqHEP1QnvJAWiRQBaoA3rtSy4LbTiFYVpawN3l
         5MkLXw/2gPgF8qYdEqJJQJJbDcxNQnqZBGYcRDnxLhmph2TVcnm6TiqhpPy33cCSWe1y
         b22s0LxsvP2/Qb03gPAzgYgrzieOX59oG/MVRGC2XgVC1nOh+JFfenkSmug3ei3RSbBB
         1TdJ74Wi1I0qqNSj5qveu4yqSLPlsgZvEL7y6KPEZp6eRAb0T1qxoi70gX0wP2hM/DiN
         ttl9Sogpsob9kOApABV8CK0Y9Kq0Y3s8RpV0pKXCCZRrosCxynLzDCTXGB/xTBu0s/cJ
         FHBA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe
         :list-help:list-post:list-archive:list-unsubscribe:list-id
         :precedence:subject:mime-version:message-id:date:to:from
         :delivered-to;
        bh=P9pzNU3e7VzWQiiutQSmwDMRg/EcK8mwuadeeRmu188=;
        fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=;
        b=cmLaxozN42VT1ofIT2ImrV1I01dfF2a3CdgVPscyAuoIwamsuRkyFZgcDrk6CMt2x1
         t9JmjiBreuFjoWRKDR2LhJkXffC1Uc5pDKUZNQ8zbvNgzSV7CM9gfvIDdi6tCDlchBoW
         xp1KM0vZp5VHe/dm97Lpql23RsKTJIjDqGDCFFa3laOvsQbU1cj6p6XJAROcnIt9v+42
         jR03zeAKotLGW6AJHylEsgciVH5zgsA93+mrqkqD/SZD2qxaEycMAWL2ppqcX9Odvr0C
         7u81Ro1kwieWAWIeORuwZcclx3wBw7UOJB4hzKYcG2encVrh/HvkVfiN5maXXLz2pHlo
         G1ug==;
        dara=google.com
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id
 a640c23a62f3a-a67e6f02658si6394666b.198.2024.05.30.12.07.32;
        Thu, 30 May 2024 12:07:32 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6790868D461;
	Thu, 30 May 2024 22:07:08 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 647A368D30D
 for <ffmpeg-devel@ffmpeg.org>; Thu, 30 May 2024 22:07:00 +0300 (EEST)
Received: from basile.remlab.net (localhost [IPv6:::1])
 by ursule.remlab.net (Postfix) with ESMTP id 6EA49C013B
 for <ffmpeg-devel@ffmpeg.org>; Thu, 30 May 2024 22:06:59 +0300 (EEST)
From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= <remi@remlab.net>
To: ffmpeg-devel@ffmpeg.org
Date: Thu, 30 May 2024 22:06:55 +0300
Message-ID: <20240530190659.65309-1-remi@remlab.net>
X-Mailer: git-send-email 2.45.1
MIME-Version: 1.0
Subject: [FFmpeg-devel] [PATCHv2 1/5] lavu/float_dsp: add double-precision
 scalar product
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: MbN7t9NtsNpo

The function pointer is appended to the structure for backward binary
compatibility. Fortunately, this is allocated by libavutil, not by the
user, so increasing the structure size is safe.
---
 libavutil/float_dsp.c | 12 ++++++++++++
 libavutil/float_dsp.h | 31 ++++++++++++++++++++++++++++++-
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/libavutil/float_dsp.c b/libavutil/float_dsp.c
index e9fb023466..08bbc85e3e 100644
--- a/libavutil/float_dsp.c
+++ b/libavutil/float_dsp.c
@@ -132,6 +132,17 @@ float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len)
     return p;
 }
 
+double ff_scalarproduct_double_c(const double *v1, const double *v2,
+                                 size_t len)
+{
+    double p = 0.0;
+
+    for (size_t i = 0; i < len; i++)
+        p += v1[i] * v2[i];
+
+    return p;
+}
+
 av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact)
 {
     AVFloatDSPContext *fdsp = av_mallocz(sizeof(AVFloatDSPContext));
@@ -149,6 +160,7 @@ av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact)
     fdsp->vector_fmul_reverse = vector_fmul_reverse_c;
     fdsp->butterflies_float = butterflies_float_c;
     fdsp->scalarproduct_float = avpriv_scalarproduct_float_c;
+    fdsp->scalarproduct_double = ff_scalarproduct_double_c;
 
 #if ARCH_AARCH64
     ff_float_dsp_init_aarch64(fdsp);
diff --git a/libavutil/float_dsp.h b/libavutil/float_dsp.h
index 342a8715c5..5053aa240d 100644
--- a/libavutil/float_dsp.h
+++ b/libavutil/float_dsp.h
@@ -19,6 +19,8 @@
 #ifndef AVUTIL_FLOAT_DSP_H
 #define AVUTIL_FLOAT_DSP_H
 
+#include <stddef.h>
+
 typedef struct AVFloatDSPContext {
     /**
      * Calculate the entry wise product of two vectors of floats and store the result in
@@ -187,19 +189,46 @@ typedef struct AVFloatDSPContext {
      */
     void (*vector_dmul)(double *dst, const double *src0, const double *src1,
                         int len);
+
+    /**
+     * Calculate the scalar product of two vectors of doubles.
+     *
+     * @param v1  first vector
+     * @param v2  second vector
+     * @param len length of vectors
+     *
+     * @return inner product of the vectors
+     */
+    double (*scalarproduct_double)(const double *v1, const double *v2,
+                                   size_t len);
 } AVFloatDSPContext;
 
 /**
- * Return the scalar product of two vectors.
+ * Return the scalar product of two vectors of floats.
  *
  * @param v1  first input vector
+ *            constraints: 32-byte aligned
  * @param v2  first input vector
+ *            constraints: 32-byte aligned
  * @param len number of elements
+ *            constraints: multiple of 16
  *
  * @return sum of elementwise products
  */
 float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len);
 
+/**
+ * Return the scalar product of two vectors of doubles.
+ *
+ * @param v1  first input vector
+ * @param v2  first input vector
+ * @param len number of elements
+ *
+ * @return inner product of the vectors
+ */
+double ff_scalarproduct_double_c(const double *v1, const double *v2,
+                                 size_t len);
+
 void ff_float_dsp_init_aarch64(AVFloatDSPContext *fdsp);
 void ff_float_dsp_init_arm(AVFloatDSPContext *fdsp);
 void ff_float_dsp_init_ppc(AVFloatDSPContext *fdsp, int strict);