From patchwork Wed May 29 14:59:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49347 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:8f0d:0:b0:460:55fa:d5ed with SMTP id i13csp706521vqu; Wed, 29 May 2024 08:00:09 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXFzTuENp7o88uU0OZlQd7O9CfOxroNA2kvc4PeTZbhwQbbDOwx6UlJS3ZbZ44Zie2eKBdVsfkv0iC59Dev0yja03Z9uesTFQDKFw== X-Google-Smtp-Source: AGHT+IETPa/VO+CXlKA+LLU/YLX8Ar3vNv6SQKHXHvLJrrOUDdiEqv2sovWfxHn1yFuSwnKP6otX X-Received: by 2002:a17:907:7ea1:b0:a62:fcf7:8927 with SMTP id a640c23a62f3a-a62fcf78a7cmr1197322066b.56.1716994808863; Wed, 29 May 2024 08:00:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716994808; cv=none; d=google.com; s=arc-20160816; b=qokyMLSZCN+nNRIsS1cff1lJj9+Xa4Eapz814etUHGCGI05QN5hQWew3bFqT/bRPhA KkP7I46/HU2Uh1jboqgD51HSYwJmkhue9e5eTqXMVNs1YrxzLnhOTmsAbXhFEOG2F0qy MgiDBzKdipJOJvF2oCJO4Qe057VBulqdDe5N0rfPPGyIepH6NNGxVIbgHccF/e5wLxvp GF83V1fck1qjuMHvyt0JEVubTVAJ2rJqYmnvusyxpcZL24Uagqw0Ba1e2TFfTFAka/rR 6MtD/KU6JMEpjmFwgM/ha3iK4Ri9DIZued1594FKptQbx8lxDd2cTCw/C+cP+Asli4yR tNkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=YH+3E6O3uRKGWvhb2eDNdKrEVNj4dUu7WeZgD75jrUg=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=lrNyZybCSsHn27qtKS6X2MJo7gsaCcMzQI7ulTmrelw/io7qTEZhs8leaGpcEi+Hd1 YbL37wJ9DIx585smSZxPQys/jHOQUbQQfrrq/EMFa6cqu+sJGF3LM3GFhfZM4LpILsdG sK6x8G0tdWYbjCtsBLMgzNuYoqUx2PI9jtm1TDGRuSEVSzT0SirXrR2zwJcRyM2l/yQh 3NsaJhOv6NU4Ag62jESSJXsh5ZtEzDjeP/A2vLsPCoPFY3AKs3OsZdoxCU9BSx3JqUUj N3NiiR5SLoUeaclbEQFgDkWVeRyOh0Zxn1fH/2j+c4JcHsBtMWb5IoWmLuDr+BSw0M7a F0kw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cc66decsi620635766b.599.2024.05.29.08.00.07; Wed, 29 May 2024 08:00:08 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 01BE268D2FE; Wed, 29 May 2024 18:00:04 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A674E68D189 for ; Wed, 29 May 2024 17:59:56 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 28EBBC006D for ; Wed, 29 May 2024 17:59:56 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Wed, 29 May 2024 17:59:52 +0300 Message-ID: <20240529145955.32189-1-remi@remlab.net> X-Mailer: git-send-email 2.45.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/4] lavu/float_dsp: add double-precision scalar product X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: N6GvbjpbooZ3 The function pointer is appended to the structure for backward binary compatibility. Fortunately, this is allocated by libavutil, not by the user, so increasing the structure size is safe. --- doc/APIchanges | 3 +++ libavutil/float_dsp.c | 12 ++++++++++++ libavutil/float_dsp.h | 14 ++++++++++++++ libavutil/version.h | 2 +- 4 files changed, 30 insertions(+), 1 deletion(-) diff --git a/doc/APIchanges b/doc/APIchanges index 60f056b863..50c51c664f 100644 --- a/doc/APIchanges +++ b/doc/APIchanges @@ -2,6 +2,9 @@ The last version increases of all libraries were on 2024-03-07 API changes, most recent first: +2024-05-29 - xxxxxxxxxx - lavu 59.21.100 - float_dsp.h + Add AVFloatDSPContext.scalarproduct_double. + 2024-05-23 - xxxxxxxxxx - lavu 59.20.100 - channel_layout.h Add av_channel_layout_ambisonic_order(). diff --git a/libavutil/float_dsp.c b/libavutil/float_dsp.c index e9fb023466..1c5bb05636 100644 --- a/libavutil/float_dsp.c +++ b/libavutil/float_dsp.c @@ -132,6 +132,17 @@ float avpriv_scalarproduct_float_c(const float *v1, const float *v2, int len) return p; } +static double ff_scalarproduct_double_c(const double *v1, const double *v2, + size_t len) +{ + double p = 0.0; + + for (size_t i = 0; i < len; i++) + p += v1[i] * v2[i]; + + return p; +} + av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact) { AVFloatDSPContext *fdsp = av_mallocz(sizeof(AVFloatDSPContext)); @@ -149,6 +160,7 @@ av_cold AVFloatDSPContext *avpriv_float_dsp_alloc(int bit_exact) fdsp->vector_fmul_reverse = vector_fmul_reverse_c; fdsp->butterflies_float = butterflies_float_c; fdsp->scalarproduct_float = avpriv_scalarproduct_float_c; + fdsp->scalarproduct_double = ff_scalarproduct_double_c; #if ARCH_AARCH64 ff_float_dsp_init_aarch64(fdsp); diff --git a/libavutil/float_dsp.h b/libavutil/float_dsp.h index 342a8715c5..b6b5b0a3b3 100644 --- a/libavutil/float_dsp.h +++ b/libavutil/float_dsp.h @@ -19,6 +19,8 @@ #ifndef AVUTIL_FLOAT_DSP_H #define AVUTIL_FLOAT_DSP_H +#include + typedef struct AVFloatDSPContext { /** * Calculate the entry wise product of two vectors of floats and store the result in @@ -187,6 +189,18 @@ typedef struct AVFloatDSPContext { */ void (*vector_dmul)(double *dst, const double *src0, const double *src1, int len); + + /** + * Calculate the scalar product of two vectors of doubles. + * + * @param v1 first vector + * @param v2 second vector + * @param len length of vectors + * + * @return inner product of the vectors + */ + double (*scalarproduct_double)(const double *v1, const double *v2, + size_t len); } AVFloatDSPContext; /** diff --git a/libavutil/version.h b/libavutil/version.h index 9c7146c228..9d08d56884 100644 --- a/libavutil/version.h +++ b/libavutil/version.h @@ -79,7 +79,7 @@ */ #define LIBAVUTIL_VERSION_MAJOR 59 -#define LIBAVUTIL_VERSION_MINOR 20 +#define LIBAVUTIL_VERSION_MINOR 21 #define LIBAVUTIL_VERSION_MICRO 100 #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \ From patchwork Wed May 29 14:59:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49348 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:8f0d:0:b0:460:55fa:d5ed with SMTP id i13csp706715vqu; Wed, 29 May 2024 08:00:20 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCU70gDLRJtAR4I/HhDRFWlPGZwl3QLwnHRbLvRkNpG9+v49pvAD8+daeyJMJ6TH16XvlpP6VPuoh5a+lYvQEDETFPctGPsYGU1jzw== X-Google-Smtp-Source: AGHT+IHF+bvmQTlbbVW0cyCiu+jEMtJQ/OlJEf5SyLVe8ZjGGNviLRboZAeWzVq1TElQnIR0bDIJ X-Received: by 2002:a50:955e:0:b0:579:bf5d:ac01 with SMTP id 4fb4d7f45d1cf-579bf5daca3mr8209403a12.15.1716994819775; Wed, 29 May 2024 08:00:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716994819; cv=none; d=google.com; s=arc-20160816; b=lju7kHttPVceDnlBTcjfHjOpIs4PTviPqhFeAwXRvcbxMphFAGEM9ugIiPwnjMJ9rU YiPYXKVeF6aH7Wqmn2c6czIH/ZEoS0e3CsqcnMO/27PjSaiPLTfUdlIF86UF77g98Bb/ F1p2TCHTDKN2n751IqmcCZ+I98GBtDtUf00YDGgnAPRO1UwUwGlwaNcDR0WndcavtFIm XCYCeU1QfRzgXr1cW8ub6tk7ygViBLovJ2qJ6K6Vi6neOjyfUW1gl/1rNqwhaFFO1Ekl DOce/WCRn/tl9fkIQJtU49HhBYngijjfgrtjXECXD+574cZM5ptVsMxrsP1IxaaVFiS8 9qpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=/R0aPiGl9HyweKq4Popoed7NfNR4WzzR1tkrINm0Hjc=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=XQiW8zpAqkboTnCezXhPnPFFrDW6mt/LBcqTeyEfiUYH+eRn9YUREupESacwS01coo 1c0UvLTmwt2WNilKKruC9u01dIuiMJIarLq4fm2Y+cZjqeYWi2fjCaNlH6CkhlTL0AaU md+FBdzAUjr6eHy7jynIVx4kvUJNot+GhECCoDaogITAY5KIWfTYSCXEIDDEvo1eUZxx VwfwAyPt4opQR2DmyZkSQsUPE9cNadd8+yjYTzBw3kkQRQRKI4BNopJ/O7JSb7fEWMl0 5gPOCEFS6TAq7S4/eoP3wFRTdIhxzJL1jqGq55Inuf815+iqpFzmXzwz8ik1l6D7dqRQ XTjA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-579d769805csi3270087a12.301.2024.05.29.08.00.19; Wed, 29 May 2024 08:00:19 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3F52A68D366; Wed, 29 May 2024 18:00:05 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C66EA68D22E for ; Wed, 29 May 2024 17:59:56 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 5CBE0C0099 for ; Wed, 29 May 2024 17:59:56 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Wed, 29 May 2024 17:59:53 +0300 Message-ID: <20240529145955.32189-2-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240529145955.32189-1-remi@remlab.net> References: <20240529145955.32189-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/4] lavf: get rid of bespoke double scalar products X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: gjZlxmE1pKKF --- libavfilter/aap_template.c | 14 +------------- libavfilter/anlms_template.c | 16 ++-------------- libavfilter/arls_template.c | 14 +------------- 3 files changed, 4 insertions(+), 40 deletions(-) diff --git a/libavfilter/aap_template.c b/libavfilter/aap_template.c index ea9c815a89..0e0580fb32 100644 --- a/libavfilter/aap_template.c +++ b/libavfilter/aap_template.c @@ -36,18 +36,6 @@ #define fn2(a,b) fn3(a,b) #define fn(a) fn2(a, SAMPLE_FORMAT) -#if DEPTH == 64 -static double scalarproduct_double(const double *v1, const double *v2, int len) -{ - double p = 0.0; - - for (int i = 0; i < len; i++) - p += v1[i] * v2[i]; - - return p; -} -#endif - static ftype fn(fir_sample)(AudioAPContext *s, ftype sample, ftype *delay, ftype *coeffs, ftype *tmp, int *offset) { @@ -60,7 +48,7 @@ static ftype fn(fir_sample)(AudioAPContext *s, ftype sample, ftype *delay, #if DEPTH == 32 output = s->fdsp->scalarproduct_float(delay, tmp, s->kernel_size); #else - output = scalarproduct_double(delay, tmp, s->kernel_size); + output = s->fdsp->scalarproduct_double(delay, tmp, s->kernel_size); #endif if (--(*offset) < 0) diff --git a/libavfilter/anlms_template.c b/libavfilter/anlms_template.c index b25df4fa18..a8d1dbfe0f 100644 --- a/libavfilter/anlms_template.c +++ b/libavfilter/anlms_template.c @@ -33,18 +33,6 @@ #define fn2(a,b) fn3(a,b) #define fn(a) fn2(a, SAMPLE_FORMAT) -#if DEPTH == 64 -static double scalarproduct_double(const double *v1, const double *v2, int len) -{ - double p = 0.0; - - for (int i = 0; i < len; i++) - p += v1[i] * v2[i]; - - return p; -} -#endif - static ftype fn(fir_sample)(AudioNLMSContext *s, ftype sample, ftype *delay, ftype *coeffs, ftype *tmp, int *offset) { @@ -58,7 +46,7 @@ static ftype fn(fir_sample)(AudioNLMSContext *s, ftype sample, ftype *delay, #if DEPTH == 32 output = s->fdsp->scalarproduct_float(delay, tmp, s->kernel_size); #else - output = scalarproduct_double(delay, tmp, s->kernel_size); + output = s->fdsp->scalarproduct_double(delay, tmp, s->kernel_size); #endif if (--(*offset) < 0) @@ -85,7 +73,7 @@ static ftype fn(process_sample)(AudioNLMSContext *s, ftype input, ftype desired, #if DEPTH == 32 sum = s->fdsp->scalarproduct_float(delay, delay, s->kernel_size); #else - sum = scalarproduct_double(delay, delay, s->kernel_size); + sum = s->fdsp->scalarproduct_double(delay, delay, s->kernel_size); #endif norm = s->eps + sum; b = mu * e / norm; diff --git a/libavfilter/arls_template.c b/libavfilter/arls_template.c index d8b19d89a5..c67b48cf6f 100644 --- a/libavfilter/arls_template.c +++ b/libavfilter/arls_template.c @@ -39,18 +39,6 @@ #define fn2(a,b) fn3(a,b) #define fn(a) fn2(a, SAMPLE_FORMAT) -#if DEPTH == 64 -static double scalarproduct_double(const double *v1, const double *v2, int len) -{ - double p = 0.0; - - for (int i = 0; i < len; i++) - p += v1[i] * v2[i]; - - return p; -} -#endif - static ftype fn(fir_sample)(AudioRLSContext *s, ftype sample, ftype *delay, ftype *coeffs, ftype *tmp, int *offset) { @@ -64,7 +52,7 @@ static ftype fn(fir_sample)(AudioRLSContext *s, ftype sample, ftype *delay, #if DEPTH == 32 output = s->fdsp->scalarproduct_float(delay, tmp, s->kernel_size); #else - output = scalarproduct_double(delay, tmp, s->kernel_size); + output = s->fdsp->scalarproduct_double(delay, tmp, s->kernel_size); #endif if (--(*offset) < 0) From patchwork Wed May 29 14:59:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49349 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:8f0d:0:b0:460:55fa:d5ed with SMTP id i13csp706890vqu; Wed, 29 May 2024 08:00:30 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV6bvklPelwD7BgahT15K2AiReTKm/l7k39R5rCYlLdsgrZ3rVx1bw2ty803BpiviAwXr3xpjTS9ocZ+eCr+58l/2OVrRlRPi4EQw== X-Google-Smtp-Source: AGHT+IGXrtCktK1V4YxgHl2FX4eP40KMKO4Yj9o77WzsYvfArLInnW0u1D1YeyG0bFdytLtiGsEK X-Received: by 2002:a17:906:1d50:b0:a5a:8cc0:8c23 with SMTP id a640c23a62f3a-a642d954de3mr206162666b.27.1716994829935; Wed, 29 May 2024 08:00:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716994829; cv=none; d=google.com; s=arc-20160816; b=YE5RExfw98Uv9HSbXAkSiMsAAWhPMBa+WPkfjy5obDJegyEXKxNyjxPX2C5Aa1jWGv 97P4c4x/E7AipkG+sA4GbyEw1rF96TyI690s1WGRf543gwLDJVKCT6Dc5zxedSj/wVgc PCbpFZke032c2UBOKerZmSmSJ2x3witGDjReuTG/XhlWoKXVW+X4OQqA+VRA9fJ3hs5k hs1Kcp5ueucBsuCw20CFSOWBr77K4dt5zT/18LZ5A8XjM5QIRKhRvzFGkZ1Z1Q3vf+Ff QxdF2yfXVm74Q6up9p0di9xFhweHhNQLXOoZO6gq2DctOgCIeyIy2l9N28xidQZFb025 4P2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=hoO+tl/CMrdxDmry5L9EkUo5TO3CjyEYwDd2ikTZGoY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=hnvY681QoYD1uHH/WRtz3BZ3LwMKx7FupvFlqqXheKH0mgQL/FPQZIRnBnlOe1BwoH QjtlUzXVEoQV7wFfcdXVraNAx3tKNdg176q/TQpEHY0zZfO3RyLx5aWsoTKlGF62s/IW Tier0D5dXKxFy9/9So+2rn7/2koRaVJEuEshNUssixOXcqBE0Bc1mZT19EQQeHZFgTdn kCOoiZCGxftJQhar/9c2Vm8kW5ecGfrNGjSyvGIb8M6/jihTavkFOQmO0xlq1ugHsAzV r8QEsWZmhGCnLZCyvtfgboU+BX4KyFdGXbztfO8hhUopzEq6gu/Ean2Nkm9uoLEPntUy DwkA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cd94f6csi656628366b.854.2024.05.29.08.00.29; Wed, 29 May 2024 08:00:29 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5FA7368D392; Wed, 29 May 2024 18:00:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F025A68D22E for ; Wed, 29 May 2024 17:59:56 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 87FD0C0214 for ; Wed, 29 May 2024 17:59:56 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Wed, 29 May 2024 17:59:54 +0300 Message-ID: <20240529145955.32189-3-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240529145955.32189-1-remi@remlab.net> References: <20240529145955.32189-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/4] checkasm/float_dsp: add double-precision scalar product X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: jKd+tIndtrod --- tests/checkasm/float_dsp.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/tests/checkasm/float_dsp.c b/tests/checkasm/float_dsp.c index cadfa65e2a..296db1cff9 100644 --- a/tests/checkasm/float_dsp.c +++ b/tests/checkasm/float_dsp.c @@ -278,6 +278,22 @@ static void test_scalarproduct_float(const float *src0, const float *src1) bench_new(src0, src1, LEN); } +static void test_scalarproduct_double(const double *src0, const double *src1) +{ + double cprod, oprod; + + declare_func_float(double, const double *, const double *, size_t); + + cprod = call_ref(src0, src1, LEN); + oprod = call_new(src0, src1, LEN); + if (!double_near_abs_eps(cprod, oprod, ARBITRARY_SCALARPRODUCT_CONST)) { + fprintf(stderr, "%- .12f - %- .12f = % .12g\n", + cprod, oprod, cprod - oprod); + fail(); + } + bench_new(src0, src1, LEN); +} + void checkasm_check_float_dsp(void) { LOCAL_ALIGNED_32(float, src0, [LEN]); @@ -334,6 +350,9 @@ void checkasm_check_float_dsp(void) if (check_func(fdsp->scalarproduct_float, "scalarproduct_float")) test_scalarproduct_float(src3, src4); report("scalarproduct_float"); + if (check_func(fdsp->scalarproduct_double, "scalarproduct_double")) + test_scalarproduct_double(dbl_src0, dbl_src1); + report("scalarproduct_double"); av_freep(&fdsp); } From patchwork Wed May 29 14:59:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49350 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:8f0d:0:b0:460:55fa:d5ed with SMTP id i13csp707111vqu; Wed, 29 May 2024 08:00:40 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXSHnzf41h9pA4QUxPw72hqwo1f611VSAI5LvfYDulcejZqk0hY+I1A0DMFa3Iof96Thf5v44yDXu9iF6LEPAjlaQ8R3SZyLM9W/A== X-Google-Smtp-Source: AGHT+IHsaGvI/6r8oSwu4bMLvxNaxAydQUvJDOxTW+NNhj6B9+wYXlyIbsbuWBJQa2YAKXsmd3iX X-Received: by 2002:a17:906:f9c9:b0:a5a:f16:32b1 with SMTP id a640c23a62f3a-a62641cfb1dmr1064148866b.31.1716994840259; Wed, 29 May 2024 08:00:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1716994840; cv=none; d=google.com; s=arc-20160816; b=jXk1nX8721MosLoKy2tGT57kDaruhgbS1W0leFfMjQHqW5cTfXT3Chnnq1j+N33zk6 Rv/9NyRUjIeL28WOWiujfumNp74pfYwE+pyTSSgd6EM6ZbcUsS30vbDYbQSzHWWiS5xk M71IGSMcNJU3l20/xDSXJ5xPpQZ2E+iHWIidDR0+YecSEDQI8HN5qGk+xjnxz7nyDL65 RUYp3gaWjQXR6sQSu2mNKmNkdhIUHF/nbBV+PCSe6kJ/xKHcw4AlyDjUQlfIHtcZwDnn 0GMkal6A51Olp88AL86XrvV6XfgrN2AhS2ZIhwdNph8f6f7REU4ljR8JQVGWBqu2vtdO 6k4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=JDHhBq5jJODS/+J9vVvfxjzYpWNVPmE92FcHY1t0Qz0=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=CQ590r25bqUF+vySkwKNQiFnn5+nKGeum067o3ImT/sLFFZlJUBpODrV62egsrIamg 3Pq/8VRmZ5YZvq00vuCFdkHr4WyQTugbB8wHGPqtzrOtyT8jyT2EUsJy2zMtyuYra+dw GRbI4jq33uCoynkKjv3J19PlxbeTA/AUhDqqZDA6b3Ym4YKMfs/G5Z2vPT6P0QCEtzJy hlZa9HgomcUaDxwqhZmQyePp0FSlMJ0ap5nGd4DSmCTQshJ76/IpzfCZkcmldgIuYm0T 3cRauS+K3GtxH2At8+yRxhz36bZAXVpZkgdk0b6FsMj0piKzinKXPJBO0t9xeOImz6DI /1eg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a626cd928dfsi603334266b.787.2024.05.29.08.00.39; Wed, 29 May 2024 08:00:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D719F68D22E; Wed, 29 May 2024 18:00:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4417D68D22E for ; Wed, 29 May 2024 17:59:57 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B665EC0215 for ; Wed, 29 May 2024 17:59:56 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Wed, 29 May 2024 17:59:55 +0300 Message-ID: <20240529145955.32189-4-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240529145955.32189-1-remi@remlab.net> References: <20240529145955.32189-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/4] lavc/float_dsp: R-V V scalarproduct_double X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: keSveDaYhIox C908: scalarproduct_double_c: 39.2 scalarproduct_double_rvv_f64: 10.5 X60: scalarproduct_double_c: 35.0 scalarproduct_double_rvv_f64: 5.2 --- libavutil/riscv/float_dsp_init.c | 3 +++ libavutil/riscv/float_dsp_rvv.S | 21 +++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/libavutil/riscv/float_dsp_init.c b/libavutil/riscv/float_dsp_init.c index 585f237225..155496fa6b 100644 --- a/libavutil/riscv/float_dsp_init.c +++ b/libavutil/riscv/float_dsp_init.c @@ -46,6 +46,8 @@ void ff_vector_dmac_scalar_rvv(double *dst, const double *src, double mul, int len); void ff_vector_dmul_scalar_rvv(double *dst, const double *src, double mul, int len); +double ff_scalarproduct_double_rvv(const double *v1, const double *v2, + size_t len); av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) { @@ -68,6 +70,7 @@ av_cold void ff_float_dsp_init_riscv(AVFloatDSPContext *fdsp) fdsp->vector_dmul = ff_vector_dmul_rvv; fdsp->vector_dmac_scalar = ff_vector_dmac_scalar_rvv; fdsp->vector_dmul_scalar = ff_vector_dmul_scalar_rvv; + fdsp->scalarproduct_double = ff_scalarproduct_double_rvv; } } #endif diff --git a/libavutil/riscv/float_dsp_rvv.S b/libavutil/riscv/float_dsp_rvv.S index 7cfc890bc2..4379534af7 100644 --- a/libavutil/riscv/float_dsp_rvv.S +++ b/libavutil/riscv/float_dsp_rvv.S @@ -237,3 +237,24 @@ NOHWD mv a2, a3 ret endfunc + +func ff_scalarproduct_double_rvv, zve64f + vsetvli t0, zero, e64, m8, ta, ma + vmv.v.x v8, zero + vmv.s.x v0, zero +1: + vsetvli t0, a2, e64, m8, tu, ma + vle64.v v16, (a0) + sub a2, a2, t0 + vle64.v v24, (a1) + sh3add a0, t0, a0 + vfmacc.vv v8, v16, v24 + sh3add a1, t0, a1 + bnez a2, 1b + + vsetvli t0, zero, e64, m8, ta, ma + vfredusum.vs v0, v8, v0 + vfmv.f.s fa0, v0 +NOHWD fmv.x.w a0, fa0 + ret +endfunc