From patchwork Thu May 11 17:13:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 41582 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dca6:b0:f3:34fa:f187 with SMTP id ky38csp4791982pzb; Thu, 11 May 2023 10:14:00 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7wnDhkhB62JkXBng6abAfF79wRw6Fm57ohF/OalFeK7ETlbQxpldodY9GP+610Ugecw2B2 X-Received: by 2002:a17:907:74c:b0:965:6aff:4f02 with SMTP id xc12-20020a170907074c00b009656aff4f02mr18478587ejb.41.1683825239886; Thu, 11 May 2023 10:13:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683825239; cv=none; d=google.com; s=arc-20160816; b=Vlf68hL6wUARWtR/OEKY6qpI7yU8FQNsTMTdEY2n/Oc+rYfYyI0qRdDTldqMUfepC7 yoA55NKjYDxYS3q/KFjvVTGZkurgw+Gr+QHeTK/1jUdMetzoxL7SZVI0HvZly2gpQ6Id IjOu4bGlwx+1/q/39GltkbGzXEsCtHaR1WDA1kU789huFXqhYjHlMFeIbd2OiD/bjnsU DekfpSNRmSbJpzP+gtGvV4cSPh7go90m43yIfaqueL1KSIEthNyVuy+yC+cZFUffdFt4 Pnq8sKuq1Yg8NIJ4kBD+klVp8Wp5ZA5hZSpELtTOFUCKGVWWtG2vHg370YARxgHD90nR xkrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=rW+dILAMg2UxWDS9TBhckTZfPWl/q/URLjlUSz83x0U=; b=LT7U25daY/2wp85f0LdUVUEiEdwqlSvvZT+9Kzg5n9PEq6firxpNnHpAoMYwn5k9PH RB3cajPxHKg01Wvk6VhvA5idjEzj0rxCoR7NovDRrFAt0NMgvubcaf3FR/j+o11p3owI qAZz45E55fE/LBcyofWHjLayeCclGPpNMSCxlLi495F5fxp8GDx7MZAykeKHmr4HtKO8 ZN2SI36zpEK96GsITML3bd33xO4M1fTTheJca1cyc/L2pL9sI72bqbgCfgFMLwlNG6KE TdGVJdbvPYNr7omS94BZWF7KqEjxgHHSbj5LgsBA6nzqHzO5+iX4rUlOPVWdGT1LMEvS 2C1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20221208 header.b=TXKcn9l7; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id mb12-20020a170906eb0c00b0095f968dac16si5557218ejb.301.2023.05.11.10.13.59; Thu, 11 May 2023 10:13:59 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20221208 header.b=TXKcn9l7; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 58C2568C13F; Thu, 11 May 2023 20:13:57 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ua1-f51.google.com (mail-ua1-f51.google.com [209.85.222.51]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4413868C05D for ; Thu, 11 May 2023 20:13:55 +0300 (EEST) Received: by mail-ua1-f51.google.com with SMTP id a1e0cc1a2514c-77cfcc93ddcso2514253241.1 for ; Thu, 11 May 2023 10:13:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683825234; x=1686417234; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=6NUoNZuc3Ue9hwc29oqIRgo41a7i18wHoZ7eNOoqWO8=; b=TXKcn9l7QCl9H0Er/8WnbpJel20IZtqGHWk9brGUevObh9J26psm+Ioj1Ju+rplfQl eaqeaPfLHyo/0zwwlBb2b7ej14zzN/8smYe9ADx5D+Gv5je99AiwXBgC995yCy5wMRWy aM2LGCZS+6KhO5sHKWBJWk2hXPhUSBTLUeNmfGCuij4ADJpT9FZC7dV2E1TW1OXdsLN4 8qgo8zBD0vorUbRc/7DcE68mz0kvba6VKEh0PCFChmjYT63i913/vDNVoOFbsiVJbDOB YsoBhqm6D9cdPafEbqIZswMWfrY9klA+jUXLtitEUEqLPxKtYjWPcOIpczep1FzlsnYN +VDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683825234; x=1686417234; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=6NUoNZuc3Ue9hwc29oqIRgo41a7i18wHoZ7eNOoqWO8=; b=KkCTibQtfr6IJedIislZu2XQY3LOO1T8l/Y9AbK6jeGaDiZulcP/MntdnehJ6V/aMC LHwZoZBeY4ZirDvHNLhMUiNAoR8772AYoHGingdWdSfgZTUrEkfnEl79+XsiTbaKx+fL eZj86RBEsyVL2GH6TNtwzG+IulNNWtnIrqDWRJ50Qa3KbQYPPiPrsE4+lXFYjkkACEwc G8qIRA+PhoXxUYuC1H6ciO0/LRnApdjQ7svyk996wflx1Y+eQryaI9PGKwV4gd7lxAM/ 7zpDOGvFODNzN4/XMIBnzt2SyKvXCoE8KFIQNwd1ng6ip4NGj5l0u+FlTs3Xh5fPBs5p IfZA== X-Gm-Message-State: AC+VfDykk86lsUP/4nIglJmSxlgq0EGEuTzEeIs7YDepoDADMZ0DOtN3 ReHM9HLmht2ULbnSquGvAl5Z6I+vOr1yheKpGx206j8b X-Received: by 2002:a05:6102:11f5:b0:42e:4002:c7a2 with SMTP id e21-20020a05610211f500b0042e4002c7a2mr6884946vsg.35.1683825234659; Thu, 11 May 2023 10:13:54 -0700 (PDT) MIME-Version: 1.0 From: Paul B Mahol Date: Thu, 11 May 2023 19:13:19 +0200 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH] swresample: misc improvements X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: UK6oIn8ZHWNr Attached. From 5a8ab5b948423e6cde7b59df0d21f38dc0235155 Mon Sep 17 00:00:00 2001 From: Paul B Mahol Date: Thu, 11 May 2023 01:11:42 +0200 Subject: [PATCH 1/2] swresample/x86: add float<->double paths Signed-off-by: Paul B Mahol --- libswresample/x86/audio_convert.asm | 25 +++++++++++++++++++++++++ libswresample/x86/audio_convert_init.c | 8 ++++++-- 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/libswresample/x86/audio_convert.asm b/libswresample/x86/audio_convert.asm index ad65008e23..82eda3758e 100644 --- a/libswresample/x86/audio_convert.asm +++ b/libswresample/x86/audio_convert.asm @@ -540,6 +540,26 @@ pack_8ch_%2_to_%1_u_int %+ SUFFIX: punpckhwd m1, m4 %endmacro +%macro FLOAT_TO_DOUBLE_N 6 + shufps %3, %1, %1, q3232 + shufps %4, %2, %2, q3232 + cvtps2pd %1, %1 + cvtps2pd %2, %2 + cvtps2pd %3, %3 + cvtps2pd %4, %4 + SWAP 1,2 +%endmacro + +%macro DOUBLE_TO_FLOAT_N 6 + cvtpd2ps %1, %1 + cvtpd2ps %2, %2 + cvtpd2ps %3, %3 + cvtpd2ps %4, %4 + shufps %1, %2, q1010 + shufps %3, %4, q1010 + SWAP 1,2 +%endmacro + %macro INT32_TO_INT16_N 6 psrad m0, 16 psrad m1, 16 @@ -648,6 +668,11 @@ CONV float, int16, a, 2, 1, INT16_TO_FLOAT_N, INT16_TO_FLOAT_INIT CONV int16, float, u, 1, 2, FLOAT_TO_INT16_N, FLOAT_TO_INT16_INIT CONV int16, float, a, 1, 2, FLOAT_TO_INT16_N, FLOAT_TO_INT16_INIT +CONV double, float, u, 3, 2, FLOAT_TO_DOUBLE_N, NOP_N +CONV double, float, a, 3, 2, FLOAT_TO_DOUBLE_N, NOP_N +CONV float, double, u, 2, 3, DOUBLE_TO_FLOAT_N, NOP_N +CONV float, double, a, 2, 3, DOUBLE_TO_FLOAT_N, NOP_N + PACK_2CH float, int32, u, 2, 2, INT32_TO_FLOAT_N, INT32_TO_FLOAT_INIT PACK_2CH float, int32, a, 2, 2, INT32_TO_FLOAT_N, INT32_TO_FLOAT_INIT PACK_2CH int32, float, u, 2, 2, FLOAT_TO_INT32_N, FLOAT_TO_INT32_INIT diff --git a/libswresample/x86/audio_convert_init.c b/libswresample/x86/audio_convert_init.c index f6d36f9ca6..e10b978c68 100644 --- a/libswresample/x86/audio_convert_init.c +++ b/libswresample/x86/audio_convert_init.c @@ -24,8 +24,8 @@ #include "libswresample/audioconvert.h" #define PROTO(pre, in, out, cap) void ff ## pre ## in## _to_ ##out## _a_ ##cap(uint8_t **dst, const uint8_t **src, int len); -#define PROTO2(pre, out, cap) PROTO(pre, int16, out, cap) PROTO(pre, int32, out, cap) PROTO(pre, float, out, cap) -#define PROTO3(pre, cap) PROTO2(pre, int16, cap) PROTO2(pre, int32, cap) PROTO2(pre, float, cap) +#define PROTO2(pre, out, cap) PROTO(pre, int16, out, cap) PROTO(pre, int32, out, cap) PROTO(pre, float, out, cap) PROTO(pre, double, out,cap) +#define PROTO3(pre, cap) PROTO2(pre, int16, cap) PROTO2(pre, int32, cap) PROTO2(pre, float, cap) PROTO2(pre, double, cap) #define PROTO4(pre) PROTO3(pre, sse) PROTO3(pre, sse2) PROTO3(pre, ssse3) PROTO3(pre, sse4) PROTO3(pre, avx) PROTO3(pre, avx2) PROTO4(_) PROTO4(_pack_2ch_) @@ -72,6 +72,10 @@ MULTI_CAPS_FUNC(SSE2, sse2) ac->simd_f = ff_float_to_int32_a_sse2; if( out_fmt == AV_SAMPLE_FMT_S16 && in_fmt == AV_SAMPLE_FMT_FLT || out_fmt == AV_SAMPLE_FMT_S16P && in_fmt == AV_SAMPLE_FMT_FLTP) ac->simd_f = ff_float_to_int16_a_sse2; + if( out_fmt == AV_SAMPLE_FMT_DBL && in_fmt == AV_SAMPLE_FMT_FLT || out_fmt == AV_SAMPLE_FMT_DBLP && in_fmt == AV_SAMPLE_FMT_FLTP) + ac->simd_f = ff_float_to_double_a_sse2; + if( out_fmt == AV_SAMPLE_FMT_FLT && in_fmt == AV_SAMPLE_FMT_DBL || out_fmt == AV_SAMPLE_FMT_FLTP && in_fmt == AV_SAMPLE_FMT_DBLP) + ac->simd_f = ff_double_to_float_a_sse2; if(channels == 2) { if( out_fmt == AV_SAMPLE_FMT_FLT && in_fmt == AV_SAMPLE_FMT_FLTP || out_fmt == AV_SAMPLE_FMT_S32 && in_fmt == AV_SAMPLE_FMT_S32P) -- 2.39.1