From patchwork Wed Nov 22 19:49:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44747 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:bca6:b0:181:818d:5e7f with SMTP id fx38csp118002pzb; Wed, 22 Nov 2023 11:49:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IFiLJW42NrL+jtwiZhtfHAIcdUMgHUo5TeFeyO8vpTNLcTHC1eEhEFmlvYCzxIfj8ArXfnD X-Received: by 2002:a17:906:ce:b0:9ae:50de:1aaf with SMTP id 14-20020a17090600ce00b009ae50de1aafmr1963998eji.4.1700682568133; Wed, 22 Nov 2023 11:49:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700682568; cv=none; d=google.com; s=arc-20160816; b=g+/RYEAdhfDnGigDg9auhwmikjvt/7V93qpLLbL86vytSbPvpnOasdFteiFNHjGdgm aL5Ajyzej8vV3qDzbENBjH2P7o/pm/3b45TI2HFNzAsecAJGYmL9Pwwig7EL41rShKtD 3Ev1rNRx7MOzKYoB7Fv6wx9ODCMnoF+KRbsG1pn1aR2TKpVVnrN/OYfntd9UuIJ5JQT7 +K1+ZIjGdG1xEyT022PUy2scPaCgUr0Ro+ijPvSdrOkz7N1uDN4oEuZI8sFfUIW5OdJk hQk0EryFenPDNxO7ditmwoj551L31yjlyMuK4k5X3TgwF1RYWQQBgt6tpZc94AVps+pP XFhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=HU7ss5XudROEXR5Eone3EXeZ+2SUj/czKNLFu4AsE6k=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=U7TB7jIGLkysVm2Tshk9zQGTh5SB4p2ziaS6+3UWfvyxQvMJqbtTmI65oqY5TNb+8V 54Sj59Ty9U+hlyBsqTVBrhbCZX6YuxpOCQ9I1/HW+LJmBZvptDfqLh5T5RZLgTcfqtm/ l2ojy5e/4Bpk6U2bPm6SgjLmT9H2onMYhUnQpe990LZsMmZ7ATavar5wZT1ipBK5wSGH zAzLkkgLnDNRtukp0eEg8lj9m3uxKmHIXVVhmOpJTBhVSvU3uvJjnoI8z3tzrTk254Gl pG6Wdfgap2sZ/OLiO6ygzuhsUpuciUeeHKUNzF7T1NkFoUdLuRt9zxaHnNBstbgOcWJD zjWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=Ug32jYj2; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c23-20020a170906695700b00a0386b1ac79si118906ejs.203.2023.11.22.11.49.27; Wed, 22 Nov 2023 11:49:28 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=Ug32jYj2; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E33AF68CEE9; Wed, 22 Nov 2023 21:49:23 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4494568C9D6 for ; Wed, 22 Nov 2023 21:49:17 +0200 (EET) Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-6bd32d1a040so174959b3a.3 for ; Wed, 22 Nov 2023 11:49:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700682555; x=1701287355; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=BfUgfR6CLyi2uy/Qt1O8LbcikFl8OKk4vMCrxM0MJvA=; b=Ug32jYj2qT28TvCm6Hdu8q/X1Qut5+knGwFsawmZT2R+PYBhhbTQP/KceCS+8TMrvT 7n8LH41YKTUleoDu3SLnGAHYnM2jWQjNYJxN+4FkV75t+gzsDEJVx08bH7TfYPNF3Omg bkXOH7MO/prjdNV0HInz04KheqSCNTbPEHrTzvCBn1/I9QnjYSxmR3OIYw+ZdIkub/8O +efA8mCyMOFed60j7dQypnFQfeYcWLmb7bSZ9AWuy7lhyjn+FwGkABOFmn+6uyxb3/r5 +yMOeoB3SJUst6UIZLPHVnRt3/VSB5+RZSnaFWD5ZIuJ/db3FKTPbcltlN7C9N/i8Oup OOtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700682555; x=1701287355; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=BfUgfR6CLyi2uy/Qt1O8LbcikFl8OKk4vMCrxM0MJvA=; b=qcpxDey9+p9RuDpzOlmRcvEAXsubHtTSVWXTOQgznuLntU9aib86Mx9HNPuw28nfU+ 4uXxrs0T+9w3KKQKb0cphx5ZXbcAzOd+0Wz6gSXdfRTYP7q10GrYzKb1/c6KBtAwFPA2 ZK7A2DDKP8JyoWTCf9bYUwBQXTdEOPaWJd/V5pEN66zuqeB/uuhWhACpcn9FvSgYf3fy mQO8GelonZRI+nP1+eckxspsRxxFfg2ce+rFhwoKaLRBMuTJeU4q9gC2GJMpa1/7JY9v ZY/Q/gMVRTcvj4WpYleXd7lMuXeRe2/KExwoo+DMsxJgsTM/fqp15Fly4Lw3+MRjCWVh 6h7w== X-Gm-Message-State: AOJu0YyGtP2YosD7FUaOOyOEUi5qUZ7e+EHUet9uBaG7o5wFe/NCDkl3 IW9FOzcxpVVxI2xGe3NOCkNnQH6xfTM= X-Received: by 2002:a05:6a20:7f94:b0:180:1b3b:d560 with SMTP id d20-20020a056a207f9400b001801b3bd560mr4067493pzj.41.1700682554593; Wed, 22 Nov 2023 11:49:14 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id h18-20020aa786d2000000b00688965c5227sm100156pfo.120.2023.11.22.11.49.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Nov 2023 11:49:13 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Wed, 22 Nov 2023 16:49:11 -0300 Message-ID: <20231122194913.9856-1-jamrial@gmail.com> X-Mailer: git-send-email 2.42.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/3] x86/ac3dsp: reduce instruction count inside the float_to_fixed24 loop X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: kb2xnRdUtg6/ Signed-off-by: James Almer --- libavcodec/x86/ac3dsp.asm | 46 +++++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/libavcodec/x86/ac3dsp.asm b/libavcodec/x86/ac3dsp.asm index a95d359d95..42c8310462 100644 --- a/libavcodec/x86/ac3dsp.asm +++ b/libavcodec/x86/ac3dsp.asm @@ -77,16 +77,20 @@ AC3_EXPONENT_MIN INIT_XMM sse2 cglobal float_to_fixed24, 3, 3, 9, dst, src, len movaps m0, [pf_1_24] + shl lenq, 2 + add srcq, lenq + add dstq, lenq + neg lenq .loop: - movaps m1, [srcq ] - movaps m2, [srcq+16 ] - movaps m3, [srcq+32 ] - movaps m4, [srcq+48 ] + movaps m1, [srcq+lenq ] + movaps m2, [srcq+lenq+16 ] + movaps m3, [srcq+lenq+32 ] + movaps m4, [srcq+lenq+48 ] %ifdef m8 - movaps m5, [srcq+64 ] - movaps m6, [srcq+80 ] - movaps m7, [srcq+96 ] - movaps m8, [srcq+112] + movaps m5, [srcq+lenq+64 ] + movaps m6, [srcq+lenq+80 ] + movaps m7, [srcq+lenq+96 ] + movaps m8, [srcq+lenq+112] %endif mulps m1, m0 mulps m2, m0 @@ -108,24 +112,20 @@ cglobal float_to_fixed24, 3, 3, 9, dst, src, len cvtps2dq m7, m7 cvtps2dq m8, m8 %endif - movdqa [dstq ], m1 - movdqa [dstq+16 ], m2 - movdqa [dstq+32 ], m3 - movdqa [dstq+48 ], m4 + movdqa [dstq+lenq ], m1 + movdqa [dstq+lenq+16 ], m2 + movdqa [dstq+lenq+32 ], m3 + movdqa [dstq+lenq+48 ], m4 %ifdef m8 - movdqa [dstq+64 ], m5 - movdqa [dstq+80 ], m6 - movdqa [dstq+96 ], m7 - movdqa [dstq+112], m8 - add srcq, 128 - add dstq, 128 - sub lenq, 32 + movdqa [dstq+lenq+64 ], m5 + movdqa [dstq+lenq+80 ], m6 + movdqa [dstq+lenq+96 ], m7 + movdqa [dstq+lenq+112], m8 + add lenq, 128 %else - add srcq, 64 - add dstq, 64 - sub lenq, 16 + add lenq, 64 %endif - ja .loop + jl .loop RET ;------------------------------------------------------------------------------