From patchwork Fri Sep 29 11:36:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank Plowman X-Patchwork-Id: 44018 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:394d:b0:15d:8365:d4b8 with SMTP id r13csp483235pzg; Fri, 29 Sep 2023 04:38:22 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGSWt5Z/PILxYsCzNOyFagIu28Nnql1YE/xCn8b/tm5f4vHG9a5iLxOtJaUmNR3qh8KXUPT X-Received: by 2002:a17:906:3402:b0:9a1:bb8f:17d0 with SMTP id c2-20020a170906340200b009a1bb8f17d0mr3510320ejb.30.1695987502087; Fri, 29 Sep 2023 04:38:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695987502; cv=none; d=google.com; s=arc-20160816; b=XHSM7nHScaBBHmn7AWhw6gUW5OVbj+X3Of7BwZZyRSFn6xNi1YVn6CRU2iGszRdKgD 5DiNdviiijQ5dz/4Qci4kaQTY5TtzTSVpr/n6qok6VuSM8pkdImcEUebqQTnyezXZQw+ X2nEXiptI7gQ/NCfifUahSIoA/TExcF8z8ObGK2LXSIs3j69kGUaQJivmj3jWyZ+oxCR QpURIU9/z+Shl8sne+QldY1M/j/JV/uwhil0h+v9skrVzfoKvDPVomV5CHXFlw1zpoxN KfQWcO+SbDRsmjQeScykmujaqttum7HREjgOFBVVPzZlA2fJDS0j8gQDxYDWgcUagjUi 6eLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:to:mime-version:message-id:date:from :dkim-signature:delivered-to; bh=Sjh7p5SPMBMfHdCnAJv/D+Paz07E3QpJvVKAs82pALQ=; fh=GABYHefZpsCOOVRn8a1IgmYuOvaCu1oFlwEOjvaxWTE=; b=GsNRKEWRZgVu3IloxC9YyNOeAqjeDmpaYXtGor7GshFG5t5sZ6i6UUTUWl266Lzndw SJIU1yUFM4ybPPrF9w4mfauomrFYqAkEqw2Gl5ubcA9FCEfJNAn0Zv6keQ8QHaTqskai /cVPrXFu2ekLg1VcZpQ1NQADF4pvzydkyek4JcddqFB87nfVv+g6xjjnFbui3EBpPKhe bIe32avSdZGCH0pidl7Dwrn2WB6fhDjNCsJT0KgTD8pATKn1rajC4Kpzc5kHgOHqBH8Y MT0nCoyqqjGqtNZCN91CpjktlpP/ZrQnM8oW3m3pq+LpJJXqqRSaMdgYT2LtSB1Vrixa vICw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@frankplowman.com header.s=s1 header.b=btmipWSo; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=frankplowman.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gq23-20020a170906e25700b009a5bbaad78bsi16210506ejb.985.2023.09.29.04.38.21; Fri, 29 Sep 2023 04:38:22 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@frankplowman.com header.s=s1 header.b=btmipWSo; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=frankplowman.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C5F4F68CAE8; Fri, 29 Sep 2023 14:38:17 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from s.wfbtzhsv.outbound-mail.sendgrid.net (s.wfbtzhsv.outbound-mail.sendgrid.net [159.183.224.104]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id F39E168CA8D for ; Fri, 29 Sep 2023 14:38:10 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=frankplowman.com; h=from:subject:mime-version:to:cc:content-transfer-encoding: content-type:cc:content-type:from:subject:to; s=s1; bh=kQO61+o7LEFt82AV8IqBIZEjoWgvkLiH9M3ZMcxuGd4=; b=btmipWSoNvIW39TrnsRnyVh1cbKw6G7oB4vPBB/Tpm5plhJu4GGbMnJ6XsTGgxNabp8x sRY90huzmz50T2Hm49GTjoont4LjKh1AkiNI/CIQcWEX/x2YjbmBMB0uTsjUGS/yrBu8jM Ls2+4z12TxmspU3uMpRdconUEWYcIkF/VjtZFr5gGTaQTq/HDeQ+XQi18Lj0d3b6Xp/F5k WkRNq3zWRhlNFUhFBesR2SnJKITEyxZIaiyt3/Dh3TU50k964x9xzsMIJgL5vfvhZL3aYk SjrHayV2oj5Xg9iZCUeCrn2RIB8ax2tmq8O/nGlv1g/Ze2DttWk1NsAlUHDFKp8g== Received: by filterdrecv-84b96456cb-fbbmz with SMTP id filterdrecv-84b96456cb-fbbmz-1-6516B6B7-B 2023-09-29 11:36:23.427829888 +0000 UTC m=+659389.826007166 Received: from nuc.. (unknown) by geopod-ismtpd-9 (SG) with ESMTP id 60z5i_3hT2mvo-_cdpIB1g Fri, 29 Sep 2023 11:36:22.872 +0000 (UTC) From: Frank Plowman Date: Fri, 29 Sep 2023 11:36:23 +0000 (UTC) Message-ID: <20230929113622.122769-1-post@frankplowman.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-SG-EID: Uj3aYg52c+LLQjRSx2kNrre9PZxeM4UYymiuV7D/DFCrUPtmdddVVDUZk3miaVh1/JO1FFHK2V1NQv9+VwRIHFf5JH89t7Wvu1ztE9TSgQgl4R/1vT0jjhnlpJgwSp+M1cvMjAGX2DFGK1tvci+8rAC7BgI5b+4fKrQE7WA+nnq7eZdi7M2k/IlSXRQkPeIHsRr2UK9sRrIZ5Lf0ajTDIKShxUqOIhbeJjZUXceOmkz+2pLXRvS1UIG+C3t/NikV To: ffmpeg-devel@ffmpeg.org X-Entity-ID: LpPALsXh5JN/Quf2dstifQ== Subject: [FFmpeg-devel] [PATCH] x86inc: Add REPX macro to repeat instructions/operations X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Frank Plowman Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: IoVjy6imRzRs From: Henrik Gramner When operating on large blocks of data it's common to repeatedly use an instruction on multiple registers. Using the REPX macro makes it easy to quickly write dense code to achieve this without having to explicitly duplicate the same instruction over and over. For example, REPX {paddw x, m4}, m0, m1, m2, m3 REPX {mova [r0+16*x], m5}, 0, 1, 2, 3 will expand to paddw m0, m4 paddw m1, m4 paddw m2, m4 paddw m3, m4 mova [r0+16*0], m5 mova [r0+16*1], m5 mova [r0+16*2], m5 mova [r0+16*3], m5 Commit taken from x264: https://code.videolan.org/videolan/x264/-/commit/6d10612ab0007f8f60dd2399182efd696da3ffe4 Signed-off-by: Frank Plowman --- libavutil/x86/x86inc.asm | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm index 251ee797de..e099ee4b10 100644 --- a/libavutil/x86/x86inc.asm +++ b/libavutil/x86/x86inc.asm @@ -232,6 +232,16 @@ DECLARE_REG_TMP_SIZE 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14 %define gprsize 4 %endif +; Repeats an instruction/operation for multiple arguments. +; Example usage: "REPX {psrlw x, 8}, m0, m1, m2, m3" +%macro REPX 2-* ; operation, args + %xdefine %%f(x) %1 + %rep %0 - 1 + %rotate 1 + %%f(%1) + %endrep +%endmacro + %macro PUSH 1 push %1 %ifidn rstk, rsp