From patchwork Wed Jun 15 05:33:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 36232 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1a22:b0:84:42e0:ad30 with SMTP id cj34csp169112pzb; Tue, 14 Jun 2022 22:33:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz06rMSFQkRB/MQqmFALAyD2yEdqX7o+4GATHK5gl8Yc75r/KVjxGulPcIn9+dVUFs3guAC X-Received: by 2002:a17:906:c347:b0:718:ca61:e7b7 with SMTP id ci7-20020a170906c34700b00718ca61e7b7mr7312748ejb.746.1655271234532; Tue, 14 Jun 2022 22:33:54 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id oq19-20020a170906cc9300b006febef95910si11157977ejb.332.2022.06.14.22.33.53; Tue, 14 Jun 2022 22:33:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@outlook.com header.s=selector1 header.b=cHYIMc6v; arc=fail (body hash mismatch); spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=outlook.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 15CDE68B5DB; Wed, 15 Jun 2022 08:33:50 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03olkn2060.outbound.protection.outlook.com [40.92.58.60]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4EAC268B4F1 for ; Wed, 15 Jun 2022 08:33:43 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ON5BKKxIoLK6aHnjP2pJJF2WyAiV4YoGbbf3wBFJEz1g6zGfPRGMj9qHi+T6fffl+uL8I2l2NJp5fgLUT2V60Ta8AbFhGN+1+wMllvMDN2+GUWrwVaS2XZQGTwM+sxJCk3X21/EJf9FoOdT06vfC+h66+r5mP5Km3Jr1l2hf2J9iHoqmJyIQSP08p4KRdWwJ07H3XlpD9jJe312aedfK/WFQlJLkMpis3Dv1ZOicCt3VJQE7bGd1EU5otFCtI0BxYRZ4UJoSMj2oQmqy8G8dkGZh9j3ek476x8UdkRP4x5d1EGsvvsoU4yUz4JPlxBf/ZAPNzhNs+1IgERaBKPeiqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5MjZrPCnpmT/Ckuhu7y79ng5E6v11OpfUuCf5OjDkCc=; b=IvAFE72KZB4SmasDYeH4XUdpdROGVLZPq1YjwCdCuFc+SKqQaKWCmzdYAoB8mAIWvuIB5e7nckTyIVVthRrFu6/n6MPxd1zUuNEaN8R6W8aNScB03W5o9aZyjkfFAnSgwQ6C10l4WQN2eAUK6noRc+OSp+aOL9qHZlYNqcg6I4XOT5YRSW/wl+8QFh73cJssvGqWwZ+vzYaBIDnIvHfjr1mLNtKWrmjvFkHdUaoDgGUXUpw1/Bp8KhUvpQMC4iNprdJyR+95tXzc+fyL7lvoM71ACVYxiRpZti1HQxrP10C+8SaGVasjpu8Gu+/mR3wMZUZBVC59fumfJCwFjXrXVA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5MjZrPCnpmT/Ckuhu7y79ng5E6v11OpfUuCf5OjDkCc=; b=cHYIMc6vnbiNx+p4rOHaWWLxC8lqgoKrrGO/TrtvVKBdNKYE44hw9/5natMhunb1CrkiTtB9miPEKx17kguVATHHtReHenF4KeMdtKnkEFmVbFOX9J+moFETZBxewyK5OH1SfHkMRf2ubPW9S6FbRrMu34g7C7XbWBt736Sc4/D2hTHEcNN4rZTp/l8DxT1cbOXjsy0uW55BK6ee0NMFnFlGDkSQN9FYXH2ztI3oOHF10BNqgfQhD+P8GNR6BreUyRkawff8Wv/CzJrxXbvxQ2vMXlJNe1sctaGovnaqb5eA/POes8ivp9cpNdMasynWjznWyWjqsXNNcMNTE7zFTA== Received: from DB6PR0101MB2214.eurprd01.prod.exchangelabs.com (2603:10a6:4:42::27) by AM8PR01MB7994.eurprd01.prod.exchangelabs.com (2603:10a6:20b:36d::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.13; Wed, 15 Jun 2022 05:33:40 +0000 Received: from DB6PR0101MB2214.eurprd01.prod.exchangelabs.com ([fe80::60b9:9f29:40cc:f01c]) by DB6PR0101MB2214.eurprd01.prod.exchangelabs.com ([fe80::60b9:9f29:40cc:f01c%10]) with mapi id 15.20.5332.021; Wed, 15 Jun 2022 05:33:40 +0000 From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Wed, 15 Jun 2022 07:33:30 +0200 Message-ID: X-Mailer: git-send-email 2.34.1 X-TMN: [Go8Lkt2jkJ8sqWt2DAD+CgeNj8hBaRC9] X-ClientProxiedBy: ZR0P278CA0009.CHEP278.PROD.OUTLOOK.COM (2603:10a6:910:16::19) To DB6PR0101MB2214.eurprd01.prod.exchangelabs.com (2603:10a6:4:42::27) X-Microsoft-Original-Message-ID: <20220615053331.2168361-1-andreas.rheinhardt@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 9cf9767e-d2d6-42f5-31af-08da4e909a84 X-MS-Exchange-SLBlob-MailProps: EpEO96k6WomtMq5UKPi+rg6o36zkFloYK1Z+V3yP3+O82z1Sy2q5SGL07uOm/WN7a/Zyr7M6zHyWEdvMfLNG12q3WMCYHaU2sJ9ebZq1O1PB4ld45fcC0K3G/ymWYdrYDvLXyreGalndXI/2X7mCfGu+cVw3nrIej7ffMYYj8itS00bG6hrIM1K4WHnAYXS4Yhd+n1vgyQln2p7UC14Xnn96RcHoJ8o9/5hjwIPV23VsKmbH7w7OvNwJ8pcGXNcUdDHXbqIRMSHt+1C2muvBU/Nhm6xPjmZ2ZTDz+D4LdjDtHifl94gt9Y1mjOP7I/4Pd5pDzTsGB5i13Q1w1rMENXa/XqPtxzkQTKZtuuULp3HsObR0exWsm/tRUgQZxIQiWc/fTiApQExezQx7FWhdQub/al0shxjpjLGywlxm/ct6Iy8RIiVNd4nSqKlwPhrvN4RN/FI5xn50E1JOBgFvG6SJaCKP5RiKhbgmqflM9BY4w4Xrq/VT6vJzAWOKb2JZE+KG8h4NGEFAYMfRz2qkEp8JK5hE0h+H6SCmrfLODxqlzJRGLofvfhlmMVCqH9C6UCbbuYquBewBw/8G7gnLsgmXyPTIz5gUMvpDeQLCQL+oXAJF7zOqRyRAMvJZywciAbhV1gfV5vVB57LXJnRyEhOYbwYsSGti+W94ZNl93R4C5O48j23J+2elsfUraDhd9QDQnlfJ5AK7oBnCGjMmTwxhi5NGg70W1BdjlHzy0Z453iWuTYe6fGHcyeDGYuLRabHBSdlAD2Q= X-MS-TrafficTypeDiagnostic: AM8PR01MB7994:EE_ X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 4xCYcIyILxDRDGwiTsdVkHAOFO05odUjIhnOfl7NppumQHW+3VsbmWB+i/RIY+8NoQTz053NsseK8tXs29DOSVUN04Y3FcLImlF3ZQ3FRIADVbnmZBUbpoBK7aQ9Y+2Fzx1MzHwLaB8BJOlDqvaaQ6nKpf+fgtuRou9MDn1YkG8dCh+B5dGvrUrnC3l1Vmvm2LC/yk8ZAvBNPFIOMcd8/v0/9O7g1T19FTroH+VJm0zj69jpybThTjaXDQBw4A4vsomXQV6jyfUgOxpEWZzi+LyS40hXBCZJ3dRh7A2eFMCuEG8Zo9S+nBoCECgKlX2U2B3ZsnQetPXtOU69Dr5LWH3rueTyOx5N17tgzM6BWeJYw+EosUaK4lHrGkPnreTad4u30F64GwtX5p41w1lb+GN93e/3E8KOjt28OV6y3EVq8i2UcFnh+mE3F2E6Tct+0xfldK+lQjbSfwiaoDB/JUttQkrvLBZ3KeBXe9jA37Eeg3eD6PQpvV+ngFWZek0hZ73nANKfwXFnMSuf8HdaJ+G2bI44cyYm65wwXL6hJ5ALZ6wimJfhmAAE04l5F0N3npzskSJEV5lk8ZrdQ69y7g== X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: KF2tOe6lldhACuLiYtbXEP9v1QX4uXtiZAkEqOzughBexvmpE5bh9OXqpa9oR3091icNqFu1/v/0GgFlYMsMWs61ncGOvCvc8IDStQJx43E9Hoqt8yzKR75VHDNntFo0TXG5ZiT33jQrOPLMlG38+OQkFKjciRrHlMBWak/Iu3KjkbHYTm1e+Psxgn1u3MTljT15wtyQ/6UH/rIhBoEDkIK01bfbFqOzHxaGn/U2b0sFfigzSGfTAeCptxyXWmzr+2z7RUQQOrJ0LMdvPBzLUA9vtMMfMM6xXqzL7UBmiA8wUgGD6RF1SMGVmn8pxj35bgcvgQsTN7NoLGCsUvIxRC2Un3cnLKfdvOxFfxZZNyyjOfLLgUkGqk1zZhH5G448fEFWOrijbsO4aPX995bXX1stiWpt9gKM5rwRrtSr3NIrm2wRviuz7EdHd4gX1SxyQnZ3UDoxqsPyyBF0xZ1Fius7BhhwfgxZcwzK8GVAmYSXYVeOQ3T2nqZzt2bgfafIwEbwl32miUMIRCCIC5sSaSud3akoRr5vQpkssUNmDvbxfHB2+q4jDFrs1a1vxrhKgI2IPXmuaLjv7nowdNfTegf3KfFGXq5JrZe0osNP1XQD7PJvwTiZ742Zo3lBmZ87SJ7ImZ58/4xgjal6JJzXJO1it7vWYgBaBMgKo4VVFVe0XxtcjRKBjBHu9CXnAsy6avTZR2kwjtGxEUIkv5Vb5WwL0sch6TBfXnl0qBSXeWbn875Kkj6bGfnyuuJm5Vk6ivZiDxWh3gbyrgBEKoVFQ2Ja7U+w3b41Gr6Ys6n4rzx8afnmbKSDv7pRmMHaZVS5IDQIsZFVnpA45bx6O7gYgwvBt1QkfnEZuYWXKmBgWCPPG3n+V/O3vPGcjaMD2Gq7CH5YayA3V65T6RBq7JwHmZSBdjnEl6/Tpc2MpGtBr8v4/ABgy9DR0uY2llq3eYVb2amaHIzUMiURIr+ksRU41ukEBgLnyZH1EIkQZWFAXjX6ZHwxOfofL6wuPjq0pIxgV6ox3yTMFiQuw0E9EydS3y6KGLHwpuNuLnlgm2cpf+uC/XogxrOiZHOrc993jdNvD09mBPy3nzK0acXMTXxL4/ZhduKY1Oe7hsuoisppzBr3I18h+63i1+DuvDNJVb8SgwwhYAbupwbqVkPuejEEvn9bAP/0G77PrV3H91cGpJmUFTaiPCxMqDmgJRxo+OB39QAKWmV/HACjkSJXexMu+8gfZ4B79HO+yl/txCePqiE6FPqiqbiwXppBCCIpFopgmxbEDmHriunBHX6ZuI1ZQBIuNhVInYXek84olMZtE0oha2ksmpc3eQgYwFqNXWJw27o1kLvs2yUwPM6eicZ+fNR14elM8ARJK5eJEQqMzHXZ3c6xSJpAJjDDnKKvpYYbFDrJekUArJmLF7nclv723KXxgoGSCyJl0pN8EeTRgat+s07O3E/C/SXVd1F74ofZt2QPmYs3B3PzZmhhtWrzPg== X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9cf9767e-d2d6-42f5-31af-08da4e909a84 X-MS-Exchange-CrossTenant-AuthSource: DB6PR0101MB2214.eurprd01.prod.exchangelabs.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 05:33:40.4395 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8PR01MB7994 Subject: [FFmpeg-devel] [PATCH 1/2] avutil/cpu_internal: Fix check for SSE2SLOW X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: XTakVt1Aj6GA For SSE2 and SSE3, there are four states that the two flags involved (AV_CPU_FLAG_SSE[23] and AV_CPU_FLAG_SSE[23]SLOW) can convey. When ordered from worst to best they are: 1. both flags unset (SSE[23] unavailable) 2. the slow flag set, the ordinary flag unset (this is designed for cases where SSE2 is available, but so slow that MMX(EXT)/SSE code is usually faster) 3. both flags set (SSE2 is available, but there might be scenarios where MMX(EXT)/SSE code is faster) 4. the ordinary flag set, the slow flag unset (this is the normal case) The ordinary macros for checking cpuflags return true in the latter two cases; the fast macros only return true for the latter case. Yet the macros to check for slow currently only return true in case three. This seems unintended. In fact, the only uses of the slow macros are all of the form if (EXTERNAL_SSE2(cpu_flags) || EXTERNAL_SSE2_SLOW(cpu_flags)) where the check for EXTERNAL_SSE2_SLOW is completely redundant. Even more importantly, it is not what was intended. Before 6369ba3c9cc74becfaad2a8882dff3dd3e7ae3c0, the checks passed in cases 2 to 4. Said commit changed this to something that only passes for the third case. Commits 7fb758cd8ed08e4a37f10e25003953d13c68b8cd and c1913064e38cb338039f29c280a0dacc3fd1e451 restored the old behaviour, yet merging 4efab89332ea39a77145e8b15562b981d9dbde68 (in commit ac774cfa571734c49c26e2d3387adccff8957ff8) broke this again by changing it to what it is now.* This commit changes the macros to make the slow macros check whether a specific instruction is supported, even if slow. This restores the intended meaning to all uses of the SLOW macros and is generally more natural. *: Libav only checks for EXTERNAL_SSE2_SLOW, i.e. for the third case only. Signed-off-by: Andreas Rheinhardt --- libavutil/cpu_internal.h | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index e207b2d480..650d47fc96 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -30,12 +30,15 @@ (HAVE_ ## cpuext ## suffix && ((flags) & AV_CPU_FLAG_ ## cpuext) && \ !((flags) & AV_CPU_FLAG_ ## slow_cpuext ## SLOW)) +#define CPUEXT_SUFFIX_SLOW(flags, suffix, cpuext) \ + (HAVE_ ## cpuext ## suffix && \ + ((flags) & (AV_CPU_FLAG_ ## cpuext | AV_CPU_FLAG_ ## cpuext ## SLOW))) + #define CPUEXT_SUFFIX_SLOW2(flags, suffix, cpuext, slow_cpuext) \ (HAVE_ ## cpuext ## suffix && ((flags) & AV_CPU_FLAG_ ## cpuext) && \ - ((flags) & AV_CPU_FLAG_ ## slow_cpuext ## SLOW)) + ((flags) & (AV_CPU_FLAG_ ## slow_cpuext | AV_CPU_FLAG_ ## slow_cpuext ## SLOW))) #define CPUEXT_SUFFIX_FAST(flags, suffix, cpuext) CPUEXT_SUFFIX_FAST2(flags, suffix, cpuext, cpuext) -#define CPUEXT_SUFFIX_SLOW(flags, suffix, cpuext) CPUEXT_SUFFIX_SLOW2(flags, suffix, cpuext, cpuext) #define CPUEXT(flags, cpuext) CPUEXT_SUFFIX(flags, , cpuext) #define CPUEXT_FAST(flags, cpuext) CPUEXT_SUFFIX_FAST(flags, , cpuext)