From patchwork Thu Jun 9 23:54:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 36119 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:6914:b0:82:6b11:2509 with SMTP id q20csp654753pzj; Thu, 9 Jun 2022 16:58:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxEocn6OSJYCrLIodTHI4AWhntaBVt9tfT0bz053cjajBsyShNl3aR86ZePHsmMbemQSc2F X-Received: by 2002:a17:906:2001:b0:6f3:bd7f:d878 with SMTP id 1-20020a170906200100b006f3bd7fd878mr39076133ejo.133.1654819093907; Thu, 09 Jun 2022 16:58:13 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d5-20020a50fb05000000b0042888eee944si3644660edq.471.2022.06.09.16.58.13; Thu, 09 Jun 2022 16:58:13 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@outlook.com header.s=selector1 header.b=lHS7n1fe; arc=fail (body hash mismatch); spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=outlook.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C14DF68B89B; Fri, 10 Jun 2022 02:56:29 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-oln040092070066.outbound.protection.outlook.com [40.92.70.66]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id ABDF868B88C for ; Fri, 10 Jun 2022 02:56:26 +0300 (EEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JAFa//d3lhm5OCWLv19TFGQrv13yOKYADocmUgRGD8k7+zwANCARMtIdCqBY3ogAKBtNkWdnTm7PcyU9ibsR6rwLQvs7gBMs8OuAsTjlfUUmI/1XfkE+VOZ60x3qjVF8JH8iZaQeWAJrPMf/r30g7o7Ij2aE/bjytHFxb7BWcYVomswhO17iJ0W2TUasKYyL39z6mXjKyJy9RSoQ86dJJAeageFbzXP5X+pCk3KKxWXmUlUh5gK1Zz4WevQybEg2KgOp/gBAxPWzTWOqKEi6FV7MNCBb0dvB76DDHSW7hHrLwWsMrGobpOc9++pvjz8TQ6dRuoX+bpiGaxf29EiE8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/ugtXHDGTR9ADKuM7ds13wZrlHn1kKGdCjpahljopjE=; b=QUNJEdIdHjC3u1aPJGY2sqHJqgcrZJsnoi70lmWHNUQ8NLIFg92Q6R2zzI7PIztVRDZVRmkzk9q1bY9O1bSKAv0gi0y1c+nMInKRxxmryYNqHrJeb8RUigWrXy0EL1FUeX0fajYP2El1ctXtD3zblPSdK2+3nXjz4/jWMO01NowtUtkWR6Phb8AwP789jFOQg/EydT/HCSiorKJMxatWO74C4U+zmpkgVN20cdOQWeiT06+WzJKYuRSjCGBJTYtBC7f1+JqXWmlEB0DhlqXQUJLhJYmYc5Y+ODnqRhpHAII5Uyo+mtFCCQy4clWUSJ4PU0lE22XaXMUVmCB+xYwQZg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/ugtXHDGTR9ADKuM7ds13wZrlHn1kKGdCjpahljopjE=; b=lHS7n1fejOmEYwJVcZOFn6/0uOL/MasyupdHleSsQuueRyzYq68MR21di2BXLD9blQaadzHUNMLgS0M4W1Mhpb7Yts9QbSDZ/X+4nyICxdrFxeWX3r8ehhUCllWVlnlrtJ73mAW9TctlooPrEpZzJSLMGJ6LuCsfUCAJHurHzgStf0IXs+SfDjNTERXdRzBITkT3hZ6xpJfJpDYKhlGvcw48I1m7CqQ6YC2AstpBKdmO34Nxs5uLJdJJREXQDr7TcKdjkpqeLMBRy76ezDrkdDMBc6KMLIWTwfTP0GUO4w5thpk4WI7FeId3tloq8Gvudkj3NwCHPcD45siHhrPwMQ== Received: from DB6PR0101MB2214.eurprd01.prod.exchangelabs.com (2603:10a6:4:42::27) by AM0PR01MB5809.eurprd01.prod.exchangelabs.com (2603:10a6:208:16f::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5332.12; Thu, 9 Jun 2022 23:56:22 +0000 Received: from DB6PR0101MB2214.eurprd01.prod.exchangelabs.com ([fe80::60b9:9f29:40cc:f01c]) by DB6PR0101MB2214.eurprd01.prod.exchangelabs.com ([fe80::60b9:9f29:40cc:f01c%10]) with mapi id 15.20.5332.013; Thu, 9 Jun 2022 23:56:22 +0000 From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Fri, 10 Jun 2022 01:54:57 +0200 Message-ID: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: X-TMN: [6c8ZM9xfRPseNWRWpVJcgngvAjTvL9Qn] X-ClientProxiedBy: AM5PR04CA0005.eurprd04.prod.outlook.com (2603:10a6:206:1::18) To DB6PR0101MB2214.eurprd01.prod.exchangelabs.com (2603:10a6:4:42::27) X-Microsoft-Original-Message-ID: <20220609235523.458689-15-andreas.rheinhardt@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 9276af93-38cd-4bdd-38f1-08da4a73a79c X-MS-Exchange-SLBlob-MailProps: S/btQ8cKWiTijo6adWu98SHbtEySq1lKry91Qzc+7yrqqC4Hgj5YvMFM3UeT0D3FXynBXrSZWz0X33sTpy/d6aLakPyzKL3C/5da6kTwwo8WVMMcXppyYK6wmS9uWMezYI4qGXHc5nGWvLikO1J0yCd4aIGSXbuQkf0uD8eFE8qgu6J1CLWr/Fa3Tq4ySIqWGBoNoiOfPooBDyZ8xuWJU3DfTl59qzU47OlIRiaKhnipwGAe+HGUx/c137ME8Gto8xZ9PnOWlSPi1ezSdwbjpbPCgLPdduJHUeoVNSYiAtCbYSuUodmNkaJ8J3L98zt6yUmuIztjKaChuP+U2+C04dwtIwXK5PaC85Of5h1d10ff2VkD/t6WmOxsgFyosdAKV6Yuw1dL2c8RMmD/gOrudvpIOKZISGRnNRnXfaNtennL4KgyHHTeDUkrzAwTpd3H46A+r0KpZ1dX779dAv3B/Ivsuj72VBL/aAiehWP6rKZ0kWCWVXCMk1SRKTPbv/5RO1ay+whzkzmAYOmKSi53IMtN7+su7tJPlEpNbUe6tEHjn0v67p90gy9Z3tt43abxE38/HmV5EJ0kYH1utOJIOWBPNwWzXWaSvY8c0vG+O7knDvXcePW6Gtnta1xi3yMyOrJfv7EapRVILFh40XUNT/ARThSX4CFLZo4Y78ZP6OhObhuKibv4D+B5DzJ+tEvdXxJ15BWJuinjLy7jTAklA8fWeriJN+SAw9f3/a13LyI4EcRijMeqlSGO7By5vq2EbkIevFokdCQ= X-MS-TrafficTypeDiagnostic: AM0PR01MB5809:EE_ X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 9xx2ZF+wmzhMQZCE2fddl4LufnkXRU59AJ9ZsJr/V4O21rLo8euSCuR3LU5r1Wst2RKSY70CPmCkTEuyPAwYWW6yPcUuHSPyw7MN/yZA6NIKLLIEbmb1GzxrejHEhR9559RhWB7KbivsF5LSzFV56PZCLGW9N26n2p3afQnjgSOmnNObb1myl3JCQxXr3CzkPBX8Zu286zIISHr/rLu0Pl/iJxGoc4nFE/GfpMWPT4Zckp18FpKGpqc5w5fh4tXJ8A97MxsrF5c4RQEagWMs8NntgtJsxrXbYQ2EsKeFVMAq4g2hAbzRQlQIvtYkpo5P9TnAU0QYNymW62b4wiJFF7sS7ZK9khaX8igPQLAdu7q5vZCWrDzzs95yO8lQnzTd5kuthT+YK/Z2iSA5NiQJVwTJe1OUQiYbrWtJMNzrMkXyoei5bp8tyDgAMm0KDfOJ0wCy+dvVrKLSlPsC1+Mv3YcWaVkDZKtBaq95pGZnPwm8gGhRE0FNO6PM0iz3YarBLONjFOorBg+LgrfBL6wVhLM5SnV2tYJ70ovYYm7iEOfD48k19swzmwBR1HYrJ8o4Gr3JK+/2JzRjy9TGKjDQxA== X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 1YN5RbGSfoZEMGi5WEFe6vUXdGg27ArV4lyq1Moog2bgs8SDRdYhTV1zhQB0EJ8CW+3YcyIL2SplAAeT1QG+22xqDsm1DJFxcFgxPz8OaZZGop+M0T+JLoLqJ/3D0N46S3WyWgXx4/bh2HSmyJFT3i0IQLUL+jB1gJZyNVsOzIA/huvxSGdsulMOySudGPlKa2eSjg53BtPgQ3yvE2wkK36/OHx3hPie15Voggup3VPHwrj61YBXO1Zc/TXtLSVDzZKy8NXp4KhVPeyS4GZezgL5N8YIF0dlAMeZrMt/rBFAg2cnmzbNrfjrgi3VqT/d1rFthoQmUcqu5U8rNqcuaG2RR6OgjXJldgol3UhydvIHjYGQ4v2sgL8nZeMl6kg44uFRPMTwqHLHSk7mIf3Ma+epff18qmbb3uILSiO0n9h1mnl0Y1EGEgXsC8W50m9sl5bV9vkb2pb1GKOW9dRVTfZAJSZIQXYiEqCMnoT9gA40J3fnAbK5CDm/AnH6e2+LvAS5/iqTTwR4VEdPVHLbNMOPNXC6uwso8+YxmkwUgsEeWec3oPhoN+9R/xMJzEe3ElZ20B29ccaMXwlJ/h4VOwt0UUYpE727V0zCNbXh2plmzezQVk7WH80iQ/B1nIKxMnUh1Ey0AKXN5IyPf0WVeg5qGm76IA0r2vgrToGCH1FyVH7uKOYsAePDS+FWefTTdGumwqI+/EoSC0Y6F+HBu12ahKT1i5SRbqU+7onzbyTTvldl8E3hzIu6Z0LbhbN10jaeNK3FL4kQG0k5Z8AwcxmAOUohDxwHR4FIbf7QUZJk2AjjtdoCA7Lxk37UBhlGL7T8qGBIvV2hjN3Vuks0IE7MQrG4RnfavIapHysqyTt1/QfBAVWTPOK2Tv+8YdQmEn9zgOEINrZD9jx5de8LIKWWEgsh3Ov6O42bbg7w3AkyWZpUqui62P6MtAnY7MtbVuTyYggbXJi5/fzIEO8Z3EYUsIaYVkR5qRAu84WotFXviUR9hefn/omlHOzL7gpe4HMtHzINWd4wwZQiuTtXUs5sfTyi/DAgrQvW4Q8GGL4RnDm0/tNI9gwp9I/Mryme/N9GSzrzyp4S/kw+5jriGiQXKKdpYiA6ZpFZmbcyLBx751Z6RYbtgWbi2WwHseRtNoC+IQZX3eO7kOC0CrbDdQrrNrjXv6RfxTkT7+eW6sGhbNJQ0ne7JzuEsys2yEM5Hru7NpLTbGxR4Gk+002ghJySwXoTh4IsAEdBTjmdTZCLJ0DB/zIujXz18x6sJCsQdaVVIA5Wf9H3qaed7lVEPkP4n/cslzoJaj6Dwnkk/IAYyGoSInj2OgmRsFiTh9BUxXSGlqAFYA3srnOp+D6bljDRqq76j78ML4tS9bQYKKp8XC3dEqOoINzfosBn/QcSxxz6aSFidGC5MY5XyrdXEOvgsiNNQL+JpQZgZgevV9EK/tnCzKdOfS6STE0+MnXe+AfpcbzLh6Srx3S0zJuElA== X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9276af93-38cd-4bdd-38f1-08da4a73a79c X-MS-Exchange-CrossTenant-AuthSource: DB6PR0101MB2214.eurprd01.prod.exchangelabs.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jun 2022 23:56:22.2418 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR01MB5809 Subject: [FFmpeg-devel] [PATCH 15/41] avcodec/x86/hevcdsp_init: Disable overridden functions on x64 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: UKJ9SvzYI93T x64 always has MMX, MMXEXT, SSE and SSE2 and this means that some functions for MMX, MMXEXT, SSE and 3dnow are always overridden by other functions (unless one e.g. explicitly disables SSE2). This commit therefore disables such hevcdsp functions at compile-time. Signed-off-by: Andreas Rheinhardt --- FYI: There is a pre-existing stride/alignment bug in this code: If one configures with --disable-sse3, one gets STRIDE_ALIGN 16. Then the test fate-hevc-conformance-DBLK_A_MAIN10_VIXS_3 fails when using SSE2; more exactly, if one comments out both SAO_BAND_INIT(10, sse2); and SAO_EDGE_INIT(10, sse2); in x86/hevcdsp_init.c, the test passes. It also passes if one hardcodes STRIDE_ALIGN to 32 in avcodec_align_dimensions2(). libavcodec/x86/hevc_idct.asm | 2 ++ libavcodec/x86/hevcdsp_init.c | 6 ++++++ 2 files changed, 8 insertions(+) diff --git a/libavcodec/x86/hevc_idct.asm b/libavcodec/x86/hevc_idct.asm index 1eb1973f27..eb44e06123 100644 --- a/libavcodec/x86/hevc_idct.asm +++ b/libavcodec/x86/hevc_idct.asm @@ -811,7 +811,9 @@ cglobal hevc_idct_32x32_%1, 1, 6, 16, 256, coeffs %macro INIT_IDCT_DC 1 INIT_MMX mmxext IDCT_DC_NL 4, %1 +%if ARCH_X86_32 IDCT_DC 8, 2, %1 +%endif INIT_XMM sse2 IDCT_DC_NL 8, %1 diff --git a/libavcodec/x86/hevcdsp_init.c b/libavcodec/x86/hevcdsp_init.c index 48f48a925f..b48661fe35 100644 --- a/libavcodec/x86/hevcdsp_init.c +++ b/libavcodec/x86/hevcdsp_init.c @@ -712,7 +712,9 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const int bit_depth) if (bit_depth == 8) { if (EXTERNAL_MMXEXT(cpu_flags)) { c->idct_dc[0] = ff_hevc_idct_4x4_dc_8_mmxext; +#if ARCH_X86_32 c->idct_dc[1] = ff_hevc_idct_8x8_dc_8_mmxext; +#endif c->add_residual[0] = ff_hevc_add_residual_4_8_mmxext; } @@ -889,7 +891,9 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const int bit_depth) if (EXTERNAL_MMXEXT(cpu_flags)) { c->add_residual[0] = ff_hevc_add_residual_4_10_mmxext; c->idct_dc[0] = ff_hevc_idct_4x4_dc_10_mmxext; +#if ARCH_X86_32 c->idct_dc[1] = ff_hevc_idct_8x8_dc_10_mmxext; +#endif } if (EXTERNAL_SSE2(cpu_flags)) { c->hevc_v_loop_filter_chroma = ff_hevc_v_loop_filter_chroma_10_sse2; @@ -1105,7 +1109,9 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const int bit_depth) } else if (bit_depth == 12) { if (EXTERNAL_MMXEXT(cpu_flags)) { c->idct_dc[0] = ff_hevc_idct_4x4_dc_12_mmxext; +#if ARCH_X86_32 c->idct_dc[1] = ff_hevc_idct_8x8_dc_12_mmxext; +#endif } if (EXTERNAL_SSE2(cpu_flags)) { c->hevc_v_loop_filter_chroma = ff_hevc_v_loop_filter_chroma_12_sse2;