From patchwork Sun Feb 11 17:40:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Timo Rothenpieler X-Patchwork-Id: 46184 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9002:b0:19e:cdac:8cce with SMTP id d2csp283216pzc; Sun, 11 Feb 2024 09:41:19 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCU7k4Rujf+ovjtM5ScsHh6OgvseybZQuigbti5Yy9S07Na0/8HLoAcgX4s9+F3Yr/sSKB4uJqSxTSPwaNFIeaiX8nkYTlimouS92g== X-Google-Smtp-Source: AGHT+IELZzJXa+uZ6Z0tjuEmtOPYcqYNuisoG+QwdZDXzCuTf2LoisaaCfGS6x2nsuBK6gk0E9jr X-Received: by 2002:a17:906:6d5:b0:a3c:1e4c:3218 with SMTP id v21-20020a17090606d500b00a3c1e4c3218mr3173142ejb.55.1707673278914; Sun, 11 Feb 2024 09:41:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1707673278; cv=none; d=google.com; s=arc-20160816; b=EALFX/PJEteRck3WLaPpkOdjp3xAOYCH+pD8wDZXx5rmr8uTI/WYqIsMe5MQjdobmQ 747uM4CD9QQyhOL0UxZqU+Ggatw5DkLVIqVlUFiePeg9a43NCzyPOe+QrzWwvtu7O4Bt i74L6JxOasPVfJ8DKE9fL3yAatQVrTGhUBt0Qiv/D1TaSWJNnBjs85K76aeNRsQnCTby Zg4s4y6r8+Dq3mhWfp3WjpkjC8B20qPX/3Y2ePmfNJaymdYVot7NWOPqddodhz8winkH zjvuDfBaRZKrjaK4a3edMqG75t/hqdu1Wc4nZcr9u11+nTy5H3mKohK0FZb4nj6Q7AgK AUVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=fKB5SlxrXLCcHjXWt67Fv24WeGERZ0emRE+LeTWhd4U=; fh=HxE/UC6Q4c6tzDyMA5cdLFxUW2ScIFqtm1w3YXAQANI=; b=tdtuVcEsg5vfqJkzgkjdPcYWhuEGtjOkW9uqV9HsRkGiBoiWXZH3jvmtfw1rPH7yYo s08j+5rvt3wMe/aDxCUbsYXZn/jxtKgyVOG27nMKCcw9JOjfbgayivZh0uHFyNA5klYy MGPR75efIleOsKKMEhAau9MAsTe2R1QewyTfob99S+uuVLW36o/h7sDCAmz+UcAbU8Oh oJYHWRfWM+dzHdRN+UZ+5ZYHdTd+bmPUIUeyDxoQDndSgWlWvZNrwbKychKXE7nkL29a YmTKpVXObdrx97AvJWRKVjJmCXHyzbPHcGp08RQffcSD2U3JztaQpsOrKKWw8O4aqYfW LEOg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=JZuxFRSO; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org X-Forwarded-Encrypted: i=1; AJvYcCWqEYEEZHCz1PNQlB8n7mcI2B88657Wzmp8DmYzaOxeDz0WrI3C5QnK93qXcM0daVDKKcXWgsvrWQ9cdn3w0SoI8MpWRpF8fmjFoofy0ouP7b2gV4FzzmYuOoBkPp/1tl5jfT0ulYmqUwoQ3IB/QIAKDuEm1u4NjwVYn+9zJq5aED/gNo9O1zPeZaOiIATpjUNYksNYgvgAtz7cQkYWw8TwwGzsrmqMW4O5uIf+t27JlTkzjcjqojeh7HTP4fzcke6lTx+2oeO2y3l3V81uBApxHc+wKcVlVIoPddtSHwjwtyXk/yLznxen6az3BWO4hhMuG7k3zekIpI0j18Q3/ZmOCJjdNS/79fK9DyuS8396+gRhBl6i/IPqIPszopcJ8mokksYHlplfaAQfIqxFSTfhp8+qth4Uwwf8ZhwTNCfSfBvvNXuywCYn5PN0lPa1cat38DzVG9SSzZRA9foTCcw8h515P2uPKRZe0XYxm4HroU/k0OKPlLBLtTwbTLLAyvRUeVSOQVoV6xWvD0XDXPOtZ/ObYmRtiXuJfqJhuzkn+Nk8OtVsQgLez+1a7DfRyfYZesF/UAOoOaNtYWSp5+RTP3ya3eWrqzHHwJdFW83s3gSgxPnkju20QCd9Gur1kNHZ6PK9p4evjQUtgiyXRDgPnU4W/4RjoCJPxqm0UzI1ozrE8o1ppQaeZr6dBKHUYIjBZzY5fHDSHQqRHjmrfMXfzGQF7RYwZR5V5D8xkBe1M18cg3/VPmH0cjph43UqvZ8al1rhR/eNH4NZ5RuZQ21GTE4JrBEUEJUUD+kOasWZBvHx5dVvNj4s9M0CaORLaCxx1tqM7BC+FBAjvxWzb+nCC6n5cP4T4i79u+atwE54LEypr5nZGGQklBvmQkq6xvrkPtVmR2XFmXm8dec83K3EKx1YIA4fGMemv+FNJcjxFcbNP4q199+oBH3cEZnCU6wpZy tt7IzabxQ890ned2RH3hwoM/82m8yR382nB0RK0bao3NoRfMUXnoGPoTyZ4u/FmmmiFZlmhalJTCBcS5Gao/w5UBt5f0srloOoTMKRifVywodYpkrRD47dm+8ByOVgrg3w4DrPQ5djdnRR7i5cPOdrEIUxeftxmjoaXYVIqmrSHVBpSKH3b9uOrVnHwF0fxIF04LWLLKhxgHWtU/gv+d0BPn8DBeD3XEqIBrTLHexI/Cic8KPzN4jRdVyWecSwz12czvCVEfQuL9WhieYhOAwJCjU2plyU7NvNZQq28bGOtEeaWTcioMrw+xV2Gsz8+9atR4esfLt2RiE7HJSxpjwTOGWXz2nRNXAFsf8lCJNs7Fy3ljDawwiAZWkZivoxQ4KN0f/NUe/aa4RBl4q7Ed7ATYoa+plS7xaINE+F4bUlijteKsKEFT/SMhSTKScyTjJjDz8Z2m3lh/pjc4TVfJbn782zpGrH/kSV9kkByE4fcnXhka/dp98iWObqeHFtsK2vOD5e55Q994LSOWAFbQWcM/TnemA6nlbDFs4O+vxI9VB0wyB6IUVAu+nv/6U07Bxjb+ylg6qiYDArMPZrsP2F+UeDtl3Qg1NmVQ== Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qw18-20020a170906fcb200b00a3c9abc9b56si232557ejb.892.2024.02.11.09.41.18; Sun, 11 Feb 2024 09:41:18 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@rothenpieler.org header.s=mail header.b=JZuxFRSO; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5472C68D085; Sun, 11 Feb 2024 19:41:16 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from btbn.de (btbn.de [144.76.60.213]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BEF13680276 for ; Sun, 11 Feb 2024 19:41:09 +0200 (EET) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id 2A6BB28A8E332; Sun, 11 Feb 2024 18:41:08 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1707673268; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8lf1mpG4rAzqq83VhwKueqAqtt/nd8Mg3nBIhgrSy6E=; b=JZuxFRSOEsndrh3o9ZxGS+T9vYSP3Ojn0fXm0ao+Tdc+QDKjH1uQuqjQuT3Q6zWt0cdwv6 YKEu0ouCjYTqOa/h2x5HwrN4TqoYcPG7uBVwiny+5WIERB1Fsep7wGkjex9dA9PbwsmU2h ZWChn/+m0kUv4cr1M4PuIWSchd4+Z6aGOyXl/I6zY0vLM9McHEAWyX6lPXhJ+fTdHlhJwb 3HM4htW5XZKqX4SpMWpP3JCvjkpjrD8xwkJOradNbzO1UaoY+9CepmbgekPPHZfeCBCire EkD4ZWBv2887kvtca+3TR9hjL+GPVVMpdPAw20gyTg+kdrB7m+tf1MEjrxiDpw== From: Timo Rothenpieler To: ffmpeg-devel@ffmpeg.org Date: Sun, 11 Feb 2024 18:40:55 +0100 Message-Id: <20240211174055.1659320-1-timo@rothenpieler.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam: Yes Subject: [FFmpeg-devel] [PATCH] avutil/mem: limit alignment to maximum simd align X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Timo Rothenpieler Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3g4L46qFUyXq FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs, which then end up heap-allocated. By declaring any variable in a struct, or tree of structs, to be 32 byte aligned, it allows the compiler to safely assume the entire struct itself is also 32 byte aligned. This might make the compiler emit code which straight up crashes or misbehaves in other ways, and at least in one instances is now documented to actually do (see ticket 10549 on trac). The issue there is that an unrelated variable in SingleChannelElement is declared to have an alignment of 32 bytes. So if the compiler does a copy in decode_cpe() with avx instructions, but ffmpeg is built with --disable-avx, this results in a crash, since the memory is only 16 byte aligned. Mind you, even if the compiler does not emit avx instructions, the code is still invalid and could misbehave. It just happens not to. Declaring any variable in a struct with a 32 byte alignment promises 32 byte alignment of the whole struct to the compiler. This patch limits the maximum alignment to the maximum possible simd alignment according to configure. While not perfect, it at the very least gets rid of a lot of UB, by matching up the maximum DECLARE_ALIGNED value with the alignment of heap allocations done by lavu. --- libavutil/mem.c | 2 +- libavutil/mem_internal.h | 33 ++++++++++++++++++++++++++++----- 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/libavutil/mem.c b/libavutil/mem.c index 36b8940a0c..62163b4cb3 100644 --- a/libavutil/mem.c +++ b/libavutil/mem.c @@ -62,7 +62,7 @@ void free(void *ptr); #endif /* MALLOC_PREFIX */ -#define ALIGN (HAVE_AVX512 ? 64 : (HAVE_AVX ? 32 : 16)) +#define ALIGN (HAVE_SIMD_ALIGN_64 ? 64 : (HAVE_SIMD_ALIGN_32 ? 32 : 16)) /* NOTE: if you want to override these functions with your own * implementations (not recommended) you have to link libav* as diff --git a/libavutil/mem_internal.h b/libavutil/mem_internal.h index 2448c606f1..b1d89a0605 100644 --- a/libavutil/mem_internal.h +++ b/libavutil/mem_internal.h @@ -76,27 +76,50 @@ */ #if defined(__INTEL_COMPILER) && __INTEL_COMPILER < 1110 || defined(__SUNPRO_C) - #define DECLARE_ALIGNED(n,t,v) t __attribute__ ((aligned (n))) v + #define DECLARE_ALIGNED_T(n,t,v) t __attribute__ ((aligned (n))) v #define DECLARE_ASM_ALIGNED(n,t,v) t __attribute__ ((aligned (n))) v #define DECLARE_ASM_CONST(n,t,v) const t __attribute__ ((aligned (n))) v #elif defined(__DJGPP__) - #define DECLARE_ALIGNED(n,t,v) t __attribute__ ((aligned (FFMIN(n, 16)))) v + #define DECLARE_ALIGNED_T(n,t,v) t __attribute__ ((aligned (FFMIN(n, 16)))) v #define DECLARE_ASM_ALIGNED(n,t,v) t av_used __attribute__ ((aligned (FFMIN(n, 16)))) v #define DECLARE_ASM_CONST(n,t,v) static const t av_used __attribute__ ((aligned (FFMIN(n, 16)))) v #elif defined(__GNUC__) || defined(__clang__) - #define DECLARE_ALIGNED(n,t,v) t __attribute__ ((aligned (n))) v + #define DECLARE_ALIGNED_T(n,t,v) t __attribute__ ((aligned (n))) v #define DECLARE_ASM_ALIGNED(n,t,v) t av_used __attribute__ ((aligned (n))) v #define DECLARE_ASM_CONST(n,t,v) static const t av_used __attribute__ ((aligned (n))) v #elif defined(_MSC_VER) - #define DECLARE_ALIGNED(n,t,v) __declspec(align(n)) t v + #define DECLARE_ALIGNED_T(n,t,v) __declspec(align(n)) t v #define DECLARE_ASM_ALIGNED(n,t,v) __declspec(align(n)) t v #define DECLARE_ASM_CONST(n,t,v) __declspec(align(n)) static const t v #else - #define DECLARE_ALIGNED(n,t,v) t v + #define DECLARE_ALIGNED_T(n,t,v) t v #define DECLARE_ASM_ALIGNED(n,t,v) t v #define DECLARE_ASM_CONST(n,t,v) static const t v #endif +#if HAVE_SIMD_ALIGN_64 + #define ALIGN_64 64 + #define ALIGN_32 32 +#elif HAVE_SIMD_ALIGN_32 + #define ALIGN_64 32 + #define ALIGN_32 32 +#else + #define ALIGN_64 16 + #define ALIGN_32 16 +#endif + +#define DECLARE_ALIGNED(n,t,v) DECLARE_ALIGNED_V(n,t,v) + +// Macro needs to be double-wrapped in order to expand +// possible other macros being passed for n. +#define DECLARE_ALIGNED_V(n,t,v) DECLARE_ALIGNED_##n(t,v) + +#define DECLARE_ALIGNED_4(t,v) DECLARE_ALIGNED_T( 4, t, v) +#define DECLARE_ALIGNED_8(t,v) DECLARE_ALIGNED_T( 8, t, v) +#define DECLARE_ALIGNED_16(t,v) DECLARE_ALIGNED_T( 16, t, v) +#define DECLARE_ALIGNED_32(t,v) DECLARE_ALIGNED_T(ALIGN_32, t, v) +#define DECLARE_ALIGNED_64(t,v) DECLARE_ALIGNED_T(ALIGN_64, t, v) + // Some broken preprocessors need a second expansion // to be forced to tokenize __VA_ARGS__ #define E1(x) x