From patchwork Tue Sep 17 12:14:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 51630 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:d32e:0:b0:48e:c0f8:d0de with SMTP id cf14csp210892vqb; Tue, 17 Sep 2024 05:14:31 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWHdmUgH7nGG2bxoMJ7vNYr+Cq1sfT3ItEbjDNmLpCaFMOA/11fecS4Z8L5S0k11veGs1F7+7T/9RNQNz3wO2VW@gmail.com X-Google-Smtp-Source: AGHT+IFD72LFcniyevOpaJOl+jnsl8F3Gs9Vc0EJUW39n8klwb5mZnnunbqnwStrcp7Vgstt5qx+ X-Received: by 2002:a05:6512:3f03:b0:52c:cc38:592c with SMTP id 2adb3069b0e04-53678f616demr10147593e87.0.1726575271652; Tue, 17 Sep 2024 05:14:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1726575271; cv=none; d=google.com; s=arc-20240605; b=BN81aPB1ishkJMVez4AgMviutCE5BMfqS9qTr2VDj+Zyp/xDruxo4YJgWZwZiaOIMx exxuWXyHAJ5N+gz2fqwqI09+28c+LU6RND79zO/XO0VyRGedBlBY+MwZMLJQz5hSP2hy xPbuVYlsnKiNoiJjjgPxxwHHlQolOofD/gOJZnD5MKGnBERNOqJQbD+eHD5icu4gYh3o Hz2AedIy8xEYrsQjpINvB3z92WBwmQ0g8CI9roV4nu/HA2Tw52OQMTN3XkrHA3xcihso B3df8Pm+GgxrF+n9BqsuZvquOgAKCZgJTFxrkuCMGBM74t929u40NDFGNrm5WSK++qhE cbcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=9rFrQUCXbaAJUG7Kpm3xrJlonD2Yc1vNwT65w9cHCFM=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=BBxC6rN+5+xNM806cHiLiSoQ/ZpTmqZgTGO9f+A2CwWe8iW3O6t/huwQ69cCjaQQnr vLT0fWDVgFPhc93S713HjAAe101B3Bhj9DKuQNneRUsvZpeOfr+lE7vGXsSu51qfegD2 Vr0yUlK5Z0jOCtoNDp8n8PG69zwB4wzrnJ7KpQLkjDIoF/uo/VAlJf4K/gtfkQNlXI3t fta36yEDmb5B6dxTvTIO0taNGxmP44F/iuitDfo8XqbtCArTbvOJPFt0SGSNDRV9gA+t xSQ4hSIM16QM75lnxA88Xk9Wkpq3aca7Jw/lV8ExqEEoL7q7+sT+nvy08Ay/PFW5KjXp c1uw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=izxWTyFX; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5c42bb512d7si5087545a12.76.2024.09.17.05.14.31; Tue, 17 Sep 2024 05:14:31 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=izxWTyFX; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1B7C968DAE3; Tue, 17 Sep 2024 15:14:28 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 219E568D78B for ; Tue, 17 Sep 2024 15:14:21 +0300 (EEST) Received: by mail-lj1-f174.google.com with SMTP id 38308e7fff4ca-2f75c56f16aso69247181fa.0 for ; Tue, 17 Sep 2024 05:14:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1726575260; x=1727180060; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=ttDJ+iHj9RB4P/aplGETUtxKBPkhLw0MJE7H6maaO0Y=; b=izxWTyFXyNzAqhlJcBiYtkW4MwJV+TovRxrEUGehw0SSaENnCabMbKgAs3vMxrNutO d/OsFq0Ga7UeaTHgukaQFtV62h0cJVYA9dkzk+fFIMm8P+kQOmrIfJtSxmPU4IlaKR38 VVKOVsBQvzt8+Jr/xE2TGkt1aPuADnBXbL2EIZGKc2OhqameXE6lYjYp3pkil7g1Xwop TTcVW6C+4Voa6LSTD5LBlqYP87DImpjTJ9lYIbYcYHhF2o43Sd6fnty0rh42Ww2T7eeB chVOsqWXfd5/z67+GPrTteZKESktnQjeAPNlkAR0GB2tkVVoarFPDZv05L1+0szCTUN9 CJdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726575260; x=1727180060; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ttDJ+iHj9RB4P/aplGETUtxKBPkhLw0MJE7H6maaO0Y=; b=brXxE+vRx3n5LSQCwf9MUljnt7EuZfTtQnQldDQEBZhs/xIEHwswQi2TrY+6liplJu LpO9fTOgYVB9BYPPXotnxez1fa6wzy2AuqsYzPnZIPx/RGsrdquUgDccs6TUOGb8cwJB hTYEZIafi7XzmeRavQH85NAOoGog5b7gW5BqvOlc4u6f3OW9iLKtMH//zq4gc3MpGafE ui97tHIYoTZ05tepGLxtFTVF9qlnVGU3sBMwXa2bkRgb/Jxsd13GcR5PIyW6VKD/F2Fj +CPaBjX4MbxO/LVMBBxAxc3wgk/hYdIkPc+mjEns3t7Hv4it/PvaR2yMG9Or6hhUv5B2 QL5g== X-Gm-Message-State: AOJu0YyptegloGkP4mjMuwyI8RzRA1bjAHfuQQLM3RB31Z8T1BmOumDK gsFFfaTec0JcnQh79gGPA1zZIBrRZReaEoeV9g9ARb6sCAqskM802EATVJ46GkyU7Yq30tAD+4k USA== X-Received: by 2002:a05:6512:3e04:b0:52e:a68a:6076 with SMTP id 2adb3069b0e04-53678ff2e04mr8861373e87.49.1726575259769; Tue, 17 Sep 2024 05:14:19 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-536870b8cc0sm1164882e87.271.2024.09.17.05.14.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Sep 2024 05:14:19 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 17 Sep 2024 15:14:14 +0300 Message-Id: <20240917121419.610349-1-martin@martin.st> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/5] aarch64: Detect I8MM on Windows via SVE-I8MM X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: xAmbYH1VEYFp There's no direct processor feature constant for I8MM alone, but there is a flag for SVE-I8MM (added in WinSDK 10.0.26100 and recent versions of mingw-w64). If SVE-I8MM is available, we can assume that I8MM is available. While HW supporting these features isn't yet commonly running Windows, this at least allows detecting and running the I8MM codepaths in Windows builds in Wine (possibly running in QEMU). --- libavutil/aarch64/cpu.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c index 7631d13de0..fe24b1da4d 100644 --- a/libavutil/aarch64/cpu.c +++ b/libavutil/aarch64/cpu.c @@ -112,6 +112,13 @@ static int detect_flags(void) #ifdef PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE if (IsProcessorFeaturePresent(PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE)) flags |= AV_CPU_FLAG_DOTPROD; +#endif +#ifdef PF_ARM_SVE_I8MM_INSTRUCTIONS_AVAILABLE + /* There's no PF_* flag that indicates whether plain I8MM is available + * or not. But if SVE_I8MM is available, that also implies that + * regular I8MM is available. */ + if (IsProcessorFeaturePresent(PF_ARM_SVE_I8MM_INSTRUCTIONS_AVAILABLE)) + flags |= AV_CPU_FLAG_I8MM; #endif return flags; } From patchwork Tue Sep 17 12:14:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 51631 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:d32e:0:b0:48e:c0f8:d0de with SMTP id cf14csp211013vqb; Tue, 17 Sep 2024 05:14:42 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXKezA6fs/kxb4yxV0LPwDjFvwPKfyJPq9I8hABwUrpKigfuVftxfoqN8ItLVB0laxXs2Xh9WLmcx8MX6FnAv/0@gmail.com X-Google-Smtp-Source: AGHT+IEYsjPZxyjpnyrd48DdX8hK/0To/PQzvABPlcp62L8P6yBFLv0ttihgbLrk7XZUv9C5Eyr9 X-Received: by 2002:a05:651c:b2b:b0:2f1:563d:ec8a with SMTP id 38308e7fff4ca-2f787f4a407mr86768691fa.41.1726575282306; Tue, 17 Sep 2024 05:14:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1726575282; cv=none; d=google.com; s=arc-20240605; b=E6PARGbShreUbuMZaf/Hqw+MVHhZGVZoZoa3QJnsecTx5lP9Xvy0+RC9B2PzNHKncC oeNMMGCS8ZZYcfdCJUzESILj+rNOfmHn1/PUFdMg+CNXqcfNlTPs7TgPqnPkDwTfNuVp BxrDtmbUql9GGyQpw3BsiSyIWH0e3QkQJUQdTLMEpH1TSJaUVCyi8MPVZiby2NETU2KY eXFytPSNYdQsRbEnvT0MhrZqVKVqIoTChBTqwtr1+nRDnIEBGYlRP50+dTgzylwe3u1Y XkqHSEvg/J/cv4TQKuru4pYxbwp718DccLztxQkdie6A0mAoDgZ/WQ7mjPq8y9gVEKx2 NvrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=8TxagcKobMVr3cA3LbNLF/YaKhfbez6t6KkBgIqZdoI=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=JFYN65BS9tw4F8OV9YOlwelA/xTJw+5eXmA0Bd/sgTnz5oBAevGayIB8eeQvyxl741 B2O0LDSvDvakIdZL5X9XgsAct4Cz/4igasS8iFWB2lfScplDeULCF9Hs0Jx6DMHL6mGd kzzKDpU9wJSN7DG8XkKAliWunjBeKwucGyzfIR99JadAtGHLSodKuXUO9e0Pfhcp2wJV z/zallRGXQNdsPkNTXYo6l0OA/RoF4BNb7F3RrWzf25J3T4d7yqwT9R08Sl9N0WL9AC3 K+eVyNoQzXTvJVYKC7GXoZ7LZVcjU9HtFSiOZK3kj2xS/MBHhVVpt+EJC4ZLIutbUKoB XoXQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=LnHedKOj; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 38308e7fff4ca-2f79d47ee3esi21007101fa.573.2024.09.17.05.14.41; Tue, 17 Sep 2024 05:14:42 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=LnHedKOj; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 71A9F68D814; Tue, 17 Sep 2024 15:14:30 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f45.google.com (mail-lf1-f45.google.com [209.85.167.45]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C5E8B68D78B for ; Tue, 17 Sep 2024 15:14:21 +0300 (EEST) Received: by mail-lf1-f45.google.com with SMTP id 2adb3069b0e04-5356ab89665so6119078e87.1 for ; Tue, 17 Sep 2024 05:14:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1726575261; x=1727180061; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=/D2bFNTQsKTEPYiPMHr2S4Wh5YoBtI+8ivI+zC/AU/Q=; b=LnHedKOjC67JEP2IPMbEyYK8W/Wx5jvqCxODF1ol7qOOhd1XalWpTQuGqpqxtqguFi 6hqNuZUCG5787FQgtIi/eOwy0ODgcUqb1vhXCLFZZ5g7w9J4/o9T570tVqDtxApzRwHp zP6fhH2upUjWBxW0BBQIhwIf7QCLUdDWpsGnlo8wU/22qQQKH3/miNP1tZGm8WxwINNJ xtlJbyRV2tzwIHK/UcOebBQgm9hwFEtMHAE917YtEYoDIt867wQg3kkzWtxd+Lu1qtk2 LlCTJnrZSvf0wEhRpBjfr7jSiDw/mpksKyJf4mTIlteIc/jsMwvzuhEkdDCeMmZhkc3U l6yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726575261; x=1727180061; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/D2bFNTQsKTEPYiPMHr2S4Wh5YoBtI+8ivI+zC/AU/Q=; b=qfwBAjX8vByCQHRGSdKKqr8rXHRijQdior2p64OvEiaJodMdXxteTtvt8BGb8Gmthm 67s3C0PnsaRThT5aWn/np9mdcsSCOAV+yX6Zvxxvyrz5mu2+sqKjniVGdyK+sooMCPwQ NhWZSyhdJJLQvwQ1wXhx2ksj1qUYc8Ro0btpwrGaEVhyhBriiBFsy+cNT4AoP4rdEXhA bzgpF3ggolmzgOdg5FDqO4SSiFq2cKrjV+/7W9xDQ/ksW4GRqvqR33aPaIusKtIBc9Hp ic3icy4KAFs/ZhgkMJ2f78hzDLAYtLiIARLkWtMKik+iiiY9v/cOGagpSMlIds9UAlXN SgXg== X-Gm-Message-State: AOJu0YxTyCMH8hGtFfUbq3sReSVbNy83FcVFmXgC5Lo9FQI6I56xIpeb HmqYDphkEyrKmTH1KAN+EQhrcBrEVQ4RgYPjmBUYi7Ofo0qaAPkjI8xx3PWK9/uGwnFsvArNsQ7 2pQ== X-Received: by 2002:a05:6512:12c4:b0:52c:db0a:a550 with SMTP id 2adb3069b0e04-53678feb66dmr10387940e87.42.1726575260517; Tue, 17 Sep 2024 05:14:20 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-536870b8c09sm1198790e87.268.2024.09.17.05.14.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Sep 2024 05:14:20 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 17 Sep 2024 15:14:15 +0300 Message-Id: <20240917121419.610349-2-martin@martin.st> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240917121419.610349-1-martin@martin.st> References: <20240917121419.610349-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/5] configure: Add detection of assembler support for SVE/SVE2 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Teh22SOeKtEu It turns out that recent versions of MS armasm64 does support some SVE instructions, but not all of them. Test for one of the instructions that it currently doesn't support. --- Just as disclaimer, I'm not currently actively planning on writing SVE/SVE2 optimizations. However, related projects such as x264 and dav1d do have a few functions using these extensions, so we might just as well add the framework support for these features in ffmpeg anyway, as functions needing this support will come sooner or later anyway. In the related projects, there's no really use of longer vectors (as there's very little such HW available anyway), but SVE gives widening loads (used in a couple places in x264) and 16 bit dot products (used in dav1d), which can be useful with 128 bit vectors. --- configure | 14 +++++++++++++- ffbuild/arch.mak | 2 ++ libavutil/aarch64/asm.S | 18 ++++++++++++++++++ 3 files changed, 33 insertions(+), 1 deletion(-) diff --git a/configure b/configure index da36419f2d..d05c4a5a51 100755 --- a/configure +++ b/configure @@ -466,6 +466,8 @@ Optimization options (experts only): --disable-neon disable NEON optimizations --disable-dotprod disable DOTPROD optimizations --disable-i8mm disable I8MM optimizations + --disable-sve disable SVE optimizations + --disable-sve2 disable SVE2 optimizations --disable-inline-asm disable use of inline assembly --disable-x86asm disable use of standalone x86 assembly --disable-mipsdsp disable MIPS DSP ASE R1 optimizations @@ -2163,6 +2165,8 @@ ARCH_EXT_LIST_ARM=" vfp vfpv3 setend + sve + sve2 " ARCH_EXT_LIST_MIPS=" @@ -2435,6 +2439,8 @@ TOOLCHAIN_FEATURES=" as_arch_directive as_archext_dotprod_directive as_archext_i8mm_directive + as_archext_sve_directive + as_archext_sve2_directive as_dn_directive as_fpu_directive as_func @@ -2755,6 +2761,8 @@ vfpv3_deps="vfp" setend_deps="arm" dotprod_deps="aarch64 neon" i8mm_deps="aarch64 neon" +sve_deps="aarch64 neon" +sve2_deps="aarch64 neon sve" map 'eval ${v}_inline_deps=inline_asm' $ARCH_EXT_LIST_ARM @@ -6223,9 +6231,11 @@ if enabled aarch64; then # internal assembler in clang 3.3 does not support this instruction enabled neon && check_insn neon 'ext v0.8B, v0.8B, v1.8B, #1' - archext_list="dotprod i8mm" + archext_list="dotprod i8mm sve sve2" enabled dotprod && check_archext_insn dotprod 'udot v0.4s, v0.16b, v0.16b' enabled i8mm && check_archext_insn i8mm 'usdot v0.4s, v0.16b, v0.16b' + enabled sve && check_archext_insn sve 'whilelt p0.s, x0, x1' + enabled sve2 && check_archext_insn sve2 'sqrdmulh z0.s, z0.s, z0.s' # Disable the main feature (e.g. HAVE_NEON) if neither inline nor external # assembly support the feature out of the box. Skip this for the features @@ -7913,6 +7923,8 @@ if enabled aarch64; then echo "NEON enabled ${neon-no}" echo "DOTPROD enabled ${dotprod-no}" echo "I8MM enabled ${i8mm-no}" + echo "SVE enabled ${sve-no}" + echo "SVE2 enabled ${sve2-no}" fi if enabled arm; then echo "ARMv5TE enabled ${armv5te-no}" diff --git a/ffbuild/arch.mak b/ffbuild/arch.mak index 3fc40e5e5d..af71aacfd2 100644 --- a/ffbuild/arch.mak +++ b/ffbuild/arch.mak @@ -3,6 +3,8 @@ OBJS-$(HAVE_ARMV6) += $(ARMV6-OBJS) $(ARMV6-OBJS-yes) OBJS-$(HAVE_ARMV8) += $(ARMV8-OBJS) $(ARMV8-OBJS-yes) OBJS-$(HAVE_VFP) += $(VFP-OBJS) $(VFP-OBJS-yes) OBJS-$(HAVE_NEON) += $(NEON-OBJS) $(NEON-OBJS-yes) +OBJS-$(HAVE_SVE) += $(SVE-OBJS) $(SVE-OBJS-yes) +OBJS-$(HAVE_SVE2) += $(SVE2-OBJS) $(SVE2-OBJS-yes) OBJS-$(HAVE_MIPSFPU) += $(MIPSFPU-OBJS) $(MIPSFPU-OBJS-yes) OBJS-$(HAVE_MIPSDSP) += $(MIPSDSP-OBJS) $(MIPSDSP-OBJS-yes) diff --git a/libavutil/aarch64/asm.S b/libavutil/aarch64/asm.S index 1840f9fb01..50ce7d4dfd 100644 --- a/libavutil/aarch64/asm.S +++ b/libavutil/aarch64/asm.S @@ -56,8 +56,26 @@ #define DISABLE_I8MM #endif +#if HAVE_AS_ARCHEXT_SVE_DIRECTIVE +#define ENABLE_SVE .arch_extension sve +#define DISABLE_SVE .arch_extension nosve +#else +#define ENABLE_SVE +#define DISABLE_SVE +#endif + +#if HAVE_AS_ARCHEXT_SVE2_DIRECTIVE +#define ENABLE_SVE2 .arch_extension sve2 +#define DISABLE_SVE2 .arch_extension nosve2 +#else +#define ENABLE_SVE2 +#define DISABLE_SVE2 +#endif + DISABLE_DOTPROD DISABLE_I8MM +DISABLE_SVE +DISABLE_SVE2 /* Support macros for From patchwork Tue Sep 17 12:14:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 51632 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:d32e:0:b0:48e:c0f8:d0de with SMTP id cf14csp211113vqb; Tue, 17 Sep 2024 05:14:53 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVDHzRNKFRtp85vYeypgFE7czPrYUDDFyiHu/iZR0Nrc4YjpdVVGf0rmaCyIZMGDdUeShIc9T8M21PHNwPiGWfm@gmail.com X-Google-Smtp-Source: AGHT+IEb7ojyHSQHEz8vdmReRSKNFZ5Gv/Ks3p9pXemk32t+XqymDxBNfSNKHGvyzDg/P//dpHgz X-Received: by 2002:a05:651c:2228:b0:2f6:57b1:98b0 with SMTP id 38308e7fff4ca-2f787f4bee5mr107109601fa.42.1726575292897; Tue, 17 Sep 2024 05:14:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1726575292; cv=none; d=google.com; s=arc-20240605; b=Q39Yx5caTMgjUj1XN3MPIwuyu3d8hRDmeV1pXBJpt0zwPIPaKJoF43XFdigLxu3AQ+ SOXWkjCaFaDs98Pw1LlaHSTEeGPuv7m56e29FHdME4GyrA7gJZRA+K16+q8xGGXp8BnK WQdUVDLeGUN5FNBr038atgBi1blS2r3WYrFt1HT5tcq1RVKOyX7cKzzp7qrvEOYwgm9U 1YlsJEWTJ5DW6O7cwKgg6YqSc56YI4ld0IXPIlXmH1pqF4FBKJNE9WK+ZzF9w6zsR/G8 deZVTQnQHkMrniuwBJlLYsVxYrJQycANTLgvmZ/pievuZGgXw2J5wnMgbAhfaIh9xNKS jgcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=0iRCPA7I0Z0iUzhdmjdr4jc4blwr/HuMKDUSu2zw/O4=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=Xvvsf7FPgbWVvVqNyxmUThjz79uk0pAvAnbxsuDlGZg2wJ1tikjFHZzmVJXdA1GhPW QiDc7UpIy7idcVNBntLbEkIybqr29lRdLjiiATB5A312nxcGZO8/ybGQpO5umH6U4Jgo qk3n9LSJmdGVgDRKb12cjCzhjrDSolUXLb6knlApH2Vn/0qx3LhMmgVvWwLfTkVSRd+M dtcNyB6phkLh21jhgLoN/MVJz8IaVInNkuqHKg8nlECQQlHSa7gKlv1js9GEgvq0h9oR M5h7sJ9fuwlYG/lLZ4b04VwznD5EA1Yqap0F6aCe8c+c2hT4TD4r0xrSCqKw1J4MLrZ3 9Yww==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=qLWyMU5q; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 38308e7fff4ca-2f79d2c7f61si22380761fa.110.2024.09.17.05.14.51; Tue, 17 Sep 2024 05:14:52 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=qLWyMU5q; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D08DF68DB08; Tue, 17 Sep 2024 15:14:31 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4F7A968D78B for ; Tue, 17 Sep 2024 15:14:22 +0300 (EEST) Received: by mail-lf1-f42.google.com with SMTP id 2adb3069b0e04-53654e2ed93so6008427e87.0 for ; Tue, 17 Sep 2024 05:14:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1726575261; x=1727180061; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=nqxW7xZL5jJxMJim6L87sJU1ZvRaBUvIvPgZCldjxZc=; b=qLWyMU5q4EhYDekorblBp6qZusF9UBlcq6J7gLVZ9nj/14JjNiUcKJmKNLjjODEbq4 0wE5At/dokJrZdKA5IGrgy1NEfSPDOYJDSz9lsyipwrHj3ex0NTXC5vvWs6eMgGYldPS e4+uF9OBxmJC1j5kodvXQeXP/mbRTX8lbCxGI/YoCZ1j6si/JA600Ob5K1hcOGRRvFiF uyp9LO1pxHqL3IRMMDCrDcHcOTNy/w+UHm0yP7xwEbcwxG1MLK7EmVp8GofNP/GMmQS8 ee/7s8lyYj43XFwAif++LDbmCWNenU6X1jf9LdnnbpwckgVxfhNzHsWEBLilZNyDzPNy Ux+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726575261; x=1727180061; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nqxW7xZL5jJxMJim6L87sJU1ZvRaBUvIvPgZCldjxZc=; b=NtSM//i7KPlCKmNsEGBCGXRAfbM1vT1U6bPwrmv2XMOoQUXM0OdWhGWJ4EyZd/cYpu btj2AyZ/AtJSQ/2+m66P2RNkkLL3y+Drj69F72J10p4wru0EOHDI2Du2ik+bahIg/W8Y ya9WaeDKrYHQGeYG793kHAoex78XfbOmv3BX2DiMSgrR0BU/sW3tneSQNbGx149CzTSS 7ySxlBusEhGxVoz2dFQKsKbQdJERDLxQZeAEuZl1vhmWKDkfbdBCXqivnRmu6e/MCfk7 98gHP3MzQGsnP5DR/BuiT7ikj1wp/SpksVQeXNG78JvQhK/IK5nxv3T3htgHriH065fO A4YA== X-Gm-Message-State: AOJu0YwH/ZSgzk39J4dZSsdhJrrjlXgo3gx91t8anLds0O3XrUwud8+l /4cI2RVfZqfTf2UU5QL0HYy0Y7V0SHGy60BkHon5y3ipHKfrIz2xNF0tFCqWnYZcPnwmvwhfqLn BnA== X-Received: by 2002:a05:6512:39ca:b0:535:ea75:e913 with SMTP id 2adb3069b0e04-53678fc86dbmr9951228e87.33.1726575261104; Tue, 17 Sep 2024 05:14:21 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-536870a4639sm1179787e87.186.2024.09.17.05.14.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Sep 2024 05:14:20 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 17 Sep 2024 15:14:16 +0300 Message-Id: <20240917121419.610349-3-martin@martin.st> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240917121419.610349-1-martin@martin.st> References: <20240917121419.610349-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/5] aarch64: Add CPU feature flags for SVE and SVE2 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: XqAsyvOrb4dQ Add code for detecting the feature on Linux and Windows. --- libavutil/aarch64/cpu.c | 20 ++++++++++++++++++++ libavutil/aarch64/cpu.h | 2 ++ libavutil/cpu.c | 2 ++ libavutil/cpu.h | 2 ++ libavutil/tests/cpu.c | 2 ++ tests/checkasm/checkasm.c | 2 ++ 6 files changed, 30 insertions(+) diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c index fe24b1da4d..e82c0f19ab 100644 --- a/libavutil/aarch64/cpu.c +++ b/libavutil/aarch64/cpu.c @@ -25,6 +25,8 @@ #include #define HWCAP_AARCH64_ASIMDDP (1 << 20) +#define HWCAP_AARCH64_SVE (1 << 22) +#define HWCAP2_AARCH64_SVE2 (1 << 1) #define HWCAP2_AARCH64_I8MM (1 << 13) static int detect_flags(void) @@ -36,6 +38,10 @@ static int detect_flags(void) if (hwcap & HWCAP_AARCH64_ASIMDDP) flags |= AV_CPU_FLAG_DOTPROD; + if (hwcap & HWCAP_AARCH64_SVE) + flags |= AV_CPU_FLAG_SVE; + if (hwcap2 & HWCAP2_AARCH64_SVE2) + flags |= AV_CPU_FLAG_SVE2; if (hwcap2 & HWCAP2_AARCH64_I8MM) flags |= AV_CPU_FLAG_I8MM; @@ -119,6 +125,14 @@ static int detect_flags(void) * regular I8MM is available. */ if (IsProcessorFeaturePresent(PF_ARM_SVE_I8MM_INSTRUCTIONS_AVAILABLE)) flags |= AV_CPU_FLAG_I8MM; +#endif +#ifdef PF_ARM_SVE_INSTRUCTIONS_AVAILABLE + if (IsProcessorFeaturePresent(PF_ARM_SVE_INSTRUCTIONS_AVAILABLE)) + flags |= AV_CPU_FLAG_SVE; +#endif +#ifdef PF_ARM_SVE2_INSTRUCTIONS_AVAILABLE + if (IsProcessorFeaturePresent(PF_ARM_SVE2_INSTRUCTIONS_AVAILABLE)) + flags |= AV_CPU_FLAG_SVE2; #endif return flags; } @@ -142,6 +156,12 @@ int ff_get_cpu_flags_aarch64(void) #ifdef __ARM_FEATURE_MATMUL_INT8 flags |= AV_CPU_FLAG_I8MM; #endif +#ifdef __ARM_FEATURE_SVE + flags |= AV_CPU_FLAG_SVE; +#endif +#ifdef __ARM_FEATURE_SVE2 + flags |= AV_CPU_FLAG_SVE2; +#endif flags |= detect_flags(); diff --git a/libavutil/aarch64/cpu.h b/libavutil/aarch64/cpu.h index 64d703be37..df7becca30 100644 --- a/libavutil/aarch64/cpu.h +++ b/libavutil/aarch64/cpu.h @@ -27,5 +27,7 @@ #define have_vfp(flags) CPUEXT(flags, VFP) #define have_dotprod(flags) CPUEXT(flags, DOTPROD) #define have_i8mm(flags) CPUEXT(flags, I8MM) +#define have_sve(flags) CPUEXT(flags, SVE) +#define have_sve2(flags) CPUEXT(flags, SVE2) #endif /* AVUTIL_AARCH64_CPU_H */ diff --git a/libavutil/cpu.c b/libavutil/cpu.c index df00bd541f..e16ebc0d38 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -180,6 +180,8 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "vfp", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VFP }, .unit = "flags" }, { "dotprod", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_DOTPROD }, .unit = "flags" }, { "i8mm", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_I8MM }, .unit = "flags" }, + { "sve", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_SVE }, .unit = "flags" }, + { "sve2", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_SVE2 }, .unit = "flags" }, #elif ARCH_MIPS { "mmi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MMI }, .unit = "flags" }, { "msa", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MSA }, .unit = "flags" }, diff --git a/libavutil/cpu.h b/libavutil/cpu.h index ba6c234e04..6b6e50f07a 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -72,6 +72,8 @@ #define AV_CPU_FLAG_VFP_VM (1 << 7) ///< VFPv2 vector mode, deprecated in ARMv7-A and unavailable in various CPUs implementations #define AV_CPU_FLAG_DOTPROD (1 << 8) #define AV_CPU_FLAG_I8MM (1 << 9) +#define AV_CPU_FLAG_SVE (1 <<10) +#define AV_CPU_FLAG_SVE2 (1 <<11) #define AV_CPU_FLAG_SETEND (1 <<16) #define AV_CPU_FLAG_MMI (1 << 0) diff --git a/libavutil/tests/cpu.c b/libavutil/tests/cpu.c index 0a459c1d9e..679b538f0f 100644 --- a/libavutil/tests/cpu.c +++ b/libavutil/tests/cpu.c @@ -40,6 +40,8 @@ static const struct { { AV_CPU_FLAG_VFP, "vfp" }, { AV_CPU_FLAG_DOTPROD, "dotprod" }, { AV_CPU_FLAG_I8MM, "i8mm" }, + { AV_CPU_FLAG_SVE, "sve" }, + { AV_CPU_FLAG_SVE2, "sve2" }, #elif ARCH_ARM { AV_CPU_FLAG_ARMV5TE, "armv5te" }, { AV_CPU_FLAG_ARMV6, "armv6" }, diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 73a998ae3a..c932e028a5 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -305,6 +305,8 @@ static const struct { { "NEON", "neon", AV_CPU_FLAG_NEON }, { "DOTPROD", "dotprod", AV_CPU_FLAG_DOTPROD }, { "I8MM", "i8mm", AV_CPU_FLAG_I8MM }, + { "SVE", "sve", AV_CPU_FLAG_SVE }, + { "SVE2", "sve2", AV_CPU_FLAG_SVE2 }, #elif ARCH_ARM { "ARMV5TE", "armv5te", AV_CPU_FLAG_ARMV5TE }, { "ARMV6", "armv6", AV_CPU_FLAG_ARMV6 }, From patchwork Tue Sep 17 12:14:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 51633 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:d32e:0:b0:48e:c0f8:d0de with SMTP id cf14csp211255vqb; Tue, 17 Sep 2024 05:15:01 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCX+HU+HWCQpMXSTFR4NNtUgjSfqEkfuxpFXgpCdQPzA1Rf4sPW0uqQd/JHT8c3JOQeo+fdmBy8xRC6yHWKtXOYz@gmail.com X-Google-Smtp-Source: AGHT+IH5LoI8x/bQx/TS6AX/aOyjmcjbDT9DNBHNeV80cDKwxIW/+wH2zYIbwJ52eaB/BPKtqYcb X-Received: by 2002:a05:651c:2227:b0:2f7:5337:52ce with SMTP id 38308e7fff4ca-2f7919045d9mr95809691fa.11.1726575301530; Tue, 17 Sep 2024 05:15:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1726575301; cv=none; d=google.com; s=arc-20240605; b=NdQVAuAKDFzha8liy9bgmJAwFbLvAgFiYJHL3MAGwMH3If76A6RkOS0YhmVT2vnKXn DGpS1nw/PDb47ht//VinXQ7Pl1lGM1h1+bMrXsWhp7dT+hK48QUS5Cv0G3t47xgwP8q8 gj6Ns9TTO6hUIu1Qd9TKt2sOR8ad5FWpa6wcB9OYezcqqDRI0crOY7T1X3xVeoo60MU8 +YXXFWWQemz9exY7J9hlkbZtXVlpXyvXRKmSO4xrtk45z/Odjh+x9ogqF93PijYLn6OW z9f777GxXQDzf9d7bRIHy/AQCRAaSBt6HtTkEV/RRoEvz9+F/Ho+iI7Smubbnlvu6DUM 5skg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=Dj+zJo+i88iErtts+FVlPt8cEHNzdbXvtfyn9wxLKEI=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=Gw1776niRBXMiRgB1WlyulokJh1t1D7gTiXNCYw2/5vfrtftqpR9oGl572spU4coIx pOWIAv8v+zFQkxOEdJUqGxyuOeG2ANFuPeXpdhQGi6A2dTkF8AnynJitie3oW13RuNNG 3iwG/0MMejLV8a7B7fPm5wPZyoe96gnvvPRj8KpFJoIBk67J+U0SOD0TGBIOTYAbrHRs wbAY2NWsPg2fBfrWHAQgJkR20YaKlQ3Xbuj2TT4Tj9h3HgXUda7Y+aMVrGHzmZqy0bjQ IP/bVrHLuk5rlPz4alaZmW4PMBMdE47wSiMIEJOqRLN047XdlThkaNbbpm2bPoNJ05rj TwSg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b="V8vM5j/W"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 38308e7fff4ca-2f79d35da99si21788991fa.216.2024.09.17.05.15.00; Tue, 17 Sep 2024 05:15:01 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b="V8vM5j/W"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0D4FA68DB24; Tue, 17 Sep 2024 15:14:33 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f182.google.com (mail-lj1-f182.google.com [209.85.208.182]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id EE22F68DAE3 for ; Tue, 17 Sep 2024 15:14:22 +0300 (EEST) Received: by mail-lj1-f182.google.com with SMTP id 38308e7fff4ca-2f029e9c9cfso45729041fa.2 for ; Tue, 17 Sep 2024 05:14:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1726575262; x=1727180062; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=yLvHGR7jbqUYwAKPmUFdkjuEBjP0PmE/qvNomr8Z6g8=; b=V8vM5j/W/KWytPa9Eireoe1NAs3KP4Scbpi5/HyurltOW8YRqcmW9lydzpU28qF8UQ Lyzrj1WOLRrxC2/ugE97+qnE0CH/bDbf4kk6CvG8hs6wN5I9zd1BrMGHileH6hRB3bGU PMlOlyDMIr/Iu6/BOff0PsdL4Tuq4/CBy4S5cjMHNrvKkZm/fexGtY3m6rCHWUpK2Kl4 YkBvfVikfAvxcKRLDKtdRM2v+LrCNAe6wanwriiFsTwJ2PAiYawz/CWBFW0W+ttpYtwP b9Rae0DCxgQa0iEUomdIhnuWhvpUvbxvUR5eVyobz8/bBNYNQ0x1Ekyt8safPSUduexh 3thQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726575262; x=1727180062; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yLvHGR7jbqUYwAKPmUFdkjuEBjP0PmE/qvNomr8Z6g8=; b=fMsgsNA6u4HrLTXTSx7K6aR7V0my+OiG/bzRo7CC7gtmrHQLj1qc3OKu8jo5bjv4dc LTKUr76O4EnRoH5nt3OB94VaHm5Ry6Xi4rBPXZlChiNsg5k23GJVCBaBt7vmYoixI2AY iBzJSUtontuT5kneaoYPAkWrqIo8vA/ZOvPgc48pDgiUF60FM1UedMMwQozYrSH+ZfEe nedpjIOIMdGPqkMQET1A9C6ZHtql0+k7mTJblqqqa9rmt2bjih1a51cwoi3196DsrI9V aMiH4whxM42dpNhE207qd4NgJvVnOEBXtsP53M9NQyx3qrixEU7ruuLkq4/z5cmRiLOu H3AA== X-Gm-Message-State: AOJu0Yz2OeYoUQV/4/8QOYEbt16LrJO+4QojGApegARLvI04IzQ6uKXX X6ZRnR7QassF+r8lqkLG8QT9yxMEsNUDDEiWUzXDvKqELgFFy2VhcR/Bx/Ir98MyRjHolOd4H0C CNA== X-Received: by 2002:a05:651c:211b:b0:2f1:750d:53a7 with SMTP id 38308e7fff4ca-2f7918e4cabmr96838181fa.8.1726575261925; Tue, 17 Sep 2024 05:14:21 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-2f79d59d494sm10148981fa.140.2024.09.17.05.14.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Sep 2024 05:14:21 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 17 Sep 2024 15:14:17 +0300 Message-Id: <20240917121419.610349-4-martin@martin.st> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240917121419.610349-1-martin@martin.st> References: <20240917121419.610349-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/5] aarch64: Print the SVE vector length in libavutil/tests/cpu.c X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: EEcm4ES+dtUW This makes this aspect more visible in test logs. --- libavutil/aarch64/Makefile | 2 ++ libavutil/aarch64/cpu.h | 4 ++++ libavutil/aarch64/cpu_sve.S | 29 +++++++++++++++++++++++++++++ libavutil/tests/cpu.c | 8 ++++++++ 4 files changed, 43 insertions(+) create mode 100644 libavutil/aarch64/cpu_sve.S diff --git a/libavutil/aarch64/Makefile b/libavutil/aarch64/Makefile index eba0151337..992e95e4df 100644 --- a/libavutil/aarch64/Makefile +++ b/libavutil/aarch64/Makefile @@ -4,3 +4,5 @@ OBJS += aarch64/cpu.o \ NEON-OBJS += aarch64/float_dsp_neon.o \ aarch64/tx_float_neon.o \ + +SVE-OBJS += aarch64/cpu_sve.o \ diff --git a/libavutil/aarch64/cpu.h b/libavutil/aarch64/cpu.h index df7becca30..a41b729659 100644 --- a/libavutil/aarch64/cpu.h +++ b/libavutil/aarch64/cpu.h @@ -30,4 +30,8 @@ #define have_sve(flags) CPUEXT(flags, SVE) #define have_sve2(flags) CPUEXT(flags, SVE2) +#if HAVE_SVE +int ff_aarch64_sve_length(void); +#endif + #endif /* AVUTIL_AARCH64_CPU_H */ diff --git a/libavutil/aarch64/cpu_sve.S b/libavutil/aarch64/cpu_sve.S new file mode 100644 index 0000000000..d216ed2c49 --- /dev/null +++ b/libavutil/aarch64/cpu_sve.S @@ -0,0 +1,29 @@ +/* + * Copyright (c) 2023 Martin Storsjo + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" +#include "asm.S" + +ENABLE_SVE + +function ff_aarch64_sve_length, export=1 + cntb x0 + ret +endfunc diff --git a/libavutil/tests/cpu.c b/libavutil/tests/cpu.c index 679b538f0f..abe2b057d7 100644 --- a/libavutil/tests/cpu.c +++ b/libavutil/tests/cpu.c @@ -23,6 +23,10 @@ #include "libavutil/cpu.h" #include "libavutil/avstring.h" +#if ARCH_AARCH64 +#include "libavutil/aarch64/cpu.h" +#endif + #if HAVE_UNISTD_H #include #endif @@ -161,6 +165,10 @@ int main(int argc, char **argv) print_cpu_flags(cpu_flags_raw, "raw"); print_cpu_flags(cpu_flags_eff, "effective"); printf("threads = %s (cpu_count = %d)\n", threads, cpu_count); +#if ARCH_AARCH64 + if (cpu_flags_raw & AV_CPU_FLAG_SVE) + printf("sve_vector_length = %d\n", 8 * ff_aarch64_sve_length()); +#endif return 0; } From patchwork Tue Sep 17 12:14:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 51634 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:d32e:0:b0:48e:c0f8:d0de with SMTP id cf14csp211383vqb; Tue, 17 Sep 2024 05:15:11 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXcB6qUU/7/areLBB7I15RoA+ujFDjKZnJDmAP8xunqlmUhk2l5CVhAmEur0K+ulbRalrefRN4SlVYr0p/2xDaY@gmail.com X-Google-Smtp-Source: AGHT+IF3etG/ED/zDfCzzVjYEkoJacbZPpyCje3YhIX2WWrJc+1YqOId/XbyvwIhFcopPnTtYIGg X-Received: by 2002:a05:6512:230a:b0:530:e1f1:8dc9 with SMTP id 2adb3069b0e04-53678fed1b2mr8958332e87.46.1726575310814; Tue, 17 Sep 2024 05:15:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1726575310; cv=none; d=google.com; s=arc-20240605; b=c1nOaCdvbP0MWGvhI6PLZaXrRlZSeXWxDUB8SVOrZ5hSIQbaFkpJbq37H8Ae0M6nb6 VU+cjxuJiX+qJxQzm2BRHf00bxep0SAuGOa6kcmdQCn9i8317tb/uvQ7AT4uK+cIOSV0 AI6CnzXjPJCfLkZwgWbEWu/77Qb9btXj0dEEu9ZemBteYH1/FontlKKfWWlWZe5IS+Jr l5SAPshvqqMl7PDqEOOQ2bxkTvIyMuBxWQZKLmQ7S+PLJ0AMYeKxxYSip1Ys5abJpHl7 7CQPjXAVkvJirEBEO9iJd1I2RSw8WU6zLQOv1bPxb0XFH7Xb8OzpUwspD6iyrD2qUSG7 VR4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=AtgScGnpXUiNrpJxnFYNz4O2n09sie6CtDub3uo4/Ww=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=foZkNZXVLww/Y2kHGdjHy9K51gb4ALzxDuZfma30mOqa9lJB29z2LNyBpwmZNz9SNv 7tmuVU0pKy/R8lHF0T+SCdJ9/ybHgMKtDVMNkP4iD/Jwu45j5FzICcgoJMAvFexuwFRx os5mkV3Cpq4Apelnevu1fQc9Y6NufHRIKCtRLGO6xmt+Vx17vh7ibzKyu1y1BRzR7XCN yWmLO95VYLXwuQtU5BbjxlkUVsUEOzjGYFF01Hx9tj8swbG+fTuswbCR4NHf5pd2cx/B m1F2d/M5U3ZG8HxrOj3NxLLB4Rj4SDZW3hcK6BTQU1FPKYkQtMVfeqg6nu9qASX/VSKv /3GA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=PYAasZrU; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 2adb3069b0e04-536870a0c78si2457406e87.421.2024.09.17.05.15.10; Tue, 17 Sep 2024 05:15:10 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=PYAasZrU; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 101FD68DB33; Tue, 17 Sep 2024 15:14:34 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 795C068DAD5 for ; Tue, 17 Sep 2024 15:14:23 +0300 (EEST) Received: by mail-lj1-f174.google.com with SMTP id 38308e7fff4ca-2f759688444so47296711fa.1 for ; Tue, 17 Sep 2024 05:14:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1726575263; x=1727180063; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=mqFu9rjFkO7/KW1owz1C3vZ6sTLA6ccZFe+CZIagSvA=; b=PYAasZrUAzL9OKDoY1xFZ9HddXLrxGXuz7JyASohzaisRSGS2C8QkV0hcc999ilhqp hNVDPUA/2b42OTAqLSKSZn452Zo1ZbwW7z8PuYfgDui80rGe+aYj4Z8XIxShWDt+K/ss AEYbnN9/9AecqUWSgRJaYvQ6OPWur3Rysp8hHvwkYELYELSxMfVYEHjnU/SrwBG18N6u FSo4gNGc6JVkRQBj4SZR7evz6uJJeloSSHAfrIC55/Y8d4mhhblFhGqTnpkaqp3bet/I S3KeALmmDI+PRXh/F4C7oOBmOAoSvjvKdJ2T80D/qvF3ioq7a2ZwvPMMUbDsT87i5L/7 KbAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726575263; x=1727180063; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mqFu9rjFkO7/KW1owz1C3vZ6sTLA6ccZFe+CZIagSvA=; b=X06P8k9bOUFguZr/IW7aa9DsFz/S2u7rOTlpAgBY2TbIu4gvaASWk9snK+NtyvSNSn ppM3k+csNUnAYuU2i/HKWkSPZWEdmCqNJj4gVroHUOfZakDeDSKVCt4sROD5Eh1tOfCi 8vwVBFr2XuYqyD/bhOipnQMZ07PAS+Gre5KMJPJZcxew+Qu0RxM9ZN9ywavF3HFCG/jh MbH8VV/Zj3MmuETw7fYCdE+kQ+x3RIawWiXHWFadbLldFAXuM3/YaZP6sKiuitUBY6U8 JMIq5Wr6E5W592H36NXHUYsCVDHPYv9bDbLuoChdtU6uU/wdHz0yeoP48te3Ces2pIm1 epQA== X-Gm-Message-State: AOJu0Yxp8RhVXrwCtBwJLSnGOR0BTt/MXHkVJsxKcKN8EOwEP8Tqr9NF sE2evEl2zDyWgwp0GODqVSCCgncV8fuKHg7Fl3IbNazqKKsROUp3cOYW6v6xfe5F5E+JieEEV6o 1yw== X-Received: by 2002:a05:651c:2129:b0:2f7:7ea4:2a1e with SMTP id 38308e7fff4ca-2f787f4460amr102858081fa.28.1726575262539; Tue, 17 Sep 2024 05:14:22 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-2f79d3267e3sm10103741fa.68.2024.09.17.05.14.22 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Sep 2024 05:14:22 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 17 Sep 2024 15:14:18 +0300 Message-Id: <20240917121419.610349-5-martin@martin.st> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240917121419.610349-1-martin@martin.st> References: <20240917121419.610349-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/5] checkasm: Print the SVE vector length at startup X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: eMwKA98ecsvP --- tests/checkasm/checkasm.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index c932e028a5..c9d2b5faf1 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -94,6 +94,10 @@ #define isatty(fd) 1 #endif +#if ARCH_AARCH64 +#include "libavutil/aarch64/cpu.h" +#endif + #if ARCH_ARM && HAVE_ARMV5TE_EXTERNAL #include "libavutil/arm/cpu.h" @@ -917,6 +921,7 @@ int main(int argc, char *argv[]) { unsigned int seed = av_get_random_seed(); int i, ret = 0; + char arch_info_buf[50] = ""; #ifdef _WIN32 #if WINAPI_FAMILY_PARTITION(WINAPI_PARTITION_DESKTOP) @@ -981,7 +986,12 @@ int main(int argc, char *argv[]) } } - fprintf(stderr, "checkasm: using random seed %u\n", seed); +#if ARCH_AARCH64 && HAVE_SVE + if (have_sve(av_get_cpu_flags())) + snprintf(arch_info_buf, sizeof(arch_info_buf), + "SVE %d bits, ", 8 * ff_aarch64_sve_length()); +#endif + fprintf(stderr, "checkasm: %susing random seed %u\n", arch_info_buf, seed); av_lfg_init(&checkasm_lfg, seed); if (state.bench_pattern)