From patchwork Tue May 30 12:30:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 41891 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp2426740pzb; Tue, 30 May 2023 05:30:56 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5DVhbbLm18BvdV/bMTh+ih0z53OcUNf1c8cZXYeU7kDJxZXNCsUmNMsbktCAZoChRA3+Jt X-Received: by 2002:a17:906:58d1:b0:94a:653b:ba41 with SMTP id e17-20020a17090658d100b0094a653bba41mr2788983ejs.15.1685449856348; Tue, 30 May 2023 05:30:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685449856; cv=none; d=google.com; s=arc-20160816; b=lnXFDQkUXCajXUDto9fqHR9S67tPb7M2uZQrTUHlHeWEQTeqJYgXWNr4+EHISzxtpg oGohxaDP2BSVaDXFieufqYG0qrrUw5hmkafb7bjj0NhPfDgB4jF7iuKYEZeM4Yv2DYph LJhe4qk9PZKIqlAoTMkd4O88UmHPTMV07oMrh1Flc1gzLTECDGKQ6NIlik7SGJsMOgn+ e6EGN0J0aDG9Xk6w56LuhUCyHpa+1pyJKUUMx8L2Qg+FNB0jfDTNEN5rpdPUBVvME4w/ gEsbW2tZJ4H+FqQ4OA5kaNBCgG5DJhYbsThqC8llDQYvaP+xvt07z6OrFEBjQwDGoHCu nYfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=IGTAUoO0+i4SDhV0Pv09dQeX/wp/9fDDbX/9ofF6pjE=; b=U75clz9f4Kge40SGmAgx6N9ZWYeFsDbrNzUyxmrUYFDuhcx8EGGU30to28WK9esGTV sK110bfZ1DevEtJUR1K9gl+HEhhUBuRr1+eEojAlLxoJ4UsX6ADDYgyDZnWDNVD8C7C7 h107mqaaQt+9gd58qzS5F4oV76047AgC1VpjHd3Z7mu1CrNXUutbkQQxa4sNqIB4rw8E JSsLD7oZ5OhAvXr+l/CHSDxoGAG2oke7NFW5dBQjoRCMqIK0iVnIalHSVesOZ1hAqT9D L0IkuJPwkU4cCQqHGQTT/oJ1Fnt4ZXohypkmcj8R+5LEc0RVnHCdCasWRZSxRlzDUuMv +3GA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=kqfIt0LF; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id sc32-20020a1709078a2000b0096f70608f91si2094957ejc.149.2023.05.30.05.30.55; Tue, 30 May 2023 05:30:56 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=kqfIt0LF; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A390668C20F; Tue, 30 May 2023 15:30:52 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com [209.85.167.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 647DB68C033 for ; Tue, 30 May 2023 15:30:45 +0300 (EEST) Received: by mail-lf1-f47.google.com with SMTP id 2adb3069b0e04-4f4e71a09a7so4883754e87.1 for ; Tue, 30 May 2023 05:30:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20221208.gappssmtp.com; s=20221208; t=1685449844; x=1688041844; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=flfg/l3g4m7HFSKYXQZiF5SeKw07UNZYyRlgRM9O4Vg=; b=kqfIt0LF3hxJrBKE3did21hKXFH+A0h439XcQ+jlL5GHAjPaE0Tdb+oUdr4e0UQlF1 9GpcWtmKZasHAhMS2yE67JqVJ6SfM3ZB3Tg2Po1uX1hyCXE9dCsOOoY+Id5pOO4I8PwL oRiStMa8j3T3vr1a76Ib9+mzmaDn8GxK0D3qOTguP37BSQiIl93NtEpKSY/1aZfK8xWh qAqWRITpn4XyS+TLQx7qL9amasnA0pMyyIx2lfHHm8nCTh9XmypnAsUdfs2Iyidydifd ixrlIlzIjjakfeZdABANnY/Ffi5Msb5XsmYamF7nUOApR6aRJIYFTFZxG7oN3e0X0Mnd HiXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685449844; x=1688041844; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=flfg/l3g4m7HFSKYXQZiF5SeKw07UNZYyRlgRM9O4Vg=; b=X0v/07kSnOLkkNDrYtf1ui+ojK1Yyt98g7eUvDbMnbsvZVb8miIyS6aHmUOs42yQPM 0a/FEKkOxT54DhRKfrvJKGdJwKbRofbXDUeQzCTH3D7wOfc3cCZtETMENBL9yM9V3jLK mhlsw3p7rbP5TCX//3erd9cE2iJYLlrOqOezvnmu6zpe+cdhk/8NwnKDVgf1GHMys4ab QF79+N1LA+z/xcwzw0JmcVg/+yoWJpHTNOtmiZWRfIclS6ABe+wj7Ue0rpkfFXIi2cJt TC6r4EHOFk6zLqRrHbjeHsepLNLc0G9kOR+SGjGZQL7J+f91MbDC4pwbQO5agaOYn6mU SkqQ== X-Gm-Message-State: AC+VfDzOTWhKX0/9gyrbxqMfpt7zciCvyHubDoqKG2J13kXEn0Emw4Eb XjN4BGaE2F6tQtJVfG5oq+McaBQah33dUacC0G4YQw== X-Received: by 2002:ac2:4464:0:b0:4eb:3cac:23b9 with SMTP id y4-20020ac24464000000b004eb3cac23b9mr737085lfl.9.1685449844417; Tue, 30 May 2023 05:30:44 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id q20-20020ac25294000000b004f3945751b2sm331029lfm.43.2023.05.30.05.30.43 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 30 May 2023 05:30:43 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 30 May 2023 15:30:39 +0300 Message-Id: <20230530123043.52940-1-martin@martin.st> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 1/5] configure: aarch64: Support assembling the dotprod and i8mm arch extensions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Sg03X3W6MQjo These are available since ARMv8.4-a and ARMv8.6-a respectively, but can also be available optionally since ARMv8.2-a. Check if ".arch armv8.2-a" and ".arch_extension {dotprod,i8mm}" are supported, and check if the instructions can be assembled. Current clang versions fail to support the dotprod and i8mm features in the .arch_extension directive, but do support them if enabled with -march=armv8.4-a on the command line. (Curiously, lowering the arch level with ".arch armv8.2-a" doesn't make the extensions unavailable if they were enabled with -march; if that changes, Clang should also learn to support these extensions via .arch_extension for them to remain usable here.) --- Simplified the detection logic somewhat; check if ".arch armv8.2-a" and ".arch_extension {dotprod,i8mm}" are available, then check if the instruction can be assembled. This way, we check exactly the same thing as we are going to assemble in the end, so there shouldn't be any risk of build breakage due to testing and building subtly different things. --- configure | 81 ++++++++++++++++++++++++++++++++++++++++- libavutil/aarch64/asm.S | 11 ++++++ 2 files changed, 91 insertions(+), 1 deletion(-) diff --git a/configure b/configure index 495493aa0e..50eb27ba0e 100755 --- a/configure +++ b/configure @@ -454,6 +454,8 @@ Optimization options (experts only): --disable-armv6t2 disable armv6t2 optimizations --disable-vfp disable VFP optimizations --disable-neon disable NEON optimizations + --disable-dotprod disable DOTPROD optimizations + --disable-i8mm disable I8MM optimizations --disable-inline-asm disable use of inline assembly --disable-x86asm disable use of standalone x86 assembly --disable-mipsdsp disable MIPS DSP ASE R1 optimizations @@ -1154,6 +1156,43 @@ check_insn(){ check_as ${1}_external "$2" } +check_arch_level(){ + log check_arch_level "$@" + level="$1" + check_as tested_arch_level ".arch $level" + enabled tested_arch_level && as_arch_level="$level" +} + +check_archext_insn(){ + log check_archext_insn "$@" + feature="$1" + instr="$2" + # Check if the assembly is accepted in inline assembly. + check_inline_asm ${feature}_inline "\"$instr\"" + # We don't check if the instruction is supported out of the box by the + # external assembler (we don't try to set ${feature}_external) as we don't + # need to use these instructions in non-runtime detected codepaths. + + disable $feature + + enabled as_arch_directive && arch_directive=".arch $as_arch_level" || arch_directive="" + + # Test if the assembler supports the .arch_extension $feature directive. + arch_extension_directive=".arch_extension $feature" + test_as <>$TMPH +enabled aarch64 && + echo "#define AS_ARCH_LEVEL $as_arch_level" >>$TMPH + if enabled x86asm; then append config_files $TMPASM cat > $TMPASM < X-Patchwork-Id: 41892 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp2426880pzb; Tue, 30 May 2023 05:31:06 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6KoBLO7NKWHWZ/aiNZ7Y3DjV7fddiqGkuLjGP1SepSdWFm3Y58TwyKNaMof+tMsVEIpY6t X-Received: by 2002:a05:6402:1245:b0:514:9e81:9065 with SMTP id l5-20020a056402124500b005149e819065mr1850110edw.16.1685449866702; Tue, 30 May 2023 05:31:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685449866; cv=none; d=google.com; s=arc-20160816; b=FC75s+WB4oBIMfgnYCyyzR7bGqxCPRILZBqWGivGEngywKpY4uXBv1Y8kWtpAg3VLd Y18elnE3CSicMGxTWF3kEW/1uuFvKqZQbEeJS+BQOajh7lw3tNy4LIfZuseSaed4K2/+ oB4hG7oGeNslhAhFLoskA/w/OYoZJEde4p+ToawBQglLYNd9bFEZsAZyPrP0Z4GiI60s hOz8Yd2CksDBWKwY/4FbJ14BH2dd19sqlFZpC56QKGnJyW2L6GWJF69LqXHB4n/euHxg RfnhZWtHa91EJypjMR1PDbyTEML1SJqc9QZF6HpiMwU4802cw3Q+L5nLLmcM1coxhJCC ovdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=5Ky7PDnk2100gdTrKSPOvnXiYLih4JbzxOv1Czy5Pw8=; b=m+fJzIa9YP7XOOBdQYyewNo0KCz4V6NE7mge9xJjUUuDn4EdRGjNrb+UjSFtw/JeeF dzAWCGiPppzPbariqAf2QRKO8EfRs4WpQu3wZKZ/kb/4Wdu+FTNbBppKYBdcq7ii18tT luglBigqBg8wuBtXqk4e9Mes8oWcWAqsNVk0UTncedgiKfdmgcuf94A5GFDkeEjb0RSV MpiR2j4zQFzXaQ0+nYmyw0Ly0eusYNMXNd3ziMqCRB43v/omxi6ydOoIaBZ1JTbrmp1E qlWj5MiqBuvA64hJOelkN4OZui9ef5rkfZq7rsAYLO+QSzdCRETAykvnvq8jBZt0B808 5oWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=iANm4nsS; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c24-20020aa7c998000000b0050bc4725340si6082320edt.137.2023.05.30.05.31.05; Tue, 30 May 2023 05:31:06 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=iANm4nsS; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A3FC268C218; Tue, 30 May 2023 15:30:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0A6C568C033 for ; Tue, 30 May 2023 15:30:46 +0300 (EEST) Received: by mail-lj1-f174.google.com with SMTP id 38308e7fff4ca-2af29b37bd7so46711081fa.1 for ; Tue, 30 May 2023 05:30:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20221208.gappssmtp.com; s=20221208; t=1685449845; x=1688041845; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=IuZDgdrmpMiyisnE9+wSiZBeFmXLeo+UAxfG9ElhFNU=; b=iANm4nsSTnhJp6rDtHVHT0T+mnXTb1BGX0DNcouPjJteyHeOzCeCi6KJSIENZb7wT+ RM/+jFNnXcIRdzc2jrsQtr2ZswbjxAEM+mauskfN/vREnNPyfp6MOK3r/1Z2KenG5Nmp q6fwBNfW1qYKyF3ExrdpoMSV2akZFZk2o18MojFlihqs7v8F7I0wTQtiVLGT2kOi7gC1 fZsXLLEoXD41/aZRU41xSGweA/G6gOh5h5c0LB9HFUNXh3rWowB4nj7FgxDYj6GFCJhc 7/Vnau1UghN1K/f5xnvL5JRPBoJaf5AdHDv3yY9Eu93DK0EBhUqtyXvFti7/A+2X27j1 FI0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685449845; x=1688041845; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IuZDgdrmpMiyisnE9+wSiZBeFmXLeo+UAxfG9ElhFNU=; b=Htl6MowismwaYbXIUFCTc8+EK+bmF0MO76Z/WovX/bToQ7jKfutQ8T+KG7VwdKIQh6 IHm6amTBFPF/PRdliMnH1jEwllT4Yjz4j3z6Hlwqt2tG6S8V9Xnefp8hWcKaXdKLz+Ug /zdtU6iouBp7LFF38Lw7lXwbHjMIWBgXVUfjy0ZBbi8ZTVnP/lRd9inHtrKCV//J2o26 hNhPiOsqbKTFgYJ5vLRCBpkmN2CO0qbUEtopfXNzG7SKBtlyCe2yoQQzK5ILQch0eA5Z 0fvMA9LsKdNJoJzBFBpHTKVIYYHjqPX8sYba30Mn2CV9Q+dACAEMQNapTBOL/wDBJtTy 1rFQ== X-Gm-Message-State: AC+VfDyctUZxn5u1YRpMFkl0Dg/AKpA7RNZUWsOWikY49vr7ToLN/4Hw USQHeAgj39zy1LTfMahKdmLLYH7q6sFs3M/NvXZ3/g== X-Received: by 2002:a2e:8782:0:b0:2ac:8efb:fc02 with SMTP id n2-20020a2e8782000000b002ac8efbfc02mr708535lji.4.1685449845160; Tue, 30 May 2023 05:30:45 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id s10-20020a2e98ca000000b002aa4713b925sm2855456ljj.21.2023.05.30.05.30.44 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 30 May 2023 05:30:44 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 30 May 2023 15:30:40 +0300 Message-Id: <20230530123043.52940-2-martin@martin.st> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230530123043.52940-1-martin@martin.st> References: <20230530123043.52940-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 2/5] aarch64: Add cpu flags for the dotprod and i8mm extensions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ESrU4RUtUBiZ Set these available if they are available unconditionally for the compiler. --- Fixed the name of the __ARM_FEATURE define used for detecting i8mm. --- libavutil/aarch64/cpu.c | 15 ++++++++++++--- libavutil/aarch64/cpu.h | 2 ++ libavutil/cpu.c | 2 ++ libavutil/cpu.h | 2 ++ libavutil/tests/cpu.c | 2 ++ tests/checkasm/checkasm.c | 2 ++ 6 files changed, 22 insertions(+), 3 deletions(-) diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c index cc641da576..0c76f5ad15 100644 --- a/libavutil/aarch64/cpu.c +++ b/libavutil/aarch64/cpu.c @@ -22,9 +22,18 @@ int ff_get_cpu_flags_aarch64(void) { - return AV_CPU_FLAG_ARMV8 * HAVE_ARMV8 | - AV_CPU_FLAG_NEON * HAVE_NEON | - AV_CPU_FLAG_VFP * HAVE_VFP; + int flags = AV_CPU_FLAG_ARMV8 * HAVE_ARMV8 | + AV_CPU_FLAG_NEON * HAVE_NEON | + AV_CPU_FLAG_VFP * HAVE_VFP; + +#ifdef __ARM_FEATURE_DOTPROD + flags |= AV_CPU_FLAG_DOTPROD; +#endif +#ifdef __ARM_FEATURE_MATMUL_INT8 + flags |= AV_CPU_FLAG_I8MM; +#endif + + return flags; } size_t ff_get_cpu_max_align_aarch64(void) diff --git a/libavutil/aarch64/cpu.h b/libavutil/aarch64/cpu.h index 2ee3f9323a..64d703be37 100644 --- a/libavutil/aarch64/cpu.h +++ b/libavutil/aarch64/cpu.h @@ -25,5 +25,7 @@ #define have_armv8(flags) CPUEXT(flags, ARMV8) #define have_neon(flags) CPUEXT(flags, NEON) #define have_vfp(flags) CPUEXT(flags, VFP) +#define have_dotprod(flags) CPUEXT(flags, DOTPROD) +#define have_i8mm(flags) CPUEXT(flags, I8MM) #endif /* AVUTIL_AARCH64_CPU_H */ diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 2c5f7f4958..2ffc3986aa 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -174,6 +174,8 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "armv8", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ARMV8 }, .unit = "flags" }, { "neon", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_NEON }, .unit = "flags" }, { "vfp", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VFP }, .unit = "flags" }, + { "dotprod", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_DOTPROD }, .unit = "flags" }, + { "i8mm", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_I8MM }, .unit = "flags" }, #elif ARCH_MIPS { "mmi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MMI }, .unit = "flags" }, { "msa", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MSA }, .unit = "flags" }, diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 8fa5ea9199..da486f9c7a 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -69,6 +69,8 @@ #define AV_CPU_FLAG_NEON (1 << 5) #define AV_CPU_FLAG_ARMV8 (1 << 6) #define AV_CPU_FLAG_VFP_VM (1 << 7) ///< VFPv2 vector mode, deprecated in ARMv7-A and unavailable in various CPUs implementations +#define AV_CPU_FLAG_DOTPROD (1 << 8) +#define AV_CPU_FLAG_I8MM (1 << 9) #define AV_CPU_FLAG_SETEND (1 <<16) #define AV_CPU_FLAG_MMI (1 << 0) diff --git a/libavutil/tests/cpu.c b/libavutil/tests/cpu.c index dadadb31dc..a52637339d 100644 --- a/libavutil/tests/cpu.c +++ b/libavutil/tests/cpu.c @@ -38,6 +38,8 @@ static const struct { { AV_CPU_FLAG_ARMV8, "armv8" }, { AV_CPU_FLAG_NEON, "neon" }, { AV_CPU_FLAG_VFP, "vfp" }, + { AV_CPU_FLAG_DOTPROD, "dotprod" }, + { AV_CPU_FLAG_I8MM, "i8mm" }, #elif ARCH_ARM { AV_CPU_FLAG_ARMV5TE, "armv5te" }, { AV_CPU_FLAG_ARMV6, "armv6" }, diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 7389ebaee9..4311a8ffcb 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -230,6 +230,8 @@ static const struct { #if ARCH_AARCH64 { "ARMV8", "armv8", AV_CPU_FLAG_ARMV8 }, { "NEON", "neon", AV_CPU_FLAG_NEON }, + { "DOTPROD", "dotprod", AV_CPU_FLAG_DOTPROD }, + { "I8MM", "i8mm", AV_CPU_FLAG_I8MM }, #elif ARCH_ARM { "ARMV5TE", "armv5te", AV_CPU_FLAG_ARMV5TE }, { "ARMV6", "armv6", AV_CPU_FLAG_ARMV6 }, From patchwork Tue May 30 12:30:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 41893 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp2427021pzb; Tue, 30 May 2023 05:31:16 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5J73C3xZ4btzqfm1gOvAlfUPd9GAM8TDu4nb3pZD72gAT5l1aRZ+E0VU728MWk5kkGDEMI X-Received: by 2002:a17:907:6e20:b0:967:142b:ff07 with SMTP id sd32-20020a1709076e2000b00967142bff07mr2235349ejc.21.1685449875996; Tue, 30 May 2023 05:31:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685449875; cv=none; d=google.com; s=arc-20160816; b=zlv6/23aW/QDzQA+G7QiuqRU2coJ7NNuQmNF9+sP4ICIZsfwcn2m88FUG5X82dlt9F 7hdElokRFhun3gXuLgBJkPLLJGEZ4F4zWc5lTEgiycOxoTvLKv/xnpN+dfpCkZZ9ciqi MpgC3t1nf8bVDsXP9bOWeesfhv48LfjREMLW2jDlLlmikdsNIRhWiZhmfyjcFtCDiCVx JYpLTqKnGw6I6N7fxQL0Ykh50NY8pWEZ3neamWg2WSX/cFLu9aHBxIobrcn2zulKqaIt cfZnepNW3aCgi0M0OvLjfZPqH0/OX/0k591Rtb1Pb30xrX1x239pvIQO7Kj9jYHLZN8L sIdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=1lpMYJXQQqSzZVmzMcKlsnwvU0+g07wST02kmG+i0ZU=; b=ANxso+EDE5NvMZNYWHoQsvMyBuls5uDBYNLFCJZh/4W1HJRZF2xpK33Sr+GzZKMAD6 OaS/91yKnixTJ7v0AS/bS3P1dZjxRe/E4yTF30j8ZRR4BrOY2k2go2RhVRtPglsDzvWO diHstm7Yv77BOndfuG6c2qZG6tTxPj3Rag3fI4he8WF+vOvJBncywAIpOo0IqLQV6sIR eIvPezv4QcbjBHZmTt8k/vttyO5e0x58UMX53OTy3DddGH6RhQPK8JPezjacyc4uP8/x omahAzR7ObFQir5Phxw3spt1OW1cKhw/F1GGiBh+lCu5jNCc6MsqGzX8mq85KAmk7wsx iH1A== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=aFXN3u61; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 18-20020a170906311200b0096f75d92716si62098ejx.752.2023.05.30.05.31.14; Tue, 30 May 2023 05:31:15 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=aFXN3u61; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 82BC468C226; Tue, 30 May 2023 15:30:54 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com [209.85.167.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7EAAB68C1B8 for ; Tue, 30 May 2023 15:30:46 +0300 (EEST) Received: by mail-lf1-f49.google.com with SMTP id 2adb3069b0e04-4f4b80bf93aso4813306e87.0 for ; Tue, 30 May 2023 05:30:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20221208.gappssmtp.com; s=20221208; t=1685449846; x=1688041846; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=CnfuONt0/CI6Q761wBnwtw9ynoX0vT8WN3wE6akqhE4=; b=aFXN3u61GJEKJBDSTrf3rYw/OMOggZ6QSDIWVSEAF0BPpbgDlgYhKbqxg6PhWxrriY LoXTmCVCsPLLWsXhd+99TFlC9DXd7tiRMmCyaJWpggPTD76gmQCPha1e2Zpo6LsY3yXQ ZoJotDlYWCWoAg6ZDUwCaGA6IW6rdO2BsHA/GPYsNHskoSXZvunHvyIMOwupqw9yoq7p 3x7dvhsnZv7obkoNzHw18ZZtIpaX7NMmV8tf2YAvvBiUy355EmOWD7f9yi+iF/zWDQgD 4qdjYnFaIY0oazX/4qtaeHO+tqUm1+msT352e7buHWS9s23YCi8nhSdqIEnIO5Fth4+z ahjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685449846; x=1688041846; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CnfuONt0/CI6Q761wBnwtw9ynoX0vT8WN3wE6akqhE4=; b=Vx39DCf5xSMAZu8tyoscEeYoDhxpo+lN/xDBej8eXfnYr65Aur5X0GqvC4QiC1NHul 5/dh0kHmc9ks9b3lnApFu0ZrV+YefxFFjd0aJ2uinPXE9F+t37KHLBz1GzBXNsgNAMIX 5YqsaSIka+k5mdpVKOZxS2XtRVDPhte/WAYkJ/xyWfD7PZ3wQ9Yw4Ei0VYzRvidl5KpW 4k0MlrizERM671vVwpWu32KnBH2/yRAmLQC7/oV9oaFssQ9PLTU5UymJpuXkxs108gp0 LpNRcnzUttk6GdlpXl+YtZkaONiTS/58WPTrzzub4VSZ04ZPqPvxMsvjqsWSPE1OAzwg X3jQ== X-Gm-Message-State: AC+VfDza0YxTyxtCw1Kwte/4saqR18JVPv3YPPudpa601Q3X7iOGlO7c U8QBCuH/ekchbSV44ls8UFt0Hujwird/JLSGy+GNEA== X-Received: by 2002:ac2:4d1a:0:b0:4f3:94b5:3272 with SMTP id r26-20020ac24d1a000000b004f394b53272mr604484lfi.11.1685449845834; Tue, 30 May 2023 05:30:45 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id c20-20020ac24154000000b004eb12850c40sm335078lfi.14.2023.05.30.05.30.45 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 30 May 2023 05:30:45 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 30 May 2023 15:30:41 +0300 Message-Id: <20230530123043.52940-3-martin@martin.st> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230530123043.52940-1-martin@martin.st> References: <20230530123043.52940-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 3/5] aarch64: Add Linux runtime cpu feature detection using getauxval(AT_HWCAP) X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NjLNDRaE+tsw Based partially on code by Janne Grunau. --- Updated to use both the direct HWCAP* macros and HWCAP_CPUID. A not unreasonably old distribution like Ubuntu 20.04 does have HWCAP_CPUID but not HWCAP2_I8MM in the distribution provided headers. Alternatively I guess we could carry our own fallback hardcoded values for the HWCAP* values we use and skip HWCAP_CPUID. --- configure | 2 ++ libavutil/aarch64/cpu.c | 63 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/configure b/configure index 50eb27ba0e..b39de74de5 100755 --- a/configure +++ b/configure @@ -2209,6 +2209,7 @@ HAVE_LIST_PUB=" HEADERS_LIST=" arpa_inet_h + asm_hwcap_h asm_types_h cdio_paranoia_h cdio_paranoia_paranoia_h @@ -6432,6 +6433,7 @@ check_headers io.h enabled libdrm && check_headers linux/dma-buf.h +check_headers asm/hwcap.h check_headers linux/perf_event.h check_headers libcrystalhd/libcrystalhd_if.h check_headers malloc.h diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c index 0c76f5ad15..4563959ffd 100644 --- a/libavutil/aarch64/cpu.c +++ b/libavutil/aarch64/cpu.c @@ -20,6 +20,67 @@ #include "libavutil/cpu_internal.h" #include "config.h" +#if (defined(__linux__) || defined(__ANDROID__)) && HAVE_GETAUXVAL && HAVE_ASM_HWCAP_H +#include +#include +#include + +#define get_cpu_feature_reg(reg, val) \ + __asm__("mrs %0, " #reg : "=r" (val)) + +static int detect_flags(void) +{ + int flags = 0; + unsigned long hwcap, hwcap2; + + // Check for support using direct individual HWCAPs + hwcap = getauxval(AT_HWCAP); +#ifdef HWCAP_ASIMDDP + if (hwcap & HWCAP_ASIMDDP) + flags |= AV_CPU_FLAG_DOTPROD; +#endif + +#ifdef AT_HWCAP2 + hwcap2 = getauxval(AT_HWCAP2); +#ifdef HWCAP2_I8MM + if (hwcap2 & HWCAP2_I8MM) + flags |= AV_CPU_FLAG_I8MM; +#endif +#endif + + // Silence warnings if none of the hwcaps to check are known. + (void)hwcap; + (void)hwcap2; + +#if defined(HWCAP_CPUID) + // The HWCAP_* defines for individual extensions may become available late, as + // they require updates to userland headers. As a fallback, see if we can access + // the CPUID registers (trapped via the kernel). + // See https://www.kernel.org/doc/html/latest/arm64/cpu-feature-registers.html + if (hwcap & HWCAP_CPUID) { + uint64_t tmp; + + get_cpu_feature_reg(ID_AA64ISAR0_EL1, tmp); + if (((tmp >> 44) & 0xf) == 0x1) + flags |= AV_CPU_FLAG_DOTPROD; + get_cpu_feature_reg(ID_AA64ISAR1_EL1, tmp); + if (((tmp >> 52) & 0xf) == 0x1) + flags |= AV_CPU_FLAG_I8MM; + } +#endif + + return flags; +} + +#else + +static int detect_flags(void) +{ + return 0; +} + +#endif + int ff_get_cpu_flags_aarch64(void) { int flags = AV_CPU_FLAG_ARMV8 * HAVE_ARMV8 | @@ -33,6 +94,8 @@ int ff_get_cpu_flags_aarch64(void) flags |= AV_CPU_FLAG_I8MM; #endif + flags |= detect_flags(); + return flags; } From patchwork Tue May 30 12:30:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 41894 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp2427198pzb; Tue, 30 May 2023 05:31:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4fsRCMxG2Sa8Letmk5ufi2gO2PRw0m2p+0o5/NELmmJWdtaQ7K4/NIMrB87pG+Qbipju1n X-Received: by 2002:a2e:86d1:0:b0:2ad:8ffe:5f37 with SMTP id n17-20020a2e86d1000000b002ad8ffe5f37mr741186ljj.47.1685449887847; Tue, 30 May 2023 05:31:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685449887; cv=none; d=google.com; s=arc-20160816; b=wYiGRuA47gCN81y745ewZzsc5WlVVzfRwy5sMUPEGlQ0TBKNEA1mxZlET2s/FDekJH I2oBmXGrXWMpTIVYeuLSQUJuNxzJmnQJVe1qbUR3zh9qB0991wGPfqM5+lc8gXHghETT KAfOr90X+vEIl4F6mlFRPaIi5XheecgpNc1sv2FEQTRqLoO/hgOMdxFsD/IOO19ra8Z3 jSM2HRa+57z0cFOXv8e9SM6LiWRbRWbGfivPzXUJ1/0/c6/AqT6GjSXWgmzNwq9z5Pf1 nI31vnTVKh7gliiEsJrXYJSvKEM+W0oNixxVJPm3ndwghmOh4F6Xh6WlYcCWcd/VZrMQ gk6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=RErd4lbGyqWIHVtkedPBl0bj450EMy44taoFZWyvnF4=; b=RLQ6gqHPYYrLT5lQEhogeJ/625AOMFrJCFB8RgDO/D+ulby7FQ/rAyr1/enbLXhhNN 7MGkZUImUG6BZybS9g3Ox0/GyIwVa8/ZA0eu1WbxLp2g9rxfEpfbnBCJYJO4qDgYCiiS CUEvZH30tObUpPcom8gVms8AAXz54OOVDVTeNoIP7MV9AMSdbJN0q07NqReJFP/uvmuj SbQ4Q8oWPcWQi/t4C8lPjiWOcrZRGZQtZAXLWyvi8LZTrHS5TLBWOH1ueobzJMDv7mZt 7mXS7so1iSK+6tPywIorNV4hCKkYBQQiufN95KtwUNvK7JiKAAjjAn/p0ND1cJPVMV5M jBeA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=ml2ohhJY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s21-20020a056402165500b0051059216f7dsi2147257edx.551.2023.05.30.05.31.24; Tue, 30 May 2023 05:31:27 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=ml2ohhJY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 45F4068C1BD; Tue, 30 May 2023 15:30:55 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3C8C368C1B8 for ; Tue, 30 May 2023 15:30:47 +0300 (EEST) Received: by mail-lf1-f53.google.com with SMTP id 2adb3069b0e04-4f4d6aee530so4623487e87.2 for ; Tue, 30 May 2023 05:30:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20221208.gappssmtp.com; s=20221208; t=1685449846; x=1688041846; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=O2+IbrXx/1Ou37DYeQWMEhuDCu+cgYvekvMOMqKtGoM=; b=ml2ohhJY21jv8WOhSxMs2exb0xuWlC1yZQaVHltx17EDGuqXVp3Db+m8m/NzulwMu1 yFZX9Lb1oLkK/274/ohs6TghyOtl6A7+DS1BoLHqttw0crinR8/0hpiKQu3n9DBm1kBL etIViYzUAldydkxpc47y1PcNwZSrJCiEqJ0VYnbPmZvwiGqb45g6ZOdTioSEv7qs3NGa b8YhC8WEZrs1F7jX6V9wnwxxhs2lu3Vqik1af27UY4r7Q4wb7xOCMCkKsDPOCDd5i8jI 8s17irQ1CJj2WImTJvKhwXo4axYm3d+QgZJd5J7p1ZnbRMccIWINLKXHw2ojTl4UFhjX CgLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685449846; x=1688041846; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=O2+IbrXx/1Ou37DYeQWMEhuDCu+cgYvekvMOMqKtGoM=; b=Xkz0YmmXIJwnLhZ903ZuqD4kNbe7jCNIL/Y02fd3q+t/vL2LRlt7tv8pW45oXOyBvU TURhjq3FccsQKnxlR1WWX19IAmPcbhPGhBR7795ON1F4W7UfxA69uumBVDD0C0Os8tUn mf5dnm6AVNuP2TFwjf/xFtOchFSCwuNAlwPb5IK3PcOvBNIw8jOnK94YOctemducBlLm tjsHvlhaYKU2W/Dq5k0aEkM93VvwybuLnnzYlVhxobcrk1zS1Ir2nKkMJ3Y37JL+cicO s6zdk2VjQz3qotaj9B8k5IAiW8ejWzf0/oKag1rvHCtoUs05HFv02PSE4RDlcyr/8dbK P5lw== X-Gm-Message-State: AC+VfDxV9BUN9ZpjniXosMH5Uoff8mkqM1UEfR5/Wsk0BXZ3rktoaecV GVbwfINA1dLRo/7BlJ0L9d20tve51GPTN/ExGP/lew== X-Received: by 2002:a2e:9f01:0:b0:2a7:98b2:923b with SMTP id u1-20020a2e9f01000000b002a798b2923bmr668675ljk.0.1685449846547; Tue, 30 May 2023 05:30:46 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id y26-20020a2e321a000000b002a77e01c3a1sm2836760ljy.22.2023.05.30.05.30.46 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 30 May 2023 05:30:46 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 30 May 2023 15:30:42 +0300 Message-Id: <20230530123043.52940-4-martin@martin.st> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230530123043.52940-1-martin@martin.st> References: <20230530123043.52940-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 4/5] aarch64: Add Apple runtime detection of dotprod and i8mm using sysctl X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: YPxeqJ/NBnKZ For now, there's not much value in this since Clang don't support enabling the dotprod or i8mm features with either .arch_extension or .arch (it has to be enabled by the base arch flags passed to the compiler). But it may be supported in the future. --- configure | 2 ++ libavutil/aarch64/cpu.c | 22 ++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/configure b/configure index b39de74de5..001287c169 100755 --- a/configure +++ b/configure @@ -2348,6 +2348,7 @@ SYSTEM_FUNCS=" strerror_r sysconf sysctl + sysctlbyname usleep UTGetOSTypeFromString VirtualAlloc @@ -6394,6 +6395,7 @@ check_func_headers mach/mach_time.h mach_absolute_time check_func_headers stdlib.h getenv check_func_headers sys/stat.h lstat check_func_headers sys/auxv.h getauxval +check_func_headers sys/sysctl.h sysctlbyname check_func_headers windows.h GetModuleHandle check_func_headers windows.h GetProcessAffinityMask diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c index 4563959ffd..ffb00f6dd2 100644 --- a/libavutil/aarch64/cpu.c +++ b/libavutil/aarch64/cpu.c @@ -72,6 +72,28 @@ static int detect_flags(void) return flags; } +#elif defined(__APPLE__) && HAVE_SYSCTLBYNAME +#include + +static int detect_flags(void) +{ + uint32_t value = 0; + size_t size; + int flags = 0; + + size = sizeof(value); + if (!sysctlbyname("hw.optional.arm.FEAT_DotProd", &value, &size, NULL, 0)) { + if (value) + flags |= AV_CPU_FLAG_DOTPROD; + } + size = sizeof(value); + if (!sysctlbyname("hw.optional.arm.FEAT_I8MM", &value, &size, NULL, 0)) { + if (value) + flags |= AV_CPU_FLAG_I8MM; + } + return flags; +} + #else static int detect_flags(void) From patchwork Tue May 30 12:30:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 41895 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp2427292pzb; Tue, 30 May 2023 05:31:35 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7cZW84RSXKOiCQ98Zha9CIEhHh+Y8N3NMwGWbUOFGfEg/R+7NkBq6sAVdaQMZL8FhCfsuV X-Received: by 2002:a2e:a0d1:0:b0:2a8:b286:8272 with SMTP id f17-20020a2ea0d1000000b002a8b2868272mr666721ljm.15.1685449894952; Tue, 30 May 2023 05:31:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685449894; cv=none; d=google.com; s=arc-20160816; b=XNu1esXrhAdJuUOf2q1K3S38Nb+qarbPJCI9Zwh6OLQId2/4GQnGojcJCzdtXhfhrG WUJ0j4se+IjzEROTiqr7deZBpNXO4qx4tvBd2B3FipN6wg0DORC94ZtufHQvLjQNtxRV TczEnzjJ5bdUqEWMPtI7SbkqtpndNdYemfzNNNItEUsQ0BZ86ApBOvZzXODNYSO/zEWh jXl4iH4XSBSyh0SRKmvTZd9L3fUizByCeBTiZHZE1oWsh7/a1mfQ07Odg4KlroUhF7Cs sLRhKOMJUPz4lj+RJERpVXdhUCMO0F3n3pzSPprRiRd5J1ijsDOf21Lr44EbCjao78rN WprA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=acpfYj/vBEcfShP+ualFupLsC8ZRS6mnB+TuWMmohtU=; b=HNzFmRLEQkk4AQRmf9niB4WMe6xCg6/uKONGjgnYMO4HEf+Dlg5gfWg2pLY3Dc5lFS l1LTR1DMDynd7+6edCdp96VzQganPo5Z+4Wr8MD0TR3v0aXIVHOGmlHyjlUa8bjntMD1 /vXToWe5uaxBHDyqBgtkF22QnkPTR1NqQhR0ovFyVxuQFPyEAHK+IZkhZDuT0vBX8+AB TvYPcm8lhbGigDOlD99/B6eldRuPWPLwM1JevYM76n3DfoXXekkvrRj+gNLidteAHCER nDW6Si0miraoQQqebTYG0spQFrOGMFo8b6pZBIC1BblmHpoGlGq8UgeM3lQ90kbsOdhN RdnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=sYiXJKxr; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d4-20020a056402516400b00514957aabc8si4493410ede.493.2023.05.30.05.31.34; Tue, 30 May 2023 05:31:34 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=sYiXJKxr; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2500468C227; Tue, 30 May 2023 15:30:56 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 102B268C16A for ; Tue, 30 May 2023 15:30:48 +0300 (EEST) Received: by mail-lf1-f53.google.com with SMTP id 2adb3069b0e04-4f3b4ed6fdeso4616528e87.3 for ; Tue, 30 May 2023 05:30:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20221208.gappssmtp.com; s=20221208; t=1685449847; x=1688041847; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=p6u9VoIp6uKAm1YK+b9wPM9c0YeBndAJnRmlmYxICgw=; b=sYiXJKxrgOM+MpZA0xPtfLrc0MDQ8JGkm5jfuaMipyNxX12VsI/wAr88U2wEwylTy3 weBOj4WYjgPsme3MJ7JB+iomaUbHqmIF+HNpkPuDm6O+j3EnbqqN4eDQrFGbpTxTOn+j fV8oo0AQsdg1angIxHSBFo5NymjAryilPD0ARTNSzP18Zvfo8ilzNDuSkm7CBHX6GE/i 39RT4arTMeh+9Bfh8iOElaIBHPGsLHkNP0JAbPcRXsL/S6A5d63hCYvxevYVTVV/inIn CtvT4fYGXFNtuwHx3tVPPYdZN3jBKhfHwaeYP56LDB6Ghs4RFpZwEXPZLqxuZoKldZR7 5wgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685449847; x=1688041847; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p6u9VoIp6uKAm1YK+b9wPM9c0YeBndAJnRmlmYxICgw=; b=EHVWjY91bvcx+CPJouOZ97xD6x25IN4wQhab3MBSZK94RhQEk4H8U/87gToClXQUdb hCjcXLe5bDzUncPkQ7op9DwIaeUbCdSMvLnZ6c/aIjpaofS+TwlfBZ7XS5mvw9Ii9tJJ te0HEMw5rteteKSO4xnbw0LAmV5nvMqG0VYAAO9l5KPVmv7hpj2k2XKt/judX2KTKdN3 TprZlYEP88wkNBuEsVkQhKbM4sZn2LnfZOl3b/Ctnjbh4lTxzo+1Mi8ixAKZgDiOcFHc IQAUqm4oe4PSAdMtx93Vo1Elz4ZVdqYIn2CHG8fqg++e/nZtD/Iwjfjt6lOy1wtrZcb1 /Ypg== X-Gm-Message-State: AC+VfDyhwmA5ZPnCJGSXwXidab58H904sOs4NdGS0g9G0VIRkQSZSy95 fHoM6OgzauPNh1ujz/BDQrEE3ej71QYg8zrPSb5oVA== X-Received: by 2002:ac2:5e8d:0:b0:4f2:4caa:cc67 with SMTP id b13-20020ac25e8d000000b004f24caacc67mr673830lfq.40.1685449847458; Tue, 30 May 2023 05:30:47 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id q20-20020ac25294000000b004f3aee3aae2sm328763lfm.140.2023.05.30.05.30.47 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 30 May 2023 05:30:47 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 30 May 2023 15:30:43 +0300 Message-Id: <20230530123043.52940-5-martin@martin.st> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230530123043.52940-1-martin@martin.st> References: <20230530123043.52940-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 5/5] aarch64: Add Windows runtime detection of the dotprod instructions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: xOjRmtRbna92 For Windows, there's no publicly defined constant for checking for the i8mm extension yet. --- libavutil/aarch64/cpu.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c index ffb00f6dd2..4b97530240 100644 --- a/libavutil/aarch64/cpu.c +++ b/libavutil/aarch64/cpu.c @@ -94,6 +94,16 @@ static int detect_flags(void) return flags; } +#elif defined(_WIN32) +#include + +static int detect_flags(void) +{ + int flags = 0; + if (IsProcessorFeaturePresent(PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE)) + flags |= AV_CPU_FLAG_DOTPROD; + return flags; +} #else static int detect_flags(void)