From patchwork Tue May 30 12:30:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 41892 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:c51c:b0:10c:5e6f:955f with SMTP id gm28csp2426880pzb; Tue, 30 May 2023 05:31:06 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6KoBLO7NKWHWZ/aiNZ7Y3DjV7fddiqGkuLjGP1SepSdWFm3Y58TwyKNaMof+tMsVEIpY6t X-Received: by 2002:a05:6402:1245:b0:514:9e81:9065 with SMTP id l5-20020a056402124500b005149e819065mr1850110edw.16.1685449866702; Tue, 30 May 2023 05:31:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685449866; cv=none; d=google.com; s=arc-20160816; b=FC75s+WB4oBIMfgnYCyyzR7bGqxCPRILZBqWGivGEngywKpY4uXBv1Y8kWtpAg3VLd Y18elnE3CSicMGxTWF3kEW/1uuFvKqZQbEeJS+BQOajh7lw3tNy4LIfZuseSaed4K2/+ oB4hG7oGeNslhAhFLoskA/w/OYoZJEde4p+ToawBQglLYNd9bFEZsAZyPrP0Z4GiI60s hOz8Yd2CksDBWKwY/4FbJ14BH2dd19sqlFZpC56QKGnJyW2L6GWJF69LqXHB4n/euHxg RfnhZWtHa91EJypjMR1PDbyTEML1SJqc9QZF6HpiMwU4802cw3Q+L5nLLmcM1coxhJCC ovdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=5Ky7PDnk2100gdTrKSPOvnXiYLih4JbzxOv1Czy5Pw8=; b=m+fJzIa9YP7XOOBdQYyewNo0KCz4V6NE7mge9xJjUUuDn4EdRGjNrb+UjSFtw/JeeF dzAWCGiPppzPbariqAf2QRKO8EfRs4WpQu3wZKZ/kb/4Wdu+FTNbBppKYBdcq7ii18tT luglBigqBg8wuBtXqk4e9Mes8oWcWAqsNVk0UTncedgiKfdmgcuf94A5GFDkeEjb0RSV MpiR2j4zQFzXaQ0+nYmyw0Ly0eusYNMXNd3ziMqCRB43v/omxi6ydOoIaBZ1JTbrmp1E qlWj5MiqBuvA64hJOelkN4OZui9ef5rkfZq7rsAYLO+QSzdCRETAykvnvq8jBZt0B808 5oWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=iANm4nsS; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c24-20020aa7c998000000b0050bc4725340si6082320edt.137.2023.05.30.05.31.05; Tue, 30 May 2023 05:31:06 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20221208.gappssmtp.com header.s=20221208 header.b=iANm4nsS; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id A3FC268C218; Tue, 30 May 2023 15:30:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0A6C568C033 for ; Tue, 30 May 2023 15:30:46 +0300 (EEST) Received: by mail-lj1-f174.google.com with SMTP id 38308e7fff4ca-2af29b37bd7so46711081fa.1 for ; Tue, 30 May 2023 05:30:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20221208.gappssmtp.com; s=20221208; t=1685449845; x=1688041845; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=IuZDgdrmpMiyisnE9+wSiZBeFmXLeo+UAxfG9ElhFNU=; b=iANm4nsSTnhJp6rDtHVHT0T+mnXTb1BGX0DNcouPjJteyHeOzCeCi6KJSIENZb7wT+ RM/+jFNnXcIRdzc2jrsQtr2ZswbjxAEM+mauskfN/vREnNPyfp6MOK3r/1Z2KenG5Nmp q6fwBNfW1qYKyF3ExrdpoMSV2akZFZk2o18MojFlihqs7v8F7I0wTQtiVLGT2kOi7gC1 fZsXLLEoXD41/aZRU41xSGweA/G6gOh5h5c0LB9HFUNXh3rWowB4nj7FgxDYj6GFCJhc 7/Vnau1UghN1K/f5xnvL5JRPBoJaf5AdHDv3yY9Eu93DK0EBhUqtyXvFti7/A+2X27j1 FI0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685449845; x=1688041845; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IuZDgdrmpMiyisnE9+wSiZBeFmXLeo+UAxfG9ElhFNU=; b=Htl6MowismwaYbXIUFCTc8+EK+bmF0MO76Z/WovX/bToQ7jKfutQ8T+KG7VwdKIQh6 IHm6amTBFPF/PRdliMnH1jEwllT4Yjz4j3z6Hlwqt2tG6S8V9Xnefp8hWcKaXdKLz+Ug /zdtU6iouBp7LFF38Lw7lXwbHjMIWBgXVUfjy0ZBbi8ZTVnP/lRd9inHtrKCV//J2o26 hNhPiOsqbKTFgYJ5vLRCBpkmN2CO0qbUEtopfXNzG7SKBtlyCe2yoQQzK5ILQch0eA5Z 0fvMA9LsKdNJoJzBFBpHTKVIYYHjqPX8sYba30Mn2CV9Q+dACAEMQNapTBOL/wDBJtTy 1rFQ== X-Gm-Message-State: AC+VfDyctUZxn5u1YRpMFkl0Dg/AKpA7RNZUWsOWikY49vr7ToLN/4Hw USQHeAgj39zy1LTfMahKdmLLYH7q6sFs3M/NvXZ3/g== X-Received: by 2002:a2e:8782:0:b0:2ac:8efb:fc02 with SMTP id n2-20020a2e8782000000b002ac8efbfc02mr708535lji.4.1685449845160; Tue, 30 May 2023 05:30:45 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id s10-20020a2e98ca000000b002aa4713b925sm2855456ljj.21.2023.05.30.05.30.44 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 30 May 2023 05:30:44 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 30 May 2023 15:30:40 +0300 Message-Id: <20230530123043.52940-2-martin@martin.st> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230530123043.52940-1-martin@martin.st> References: <20230530123043.52940-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 2/5] aarch64: Add cpu flags for the dotprod and i8mm extensions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ESrU4RUtUBiZ Set these available if they are available unconditionally for the compiler. --- Fixed the name of the __ARM_FEATURE define used for detecting i8mm. --- libavutil/aarch64/cpu.c | 15 ++++++++++++--- libavutil/aarch64/cpu.h | 2 ++ libavutil/cpu.c | 2 ++ libavutil/cpu.h | 2 ++ libavutil/tests/cpu.c | 2 ++ tests/checkasm/checkasm.c | 2 ++ 6 files changed, 22 insertions(+), 3 deletions(-) diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c index cc641da576..0c76f5ad15 100644 --- a/libavutil/aarch64/cpu.c +++ b/libavutil/aarch64/cpu.c @@ -22,9 +22,18 @@ int ff_get_cpu_flags_aarch64(void) { - return AV_CPU_FLAG_ARMV8 * HAVE_ARMV8 | - AV_CPU_FLAG_NEON * HAVE_NEON | - AV_CPU_FLAG_VFP * HAVE_VFP; + int flags = AV_CPU_FLAG_ARMV8 * HAVE_ARMV8 | + AV_CPU_FLAG_NEON * HAVE_NEON | + AV_CPU_FLAG_VFP * HAVE_VFP; + +#ifdef __ARM_FEATURE_DOTPROD + flags |= AV_CPU_FLAG_DOTPROD; +#endif +#ifdef __ARM_FEATURE_MATMUL_INT8 + flags |= AV_CPU_FLAG_I8MM; +#endif + + return flags; } size_t ff_get_cpu_max_align_aarch64(void) diff --git a/libavutil/aarch64/cpu.h b/libavutil/aarch64/cpu.h index 2ee3f9323a..64d703be37 100644 --- a/libavutil/aarch64/cpu.h +++ b/libavutil/aarch64/cpu.h @@ -25,5 +25,7 @@ #define have_armv8(flags) CPUEXT(flags, ARMV8) #define have_neon(flags) CPUEXT(flags, NEON) #define have_vfp(flags) CPUEXT(flags, VFP) +#define have_dotprod(flags) CPUEXT(flags, DOTPROD) +#define have_i8mm(flags) CPUEXT(flags, I8MM) #endif /* AVUTIL_AARCH64_CPU_H */ diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 2c5f7f4958..2ffc3986aa 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -174,6 +174,8 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "armv8", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ARMV8 }, .unit = "flags" }, { "neon", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_NEON }, .unit = "flags" }, { "vfp", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VFP }, .unit = "flags" }, + { "dotprod", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_DOTPROD }, .unit = "flags" }, + { "i8mm", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_I8MM }, .unit = "flags" }, #elif ARCH_MIPS { "mmi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MMI }, .unit = "flags" }, { "msa", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MSA }, .unit = "flags" }, diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 8fa5ea9199..da486f9c7a 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -69,6 +69,8 @@ #define AV_CPU_FLAG_NEON (1 << 5) #define AV_CPU_FLAG_ARMV8 (1 << 6) #define AV_CPU_FLAG_VFP_VM (1 << 7) ///< VFPv2 vector mode, deprecated in ARMv7-A and unavailable in various CPUs implementations +#define AV_CPU_FLAG_DOTPROD (1 << 8) +#define AV_CPU_FLAG_I8MM (1 << 9) #define AV_CPU_FLAG_SETEND (1 <<16) #define AV_CPU_FLAG_MMI (1 << 0) diff --git a/libavutil/tests/cpu.c b/libavutil/tests/cpu.c index dadadb31dc..a52637339d 100644 --- a/libavutil/tests/cpu.c +++ b/libavutil/tests/cpu.c @@ -38,6 +38,8 @@ static const struct { { AV_CPU_FLAG_ARMV8, "armv8" }, { AV_CPU_FLAG_NEON, "neon" }, { AV_CPU_FLAG_VFP, "vfp" }, + { AV_CPU_FLAG_DOTPROD, "dotprod" }, + { AV_CPU_FLAG_I8MM, "i8mm" }, #elif ARCH_ARM { AV_CPU_FLAG_ARMV5TE, "armv5te" }, { AV_CPU_FLAG_ARMV6, "armv6" }, diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 7389ebaee9..4311a8ffcb 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -230,6 +230,8 @@ static const struct { #if ARCH_AARCH64 { "ARMV8", "armv8", AV_CPU_FLAG_ARMV8 }, { "NEON", "neon", AV_CPU_FLAG_NEON }, + { "DOTPROD", "dotprod", AV_CPU_FLAG_DOTPROD }, + { "I8MM", "i8mm", AV_CPU_FLAG_I8MM }, #elif ARCH_ARM { "ARMV5TE", "armv5te", AV_CPU_FLAG_ARMV5TE }, { "ARMV6", "armv6", AV_CPU_FLAG_ARMV6 },