From patchwork Tue Sep 17 12:14:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 51632 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:d32e:0:b0:48e:c0f8:d0de with SMTP id cf14csp211113vqb; Tue, 17 Sep 2024 05:14:53 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVDHzRNKFRtp85vYeypgFE7czPrYUDDFyiHu/iZR0Nrc4YjpdVVGf0rmaCyIZMGDdUeShIc9T8M21PHNwPiGWfm@gmail.com X-Google-Smtp-Source: AGHT+IEb7ojyHSQHEz8vdmReRSKNFZ5Gv/Ks3p9pXemk32t+XqymDxBNfSNKHGvyzDg/P//dpHgz X-Received: by 2002:a05:651c:2228:b0:2f6:57b1:98b0 with SMTP id 38308e7fff4ca-2f787f4bee5mr107109601fa.42.1726575292897; Tue, 17 Sep 2024 05:14:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1726575292; cv=none; d=google.com; s=arc-20240605; b=Q39Yx5caTMgjUj1XN3MPIwuyu3d8hRDmeV1pXBJpt0zwPIPaKJoF43XFdigLxu3AQ+ SOXWkjCaFaDs98Pw1LlaHSTEeGPuv7m56e29FHdME4GyrA7gJZRA+K16+q8xGGXp8BnK WQdUVDLeGUN5FNBr038atgBi1blS2r3WYrFt1HT5tcq1RVKOyX7cKzzp7qrvEOYwgm9U 1YlsJEWTJ5DW6O7cwKgg6YqSc56YI4ld0IXPIlXmH1pqF4FBKJNE9WK+ZzF9w6zsR/G8 deZVTQnQHkMrniuwBJlLYsVxYrJQycANTLgvmZ/pievuZGgXw2J5wnMgbAhfaIh9xNKS jgcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=0iRCPA7I0Z0iUzhdmjdr4jc4blwr/HuMKDUSu2zw/O4=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=Xvvsf7FPgbWVvVqNyxmUThjz79uk0pAvAnbxsuDlGZg2wJ1tikjFHZzmVJXdA1GhPW QiDc7UpIy7idcVNBntLbEkIybqr29lRdLjiiATB5A312nxcGZO8/ybGQpO5umH6U4Jgo qk3n9LSJmdGVgDRKb12cjCzhjrDSolUXLb6knlApH2Vn/0qx3LhMmgVvWwLfTkVSRd+M dtcNyB6phkLh21jhgLoN/MVJz8IaVInNkuqHKg8nlECQQlHSa7gKlv1js9GEgvq0h9oR M5h7sJ9fuwlYG/lLZ4b04VwznD5EA1Yqap0F6aCe8c+c2hT4TD4r0xrSCqKw1J4MLrZ3 9Yww==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=qLWyMU5q; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 38308e7fff4ca-2f79d2c7f61si22380761fa.110.2024.09.17.05.14.51; Tue, 17 Sep 2024 05:14:52 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@martin-st.20230601.gappssmtp.com header.s=20230601 header.b=qLWyMU5q; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D08DF68DB08; Tue, 17 Sep 2024 15:14:31 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4F7A968D78B for ; Tue, 17 Sep 2024 15:14:22 +0300 (EEST) Received: by mail-lf1-f42.google.com with SMTP id 2adb3069b0e04-53654e2ed93so6008427e87.0 for ; Tue, 17 Sep 2024 05:14:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20230601.gappssmtp.com; s=20230601; t=1726575261; x=1727180061; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=nqxW7xZL5jJxMJim6L87sJU1ZvRaBUvIvPgZCldjxZc=; b=qLWyMU5q4EhYDekorblBp6qZusF9UBlcq6J7gLVZ9nj/14JjNiUcKJmKNLjjODEbq4 0wE5At/dokJrZdKA5IGrgy1NEfSPDOYJDSz9lsyipwrHj3ex0NTXC5vvWs6eMgGYldPS e4+uF9OBxmJC1j5kodvXQeXP/mbRTX8lbCxGI/YoCZ1j6si/JA600Ob5K1hcOGRRvFiF uyp9LO1pxHqL3IRMMDCrDcHcOTNy/w+UHm0yP7xwEbcwxG1MLK7EmVp8GofNP/GMmQS8 ee/7s8lyYj43XFwAif++LDbmCWNenU6X1jf9LdnnbpwckgVxfhNzHsWEBLilZNyDzPNy Ux+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726575261; x=1727180061; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nqxW7xZL5jJxMJim6L87sJU1ZvRaBUvIvPgZCldjxZc=; b=NtSM//i7KPlCKmNsEGBCGXRAfbM1vT1U6bPwrmv2XMOoQUXM0OdWhGWJ4EyZd/cYpu btj2AyZ/AtJSQ/2+m66P2RNkkLL3y+Drj69F72J10p4wru0EOHDI2Du2ik+bahIg/W8Y ya9WaeDKrYHQGeYG793kHAoex78XfbOmv3BX2DiMSgrR0BU/sW3tneSQNbGx149CzTSS 7ySxlBusEhGxVoz2dFQKsKbQdJERDLxQZeAEuZl1vhmWKDkfbdBCXqivnRmu6e/MCfk7 98gHP3MzQGsnP5DR/BuiT7ikj1wp/SpksVQeXNG78JvQhK/IK5nxv3T3htgHriH065fO A4YA== X-Gm-Message-State: AOJu0YwH/ZSgzk39J4dZSsdhJrrjlXgo3gx91t8anLds0O3XrUwud8+l /4cI2RVfZqfTf2UU5QL0HYy0Y7V0SHGy60BkHon5y3ipHKfrIz2xNF0tFCqWnYZcPnwmvwhfqLn BnA== X-Received: by 2002:a05:6512:39ca:b0:535:ea75:e913 with SMTP id 2adb3069b0e04-53678fc86dbmr9951228e87.33.1726575261104; Tue, 17 Sep 2024 05:14:21 -0700 (PDT) Received: from localhost (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-536870a4639sm1179787e87.186.2024.09.17.05.14.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Sep 2024 05:14:20 -0700 (PDT) From: =?utf-8?q?Martin_Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 17 Sep 2024 15:14:16 +0300 Message-Id: <20240917121419.610349-3-martin@martin.st> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240917121419.610349-1-martin@martin.st> References: <20240917121419.610349-1-martin@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/5] aarch64: Add CPU feature flags for SVE and SVE2 X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: XqAsyvOrb4dQ Add code for detecting the feature on Linux and Windows. --- libavutil/aarch64/cpu.c | 20 ++++++++++++++++++++ libavutil/aarch64/cpu.h | 2 ++ libavutil/cpu.c | 2 ++ libavutil/cpu.h | 2 ++ libavutil/tests/cpu.c | 2 ++ tests/checkasm/checkasm.c | 2 ++ 6 files changed, 30 insertions(+) diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c index fe24b1da4d..e82c0f19ab 100644 --- a/libavutil/aarch64/cpu.c +++ b/libavutil/aarch64/cpu.c @@ -25,6 +25,8 @@ #include #define HWCAP_AARCH64_ASIMDDP (1 << 20) +#define HWCAP_AARCH64_SVE (1 << 22) +#define HWCAP2_AARCH64_SVE2 (1 << 1) #define HWCAP2_AARCH64_I8MM (1 << 13) static int detect_flags(void) @@ -36,6 +38,10 @@ static int detect_flags(void) if (hwcap & HWCAP_AARCH64_ASIMDDP) flags |= AV_CPU_FLAG_DOTPROD; + if (hwcap & HWCAP_AARCH64_SVE) + flags |= AV_CPU_FLAG_SVE; + if (hwcap2 & HWCAP2_AARCH64_SVE2) + flags |= AV_CPU_FLAG_SVE2; if (hwcap2 & HWCAP2_AARCH64_I8MM) flags |= AV_CPU_FLAG_I8MM; @@ -119,6 +125,14 @@ static int detect_flags(void) * regular I8MM is available. */ if (IsProcessorFeaturePresent(PF_ARM_SVE_I8MM_INSTRUCTIONS_AVAILABLE)) flags |= AV_CPU_FLAG_I8MM; +#endif +#ifdef PF_ARM_SVE_INSTRUCTIONS_AVAILABLE + if (IsProcessorFeaturePresent(PF_ARM_SVE_INSTRUCTIONS_AVAILABLE)) + flags |= AV_CPU_FLAG_SVE; +#endif +#ifdef PF_ARM_SVE2_INSTRUCTIONS_AVAILABLE + if (IsProcessorFeaturePresent(PF_ARM_SVE2_INSTRUCTIONS_AVAILABLE)) + flags |= AV_CPU_FLAG_SVE2; #endif return flags; } @@ -142,6 +156,12 @@ int ff_get_cpu_flags_aarch64(void) #ifdef __ARM_FEATURE_MATMUL_INT8 flags |= AV_CPU_FLAG_I8MM; #endif +#ifdef __ARM_FEATURE_SVE + flags |= AV_CPU_FLAG_SVE; +#endif +#ifdef __ARM_FEATURE_SVE2 + flags |= AV_CPU_FLAG_SVE2; +#endif flags |= detect_flags(); diff --git a/libavutil/aarch64/cpu.h b/libavutil/aarch64/cpu.h index 64d703be37..df7becca30 100644 --- a/libavutil/aarch64/cpu.h +++ b/libavutil/aarch64/cpu.h @@ -27,5 +27,7 @@ #define have_vfp(flags) CPUEXT(flags, VFP) #define have_dotprod(flags) CPUEXT(flags, DOTPROD) #define have_i8mm(flags) CPUEXT(flags, I8MM) +#define have_sve(flags) CPUEXT(flags, SVE) +#define have_sve2(flags) CPUEXT(flags, SVE2) #endif /* AVUTIL_AARCH64_CPU_H */ diff --git a/libavutil/cpu.c b/libavutil/cpu.c index df00bd541f..e16ebc0d38 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -180,6 +180,8 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "vfp", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VFP }, .unit = "flags" }, { "dotprod", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_DOTPROD }, .unit = "flags" }, { "i8mm", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_I8MM }, .unit = "flags" }, + { "sve", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_SVE }, .unit = "flags" }, + { "sve2", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_SVE2 }, .unit = "flags" }, #elif ARCH_MIPS { "mmi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MMI }, .unit = "flags" }, { "msa", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MSA }, .unit = "flags" }, diff --git a/libavutil/cpu.h b/libavutil/cpu.h index ba6c234e04..6b6e50f07a 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -72,6 +72,8 @@ #define AV_CPU_FLAG_VFP_VM (1 << 7) ///< VFPv2 vector mode, deprecated in ARMv7-A and unavailable in various CPUs implementations #define AV_CPU_FLAG_DOTPROD (1 << 8) #define AV_CPU_FLAG_I8MM (1 << 9) +#define AV_CPU_FLAG_SVE (1 <<10) +#define AV_CPU_FLAG_SVE2 (1 <<11) #define AV_CPU_FLAG_SETEND (1 <<16) #define AV_CPU_FLAG_MMI (1 << 0) diff --git a/libavutil/tests/cpu.c b/libavutil/tests/cpu.c index 0a459c1d9e..679b538f0f 100644 --- a/libavutil/tests/cpu.c +++ b/libavutil/tests/cpu.c @@ -40,6 +40,8 @@ static const struct { { AV_CPU_FLAG_VFP, "vfp" }, { AV_CPU_FLAG_DOTPROD, "dotprod" }, { AV_CPU_FLAG_I8MM, "i8mm" }, + { AV_CPU_FLAG_SVE, "sve" }, + { AV_CPU_FLAG_SVE2, "sve2" }, #elif ARCH_ARM { AV_CPU_FLAG_ARMV5TE, "armv5te" }, { AV_CPU_FLAG_ARMV6, "armv6" }, diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 73a998ae3a..c932e028a5 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -305,6 +305,8 @@ static const struct { { "NEON", "neon", AV_CPU_FLAG_NEON }, { "DOTPROD", "dotprod", AV_CPU_FLAG_DOTPROD }, { "I8MM", "i8mm", AV_CPU_FLAG_I8MM }, + { "SVE", "sve", AV_CPU_FLAG_SVE }, + { "SVE2", "sve2", AV_CPU_FLAG_SVE2 }, #elif ARCH_ARM { "ARMV5TE", "armv5te", AV_CPU_FLAG_ARMV5TE }, { "ARMV6", "armv6", AV_CPU_FLAG_ARMV6 },