From patchwork Mon Jun 14 11:14:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Kelly X-Patchwork-Id: 28264 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:c91a:0:0:0:0:0 with SMTP id z26csp2747904iol; Mon, 14 Jun 2021 04:20:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxUWMa213GMMb9gOF91piYBM/mFh7rpM7ONPGBtW9TK5k5NviM5z7bKS3qk8Q5mkHTHA0Ur X-Received: by 2002:a05:6402:152:: with SMTP id s18mr16139038edu.221.1623669604726; Mon, 14 Jun 2021 04:20:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623669604; cv=none; d=google.com; s=arc-20160816; b=ozc4/uRcKqk9wB+oWbkIeGWe3fSYe524eE+I0ygXFL2iVp7+hh8fVr93zL1QsFzRTW evOdgFwiI+S+gQi53VJDadu8dXNbVNORy7jMZczKFDw0bCpn/JN4l5TZvVymfGAEHFFw Smoxf7HqtHtqzWHYBt7oFOGB64+zlCdjV7TB4cLiScmZmvbjq8Zzjzhai7/hFvcZ4zdK 8N2KFxi2FaHdbQH0LQeZ9VnRkLLLavVxTDpeuNG/UN5P+o569vsv2B7LrPVtFD6d77CV 9ztogIcGUwMMYH5YjmLQvF1cQjqt6IFmMc9O7SXbg+LqySwKZF4pIxUmVlmi/XSKQrfR jmtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:to:from:mime-version:message-id:date :dkim-signature:delivered-to; bh=Pe2217xKcijvpA3KGnxufP9Eb+G8/mp8aw0UeV7/mSg=; b=uuzM9EIMTCGZXGAlj6HT983iGOgbcflsBGfYizGGp9/CS3ULjQu5Uq9T4z6KJnXo4M GsMdlWJdxy1qq8xbPWIDE/NQk/502VdEvIEljscm6WSXJ26WBVwFV/g9ljCROMU13syI lxBjS6gDROcaEeDohSRuEUhNau8pKMlgSJ9se0j+ZbDE6M7hkBy3LToOdGYlMzolJWAa Hu28lOMKsIGqbRuLYrf7c+vaPSxYNeY0NW3P1U0+uU/e/nDjR9R/B7WH6g69T+ZwmZbU q9OxpucnwDkDUKWFqOCEjgmUgaVOHMfdtF3QCfDKPTbVQPjHVSl0NddvZ00aS2zge+1P oQYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20161025 header.b=NtyuXEv4; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h23si10922962edq.536.2021.06.14.04.20.04; Mon, 14 Jun 2021 04:20:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20161025 header.b=NtyuXEv4; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C07CF6882EE; Mon, 14 Jun 2021 14:20:00 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B42146801A4 for ; Mon, 14 Jun 2021 14:19:54 +0300 (EEST) Received: by mail-pg1-f201.google.com with SMTP id k193-20020a633dca0000b029021ff326b222so6584509pga.9 for ; Mon, 14 Jun 2021 04:19:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=lHdqWHRiB3ve0OLXnvEX6zCusYJfvxeUd3N3f1+Mqsw=; b=NtyuXEv4SfvMBVEdBTstWyZLIsUTc434ocLstGy6oo7NQwdABp8U+85CeORO12YYqc 6cPM2903LP0wBMWXz4Mllc4WZtjVUGUgR/w51nACr6xKKMu/M6ienS1soDAOnA3HkXFi AD+M3q2X1zZBhDYcH9SaUlYt3IdoyEMBACSeR2qXwBWAuvAdUONnAhSt86neRbdDxL1P x5KQMtSAF2Lp9LXBomDoYuyQtdCk6FqeHMOAZ7HV2PPJHzWpr4SkZWubI5ak4l7Sh838 BxA9Xdh9pNGY1xXvjzfXRGD8HNkgwcjYvtOdbajeSmZeH/Hdmiju+wzUsmBD9+Dsk+MQ oVOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=lHdqWHRiB3ve0OLXnvEX6zCusYJfvxeUd3N3f1+Mqsw=; b=evZTXRPLPIVFDUkhQHK57XakbRz4s0IfVePF3d5ID7BYVHYUGlD99OBpyYUBe2s14z /yn1GnMxUYR0Z7jgu8DI7pD6TjsuD7SHFvXaQnEgCFJxs2B+XZW3AHtqONS9Suc6iIfs 5eX7o3c1OxYtN6sDWZJcD0ZovPl+j9tvmKeSJnXecOJZvZ7SKdLZCUFVqrvcQGcDSNFs mhvMh2u+Q7/83yI4Kz/LWZ818X9vMK/09AByynvY0rNhABJ1KlVUKmrUDkCvtrq2C1sT P+QDpA0DEBQeF/K2aKADufz8YVv8105mUO24fb/czp7csUME6qm4jVs40F8uUOBKYjTL 712Q== X-Gm-Message-State: AOAM530lynAKiYAhknhl5tPlT6cASpasSU/7F6iRgISCwSZqpRjwR+5M NtEGWv5Okdl2BmeBvrL0bMd60XsLAFnvp4skPks67kMXb1+gwhHd4+yQkK9i5HmFNxHlpKJrVqX +KPmUdc0qRTZb2CBhlpY7mslpC4cZLJ7HklLRiyvJaEerE1cvaeJLrgkaWLmBVCAcDMdnSWo= X-Received: from alankelly0.zrh.corp.google.com ([2a00:79e0:61:301:e65a:f650:168c:24b4]) (user=alankelly job=sendgmr) by 2002:a05:6214:18d0:: with SMTP id cy16mr17776716qvb.29.1623669259516; Mon, 14 Jun 2021 04:14:19 -0700 (PDT) Date: Mon, 14 Jun 2021 13:14:06 +0200 Message-Id: <20210614111407.1897690-1-alankelly@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.32.0.272.g935e593368-goog From: Alan Kelly To: ffmpeg-devel@ffmpeg.org Subject: [FFmpeg-devel] [PATCH 1/2] libavutil/cpu: Adds av_cpu_has_fast_gather to detect cpus with avx fast gather instruction X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Alan Kelly Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Mg16ezUgoA4v Broadwell and later have fast gather instructions. --- This is so that the avx2 version of ff_hscale8to15X which uses gather instructions is only selected on machines where it will actually be faster. libavutil/cpu.c | 6 ++++++ libavutil/cpu.h | 6 ++++++ libavutil/cpu_internal.h | 1 + libavutil/x86/cpu.c | 18 ++++++++++++++++++ 4 files changed, 31 insertions(+) diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 8960415d00..0a723eeb7a 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -49,6 +49,12 @@ static atomic_int cpu_flags = ATOMIC_VAR_INIT(-1); +int av_cpu_has_fast_gather(void){ + if (ARCH_X86) + return ff_cpu_has_fast_gather(); + return 0; +} + static int get_cpu_flags(void) { if (ARCH_MIPS) diff --git a/libavutil/cpu.h b/libavutil/cpu.h index b555422dae..faf3a221f4 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -72,6 +72,7 @@ #define AV_CPU_FLAG_MMI (1 << 0) #define AV_CPU_FLAG_MSA (1 << 1) +int av_cpu_has_fast_gather(void); /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used @@ -107,6 +108,11 @@ int av_cpu_count(void); * av_set_cpu_flags_mask(), then this function will behave as if AVX is not * present. */ + +/** + * Returns true if the cpu has fast gather instructions. + * Broadwell and later cpus have fast gather + */ size_t av_cpu_max_align(void); #endif /* AVUTIL_CPU_H */ diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 889764320b..92525df0c1 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -46,6 +46,7 @@ int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); int ff_get_cpu_flags_x86(void); +int ff_cpu_has_fast_gather(void); size_t ff_get_cpu_max_align_mips(void); size_t ff_get_cpu_max_align_aarch64(void); diff --git a/libavutil/x86/cpu.c b/libavutil/x86/cpu.c index bcd41a50a2..9724e0017b 100644 --- a/libavutil/x86/cpu.c +++ b/libavutil/x86/cpu.c @@ -270,3 +270,21 @@ size_t ff_get_cpu_max_align_x86(void) return 8; } + +int ff_cpu_has_fast_gather(void){ + int eax, ebx, ecx; + int max_std_level, std_caps = 0; + int family = 0, model = 0; + cpuid(0, max_std_level, ebx, ecx, std_caps); + + if (max_std_level >= 1) { + cpuid(1, eax, ebx, ecx, std_caps); + family = ((eax >> 8) & 0xf) + ((eax >> 20) & 0xff); + model = ((eax >> 4) & 0xf) + ((eax >> 12) & 0xf0); + // Broadwell and later + if(family == 6 && model >= 70){ + return 1; + } + } + return 0; +}