From patchwork Sat Jun 8 11:37:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 49699 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:c209:0:b0:460:55fa:d5ed with SMTP id d9csp1537387vqo; Sat, 8 Jun 2024 04:37:48 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVmk/Vn0JvSOVfxrvmY66JxnvBVDzyYG8HnryHwLC/OW8FdrVxY5B1Fve6TUirKBTP8u+q28NLRndOLph6KkxKBRlspvJw6dVB80g== X-Google-Smtp-Source: AGHT+IF1TnVvccD9XZesGx58e1To9U6q5adeQv8Ka1PDcTIKuyLQQwHnCgLMnIYySiFEf0KWFkEG X-Received: by 2002:a50:d613:0:b0:57c:61b3:bdc1 with SMTP id 4fb4d7f45d1cf-57c61b3be4emr1755834a12.3.1717846667870; Sat, 08 Jun 2024 04:37:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1717846667; cv=none; d=google.com; s=arc-20160816; b=0gGJzck+xmIfuo0pwT9WKl4PR08MAcYf6x3TfTSCGSWDTykcm9PYP461WrFPxNdTkJ cagQU/5Sea6BzOEbqYg1r9vDZQwwsYxLqLiwslbzNPsAU85WdKfiL9YV3SLDJ5I9SRk+ zF3uvcve9JhO/9cARtxi4mBK7IeLtS8jwJh6jEwqMnrJgOrnW7T/GSEaUjWmbVr9WJ9Q 9BVBiQrOTbTEwX7m17qEvZ0RpssjsP06Nqz/agPsoSrb6Ta/L1JMFyjtlCgc54Nt3vI/ /H0Nt/mD1uKEKwFYBXG12k2tAoRY9d9wbpkFpYEdp26zwunTt+cadK+WKiXKJfOVEV3R cQHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ZDikNYpohx+EIdNu9HfSNoofHELdD82Yy1jWTQtNii4=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=rhPUsZJMr4+AQHzuIqjikSZQ4aFOAh3Chgc7FSExPFw+spjmOwom8Ma17vQSbP3bTW 4d5zKCR2c7phQlIsfEo4h/6LSPJi1017/M/rcz0Cu96wWxK9KVJNQlQCB8ymJsI4kgN/ +KwGqgeiDkBWQw6aGfJBATEZT5LgneExzbyTGHbi3Q0KNaiJonoYiB+15L0Pcic8KckO Lk72N72Xcwvoh3k8b4zYrxKhMd1FfOltjPGDBnpgOQxQ1ZseWg879xPyQO7ZpG/xpG5v 7mu075Pexk4S1414DuMzmsgtZ58bqT7YZKeak/trdCFW0HeUl4Qf0KzXKOKhPY1inw5Q EE0Q==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-57c68faf76asi841392a12.615.2024.06.08.04.37.47; Sat, 08 Jun 2024 04:37:47 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2430968D6C5; Sat, 8 Jun 2024 14:37:27 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 40D7D68D69C for ; Sat, 8 Jun 2024 14:37:18 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 9E032C02F8 for ; Sat, 8 Jun 2024 14:37:17 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 8 Jun 2024 14:37:15 +0300 Message-ID: <20240608113717.1677043-3-remi@remlab.net> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240608113717.1677043-1-remi@remlab.net> References: <20240608113717.1677043-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/4] lavu/riscv: use Zbb CPOP/CPOPW at run-time X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: K9+B/m6DGKBZ Zbb static Zbb dynamic I baseline popcount 1.336129286 3.469067758 20.146362909 popcountl 1.336322291 3.340292968 20.224829821 (seconds for 1 billion iterations on a SiFive-U74 core) --- libavutil/riscv/intmath.h | 73 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 69 insertions(+), 4 deletions(-) diff --git a/libavutil/riscv/intmath.h b/libavutil/riscv/intmath.h index ae9ee7775b..1f0afbc81d 100644 --- a/libavutil/riscv/intmath.h +++ b/libavutil/riscv/intmath.h @@ -1,4 +1,6 @@ /* + * Copyright © 2022-2024 Rémi Denis-Courmont. + * * This file is part of FFmpeg. * * FFmpeg is free software; you can redistribute it and/or @@ -23,6 +25,7 @@ #include "config.h" #include "libavutil/attributes.h" +#include "libavutil/riscv/cpu.h" /* * The compiler is forced to sign-extend the result anyhow, so it is faster to @@ -70,12 +73,74 @@ static av_always_inline av_const int av_clip_intp2_rvi(int a, int p) } #if defined (__GNUC__) || defined (__clang__) -#define av_popcount __builtin_popcount -#if (__riscv_xlen >= 64) -#define av_popcount64 __builtin_popcountl +static inline av_const int av_popcount_rv(unsigned int x) +{ +#if HAVE_RV && !defined(__riscv_zbb) + if (!__builtin_constant_p(x) && + __builtin_expect(ff_rv_zbb_support(), true)) { + int y; + + __asm__ ( + ".option push\n" + ".option arch, +zbb\n" +#if __riscv_xlen >= 64 + "cpopw %0, %1\n" #else -#define av_popcount64 __builtin_popcountll + "cpop %0, %1\n" +#endif + ".option pop" : "=r" (y) : "r" (x)); + if (y > 32) + __builtin_unreachable(); + return y; + } +#endif + return __builtin_popcount(x); +} +#define av_popcount av_popcount_rv + +static inline av_const int av_popcount64_rv(uint64_t x) +{ +#if HAVE_RV && !defined(__riscv_zbb) && __riscv_xlen >= 64 + if (!__builtin_constant_p(x) && + __builtin_expect(ff_rv_zbb_support(), true)) { + int y; + + __asm__ ( + ".option push\n" + ".option arch, +zbb\n" + "cpop %0, %1\n" + ".option pop" : "=r" (y) : "r" (x)); + if (y > 64) + __builtin_unreachable(); + return y; + } #endif + return __builtin_popcountl(x); +} +#define av_popcount64 av_popcount64_rv + +static inline av_const int av_parity_rv(unsigned int x) +{ +#if HAVE_RV && !defined(__riscv_zbb) + if (!__builtin_constant_p(x) && + __builtin_expect(ff_rv_zbb_support(), true)) { + int y; + + __asm__ ( + ".option push\n" + ".option arch, +zbb\n" +#if __riscv_xlen >= 64 + "cpopw %0, %1\n" +#else + "cpop %0, %1\n" +#endif + ".option pop" : "=r" (y) : "r" (x)); + return y & 1; + } +#endif + return __builtin_parity(x); +} +#define av_parity av_parity_rv #endif #endif /* AVUTIL_RISCV_INTMATH_H */