From patchwork Sat May 11 15:51:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 48724 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:1706:b0:1af:cdee:28c5 with SMTP id nv6csp196075pzb; Sat, 11 May 2024 08:51:54 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUgoA0Mpnhotc8j7dXQ0iG35VyMIN/ZlatK9ZRAlmAWM0n4KrQdOFqjJMShZ7GDLJZ3QltTQXS7ZyQR3xmDFj8Q+1I6hS57mCPECg== X-Google-Smtp-Source: AGHT+IG/qwtJSMaQVdUCBtql2bG+3vs1936SOQgkwib0eT89XW4nODsFcWuKuNgY+LjBiK9tTTPD X-Received: by 2002:a17:906:31ce:b0:a52:65bd:a19a with SMTP id a640c23a62f3a-a5a2d672e74mr328237566b.57.1715442714301; Sat, 11 May 2024 08:51:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715442714; cv=none; d=google.com; s=arc-20160816; b=F8FjTRTctEcputwJL77xhS+qQFKi+lcu1ERwOsQJ3Sj10uZiWTbsUtiprIfnVxYGFV sxaNiTsJ1/dTzQgm+Bc6H1f9VRm9tUlVX9nX23hlq9hOhjjQrMW4rUzNrnCkKtLfX+LW jdYRD0sYfBr6CjNUmvucxatwlzIFivnUih5mmSBzAAyHPXDQaBemcDozESEd0tB1tGcr kCFmax1l1B6qam7rypK8LO/ot5jDrnxE9RwnnEET/w5qON0kcsABQbtY5DE7V8fH9JHy HW0wrZEoIuIrlDMr7kZGQr0TFmgDTtEnH9/2uWvP6qv5Qf7B2tgSc//UsVtYQrUg5zeC PFuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :delivered-to; bh=9QxiV0BJyQWLExp+KxKoaFrvzto42XvH7xBK+lTx+OY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=H70ivMLPDPPvMsZspfDM/75sumsdLev+9+0UIAK1rbb1hhvAauK3AyIP48ateoNeij ht90+86Yck9ncfYRA1KsDHiRtbIOo3Za01dXkeerMWkGCINf9fVblJQSgCI1INnIS1go RnygGSmohDYK8qVxBU7fWLByUtGm/l/0z9167bxZuNEH0VtN3i24asVkdpFwDdRfkCQ4 /jJNsy7SKiDmZ+DXJmRVVAnJn9PYLM9u4ns2erqklrVsrK2f/meX7QPAIKafF1Jpt0nB eHwGweNfNeQFJBMwTuFe3kUNITEsY7bTSWV8Pl3mfcGHEgVESRJz38gVF/DjeS2TLyfx jawA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a5a17c2d14asi299354366b.1048.2024.05.11.08.51.53; Sat, 11 May 2024 08:51:54 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1A9E868D543; Sat, 11 May 2024 18:51:50 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1618068CDAC for ; Sat, 11 May 2024 18:51:43 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id 7C0A2C006B for ; Sat, 11 May 2024 18:51:42 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 11 May 2024 18:51:41 +0300 Message-ID: <20240511155142.59542-1-remi@remlab.net> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] lavu/riscv: CPU flag for fast misaligned accesses X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 8wC79KiE61kA --- libavutil/cpu.c | 1 + libavutil/cpu.h | 1 + libavutil/riscv/cpu.c | 3 +++ libavutil/tests/cpu.c | 3 ++- tests/checkasm/checkasm.c | 1 + 5 files changed, 8 insertions(+), 1 deletion(-) diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 396eeb38d6..9ac2f01c20 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -193,6 +193,7 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "zba", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVB_ADDR }, .unit = "flags" }, { "zbb", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RVB_BASIC }, .unit = "flags" }, { "zvbb", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_ZVBB }, .unit = "flags" }, + { "misaligned", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_RV_MISALIGNED }, .unit = "flags" }, #endif { NULL }, }; diff --git a/libavutil/cpu.h b/libavutil/cpu.h index cc19828d4b..a25901433e 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -91,6 +91,7 @@ #define AV_CPU_FLAG_RVB_BASIC (1 << 7) ///< Basic bit-manipulations #define AV_CPU_FLAG_RVB_ADDR (1 << 8) ///< Address bit-manipulations #define AV_CPU_FLAG_RV_ZVBB (1 << 9) ///< Vector basic bit-manipulations +#define AV_CPU_FLAG_RV_MISALIGNED (1 <<10) ///< Fast misaligned accesses /** * Return the flags which specify extensions supported by the CPU. diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c index 6755f0df69..1fe1a397c4 100644 --- a/libavutil/riscv/cpu.c +++ b/libavutil/riscv/cpu.c @@ -52,6 +52,7 @@ int ff_get_cpu_flags_riscv(void) struct riscv_hwprobe pairs[] = { { RISCV_HWPROBE_KEY_BASE_BEHAVIOR, 0 }, { RISCV_HWPROBE_KEY_IMA_EXT_0, 0 }, + { RISCV_HWPROBE_KEY_CPUPERF_0, 0 }, }; if (__riscv_hwprobe(pairs, FF_ARRAY_ELEMS(pairs), 0, NULL, 0) == 0) { @@ -76,6 +77,8 @@ int ff_get_cpu_flags_riscv(void) if (pairs[1].value & RISCV_HWPROBE_EXT_ZVBB) ret |= AV_CPU_FLAG_RV_ZVBB; #endif + if (pairs[2].value & RISCV_HWPROBE_MISALIGNED_FAST) + ret |= AV_CPU_FLAG_RV_MISALIGNED; } else #endif #if HAVE_GETAUXVAL diff --git a/libavutil/tests/cpu.c b/libavutil/tests/cpu.c index 10e620963b..02b98682e3 100644 --- a/libavutil/tests/cpu.c +++ b/libavutil/tests/cpu.c @@ -94,7 +94,8 @@ static const struct { { AV_CPU_FLAG_RVV_F32, "zve32f" }, { AV_CPU_FLAG_RVV_I64, "zve64x" }, { AV_CPU_FLAG_RVV_F64, "zve64d" }, - { AV_CPU_FLAG_RV_ZVBB, "zvbb" }, + { AV_CPU_FLAG_RV_ZVBB, "zvbb" }, + { AV_CPU_FLAG_RV_MISALIGNED, "misaligned" }, #endif { 0 } }; diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 04f94f9d09..c6dc0cfa77 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -286,6 +286,7 @@ static const struct { { "RVVi64", "rvv_i64", AV_CPU_FLAG_RVV_I64 }, { "RVVf64", "rvv_f64", AV_CPU_FLAG_RVV_F64 }, { "RV_Zvbb", "rv_zvbb", AV_CPU_FLAG_RV_ZVBB }, + { "misaligned", "misaligned", AV_CPU_FLAG_RV_MISALIGNED }, #elif ARCH_MIPS { "MMI", "mmi", AV_CPU_FLAG_MMI }, { "MSA", "msa", AV_CPU_FLAG_MSA }, From patchwork Sat May 11 15:51:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= X-Patchwork-Id: 48725 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a21:1706:b0:1af:cdee:28c5 with SMTP id nv6csp196134pzb; Sat, 11 May 2024 08:52:05 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVsjpuZK76cRS6FaN5VsmBDGLJPwL4l1pitXJwCeXs7VgAheSGOp2Z3MSmYxUZtBbwX7YgeA7Km/j9IwXXzl7QQSITL4CCi1RFAdg== X-Google-Smtp-Source: AGHT+IHvWs4prsYgOVPD5zy+vpRHx6QhFyUbFpKlsBDZyv4li9ccl1NvrFOKy42DLVeyxm7I5N58 X-Received: by 2002:aa7:dada:0:b0:573:58a6:5a4d with SMTP id 4fb4d7f45d1cf-57358a65d5fmr3195832a12.35.1715442724784; Sat, 11 May 2024 08:52:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1715442724; cv=none; d=google.com; s=arc-20160816; b=yAUQbKXkOop5Hcgr+kl8oIDB1RebVjcaFtvLIx35DEY6i7egOoQABgdysXaGwxeaXL FW5k0yXosPwJcFRT+BZsH/KD2a77wafLjxwk/qzR8mz0zOZmbUA8aU59ngDQ4zvANMUj 4XFkSVXMlotjSblDbA90sdJsu5Oak3rEYUzcO3zEHmA+BUlUCA0BAc31GjogBar33qEC HirrN3U+PxyqYU7fS10+dcsrwv4enN+4lWW0FBHjaXUrxPG9hq7i/x4nAEG3WpqjH61v TuEEY9HGIFROMn7BtvtlWty2Zw9+Mk/honCDrbuV3HmiCtKWurd3Tj4eUj2EOP8SiqP6 vv2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:delivered-to; bh=ET9c4hK8Ygm8KWBobNDGiOdJty8WHFKzv076iJKXl/k=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=sdMzt5XJfepe8UAxXLF5LAzTTjgg2ZIvjMrqZ25gwMbrXziISUsVzxTixojDGiyl4g deOptSM76UMqTEK1ggg75BMr8X4/xMua3mRuUMEi/Lkzr86txB5lKk+CpTwly+KHOVJE /zK21tDMurAMY3yxDnw0flDf4HbgTsNYZKDzFvxowIt7rNDN8q1+ryi51KPqcULA1hi8 wlvwLW18e338tPPsth33OiCjvFD7tVSDEyyh44NP+7AzHVSV3PkXDHP6zsSWZOZofD+6 VWKHQNZpbX4ojC/VUcZhkLHlLvaikZVZo/YdkZzgU95XSkcfQpsQUBR+bEzScCyyspwF cSKQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5735c9e85d7si1492894a12.637.2024.05.11.08.52.04; Sat, 11 May 2024 08:52:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4618768D56E; Sat, 11 May 2024 18:51:51 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from ursule.remlab.net (vps-a2bccee9.vps.ovh.net [51.75.19.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2149468D409 for ; Sat, 11 May 2024 18:51:43 +0300 (EEST) Received: from basile.remlab.net (localhost [IPv6:::1]) by ursule.remlab.net (Postfix) with ESMTP id B0D9EC006C for ; Sat, 11 May 2024 18:51:42 +0300 (EEST) From: =?utf-8?q?R=C3=A9mi_Denis-Courmont?= To: ffmpeg-devel@ffmpeg.org Date: Sat, 11 May 2024 18:51:42 +0300 Message-ID: <20240511155142.59542-2-remi@remlab.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240511155142.59542-1-remi@remlab.net> References: <20240511155142.59542-1-remi@remlab.net> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] lavc/vp8dsp: restrict RVI optimisations X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: TvhjWhH+jvJq They are actually awfully slow if the CPU does not support misaligned accesses natively, so only use them if misaligned accesses are fast. --- libavcodec/riscv/vp8dsp_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/riscv/vp8dsp_init.c b/libavcodec/riscv/vp8dsp_init.c index dc3e087f01..fe4fa5b867 100644 --- a/libavcodec/riscv/vp8dsp_init.c +++ b/libavcodec/riscv/vp8dsp_init.c @@ -45,7 +45,7 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) { #if HAVE_RV int flags = av_get_cpu_flags(); - if (flags & AV_CPU_FLAG_RVI) { + if (flags & AV_CPU_FLAG_RV_MISALIGNED) { #if __riscv_xlen >= 64 c->put_vp8_epel_pixels_tab[0][0][0] = ff_put_vp8_pixels16_rvi; c->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvi;