From patchwork Wed Apr 28 19:50:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Dekker X-Patchwork-Id: 27465 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a11:4023:0:0:0:0 with SMTP id ky35csp755876pxb; Wed, 28 Apr 2021 12:51:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy3KmpcBnDYX/69BlLPvZzN5efEUgOJYO4+RNFxlOjDtm8JnOul/cf0sBAHNTRA7m0kAAP3 X-Received: by 2002:aa7:d513:: with SMTP id y19mr4377531edq.9.1619639464609; Wed, 28 Apr 2021 12:51:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619639464; cv=none; d=google.com; s=arc-20160816; b=fT5HxAf+DZQ6Hy0JrxrVsgP/wFcIqnlndmJTo8Uqtj8XnluK5iwSjWSNruqDCjvNa8 br0l1jA9lnflTeDryOBbDeMwDTAyETzN1Wt1grTovYBX122sTDixpqvLGYR7M7ONM2LV c1pSJHIACDjOPG7eEFDdwQOoA9miN3sQhF8GKh/ozaZ42+/rnjq7VlDBgiwDmHQj2LPk JHuky83yba9OM+dIQQEdzDmHfTObzwjqQGeDw7wCbb3enEG4DBB+Dz8RrqlLLNYbgK3l iAr+GuaQzzZt4V0nQlvShN/pwLIC3Ruy3LFl75N6qv4u0zkvemwypTtej1iU6EXwcRVl 5Ziw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:dkim-signature:delivered-to; bh=6jvcmRklNBICzOG4iA7j46oxK99xhlY5UYPgXAE3JZg=; b=A5KFx89+G53Lg11T3m7a9cfeqDJPEMJ+J4UfFHkrcFHV3Xx5sr9+sOneLl28p8LWKX 0JC8GUQZ24+UVaPLP3rw1r6BmoANoVlZYS9HcNdMiTAIXKHtFjQ5P+mP/rIwJWG3RaqW dbKlP4MjTatSGM/JMSXys2MWqsN6ZqrZJURHKUR1uXy/emF76KgLd8i/Xf+7xYPrOOv2 VPAVKFeVEiR1KZ4T+AVjvT71ZkbireqFX3j3hMKP5UV0nILlDAyMAy44FBC8FGIBn1qc OlzaTyhxc6yXLbw74Aa8a/qlN2h21ltPXdRF0HXTUpWcBYvJQeWQverNFJNiN9X6fbP+ gwYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@itanimul.li header.s=fm1 header.b=Hm7xrYIE; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm2 header.b=qYUjgmna; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id m18si711164edv.153.2021.04.28.12.51.03; Wed, 28 Apr 2021 12:51:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@itanimul.li header.s=fm1 header.b=Hm7xrYIE; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm2 header.b=qYUjgmna; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 01557689903; Wed, 28 Apr 2021 22:50:59 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from wout5-smtp.messagingengine.com (wout5-smtp.messagingengine.com [64.147.123.21]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 826156891E7 for ; Wed, 28 Apr 2021 22:50:53 +0300 (EEST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.west.internal (Postfix) with ESMTP id B549BED4; Wed, 28 Apr 2021 15:50:50 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Wed, 28 Apr 2021 15:50:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itanimul.li; h= from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; s=fm1; bh=rwYG4va34XVfO6/0wsMG7A1beZ 6RkXu8mpgo5TdCMeo=; b=Hm7xrYIE7ZTNLK9v4Nhi3WlEAFanm3+zn6hxkcHtvY 82WPyCVg30wOVHmVq1CpY6x5Kw7CxHRUrF9TmXR4CCnkdYmZlCm2uBnp/0qNn4JO qzj+oWbvmmE0qG3Fc5asOUZSGWyQo2SHhZdyWZ8xryM6zNHUNQKAcZtpdwjk5SNU fmnFQrVvhEkidFlb8VMXK4utDejeBaTYeD4/Z1esye3vMFMG7JG1BqXZHDkicI4m v4DWBfJ5QF41D04Qwwpqoea8zHFAHebpTBmxa0YQRpmUUQDzGe7SyXWzNIntQdG2 YIQYSSVk24KXDlNOpZf9FdVfpBIeZ5MU2qpor+8UtGlA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :message-id:mime-version:subject:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=rwYG4va34XVfO6/0w sMG7A1beZ6RkXu8mpgo5TdCMeo=; b=qYUjgmnamnR9eX2oDRvnLG7juY9rS0xL3 8s+j/Sgfb24iidGJLzL53qwv6YjCiMJshtyr5ZpK7n0Cqknq0Mr4zwzPyJ18d6lt 5Qc07Av5li48BFAIsCBYfWVTRibVqX/uBh3JLyXsq+n/Iqqyp1jdNZMixqn9NsBi V7u9ORBihYeET+6/cBbOTmBa98y4Lnx99sJgIytCBihlRBplrwbjiNOEzJ2Ns9hP 0eVogQu6G/xrhkpdZxAstvytQS450xH61Ex6M7poFXljbYxHg54zSs+So2bKlpVn 6R/t4KJ6oq6AtUIdzQlbptEVOdfOs3uU670PQqiLs4WJxMv/cpdBg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvddvvddguddulecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffoggfgsedtkeertd ertddtnecuhfhrohhmpeflohhshhcuffgvkhhkvghruceojhhoshhhsehithgrnhhimhhu lhdrlhhiqeenucggtffrrghtthgvrhhnpeekgfejgfejteegvddvvdeiieejvdeigfdvgf ffudejffffffejgfelkefhuefgveenucfkphepkeekrddufedtrdegkedrudejtdenucev lhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehjohhshhesih htrghnihhmuhhlrdhlih X-ME-Proxy: Received: from computer.fritz.box (mue-88-130-48-170.dsl.tropolys.de [88.130.48.170]) by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 28 Apr 2021 15:50:48 -0400 (EDT) From: Josh Dekker To: ffmpeg-devel@ffmpeg.org Date: Wed, 28 Apr 2021 21:50:24 +0200 Message-Id: <20210428195028.80000-1-josh@itanimul.li> X-Mailer: git-send-email 2.30.1 (Apple Git-130) MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 0/2] ARM64 HEVC QPEL/EPEL X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Lynne Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: R6xCel1H7Sbk This is a patch originally, submitted in 2017 (author/date info left intact). At the time, it didn't get much attention I assume due to the sheer size of it. I have split the patch into only its QPEL/EPEL parts, rebasing, and doing some cleaning of the patches as much is reasonable for a 9001 line diff. I also have SAO band (non-working) and 32x32 IDCT (working but honestly in a worse state than these patches). This patch gives a large overall speedup roughly 30% in my testing. The only problem is that (as previously stated), 1) it's a lot of code, the original author didn't make use of macros. 2) it's only 8-bit. I will be writing 10-bit assembly, and whilst I do that will clean-up/macro-ify the current 8-bit assembly. Though there is still lots to be done. Our current IDCTs for HEVC aren't great either, I had a 40% speedup on the 16x16 one in testing. The assembly is far from 'done' but we're getting closer slowly at least. There were some suggestions for smaller improvements in the previous reviews and I have not applied those. The first course of action is to refractor it so that it is possible to work on the code without going insane. I think it's fine to use it whilst I'm working on refractoring it due to the large speedup: the code-weight in the binary should be relatively similar even after that anyway. Also, updated kperf patch as per Lynne's request. --. Josh