From patchwork Fri Jul 15 08:02:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hubert Mazur X-Patchwork-Id: 34766 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:43a1:b0:8b:53c6:bec9 with SMTP id i33csp125054pzl; Fri, 15 Jul 2022 01:03:08 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tUgQ+rSR2/GgpSTXTqu7iQ6sc1Pl71OSunCWLZH6gKcEeDrMDbSV+7i8gS+wAVhOjdARTq X-Received: by 2002:a05:6402:3307:b0:43a:826c:d8b4 with SMTP id e7-20020a056402330700b0043a826cd8b4mr17499785eda.32.1657872188123; Fri, 15 Jul 2022 01:03:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657872188; cv=none; d=google.com; s=arc-20160816; b=rk4HXeW6zUjabDsf8AD0MgGJTblq5AE1kNnZZrFOZFbrUpmG6enF1bJmKV0nYzu42y ct479rOwcTAK18bkh5D3nuP5vW0mebbnYTvALITSp0Xb7zDS4q7I+Z60EYoDAGfEdBH3 cruUp5XFM3qNQOLLa/WV2w70UKThH302n6nnnOhWYm7tlWrQ4G18Ogg9Dm9cL8ViHvTI LhosWRwmH1HH0Hxf4y+fd0oUkiWRH5mAoNoN4c70QVmtlnVTixCnVHycUlFjzbcSS+0S /0lKEAaPJycWneg2DEX0cKDzzGSVp4D3koKaHn6q/SIuFnEWjJjZRKy1Fnu7dWO9hnGP qZUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=yn+Op1p6agJxDbHBB1AFFWNS8udUlzolbHuFUaBsVsE=; b=KvvOCUgoTxjWIr1w/X9NNwie5RRlbmveRioUi1qjql9Axka42g2diK79sy0+RfR3OP jRdGgjuGw2uQQ92ioBW1F+qWtxQuzKs9W5u3wyS9LP/WvDwLsFBFKYiOUbuWkKTspaCg T6KwFfWaYQA7BCvDHSzbFSOCopxVEPRkKYR1N7vj3cDgImcm/OoTHPgn//ChfkveUkSY fzdt7Q4LcJkPMzvwAkM+zF8ObvHFHg8wmBa9NhYs1IjTqKWI0KAc1JoXo09i+9LfpkXs 49B3tpG6sAgTkXqFw0YPHNlIbdlKQuCzMVf6ySrE/49TDZ01QQYd/ruTrkzPD0ONHDqr Y/EQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@semihalf.com header.s=google header.b=hMbM5Di1; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=semihalf.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id bs1-20020a056402304100b00435afb8d3f3si4762258edb.290.2022.07.15.01.03.06; Fri, 15 Jul 2022 01:03:08 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@semihalf.com header.s=google header.b=hMbM5Di1; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=semihalf.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B258968B93D; Fri, 15 Jul 2022 11:03:01 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com [209.85.167.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 72A5A68B863 for ; Fri, 15 Jul 2022 11:02:55 +0300 (EEST) Received: by mail-lf1-f44.google.com with SMTP id d12so6573781lfq.12 for ; Fri, 15 Jul 2022 01:02:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf.com; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=NuQKfgaEToupelUb64PklmPZJeB75tY42isEwSYavsQ=; b=hMbM5Di1qe4UkdXsAK13RrvP1ON7nQ9fsu+cAp7b9JaWl6i5uXGUj0LKm2ZYlhQh5g eJYsDuOn5NQW3Q/KUy+0HdoqAjlz0QsxPlth68KrfXOYgVJuz/BNzOG6HiIqqCIVbZyX VKMERnxe7Kw+Sb/nRVnYkSbwQt9lsHyCXuIikKdtrTD3qh5W3ma6sEBMtlxAVBIr35CQ YzfdpB7xUKlj2V/Hon6iHhXOC9N+cpZiiqF8nYEE2u001TozCt2Ycd5YYn44QeRNpluL b9qyK5YYAMqIpQOtsviwXmgFA7JRQS+iKAlCUcpQNo3p2EPAAsUJjxhiC62AMSs3xMLr x7Mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=NuQKfgaEToupelUb64PklmPZJeB75tY42isEwSYavsQ=; b=2+dvvgG/qto7k5l7qkeK5WMgiARWEBNK6QEGVpjWFQH+J3P6R3RFjbQGkoJcgAa1GI z4heF4NfM3sRaQMvXiAOLMzg2JcEb9PbZeZNFbqohhqdEYjnZPRn9zNNR6Cq57fJXLCE ZTweEnhdNmdOzpiwJFshIEQhsGvuWLrQJGCQNqWMkcSnluw0T9uwzB4r3uOABSf3V0Wo rKV0Q0/z91KOowKgffzDJeDkCZU+fOFKifTSRw3t62z9j7I3UASolLNsJoSTxRN1qQ2W B9tZnk7w8UMJqyoitTwAeDIgI5Ie+XIeh0pPXXL5upVBv2AwcJVypmvWNPYMXPSFrrwQ hI8Q== X-Gm-Message-State: AJIora+5b7lB8RoBuS9cPShBHXwh0Y5sw7Mwo3HM85PFZpsR85IKVsFy BEuSuc33dp+6MlRAYcOeRzt9JoC+DmGmVWoQ X-Received: by 2002:a05:6512:2245:b0:489:dc6e:afdd with SMTP id i5-20020a056512224500b00489dc6eafddmr6849208lfu.628.1657872174433; Fri, 15 Jul 2022 01:02:54 -0700 (PDT) Received: from hum-HP-ProBook-440-G7.semihalf.net (adrf243.neoplus.adsl.tpnet.pl. [79.185.165.243]) by smtp.gmail.com with ESMTPSA id k1-20020a2eb741000000b0025d542731ffsm645548ljo.5.2022.07.15.01.02.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Jul 2022 01:02:53 -0700 (PDT) From: Hubert Mazur To: ffmpeg-devel@ffmpeg.org Date: Fri, 15 Jul 2022 10:02:23 +0200 Message-Id: <20220715080228.686736-1-hum@semihalf.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 0/5] Add neon implementation for me_cmp functions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: gjb@semihalf.com, upstream@semihalf.com, jswinney@amazon.com, Hubert Mazur , martin@martin.st, mw@semihalf.com, spop@amazon.com Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: afdD5zSKOmbL Add arm64 neon implementation for the following functions from motion estimation. All functions were tested and benchmarked on AWS Graviton 3 instances. Hubert Mazur (5): lavc/aarch64: Add neon implementation for sse16 lavc/aarch64: Add neon implementation for sse4 lavc/aarch64: Add neon implementation for pix_abs16_y2 lavc/aarch64: Add neon implementation for sse8 lavc/aarch64: Add neon implementation for pix_abs8 libavcodec/aarch64/me_cmp_init_aarch64.c | 17 ++ libavcodec/aarch64/me_cmp_neon.S | 346 +++++++++++++++++++++++ 2 files changed, 363 insertions(+)