From patchwork Thu Jun 29 17:57:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cox X-Patchwork-Id: 42312 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1e:b0:12b:9ae3:586d with SMTP id c30csp1945233pzh; Thu, 29 Jun 2023 10:58:52 -0700 (PDT) X-Google-Smtp-Source: APBJJlHsiBEva0Cju9Nca6PdZBdJnYhKdzEYcdYf91oe5y8iCgoiq46BMNNrSS7QOeu9WJfBNRMb X-Received: by 2002:a17:906:4a54:b0:992:6e93:ce38 with SMTP id a20-20020a1709064a5400b009926e93ce38mr157364ejv.41.1688061531786; Thu, 29 Jun 2023 10:58:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688061531; cv=none; d=google.com; s=arc-20160816; b=POyb4od0mVfWrOsW1604VC76Nj3TpRSs+uxMbOVL8qUbo2x4JBpUP+X7aPD7/zYzMw RxC0FBkLM+8YWkUOAMqDPtARIswB5oR/JjPi5f5lpNnd+duyO1rSu+I7wATXClVJtGig E/GfJLgnUSQtEShRHkTfUof5MeKJBCszJuKObq8wXOQVLzaYMDEuczX2cRkDQUBkhvsj 9C9c9bHvQ2GBO26SY9tGj6eUNKkFp08jriFSqFXhOTwgofcETjmX41qO491pO32alU8o rq60Ikp5oJulZCn/jGTFsrsyM1aQm0NWwnQ8hMCESgUd1sX9qUI7c6IsIvHiR4dQ7FyP N6qA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=UEj1G2jfBFsVcwK/6fn1zjqD/0R+qu5mYuJ/ckSB5b8=; fh=4TI4rjEBNZIiFzmH/zgNEtnT9CzHjNyw0MAPqTxWP34=; b=kRyRsYZodlCtzpBS5QwhF4aGuEMwJ3WaVxb3DbKPlqJm9rv1yOy0GEMyboR/UJoD6r GZplA9/IQbn8PYRsbYU1VR4eERLNPSYV4AlKH96KWLZr3twduxjNCmT7YTMLvTbxZW3H iwGs/ewOzYX/IsipAA3UGTvRMc6npPzRhEJDaniICBeU77YooPtgNZF46TEKEaTQjrzB 10t2LX/jv6tCcXqylaOYzWh/9EmG+IMGLAdc2F7tFJXIcfr38bB8e2jUzUsPgWFfjD8x E/eUFiu8wSVKMad+WLFFw1LBebFKNAjfr5w9OTTA6B51Fsuwrf1oKL1A6dIFe2D/TRM8 P/BQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=xJjidUO1; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y2-20020a17090668c200b0098dd7b3684csi7297297ejr.994.2023.06.29.10.58.51; Thu, 29 Jun 2023 10:58:51 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=xJjidUO1; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5201068C2FB; Thu, 29 Jun 2023 20:58:29 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5342D68B14C for ; Thu, 29 Jun 2023 20:58:21 +0300 (EEST) Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-3fbc244d386so3354365e9.2 for ; Thu, 29 Jun 2023 10:58:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20221208.gappssmtp.com; s=20221208; t=1688061500; x=1690653500; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yE4+Lxx+3rZiCgKYa3U5jEk2rDIqXkZWGAAkDvtFTYA=; b=xJjidUO1+j3GsLojm8IK2d4qdBOrOd09E8KFD92mx9imO1hVPYCpOJyDDJUDxQE327 FeVw8DDZvi7IEuvYvUOfnJOEke0px/B0Ne9OoeWGH9xHyV6ucyUSX0xWN+P9Hm6Ef1Dm aWv33pPiH3wRX5SdzHSDg/tZhi+dcoQiU2PzOXmF45UeuwfI2z4g7cPG17xfvg7SLZBt mlvsSOP91cg22AR6VQUXGuBnDbVmzon1qe6qQPae4bhNE91P1XBiK/FFaAgX9Tt/Eua6 CtQk8GBG8Z09k+88vaAl0aqQ2DMvpgjL3KIfIbdHsaouqmZhAowbEUUJ7CfnswywEWxf azyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688061500; x=1690653500; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yE4+Lxx+3rZiCgKYa3U5jEk2rDIqXkZWGAAkDvtFTYA=; b=VYjC1A+9Tk6f6AWNgy9jZHeaoTj3bOtCoPUqOlHiLB9TlP/a5lqqiBLbomHlC7gDtW vaPeDUAcl5HoQidpc5LIcKU3gx3s+y2AYq+MKvHURqYs5nDukeWpesvGyc8j9gosne8S t8ytl8jL+YtcJL9fy3LdnTVEb9dN15scnLIDKIgXVtwQvgjEwBqiefEgmidrs5PpkW9p rIPWtMqmuAaDaRC/qESeQEZOVIB5bUsgdn8JpH+bC45QVgBPoALUZbO1BVL5MdKx5DCk PCsU5sC8/XIxWj+TK4Dt0aOVAM3gXWq5nVz6G1C9UGDSivlfUxfL+CKpSDPpC1hkQy1u OO3Q== X-Gm-Message-State: AC+VfDzLmfCVD42nzycHsrBBWskMI+WuN+aQtMbzacyPlx+jzhbsxhLL di6SJP678ymOYdaIpGCRkW+9cqf8tIBDWqG5ALA= X-Received: by 2002:a1c:7417:0:b0:3f8:f80e:7b45 with SMTP id p23-20020a1c7417000000b003f8f80e7b45mr99021wmc.7.1688061500583; Thu, 29 Jun 2023 10:58:20 -0700 (PDT) Received: from sucnaath.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id f26-20020a7bcd1a000000b003fbba5f21b6sm2041541wmj.28.2023.06.29.10.58.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jun 2023 10:58:20 -0700 (PDT) From: John Cox To: ffmpeg-devel@ffmpeg.org Date: Thu, 29 Jun 2023 17:57:16 +0000 Message-Id: <20230629175729.224383-3-jc@kynesim.co.uk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230629175729.224383-1-jc@kynesim.co.uk> References: <20230629175729.224383-1-jc@kynesim.co.uk> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 02/15] avfilter/vf_bwdif: Add common macros and consts for aarch64 neon X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: thomas.mundt@hr.de, John Cox Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: jsrQBs9zNCDf Add macros for dual scalar half->single multiply and accumulate Add macro for shift, saturate and shorten single to byte Add filter constants Signed-off-by: John Cox --- libavfilter/aarch64/vf_bwdif_neon.S | 46 +++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/libavfilter/aarch64/vf_bwdif_neon.S b/libavfilter/aarch64/vf_bwdif_neon.S index 639ab22998..a8f0ed525a 100644 --- a/libavfilter/aarch64/vf_bwdif_neon.S +++ b/libavfilter/aarch64/vf_bwdif_neon.S @@ -23,3 +23,49 @@ #include "libavutil/aarch64/asm.S" +.macro SQSHRUNN b, s0, s1, s2, s3, n + sqshrun \s0\().4h, \s0\().4s, #\n - 8 + sqshrun2 \s0\().8h, \s1\().4s, #\n - 8 + sqshrun \s1\().4h, \s2\().4s, #\n - 8 + sqshrun2 \s1\().8h, \s3\().4s, #\n - 8 + uzp2 \b\().16b, \s0\().16b, \s1\().16b +.endm + +.macro SMULL4K a0, a1, a2, a3, s0, s1, k + smull \a0\().4s, \s0\().4h, \k + smull2 \a1\().4s, \s0\().8h, \k + smull \a2\().4s, \s1\().4h, \k + smull2 \a3\().4s, \s1\().8h, \k +.endm + +.macro UMULL4K a0, a1, a2, a3, s0, s1, k + umull \a0\().4s, \s0\().4h, \k + umull2 \a1\().4s, \s0\().8h, \k + umull \a2\().4s, \s1\().4h, \k + umull2 \a3\().4s, \s1\().8h, \k +.endm + +.macro UMLAL4K a0, a1, a2, a3, s0, s1, k + umlal \a0\().4s, \s0\().4h, \k + umlal2 \a1\().4s, \s0\().8h, \k + umlal \a2\().4s, \s1\().4h, \k + umlal2 \a3\().4s, \s1\().8h, \k +.endm + +.macro UMLSL4K a0, a1, a2, a3, s0, s1, k + umlsl \a0\().4s, \s0\().4h, \k + umlsl2 \a1\().4s, \s0\().8h, \k + umlsl \a2\().4s, \s1\().4h, \k + umlsl2 \a3\().4s, \s1\().8h, \k +.endm + +// static const uint16_t coef_lf[2] = { 4309, 213 }; +// static const uint16_t coef_hf[3] = { 5570, 3801, 1016 }; +// static const uint16_t coef_sp[2] = { 5077, 981 }; + + .align 16 +coeffs: + .hword 4309 * 4, 213 * 4 // lf[0]*4 = v0.h[0] + .hword 5570, 3801, 1016, -3801 // hf[0] = v0.h[2], -hf[1] = v0.h[5] + .hword 5077, 981 // sp[0] = v0.h[6] +