From patchwork Wed Mar 22 00:07:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "J. Dekker" X-Patchwork-Id: 40776 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:d046:b0:cd:afd7:272c with SMTP id hv6csp2995400pzb; Tue, 21 Mar 2023 17:07:52 -0700 (PDT) X-Google-Smtp-Source: AK7set/mNfSeXSb5k4eZ5q3Byn7FIr1m1Mh0WprJVzoYJKEjg0yBuzOndH4Jw0clKI82GUvylVwI X-Received: by 2002:a17:906:ca49:b0:934:286:f9 with SMTP id jx9-20020a170906ca4900b00934028600f9mr331077ejb.27.1679443672749; Tue, 21 Mar 2023 17:07:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679443672; cv=none; d=google.com; s=arc-20160816; b=z4/rKBa4siNQH56n9ktKpYz7W++9YSopIw0EkEW7pssdbRTUA8m28yz57qs/zhPzep RZvN56d7pWZNqsCcrSQuCZ2+VTADiX9llhRdYJnh8Tk/MWfhHJVIiDtjko6GfTe6744s or80vkjGY74ZxIJ99Myh69KtjzqNF+qqHnBD0nYhflZF/yI+4D0RWL9A21N3SszKV6t4 Wm5HNlKyttqHpmAiWs6KRQmPLyYp5qLTgdFzWLJ1Pvs2j6DoyZTkfR4FvO7CGR2UU5fW lt2vvlYrjJf8rIEIFOoA5bXNQMYEREVhrcRTbdN9fjeBARG9sD71mR9xDAQD0JS+E5QV YAJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:feedback-id:dkim-signature:dkim-signature:delivered-to; bh=uhZNf8YximIx0E4wAP/4ATklpRORCH5fnmX+Rw5qQeo=; b=w6Yt5gaWDt2HMAUWzH/+cV1pn29e5NLJhW/mTAPOjU1ZNkW8bMEewQOxdm3+xVVpgV mCxJKktuSkfOq5MOFleYoe/y3xiaN/jdre5R1vJ04ci/fOaupYF1p1pYT81vR1ybuake RM4zoeCnvXLO/cXHNnzaVDzMuNJc9lDIkZbER0x6Choe9N8UWHqs4Yx3zlowbZmV7io+ D2S0fBgX/baO0eQ2o485ocUMBx/P1Qkj5ngCx4z4qLXt22RUmT1pvJoVaQWzf9/Z7+pV fLJ1n+zw9KXE+gS9jPSLKSimzjyNaF5n4irwhbasNeg4p8R5NvnNSuW0Vobr6NhgIZhm +30g== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@itanimul.li header.s=fm3 header.b=aVThYcoa; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm2 header.b=KwsggiRZ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id w17-20020a17090652d100b008da1e19114fsi13562838ejn.322.2023.03.21.17.07.52; Tue, 21 Mar 2023 17:07:52 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@itanimul.li header.s=fm3 header.b=aVThYcoa; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm2 header.b=KwsggiRZ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D9C4C68BC61; Wed, 22 Mar 2023 02:07:38 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1117F68C2E3 for ; Wed, 22 Mar 2023 02:07:31 +0200 (EET) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id B93B05C01BD for ; Tue, 21 Mar 2023 20:07:28 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Tue, 21 Mar 2023 20:07:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itanimul.li; h= cc:content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm3; t= 1679443648; x=1679530048; bh=LkSnkMojO7kNYV1vSfhpGbxA55JTrTvrrQF lAUHuF2o=; b=aVThYcoagLGQ6tLFosDuYw5PCZ2aL218kk8J6F+mO15nobmXCrf OiM/FjmY0qXP/HCWilz8TqyvCcigEIf6i49CbhNniYMMDYE9PwQpYKs+zntCe4xc zOERC0TbkgxFxLiJtKNQ04grLqEbZYjSOSKhQttsV6EOZmIFiQTg6pUkYMtQzbLx +At+CfsZKQiyHFnJO8HMJu7sMt9lpa29HWHLqTxzs+LPQLQloD7y0zXNxhf65gyH Y8GL0xaqJ+zyW5NxjjDNncWcsf4Y2xPlpmq0jPOkpVAzgehOcJxaua2Y+53WmnlG 6ecHBiohQOCxaAldtGlMRmfmMx5DhxRQ4WQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1679443648; x= 1679530048; bh=LkSnkMojO7kNYV1vSfhpGbxA55JTrTvrrQFlAUHuF2o=; b=K wsggiRZpi05otv3jpf5CUEvTp9x/5rSGhMHV2OMgFk6yH93TDSSd9e1rx20jEQ6X co8LvgzQNv2/6rawHC5CL0QZmATBDtRr1ghJ31N6WxfNUD3bSuJoyXOcCytyla0i 9dVWADLic62EoNx3TX88d5WCIaB2K04iPM2bTt3fFc2PCHnsGk5VGf/BlhMCUBHm qTzX2s7LGO9as02wtQ+FS6lmwEAJyDDyc03hIh9XWmznGA8NR5jaJpnB0U8Jqueg kxe+s3wftWw6jISr3YczJ2qQYFB9zdjTGbZhaTGKSCItV3IyWR/J8GzxEePLqceo 7stNe8kAPK398pm/FY3Hg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrvdeguddgudelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhggtgfgsehtke ertdertdejnecuhfhrohhmpedflfdrucffvghkkhgvrhdfuceojhguvghksehithgrnhhi mhhulhdrlhhiqeenucggtffrrghtthgvrhhnpeeujeefteekgedufeeggeeiudffffevud ekkeegudejtdejhfevvdfgheettdehfeenucffohhmrghinhepnhgvohhnrdhssgenucev lhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehjuggvkhesih htrghnihhmuhhlrdhlih X-ME-Proxy: Feedback-ID: i84994747:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Tue, 21 Mar 2023 20:07:28 -0400 (EDT) From: "J. Dekker" To: ffmpeg-devel@ffmpeg.org Date: Wed, 22 Mar 2023 01:07:08 +0100 Message-Id: <20230322000710.47513-1-jdek@itanimul.li> X-Mailer: git-send-email 2.39.2 In-Reply-To: <61cbba0-956c-86ff-340-26a23453e0d@martin.st> References: <61cbba0-956c-86ff-340-26a23453e0d@martin.st> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/3] lavc/aarch64: add clip N macro X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: fJBT3Z4Twve/ Signed-off-by: J. Dekker --- libavcodec/aarch64/hevcdsp_idct_neon.S | 19 +++++-------------- libavcodec/aarch64/neon.S | 11 +++++++++++ 2 files changed, 16 insertions(+), 14 deletions(-) diff --git a/libavcodec/aarch64/hevcdsp_idct_neon.S b/libavcodec/aarch64/hevcdsp_idct_neon.S index 467cb0f48a..3e59dd20bb 100644 --- a/libavcodec/aarch64/hevcdsp_idct_neon.S +++ b/libavcodec/aarch64/hevcdsp_idct_neon.S @@ -5,7 +5,7 @@ * * Ported from arm/hevcdsp_idct_neon.S by * Copyright (c) 2020 Reimar Döffinger - * Copyright (c) 2020 J. Dekker + * Copyright (c) 2023 J. Dekker * * This file is part of FFmpeg. * @@ -38,13 +38,6 @@ const trans, align=4 .short 31, 22, 13, 4 endconst -.macro clip2 in1, in2, min, max - smax \in1, \in1, \min - smax \in2, \in2, \min - smin \in1, \in1, \max - smin \in2, \in2, \max -.endm - function ff_hevc_add_residual_4x4_8_neon, export=1 ld1 {v0.8h-v1.8h}, [x1] ld1 {v2.s}[0], [x0], x2 @@ -182,7 +175,7 @@ function hevc_add_residual_4x4_16_neon, export=0 ld1 {v3.d}[1], [x12], x2 movi v4.8h, #0 sqadd v1.8h, v1.8h, v3.8h - clip2 v0.8h, v1.8h, v4.8h, v21.8h + clip v4.8h, v21.8h, v0.8h, v1.8h st1 {v0.d}[0], [x0], x2 st1 {v0.d}[1], [x0], x2 st1 {v1.d}[0], [x0], x2 @@ -201,7 +194,7 @@ function hevc_add_residual_8x8_16_neon, export=0 sqadd v0.8h, v0.8h, v2.8h ld1 {v3.8h}, [x12] sqadd v1.8h, v1.8h, v3.8h - clip2 v0.8h, v1.8h, v4.8h, v21.8h + clip v4.8h, v21.8h, v0.8h, v1.8h st1 {v0.8h}, [x0], x2 st1 {v1.8h}, [x12], x2 bne 1b @@ -221,8 +214,7 @@ function hevc_add_residual_16x16_16_neon, export=0 sqadd v1.8h, v1.8h, v17.8h sqadd v2.8h, v2.8h, v18.8h sqadd v3.8h, v3.8h, v19.8h - clip2 v0.8h, v1.8h, v20.8h, v21.8h - clip2 v2.8h, v3.8h, v20.8h, v21.8h + clip v20.8h, v21.8h, v0.8h, v1.8h, v2.8h, v3.8h st1 {v0.8h-v1.8h}, [x0], x2 st1 {v2.8h-v3.8h}, [x12], x2 bne 1b @@ -239,8 +231,7 @@ function hevc_add_residual_32x32_16_neon, export=0 sqadd v1.8h, v1.8h, v17.8h sqadd v2.8h, v2.8h, v18.8h sqadd v3.8h, v3.8h, v19.8h - clip2 v0.8h, v1.8h, v20.8h, v21.8h - clip2 v2.8h, v3.8h, v20.8h, v21.8h + clip v20.8h, v21.8h, v0.8h, v1.8h, v2.8h, v3.8h st1 {v0.8h-v3.8h}, [x0], x2 bne 1b ret diff --git a/libavcodec/aarch64/neon.S b/libavcodec/aarch64/neon.S index 1ad32c359d..bc105e4861 100644 --- a/libavcodec/aarch64/neon.S +++ b/libavcodec/aarch64/neon.S @@ -1,6 +1,8 @@ /* * This file is part of FFmpeg. * + * Copyright (c) 2023 J. Dekker + * * FFmpeg is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public * License as published by the Free Software Foundation; either @@ -16,6 +18,15 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +.macro clip min, max, regs:vararg +.irp x, \regs + smax \x, \x, \min +.endr +.irp x, \regs + smin \x, \x, \max +.endr +.endm + .macro transpose_8x8B r0, r1, r2, r3, r4, r5, r6, r7, r8, r9 trn1 \r8\().8B, \r0\().8B, \r1\().8B trn2 \r9\().8B, \r0\().8B, \r1\().8B