From patchwork Fri Jul 23 05:53:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 29028 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5d:965a:0:0:0:0:0 with SMTP id d26csp1118883ios; Thu, 22 Jul 2021 22:54:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwI2Xh28uf7UMAx8I/DSwdKzELN+GgHXmZtTxgc80MXbWw4cSP7bh7sW7q7Lqg882B1LsS9 X-Received: by 2002:a17:906:bc8b:: with SMTP id lv11mr3131293ejb.331.1627019685869; Thu, 22 Jul 2021 22:54:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627019685; cv=none; d=google.com; s=arc-20160816; b=XGg2rCoAemRR0se6UYi2CB5fWzWJMUDoM71rgtjCwTBWD4dIPn2GaQ6jgf5OHoMYkH FKRuSsw8wxLvN30lo+g7G7X8o0Nw2ErfylPiLBL55VQknCF6XEg19Pp6Yc9N0ZWSnucc wwjV0SGnWkYEUt5WH9WOOfkJo9jpRIly9hJlUY0ukquFq+4qhd8cb/ttbVDBi6j4iFn0 cqM/nZL0pdNV2OxZTCZrbmz1ikH82lEbo9YHO/yDP+CMMWITPNWW/zsPesmXyDpCQ5Bj G05hOlX0SDDWeou0HTH3UuWinqKBIZZ8eWeXA46h7hMYCQEGjaZM3/ak6OpVmdepujkH CR4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:dkim-signature:delivered-to; bh=q2B42h/HSiiI4jm86WGil28NAxaSi2J6YVqmSc56mog=; b=aEgAHJCgcmroOIfxw+zpRmcCIWAeRfrTm7bPf9WsWbEc6F8M9TcKaVjqVqbD0JH2lj plP4EKl7TusNdAIbwJxwMUHpnal0tY/tUuGCWjOcr6Do7dXS0vs1FVfGHQuGCA3RCDBY nq5tCalmmoYCHP0tu+38xkiT0s6L4yIRMDdkNhc3RnM7nw1pBO+o9/2HOAnk0tXgY61Y HMGsBFIGQIc0VUDWTiSzzFsZOSxws1aCnNvWK6eDvmHYYUlQcjKkFn59jg0rPUecDxQs cfmFuuCPc8yOPCaFtsoCCF2UG5HnCmRJvWQXBtdWzOcxbOsXfUG5XW+bZRy8k6ghDaxk Rw7A== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@flygoat.com header.s=fm2 header.b=sPTn0foS; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm3 header.b=nRaOP3kN; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 5si36126965ejq.621.2021.07.22.22.54.45; Thu, 22 Jul 2021 22:54:45 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@flygoat.com header.s=fm2 header.b=sPTn0foS; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm3 header.b=nRaOP3kN; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3A60F68AB1D; Fri, 23 Jul 2021 08:54:35 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BA4F368A789 for ; Fri, 23 Jul 2021 08:54:28 +0300 (EEST) Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id D3BFF5C00F2; Fri, 23 Jul 2021 01:54:27 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Fri, 23 Jul 2021 01:54:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=flygoat.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm2; bh=zu/9CU005Wylv 2Gzwa5rOGeizAjRGiy8/MaO0WfpN1g=; b=sPTn0foS+zOqWIUmKtFlXUnhMs01N gSxzeQOD9KjPUj5DicYKPyGLCMQnjdINhGavyE3IW8JkPbYH3nKDRmqheTpNv046 dGWYeeTT4n4dRt2Kkt/52W5aCB2r2cINPB5De6qQLxDLcJN1RL6a+D3aK7zqSviv nJah1i8a6GilRyqzjH1vnTpebimhHgZm/c19M3LUA4GUfUfy5VeHm8mQPRmwo5Sc b0bSnWriNEVtoulVkJzdeUftbxmyy8yVnR5UT5wiS+1OLn0jVE1jALXJfjhFQiUj 26GAKa/ECoJ0lKqWo7AwaahEZTbYiMpGSmGYzIOB1wAjveb4MMsZP9Vkg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=zu/9CU005Wylv2Gzwa5rOGeizAjRGiy8/MaO0WfpN1g=; b=nRaOP3kN 09Qf37lamLNNxI+HyV+9Vh0NWOzYjhAOY4Mt5IkQ2bQ9thIoW+eZUW25+pgubtFy mXe+UYitQY60Vx5kAP85NWFPdVRZzdZxYZjY5fps346J06Z01k4DvOMi99qF5j7f sSX/6udm/eHEKvu9YjL2y0OL445vLEeGL7ayxIGJsXm0Vt9Wg8gdXKGTGGLbCFaZ F6PgJglxasoZ5AR5NOqdLHGuhT90tc37L58oBatv4aSanXhAAyKPVnpgO0VHxcol mOlYf1wFZQvS4fJ/O67H+K0m6yus5R12zcGqKjaSk+rsR1R2A2id6WdXRP4NTnle cGArOvanFa7FrA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrfeejgdeljecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeflihgrgihunhcujggrnhhguceojhhirgiguhhnrdihrghnghes fhhlhihgohgrthdrtghomheqnecuggftrfgrthhtvghrnhepjeeihffgteelkeelffduke dtheevudejvdegkeekjeefhffhhfetudetgfdtffeunecuvehluhhsthgvrhfuihiivgep tdenucfrrghrrghmpehmrghilhhfrhhomhepjhhirgiguhhnrdihrghnghesfhhlhihgoh grthdrtghomh X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 23 Jul 2021 01:54:25 -0400 (EDT) From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Fri, 23 Jul 2021 13:53:41 +0800 Message-Id: <20210723055344.21961-2-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210723055344.21961-1-jiaxun.yang@flygoat.com> References: <20210723055344.21961-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 1/4] avutil/mips: Use MMI_{L, S}QC1 macro in {SAVE, RECOVER}_REG X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: yinshiyou-hf@loongson.cn, Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: JmlkEylMNBaM {SAVE,RECOVER}_REG will be available for Loongson2 again, also comment about the magic. Signed-off-by: Jiaxun Yang Reviewed-by: Shiyou Yin --- libavutil/mips/mmiutils.h | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/libavutil/mips/mmiutils.h b/libavutil/mips/mmiutils.h index 6a82caa908..41715c6490 100644 --- a/libavutil/mips/mmiutils.h +++ b/libavutil/mips/mmiutils.h @@ -204,25 +204,27 @@ #endif /* HAVE_LOONGSON2 */ /** - * backup register + * Backup saved registers + * We're not using compiler's clobber list as it's not smart enough + * to take advantage of quad word load/store. */ #define BACKUP_REG \ LOCAL_ALIGNED_16(double, temp_backup_reg, [8]); \ if (_MIPS_SIM == _ABI64) \ __asm__ volatile ( \ - "gssqc1 $f25, $f24, 0x00(%[temp]) \n\t" \ - "gssqc1 $f27, $f26, 0x10(%[temp]) \n\t" \ - "gssqc1 $f29, $f28, 0x20(%[temp]) \n\t" \ - "gssqc1 $f31, $f30, 0x30(%[temp]) \n\t" \ + MMI_SQC1($f25, $f24, %[temp], 0x00) \ + MMI_SQC1($f27, $f26, %[temp], 0x10) \ + MMI_SQC1($f29, $f28, %[temp], 0x20) \ + MMI_SQC1($f31, $f30, %[temp], 0x30) \ : \ : [temp]"r"(temp_backup_reg) \ : "memory" \ ); \ else \ __asm__ volatile ( \ - "gssqc1 $f22, $f20, 0x00(%[temp]) \n\t" \ - "gssqc1 $f26, $f24, 0x10(%[temp]) \n\t" \ - "gssqc1 $f30, $f28, 0x20(%[temp]) \n\t" \ + MMI_SQC1($f22, $f20, %[temp], 0x10) \ + MMI_SQC1($f26, $f24, %[temp], 0x10) \ + MMI_SQC1($f30, $f28, %[temp], 0x20) \ : \ : [temp]"r"(temp_backup_reg) \ : "memory" \ @@ -234,19 +236,19 @@ #define RECOVER_REG \ if (_MIPS_SIM == _ABI64) \ __asm__ volatile ( \ - "gslqc1 $f25, $f24, 0x00(%[temp]) \n\t" \ - "gslqc1 $f27, $f26, 0x10(%[temp]) \n\t" \ - "gslqc1 $f29, $f28, 0x20(%[temp]) \n\t" \ - "gslqc1 $f31, $f30, 0x30(%[temp]) \n\t" \ + MMI_LQC1($f25, $f24, %[temp], 0x00) \ + MMI_LQC1($f27, $f26, %[temp], 0x10) \ + MMI_LQC1($f29, $f28, %[temp], 0x20) \ + MMI_LQC1($f31, $f30, %[temp], 0x30) \ : \ : [temp]"r"(temp_backup_reg) \ : "memory" \ ); \ else \ __asm__ volatile ( \ - "gslqc1 $f22, $f20, 0x00(%[temp]) \n\t" \ - "gslqc1 $f26, $f24, 0x10(%[temp]) \n\t" \ - "gslqc1 $f30, $f28, 0x20(%[temp]) \n\t" \ + MMI_LQC1($f22, $f20, %[temp], 0x10) \ + MMI_LQC1($f26, $f24, %[temp], 0x10) \ + MMI_LQC1($f30, $f28, %[temp], 0x20) \ : \ : [temp]"r"(temp_backup_reg) \ : "memory" \ From patchwork Fri Jul 23 05:53:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 29025 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5d:965a:0:0:0:0:0 with SMTP id d26csp1118965ios; Thu, 22 Jul 2021 22:54:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwEN0gmlNZ5jXU2mUc9u1qrAzwTv3gA70aUR6xm5ZbMpIL8xynB45Hv4JLSXmT6ojeiCqdT X-Received: by 2002:a05:6402:b8f:: with SMTP id cf15mr3618035edb.286.1627019696345; Thu, 22 Jul 2021 22:54:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627019696; cv=none; d=google.com; s=arc-20160816; b=AmYpvE7wFPCdLtuPpLtNoTaITzVsq2TH+3DyPwVAKhxy+OWOtAM1+VCDJ2wvij35jO 7+3gX+JzCHNC/PFRhaRu/o1xC6AacA2cV4LwSpRngOaTanPDhF05WxF3nDdAcem6EBPb XNzFElCWU1wGQRYFtg9UUA33zdrIhKI7oMMlLi3M00OIFr028Vk71bDclo0g0ePP1j1C LVY3j5PhTHFnoKti1MkcnvJ5fs6m88tZnP5teO03qV+JDjwleSzaAp68NsWOFCQuV9MO 3mslWEPYFfSznrtuQfGq2wa0e8yRBJUb1qA5PL8nQvAXXyG6vtrYB5ks1TVqrYnHe63A WFFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:dkim-signature:delivered-to; bh=XEeHMXxQm78Gidy+wrd1FftwAyB+SYLPuN0IOK3j/FA=; b=ky+0lKnbMCCVkujHnjFiB5gfZyvEHlsyw0X/MrBSBlIK92t5l+CKPYEaJE0dOqSse8 zH+JyL8MBLx5aYmivU7gZeaxcMiXmPY7HTPckXMp5ZHvR4JyVP+g1E0njIMHoBNMTpoK P3iDzDh0tVhC0d0zvB5k6MSbuwl5MrTsxJjJBCG/VqeyZ/vFS2oMVEmbURYGcZLgFm1w QBjcnMteYmys+dwyHKLWyDDrHiEcvNpiUE7Ar4HyqWq4bJgfEpPUJsP8QOGg2ADdgOLY n0AUq1jFolHv7xEuFc8umpPFMNNfO3upb+3kRImn/a/qgDz/6rEMp6llHG7pOlK4aSIt jQCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@flygoat.com header.s=fm2 header.b=CoRTpg++; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm3 header.b=lrBQp+jF; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b5si1452061ejg.662.2021.07.22.22.54.55; Thu, 22 Jul 2021 22:54:56 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@flygoat.com header.s=fm2 header.b=CoRTpg++; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm3 header.b=lrBQp+jF; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 30CAD68ACA5; Fri, 23 Jul 2021 08:54:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7A37568AAA6 for ; Fri, 23 Jul 2021 08:54:31 +0300 (EEST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.nyi.internal (Postfix) with ESMTP id 93CA45C00E9; Fri, 23 Jul 2021 01:54:30 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Fri, 23 Jul 2021 01:54:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=flygoat.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm2; bh=Ek8IA2E+8KrOT jouQFVLMa6dtBz2BwGNZptAHJyd+TE=; b=CoRTpg++ihzBIuDD3wVNpiWQ1ElzE upaHMsGwlgdLXqT6CoYuOJRxmlzm9s3+M4b/mWy2Z9+425U3GdGDv197INVM49gR F09gaCXr3H+CNTkPlZkvTfVriF58rmU1iL+7EnoqllphvuDp1aqriGqV6eWD387B 2eBCNxPhC4DdDgX+x1TfLBUC32wREuYPhsuKo/RO5iSfvAVGp+QyGXzjv8Ktdy6B trguXljhuKPfS710K/yKZ59Xv2L5AGdM/ji18Wk9FwiE6G+siwFB/jpPLVwPR7V2 DT5iO+/r54k4xI4jSBAHQxtccjKaquK3z7xzoUpHdw10EMZLQ57N6xAKw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=Ek8IA2E+8KrOTjouQFVLMa6dtBz2BwGNZptAHJyd+TE=; b=lrBQp+jF KpKS3rGXF7bBwmAMeYT9n4L1lVpdh2OwVsHB1Ea6khxhqbOwH61dV4M/XRKptg8X /gdBDMwtZyyfe4dIzmUSm+/KUHzfztjqPK62LfANmiCRmAdH4cYJxFKvO1DlnomG ug2l2XWupGpLJpgJbFS5ALuRQRbR8m2bjT4cESdX2oS/WdiTMoCxBeqUjNqxiuCH 5MTy8ekLu1/J/uyZTga+k9AwWNAN1hLuckzSabqCUym8YsLfM4rhqWxA7lkCxo44 gyNxZbB+qIuiqyvTqlTahuxLtBXWhI7kzMGbLCeRVEcnHU9GI+5VD7Rb50P3Zm/t MzWdPIcmekJO3A== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrfeejgdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeflihgrgihunhcujggrnhhguceojhhirgiguhhnrdihrghnghes fhhlhihgohgrthdrtghomheqnecuggftrfgrthhtvghrnhepjeeihffgteelkeelffduke dtheevudejvdegkeekjeefhffhhfetudetgfdtffeunecuvehluhhsthgvrhfuihiivgep tdenucfrrghrrghmpehmrghilhhfrhhomhepjhhirgiguhhnrdihrghnghesfhhlhihgoh grthdrtghomh X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 23 Jul 2021 01:54:28 -0400 (EDT) From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Fri, 23 Jul 2021 13:53:42 +0800 Message-Id: <20210723055344.21961-3-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210723055344.21961-1-jiaxun.yang@flygoat.com> References: <20210723055344.21961-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 2/4] avcodec/mips: Use MMI marcos to replace Loongson3 instructions X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: yinshiyou-hf@loongson.cn, Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: xIzzfYlPbjSi Loongson3's extention instructions (prefixed with gs) are widely used in our MMI codebase. However, these instructions are not avilable on Loongson-2E/F while MMI code should work on these processors. Previously we introduced mmiutils marcos to provide backward compactbility but newly commited code didn't follow that. In this patch I revised the codebase and converted all these instructions into MMI marcos to get Loongson2 supproted again. Signed-off-by: Jiaxun Yang Reviewed-by: Shiyou Yin --- libavcodec/mips/h264chroma_mmi.c | 28 +++- libavcodec/mips/h264dsp_mmi.c | 8 +- libavcodec/mips/hevcdsp_mmi.c | 251 ++++++++++++------------------ libavcodec/mips/hpeldsp_mmi.c | 1 + libavcodec/mips/simple_idct_mmi.c | 49 +++--- libavcodec/mips/vp3dsp_idct_mmi.c | 11 +- libavcodec/mips/vp8dsp_mmi.c | 100 +++++------- libavcodec/mips/vp9_mc_mmi.c | 128 ++++++--------- 8 files changed, 247 insertions(+), 329 deletions(-) diff --git a/libavcodec/mips/h264chroma_mmi.c b/libavcodec/mips/h264chroma_mmi.c index cc2d7cb7e9..ec35c5a72e 100644 --- a/libavcodec/mips/h264chroma_mmi.c +++ b/libavcodec/mips/h264chroma_mmi.c @@ -31,6 +31,8 @@ void ff_put_h264_chroma_mc8_mmi(uint8_t *dst, uint8_t *src, ptrdiff_t stride, { double ftmp[12]; union mmi_intfloat64 A, B, C, D, E; + DECLARE_VAR_ALL64; + A.i = 64; if (!(x || y)) { @@ -57,7 +59,8 @@ void ff_put_h264_chroma_mc8_mmi(uint8_t *dst, uint8_t *src, ptrdiff_t stride, MMI_SDC1(%[ftmp3], %[dst], 0x00) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" "bnez %[h], 1b \n\t" - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), [dst]"+&r"(dst), [src]"+&r"(src), [h]"+&r"(h) @@ -151,7 +154,8 @@ void ff_put_h264_chroma_mc8_mmi(uint8_t *dst, uint8_t *src, ptrdiff_t stride, MMI_SDC1(%[ftmp3], %[dst], 0x00) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" "bnez %[h], 1b \n\t" - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), @@ -201,7 +205,8 @@ void ff_put_h264_chroma_mc8_mmi(uint8_t *dst, uint8_t *src, ptrdiff_t stride, MMI_SDC1(%[ftmp1], %[dst], 0x00) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" "bnez %[h], 1b \n\t" - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), @@ -268,7 +273,8 @@ void ff_put_h264_chroma_mc8_mmi(uint8_t *dst, uint8_t *src, ptrdiff_t stride, MMI_SDC1(%[ftmp2], %[dst], 0x00) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" "bnez %[h], 1b \n\t" - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), @@ -288,6 +294,8 @@ void ff_avg_h264_chroma_mc8_mmi(uint8_t *dst, uint8_t *src, ptrdiff_t stride, { double ftmp[10]; union mmi_intfloat64 A, B, C, D, E; + DECLARE_VAR_ALL64; + A.i = 64; if(!(x || y)){ @@ -310,7 +318,8 @@ void ff_avg_h264_chroma_mc8_mmi(uint8_t *dst, uint8_t *src, ptrdiff_t stride, PTR_ADDU "%[dst], %[dst], %[stride] \n\t" "addi %[h], %[h], -0x02 \n\t" "bnez %[h], 1b \n\t" - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), [dst]"+&r"(dst), [src]"+&r"(src), [h]"+&r"(h) @@ -373,7 +382,8 @@ void ff_avg_h264_chroma_mc8_mmi(uint8_t *dst, uint8_t *src, ptrdiff_t stride, MMI_SDC1(%[ftmp1], %[dst], 0x00) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" "bnez %[h], 1b \n\t" - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), @@ -423,7 +433,8 @@ void ff_avg_h264_chroma_mc8_mmi(uint8_t *dst, uint8_t *src, ptrdiff_t stride, MMI_SDC1(%[ftmp1], %[dst], 0x00) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" "bnez %[h], 1b \n\t" - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), @@ -471,7 +482,8 @@ void ff_avg_h264_chroma_mc8_mmi(uint8_t *dst, uint8_t *src, ptrdiff_t stride, MMI_SDC1(%[ftmp1], %[dst], 0x00) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" "bnez %[h], 1b \n\t" - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), diff --git a/libavcodec/mips/h264dsp_mmi.c b/libavcodec/mips/h264dsp_mmi.c index 6e77995523..b5ab07c863 100644 --- a/libavcodec/mips/h264dsp_mmi.c +++ b/libavcodec/mips/h264dsp_mmi.c @@ -40,8 +40,8 @@ void ff_h264_add_pixels4_8_mmi(uint8_t *dst, int16_t *src, int stride) MMI_LDC1(%[ftmp3], %[src], 0x10) MMI_LDC1(%[ftmp4], %[src], 0x18) /* memset(src, 0, 32); */ - "gssqc1 %[ftmp0], %[ftmp0], 0x00(%[src]) \n\t" - "gssqc1 %[ftmp0], %[ftmp0], 0x10(%[src]) \n\t" + MMI_SQC1(%[ftmp0], %[ftmp0], %[src], 0x00) + MMI_SQC1(%[ftmp0], %[ftmp0], %[src], 0x10) MMI_ULWC1(%[ftmp5], %[dst0], 0x00) MMI_ULWC1(%[ftmp6], %[dst1], 0x00) MMI_ULWC1(%[ftmp7], %[dst2], 0x00) @@ -90,8 +90,8 @@ void ff_h264_idct_add_8_mmi(uint8_t *dst, int16_t *block, int stride) MMI_LDC1(%[ftmp3], %[block], 0x18) /* memset(block, 0, 32) */ "pxor %[ftmp4], %[ftmp4], %[ftmp4] \n\t" - "gssqc1 %[ftmp4], %[ftmp4], 0x00(%[block]) \n\t" - "gssqc1 %[ftmp4], %[ftmp4], 0x10(%[block]) \n\t" + MMI_SQC1(%[ftmp4], %[ftmp4], %[block], 0x00) + MMI_SQC1(%[ftmp4], %[ftmp4], %[block], 0x10) "dli %[tmp0], 0x01 \n\t" "mtc1 %[tmp0], %[ftmp8] \n\t" "dli %[tmp0], 0x06 \n\t" diff --git a/libavcodec/mips/hevcdsp_mmi.c b/libavcodec/mips/hevcdsp_mmi.c index 87fc2555a4..6583bef5da 100644 --- a/libavcodec/mips/hevcdsp_mmi.c +++ b/libavcodec/mips/hevcdsp_mmi.c @@ -35,6 +35,7 @@ void ff_hevc_put_hevc_qpel_h##w##_8_mmi(int16_t *dst, uint8_t *_src, \ double ftmp[15]; \ uint64_t rtmp[1]; \ const int8_t *filter = ff_hevc_qpel_filters[mx - 1]; \ + DECLARE_VAR_ALL64; \ \ x = x_step; \ y = height; \ @@ -50,14 +51,10 @@ void ff_hevc_put_hevc_qpel_h##w##_8_mmi(int16_t *dst, uint8_t *_src, \ \ "1: \n\t" \ "2: \n\t" \ - "gsldlc1 %[ftmp3], 0x07(%[src]) \n\t" \ - "gsldrc1 %[ftmp3], 0x00(%[src]) \n\t" \ - "gsldlc1 %[ftmp4], 0x08(%[src]) \n\t" \ - "gsldrc1 %[ftmp4], 0x01(%[src]) \n\t" \ - "gsldlc1 %[ftmp5], 0x09(%[src]) \n\t" \ - "gsldrc1 %[ftmp5], 0x02(%[src]) \n\t" \ - "gsldlc1 %[ftmp6], 0x0a(%[src]) \n\t" \ - "gsldrc1 %[ftmp6], 0x03(%[src]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[src], 0x00) \ + MMI_ULDC1(%[ftmp4], %[src], 0x01) \ + MMI_ULDC1(%[ftmp5], %[src], 0x02) \ + MMI_ULDC1(%[ftmp6], %[src], 0x03) \ "punpcklbh %[ftmp7], %[ftmp3], %[ftmp0] \n\t" \ "punpckhbh %[ftmp8], %[ftmp3], %[ftmp0] \n\t" \ "pmullh %[ftmp7], %[ftmp7], %[ftmp1] \n\t" \ @@ -83,8 +80,7 @@ void ff_hevc_put_hevc_qpel_h##w##_8_mmi(int16_t *dst, uint8_t *_src, \ "paddh %[ftmp3], %[ftmp3], %[ftmp4] \n\t" \ "paddh %[ftmp5], %[ftmp5], %[ftmp6] \n\t" \ "paddh %[ftmp3], %[ftmp3], %[ftmp5] \n\t" \ - "gssdlc1 %[ftmp3], 0x07(%[dst]) \n\t" \ - "gssdrc1 %[ftmp3], 0x00(%[dst]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[dst], 0x00) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[src], %[src], 0x04 \n\t" \ @@ -98,7 +94,8 @@ void ff_hevc_put_hevc_qpel_h##w##_8_mmi(int16_t *dst, uint8_t *_src, \ PTR_ADDU "%[src], %[src], %[stride] \n\t" \ PTR_ADDIU "%[dst], %[dst], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_ALL64 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ @@ -134,6 +131,7 @@ void ff_hevc_put_hevc_qpel_hv##w##_8_mmi(int16_t *dst, uint8_t *_src, \ int16_t *tmp = tmp_array; \ double ftmp[15]; \ uint64_t rtmp[1]; \ + DECLARE_VAR_ALL64; \ \ src -= (QPEL_EXTRA_BEFORE * srcstride + 3); \ filter = ff_hevc_qpel_filters[mx - 1]; \ @@ -151,14 +149,10 @@ void ff_hevc_put_hevc_qpel_hv##w##_8_mmi(int16_t *dst, uint8_t *_src, \ \ "1: \n\t" \ "2: \n\t" \ - "gsldlc1 %[ftmp3], 0x07(%[src]) \n\t" \ - "gsldrc1 %[ftmp3], 0x00(%[src]) \n\t" \ - "gsldlc1 %[ftmp4], 0x08(%[src]) \n\t" \ - "gsldrc1 %[ftmp4], 0x01(%[src]) \n\t" \ - "gsldlc1 %[ftmp5], 0x09(%[src]) \n\t" \ - "gsldrc1 %[ftmp5], 0x02(%[src]) \n\t" \ - "gsldlc1 %[ftmp6], 0x0a(%[src]) \n\t" \ - "gsldrc1 %[ftmp6], 0x03(%[src]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[src], 0x00) \ + MMI_ULDC1(%[ftmp4], %[src], 0x01) \ + MMI_ULDC1(%[ftmp5], %[src], 0x02) \ + MMI_ULDC1(%[ftmp6], %[src], 0x03) \ "punpcklbh %[ftmp7], %[ftmp3], %[ftmp0] \n\t" \ "punpckhbh %[ftmp8], %[ftmp3], %[ftmp0] \n\t" \ "pmullh %[ftmp7], %[ftmp7], %[ftmp1] \n\t" \ @@ -184,8 +178,7 @@ void ff_hevc_put_hevc_qpel_hv##w##_8_mmi(int16_t *dst, uint8_t *_src, \ "paddh %[ftmp3], %[ftmp3], %[ftmp4] \n\t" \ "paddh %[ftmp5], %[ftmp5], %[ftmp6] \n\t" \ "paddh %[ftmp3], %[ftmp3], %[ftmp5] \n\t" \ - "gssdlc1 %[ftmp3], 0x07(%[tmp]) \n\t" \ - "gssdrc1 %[ftmp3], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[tmp], 0x00) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[src], %[src], 0x04 \n\t" \ @@ -199,7 +192,8 @@ void ff_hevc_put_hevc_qpel_hv##w##_8_mmi(int16_t *dst, uint8_t *_src, \ PTR_ADDU "%[src], %[src], %[stride] \n\t" \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_ALL64 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ @@ -228,29 +222,21 @@ void ff_hevc_put_hevc_qpel_hv##w##_8_mmi(int16_t *dst, uint8_t *_src, \ \ "1: \n\t" \ "2: \n\t" \ - "gsldlc1 %[ftmp3], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp3], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp4], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp4], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp4], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp5], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp5], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp5], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp6], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp6], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp6], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp7], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp7], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp7], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp8], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp8], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp8], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp9], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp9], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp9], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp10], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp10], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp10], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], -0x380 \n\t" \ TRANSPOSE_4H(%[ftmp3], %[ftmp4], %[ftmp5], %[ftmp6], \ %[ftmp11], %[ftmp12], %[ftmp13], %[ftmp14]) \ @@ -275,8 +261,7 @@ void ff_hevc_put_hevc_qpel_hv##w##_8_mmi(int16_t *dst, uint8_t *_src, \ "paddw %[ftmp5], %[ftmp5], %[ftmp6] \n\t" \ "psraw %[ftmp5], %[ftmp5], %[ftmp0] \n\t" \ "packsswh %[ftmp3], %[ftmp3], %[ftmp5] \n\t" \ - "gssdlc1 %[ftmp3], 0x07(%[dst]) \n\t" \ - "gssdrc1 %[ftmp3], 0x00(%[dst]) \n\t" \ + MMI_USDC1(%[ftmp3], %[dst], 0x00) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[dst], %[dst], 0x08 \n\t" \ @@ -290,7 +275,8 @@ void ff_hevc_put_hevc_qpel_hv##w##_8_mmi(int16_t *dst, uint8_t *_src, \ PTR_ADDIU "%[dst], %[dst], 0x80 \n\t" \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_ALL64 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ @@ -333,6 +319,8 @@ void ff_hevc_put_hevc_qpel_bi_h##w##_8_mmi(uint8_t *_dst, \ uint64_t rtmp[1]; \ union av_intfloat64 shift; \ union av_intfloat64 offset; \ + DECLARE_VAR_ALL64; \ + DECLARE_VAR_LOW32; \ shift.i = 7; \ offset.i = 64; \ \ @@ -353,14 +341,10 @@ void ff_hevc_put_hevc_qpel_bi_h##w##_8_mmi(uint8_t *_dst, \ "1: \n\t" \ "li %[x], " #x_step " \n\t" \ "2: \n\t" \ - "gsldlc1 %[ftmp3], 0x07(%[src]) \n\t" \ - "gsldrc1 %[ftmp3], 0x00(%[src]) \n\t" \ - "gsldlc1 %[ftmp4], 0x08(%[src]) \n\t" \ - "gsldrc1 %[ftmp4], 0x01(%[src]) \n\t" \ - "gsldlc1 %[ftmp5], 0x09(%[src]) \n\t" \ - "gsldrc1 %[ftmp5], 0x02(%[src]) \n\t" \ - "gsldlc1 %[ftmp6], 0x0a(%[src]) \n\t" \ - "gsldrc1 %[ftmp6], 0x03(%[src]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[src], 0x00) \ + MMI_ULDC1(%[ftmp4], %[src], 0x01) \ + MMI_ULDC1(%[ftmp5], %[src], 0x02) \ + MMI_ULDC1(%[ftmp6], %[src], 0x03) \ "punpcklbh %[ftmp7], %[ftmp3], %[ftmp0] \n\t" \ "punpckhbh %[ftmp8], %[ftmp3], %[ftmp0] \n\t" \ "pmullh %[ftmp7], %[ftmp7], %[ftmp1] \n\t" \ @@ -387,8 +371,7 @@ void ff_hevc_put_hevc_qpel_bi_h##w##_8_mmi(uint8_t *_dst, \ "paddh %[ftmp5], %[ftmp5], %[ftmp6] \n\t" \ "paddh %[ftmp3], %[ftmp3], %[ftmp5] \n\t" \ "paddh %[ftmp3], %[ftmp3], %[offset] \n\t" \ - "gsldlc1 %[ftmp4], 0x07(%[src2]) \n\t" \ - "gsldrc1 %[ftmp4], 0x00(%[src2]) \n\t" \ + MMI_ULDC1(%[ftmp4], %[src2], 0x00) \ "li %[rtmp0], 0x10 \n\t" \ "dmtc1 %[rtmp0], %[ftmp8] \n\t" \ "punpcklhw %[ftmp5], %[ftmp0], %[ftmp3] \n\t" \ @@ -407,8 +390,7 @@ void ff_hevc_put_hevc_qpel_bi_h##w##_8_mmi(uint8_t *_dst, \ "pcmpgth %[ftmp7], %[ftmp5], %[ftmp0] \n\t" \ "pand %[ftmp3], %[ftmp5], %[ftmp7] \n\t" \ "packushb %[ftmp3], %[ftmp3], %[ftmp3] \n\t" \ - "gsswlc1 %[ftmp3], 0x03(%[dst]) \n\t" \ - "gsswrc1 %[ftmp3], 0x00(%[dst]) \n\t" \ + MMI_USWC1(%[ftmp3], %[dst], 0x00) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[src], %[src], 0x04 \n\t" \ @@ -424,7 +406,8 @@ void ff_hevc_put_hevc_qpel_bi_h##w##_8_mmi(uint8_t *_dst, \ PTR_ADDU "%[dst], %[dst], %[dst_stride] \n\t" \ PTR_ADDIU "%[src2], %[src2], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_ALL64 RESTRICT_ASM_LOW32 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ @@ -469,6 +452,8 @@ void ff_hevc_put_hevc_qpel_bi_hv##w##_8_mmi(uint8_t *_dst, \ uint64_t rtmp[1]; \ union av_intfloat64 shift; \ union av_intfloat64 offset; \ + DECLARE_VAR_ALL64; \ + DECLARE_VAR_LOW32; \ shift.i = 7; \ offset.i = 64; \ \ @@ -488,14 +473,10 @@ void ff_hevc_put_hevc_qpel_bi_hv##w##_8_mmi(uint8_t *_dst, \ \ "1: \n\t" \ "2: \n\t" \ - "gsldlc1 %[ftmp3], 0x07(%[src]) \n\t" \ - "gsldrc1 %[ftmp3], 0x00(%[src]) \n\t" \ - "gsldlc1 %[ftmp4], 0x08(%[src]) \n\t" \ - "gsldrc1 %[ftmp4], 0x01(%[src]) \n\t" \ - "gsldlc1 %[ftmp5], 0x09(%[src]) \n\t" \ - "gsldrc1 %[ftmp5], 0x02(%[src]) \n\t" \ - "gsldlc1 %[ftmp6], 0x0a(%[src]) \n\t" \ - "gsldrc1 %[ftmp6], 0x03(%[src]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[src], 0x00) \ + MMI_ULDC1(%[ftmp4], %[src], 0x01) \ + MMI_ULDC1(%[ftmp5], %[src], 0x02) \ + MMI_ULDC1(%[ftmp6], %[src], 0x03) \ "punpcklbh %[ftmp7], %[ftmp3], %[ftmp0] \n\t" \ "punpckhbh %[ftmp8], %[ftmp3], %[ftmp0] \n\t" \ "pmullh %[ftmp7], %[ftmp7], %[ftmp1] \n\t" \ @@ -521,8 +502,7 @@ void ff_hevc_put_hevc_qpel_bi_hv##w##_8_mmi(uint8_t *_dst, \ "paddh %[ftmp3], %[ftmp3], %[ftmp4] \n\t" \ "paddh %[ftmp5], %[ftmp5], %[ftmp6] \n\t" \ "paddh %[ftmp3], %[ftmp3], %[ftmp5] \n\t" \ - "gssdlc1 %[ftmp3], 0x07(%[tmp]) \n\t" \ - "gssdrc1 %[ftmp3], 0x00(%[tmp]) \n\t" \ + MMI_USDC1(%[ftmp3], %[tmp], 0x00) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[src], %[src], 0x04 \n\t" \ @@ -536,7 +516,8 @@ void ff_hevc_put_hevc_qpel_bi_hv##w##_8_mmi(uint8_t *_dst, \ PTR_ADDU "%[src], %[src], %[stride] \n\t" \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_ALL64 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ @@ -567,29 +548,21 @@ void ff_hevc_put_hevc_qpel_bi_hv##w##_8_mmi(uint8_t *_dst, \ "1: \n\t" \ "li %[x], " #x_step " \n\t" \ "2: \n\t" \ - "gsldlc1 %[ftmp3], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp3], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp4], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp4], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp4], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp5], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp5], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp5], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp6], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp6], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp6], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp7], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp7], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp7], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp8], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp8], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp8], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp9], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp9], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp9], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp10], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp10], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp10], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], -0x380 \n\t" \ TRANSPOSE_4H(%[ftmp3], %[ftmp4], %[ftmp5], %[ftmp6], \ %[ftmp11], %[ftmp12], %[ftmp13], %[ftmp14]) \ @@ -614,8 +587,7 @@ void ff_hevc_put_hevc_qpel_bi_hv##w##_8_mmi(uint8_t *_dst, \ "paddw %[ftmp5], %[ftmp5], %[ftmp6] \n\t" \ "psraw %[ftmp5], %[ftmp5], %[ftmp0] \n\t" \ "packsswh %[ftmp3], %[ftmp3], %[ftmp5] \n\t" \ - "gsldlc1 %[ftmp4], 0x07(%[src2]) \n\t" \ - "gsldrc1 %[ftmp4], 0x00(%[src2]) \n\t" \ + MMI_ULDC1(%[ftmp4], %[src2], 0x00) \ "pxor %[ftmp7], %[ftmp7], %[ftmp7] \n\t" \ "li %[rtmp0], 0x10 \n\t" \ "dmtc1 %[rtmp0], %[ftmp8] \n\t" \ @@ -637,8 +609,7 @@ void ff_hevc_put_hevc_qpel_bi_hv##w##_8_mmi(uint8_t *_dst, \ "pcmpgth %[ftmp7], %[ftmp5], %[ftmp7] \n\t" \ "pand %[ftmp3], %[ftmp5], %[ftmp7] \n\t" \ "packushb %[ftmp3], %[ftmp3], %[ftmp3] \n\t" \ - "gsswlc1 %[ftmp3], 0x03(%[dst]) \n\t" \ - "gsswrc1 %[ftmp3], 0x00(%[dst]) \n\t" \ + MMI_USWC1(%[ftmp3], %[dst], 0x00) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[src2], %[src2], 0x08 \n\t" \ @@ -654,7 +625,8 @@ void ff_hevc_put_hevc_qpel_bi_hv##w##_8_mmi(uint8_t *_dst, \ PTR_ADDU "%[dst], %[dst], %[stride] \n\t" \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_ALL64 RESTRICT_ASM_LOW32 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ @@ -700,6 +672,8 @@ void ff_hevc_put_hevc_epel_bi_hv##w##_8_mmi(uint8_t *_dst, \ uint64_t rtmp[1]; \ union av_intfloat64 shift; \ union av_intfloat64 offset; \ + DECLARE_VAR_ALL64; \ + DECLARE_VAR_LOW32; \ shift.i = 7; \ offset.i = 64; \ \ @@ -716,14 +690,10 @@ void ff_hevc_put_hevc_epel_bi_hv##w##_8_mmi(uint8_t *_dst, \ \ "1: \n\t" \ "2: \n\t" \ - "gslwlc1 %[ftmp2], 0x03(%[src]) \n\t" \ - "gslwrc1 %[ftmp2], 0x00(%[src]) \n\t" \ - "gslwlc1 %[ftmp3], 0x04(%[src]) \n\t" \ - "gslwrc1 %[ftmp3], 0x01(%[src]) \n\t" \ - "gslwlc1 %[ftmp4], 0x05(%[src]) \n\t" \ - "gslwrc1 %[ftmp4], 0x02(%[src]) \n\t" \ - "gslwlc1 %[ftmp5], 0x06(%[src]) \n\t" \ - "gslwrc1 %[ftmp5], 0x03(%[src]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[src], 0x00) \ + MMI_ULDC1(%[ftmp4], %[src], 0x01) \ + MMI_ULDC1(%[ftmp5], %[src], 0x02) \ + MMI_ULDC1(%[ftmp6], %[src], 0x03) \ "punpcklbh %[ftmp2], %[ftmp2], %[ftmp0] \n\t" \ "pmullh %[ftmp2], %[ftmp2], %[ftmp1] \n\t" \ "punpcklbh %[ftmp3], %[ftmp3], %[ftmp0] \n\t" \ @@ -737,8 +707,7 @@ void ff_hevc_put_hevc_epel_bi_hv##w##_8_mmi(uint8_t *_dst, \ "paddh %[ftmp2], %[ftmp2], %[ftmp3] \n\t" \ "paddh %[ftmp4], %[ftmp4], %[ftmp5] \n\t" \ "paddh %[ftmp2], %[ftmp2], %[ftmp4] \n\t" \ - "gssdlc1 %[ftmp2], 0x07(%[tmp]) \n\t" \ - "gssdrc1 %[ftmp2], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp2], %[tmp], 0x00) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[src], %[src], 0x04 \n\t" \ @@ -752,7 +721,8 @@ void ff_hevc_put_hevc_epel_bi_hv##w##_8_mmi(uint8_t *_dst, \ PTR_ADDU "%[src], %[src], %[stride] \n\t" \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_ALL64 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ @@ -782,17 +752,13 @@ void ff_hevc_put_hevc_epel_bi_hv##w##_8_mmi(uint8_t *_dst, \ "1: \n\t" \ "li %[x], " #x_step " \n\t" \ "2: \n\t" \ - "gsldlc1 %[ftmp3], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp3], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp4], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp4], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp4], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp5], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp5], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp5], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp6], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp6], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp6], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], -0x180 \n\t" \ TRANSPOSE_4H(%[ftmp3], %[ftmp4], %[ftmp5], %[ftmp6], \ %[ftmp7], %[ftmp8], %[ftmp9], %[ftmp10]) \ @@ -807,8 +773,7 @@ void ff_hevc_put_hevc_epel_bi_hv##w##_8_mmi(uint8_t *_dst, \ "paddw %[ftmp5], %[ftmp5], %[ftmp6] \n\t" \ "psraw %[ftmp5], %[ftmp5], %[ftmp0] \n\t" \ "packsswh %[ftmp3], %[ftmp3], %[ftmp5] \n\t" \ - "gsldlc1 %[ftmp4], 0x07(%[src2]) \n\t" \ - "gsldrc1 %[ftmp4], 0x00(%[src2]) \n\t" \ + MMI_ULDC1(%[ftmp4], %[tmp], 0x02) \ "li %[rtmp0], 0x10 \n\t" \ "dmtc1 %[rtmp0], %[ftmp8] \n\t" \ "punpcklhw %[ftmp5], %[ftmp2], %[ftmp3] \n\t" \ @@ -829,8 +794,7 @@ void ff_hevc_put_hevc_epel_bi_hv##w##_8_mmi(uint8_t *_dst, \ "pcmpgth %[ftmp7], %[ftmp5], %[ftmp2] \n\t" \ "pand %[ftmp3], %[ftmp5], %[ftmp7] \n\t" \ "packushb %[ftmp3], %[ftmp3], %[ftmp3] \n\t" \ - "gsswlc1 %[ftmp3], 0x03(%[dst]) \n\t" \ - "gsswrc1 %[ftmp3], 0x00(%[dst]) \n\t" \ + MMI_USWC1(%[ftmp3], %[dst], 0x0) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[src2], %[src2], 0x08 \n\t" \ @@ -846,7 +810,8 @@ void ff_hevc_put_hevc_epel_bi_hv##w##_8_mmi(uint8_t *_dst, \ PTR_ADDU "%[dst], %[dst], %[stride] \n\t" \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_LOW32 RESTRICT_ASM_ALL64 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ @@ -884,6 +849,7 @@ void ff_hevc_put_hevc_pel_bi_pixels##w##_8_mmi(uint8_t *_dst, \ double ftmp[12]; \ uint64_t rtmp[1]; \ union av_intfloat64 shift; \ + DECLARE_VAR_ALL64; \ shift.i = 7; \ \ y = height; \ @@ -901,12 +867,9 @@ void ff_hevc_put_hevc_pel_bi_pixels##w##_8_mmi(uint8_t *_dst, \ \ "1: \n\t" \ "2: \n\t" \ - "gsldlc1 %[ftmp5], 0x07(%[src]) \n\t" \ - "gsldrc1 %[ftmp5], 0x00(%[src]) \n\t" \ - "gsldlc1 %[ftmp2], 0x07(%[src2]) \n\t" \ - "gsldrc1 %[ftmp2], 0x00(%[src2]) \n\t" \ - "gsldlc1 %[ftmp3], 0x0f(%[src2]) \n\t" \ - "gsldrc1 %[ftmp3], 0x08(%[src2]) \n\t" \ + MMI_ULDC1(%[ftmp5], %[src], 0x00) \ + MMI_ULDC1(%[ftmp2], %[src2], 0x00) \ + MMI_ULDC1(%[ftmp3], %[src2], 0x08) \ "punpcklbh %[ftmp4], %[ftmp5], %[ftmp0] \n\t" \ "punpckhbh %[ftmp5], %[ftmp5], %[ftmp0] \n\t" \ "psllh %[ftmp4], %[ftmp4], %[ftmp1] \n\t" \ @@ -940,8 +903,7 @@ void ff_hevc_put_hevc_pel_bi_pixels##w##_8_mmi(uint8_t *_dst, \ "pand %[ftmp2], %[ftmp2], %[ftmp3] \n\t" \ "pand %[ftmp4], %[ftmp4], %[ftmp5] \n\t" \ "packushb %[ftmp2], %[ftmp2], %[ftmp4] \n\t" \ - "gssdlc1 %[ftmp2], 0x07(%[dst]) \n\t" \ - "gssdrc1 %[ftmp2], 0x00(%[dst]) \n\t" \ + MMI_USDC1(%[ftmp2], %[dst], 0x0) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[src], %[src], 0x08 \n\t" \ @@ -958,7 +920,8 @@ void ff_hevc_put_hevc_pel_bi_pixels##w##_8_mmi(uint8_t *_dst, \ PTR_ADDU "%[dst], %[dst], %[dststride] \n\t" \ PTR_ADDIU "%[src2], %[src2], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_ALL64 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ @@ -1000,6 +963,8 @@ void ff_hevc_put_hevc_qpel_uni_hv##w##_8_mmi(uint8_t *_dst, \ uint64_t rtmp[1]; \ union av_intfloat64 shift; \ union av_intfloat64 offset; \ + DECLARE_VAR_ALL64; \ + DECLARE_VAR_LOW32; \ shift.i = 6; \ offset.i = 32; \ \ @@ -1019,14 +984,10 @@ void ff_hevc_put_hevc_qpel_uni_hv##w##_8_mmi(uint8_t *_dst, \ \ "1: \n\t" \ "2: \n\t" \ - "gsldlc1 %[ftmp3], 0x07(%[src]) \n\t" \ - "gsldrc1 %[ftmp3], 0x00(%[src]) \n\t" \ - "gsldlc1 %[ftmp4], 0x08(%[src]) \n\t" \ - "gsldrc1 %[ftmp4], 0x01(%[src]) \n\t" \ - "gsldlc1 %[ftmp5], 0x09(%[src]) \n\t" \ - "gsldrc1 %[ftmp5], 0x02(%[src]) \n\t" \ - "gsldlc1 %[ftmp6], 0x0a(%[src]) \n\t" \ - "gsldrc1 %[ftmp6], 0x03(%[src]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[src], 0x00) \ + MMI_ULDC1(%[ftmp4], %[src], 0x01) \ + MMI_ULDC1(%[ftmp5], %[src], 0x02) \ + MMI_ULDC1(%[ftmp6], %[src], 0x03) \ "punpcklbh %[ftmp7], %[ftmp3], %[ftmp0] \n\t" \ "punpckhbh %[ftmp8], %[ftmp3], %[ftmp0] \n\t" \ "pmullh %[ftmp7], %[ftmp7], %[ftmp1] \n\t" \ @@ -1052,8 +1013,7 @@ void ff_hevc_put_hevc_qpel_uni_hv##w##_8_mmi(uint8_t *_dst, \ "paddh %[ftmp3], %[ftmp3], %[ftmp4] \n\t" \ "paddh %[ftmp5], %[ftmp5], %[ftmp6] \n\t" \ "paddh %[ftmp3], %[ftmp3], %[ftmp5] \n\t" \ - "gssdlc1 %[ftmp3], 0x07(%[tmp]) \n\t" \ - "gssdrc1 %[ftmp3], 0x00(%[tmp]) \n\t" \ + MMI_USDC1(%[ftmp3], %[tmp], 0x0) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[src], %[src], 0x04 \n\t" \ @@ -1067,7 +1027,8 @@ void ff_hevc_put_hevc_qpel_uni_hv##w##_8_mmi(uint8_t *_dst, \ PTR_ADDU "%[src], %[src], %[stride] \n\t" \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_ALL64 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ @@ -1099,29 +1060,21 @@ void ff_hevc_put_hevc_qpel_uni_hv##w##_8_mmi(uint8_t *_dst, \ "1: \n\t" \ "li %[x], " #x_step " \n\t" \ "2: \n\t" \ - "gsldlc1 %[ftmp3], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp3], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp3], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp4], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp4], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp4], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp5], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp5], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp5], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp6], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp6], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp6], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp7], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp7], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp7], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp8], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp8], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp8], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp9], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp9], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp9], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ - "gsldlc1 %[ftmp10], 0x07(%[tmp]) \n\t" \ - "gsldrc1 %[ftmp10], 0x00(%[tmp]) \n\t" \ + MMI_ULDC1(%[ftmp10], %[tmp], 0x00) \ PTR_ADDIU "%[tmp], %[tmp], -0x380 \n\t" \ TRANSPOSE_4H(%[ftmp3], %[ftmp4], %[ftmp5], %[ftmp6], \ %[ftmp11], %[ftmp12], %[ftmp13], %[ftmp14]) \ @@ -1152,8 +1105,7 @@ void ff_hevc_put_hevc_qpel_uni_hv##w##_8_mmi(uint8_t *_dst, \ "pcmpgth %[ftmp7], %[ftmp3], %[ftmp7] \n\t" \ "pand %[ftmp3], %[ftmp3], %[ftmp7] \n\t" \ "packushb %[ftmp3], %[ftmp3], %[ftmp3] \n\t" \ - "gsswlc1 %[ftmp3], 0x03(%[dst]) \n\t" \ - "gsswrc1 %[ftmp3], 0x00(%[dst]) \n\t" \ + MMI_USWC1(%[ftmp3], %[dst], 0x00) \ \ "daddi %[x], %[x], -0x01 \n\t" \ PTR_ADDIU "%[tmp], %[tmp], 0x08 \n\t" \ @@ -1166,7 +1118,8 @@ void ff_hevc_put_hevc_qpel_uni_hv##w##_8_mmi(uint8_t *_dst, \ PTR_ADDU "%[dst], %[dst], %[stride] \n\t" \ PTR_ADDIU "%[tmp], %[tmp], 0x80 \n\t" \ "bnez %[y], 1b \n\t" \ - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ + : RESTRICT_ASM_ALL64 RESTRICT_ASM_LOW32 \ + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), \ [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), \ [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), \ [ftmp6]"=&f"(ftmp[6]), [ftmp7]"=&f"(ftmp[7]), \ diff --git a/libavcodec/mips/hpeldsp_mmi.c b/libavcodec/mips/hpeldsp_mmi.c index bf3e4636aa..8e9c0fa821 100644 --- a/libavcodec/mips/hpeldsp_mmi.c +++ b/libavcodec/mips/hpeldsp_mmi.c @@ -307,6 +307,7 @@ inline void ff_put_pixels4_l2_8_mmi(uint8_t *dst, const uint8_t *src1, double ftmp[4]; mips_reg addr[5]; DECLARE_VAR_LOW32; + DECLARE_VAR_ADDRT; __asm__ volatile ( "1: \n\t" diff --git a/libavcodec/mips/simple_idct_mmi.c b/libavcodec/mips/simple_idct_mmi.c index ad068a8251..4680520edc 100644 --- a/libavcodec/mips/simple_idct_mmi.c +++ b/libavcodec/mips/simple_idct_mmi.c @@ -56,6 +56,8 @@ DECLARE_ALIGNED(16, const int16_t, W_arr)[46] = { void ff_simple_idct_8_mmi(int16_t *block) { + DECLARE_VAR_ALL64; + BACKUP_REG __asm__ volatile ( @@ -142,20 +144,20 @@ void ff_simple_idct_8_mmi(int16_t *block) /* idctRowCondDC row0~8 */ /* load W */ - "gslqc1 $f19, $f18, 0x00(%[w_arr]) \n\t" - "gslqc1 $f21, $f20, 0x10(%[w_arr]) \n\t" - "gslqc1 $f23, $f22, 0x20(%[w_arr]) \n\t" - "gslqc1 $f25, $f24, 0x30(%[w_arr]) \n\t" - "gslqc1 $f17, $f16, 0x40(%[w_arr]) \n\t" + MMI_LQC1($f19, $f18, %[w_arr], 0x00) + MMI_LQC1($f21, $f20, %[w_arr], 0x10) + MMI_LQC1($f23, $f22, %[w_arr], 0x20) + MMI_LQC1($f25, $f24, %[w_arr], 0x30) + MMI_LQC1($f17, $f16, %[w_arr], 0x40) /* load source in block */ - "gslqc1 $f1, $f0, 0x00(%[block]) \n\t" - "gslqc1 $f3, $f2, 0x10(%[block]) \n\t" - "gslqc1 $f5, $f4, 0x20(%[block]) \n\t" - "gslqc1 $f7, $f6, 0x30(%[block]) \n\t" - "gslqc1 $f9, $f8, 0x40(%[block]) \n\t" - "gslqc1 $f11, $f10, 0x50(%[block]) \n\t" - "gslqc1 $f13, $f12, 0x60(%[block]) \n\t" - "gslqc1 $f15, $f14, 0x70(%[block]) \n\t" + MMI_LQC1($f1, $f0, %[block], 0x00) + MMI_LQC1($f3, $f2, %[block], 0x10) + MMI_LQC1($f5, $f4, %[block], 0x20) + MMI_LQC1($f7, $f6, %[block], 0x30) + MMI_LQC1($f9, $f8, %[block], 0x40) + MMI_LQC1($f11, $f10, %[block], 0x50) + MMI_LQC1($f13, $f12, %[block], 0x60) + MMI_LQC1($f15, $f14, %[block], 0x70) /* $9: mask ; $f17: ROW_SHIFT */ "dmfc1 $9, $f17 \n\t" @@ -253,8 +255,7 @@ void ff_simple_idct_8_mmi(int16_t *block) /* idctSparseCol col0~3 */ /* $f17: ff_p16_32; $f16: COL_SHIFT-16 */ - "gsldlc1 $f17, 0x57(%[w_arr]) \n\t" - "gsldrc1 $f17, 0x50(%[w_arr]) \n\t" + MMI_ULDC1($f17, %[w_arr], 0x50) "li $10, 4 \n\t" "dmtc1 $10, $f16 \n\t" "paddh $f0, $f0, $f17 \n\t" @@ -395,16 +396,16 @@ void ff_simple_idct_8_mmi(int16_t *block) "punpcklwd $f11, $f27, $f29 \n\t" "punpckhwd $f15, $f27, $f29 \n\t" /* Store */ - "gssqc1 $f1, $f0, 0x00(%[block]) \n\t" - "gssqc1 $f5, $f4, 0x10(%[block]) \n\t" - "gssqc1 $f9, $f8, 0x20(%[block]) \n\t" - "gssqc1 $f13, $f12, 0x30(%[block]) \n\t" - "gssqc1 $f3, $f2, 0x40(%[block]) \n\t" - "gssqc1 $f7, $f6, 0x50(%[block]) \n\t" - "gssqc1 $f11, $f10, 0x60(%[block]) \n\t" - "gssqc1 $f15, $f14, 0x70(%[block]) \n\t" + MMI_SQC1($f1, $f0, %[block], 0x00) + MMI_SQC1($f5, $f4, %[block], 0x10) + MMI_SQC1($f9, $f8, %[block], 0x20) + MMI_SQC1($f13, $f12, %[block], 0x30) + MMI_SQC1($f3, $f2, %[block], 0x40) + MMI_SQC1($f7, $f6, %[block], 0x50) + MMI_SQC1($f11, $f10, %[block], 0x60) + MMI_SQC1($f15, $f14, %[block], 0x70) - : [block]"+&r"(block) + : RESTRICT_ASM_ALL64 [block]"+&r"(block) : [w_arr]"r"(W_arr) : "memory" ); diff --git a/libavcodec/mips/vp3dsp_idct_mmi.c b/libavcodec/mips/vp3dsp_idct_mmi.c index 0d4cba19ec..f658affd36 100644 --- a/libavcodec/mips/vp3dsp_idct_mmi.c +++ b/libavcodec/mips/vp3dsp_idct_mmi.c @@ -722,6 +722,8 @@ void ff_put_no_rnd_pixels_l2_mmi(uint8_t *dst, const uint8_t *src1, if (h == 8) { double ftmp[6]; uint64_t tmp[2]; + DECLARE_VAR_ALL64; + __asm__ volatile ( "li %[tmp0], 0x08 \n\t" "li %[tmp1], 0xfefefefe \n\t" @@ -730,10 +732,8 @@ void ff_put_no_rnd_pixels_l2_mmi(uint8_t *dst, const uint8_t *src1, "li %[tmp1], 0x01 \n\t" "dmtc1 %[tmp1], %[ftmp5] \n\t" "1: \n\t" - "gsldlc1 %[ftmp1], 0x07(%[src1]) \n\t" - "gsldrc1 %[ftmp1], 0x00(%[src1]) \n\t" - "gsldlc1 %[ftmp2], 0x07(%[src2]) \n\t" - "gsldrc1 %[ftmp2], 0x00(%[src2]) \n\t" + MMI_ULDC1(%[ftmp1], %[src1], 0x0) + MMI_ULDC1(%[ftmp2], %[src2], 0x0) "pxor %[ftmp3], %[ftmp1], %[ftmp2] \n\t" "pand %[ftmp3], %[ftmp3], %[ftmp4] \n\t" "psrlw %[ftmp3], %[ftmp3], %[ftmp5] \n\t" @@ -745,7 +745,8 @@ void ff_put_no_rnd_pixels_l2_mmi(uint8_t *dst, const uint8_t *src1, PTR_ADDU "%[dst], %[dst], %[stride] \n\t" PTR_ADDIU "%[tmp0], %[tmp0], -0x01 \n\t" "bnez %[tmp0], 1b \n\t" - : [dst]"+&r"(dst), [src1]"+&r"(src1), [src2]"+&r"(src2), + : RESTRICT_ASM_ALL64 + [dst]"+&r"(dst), [src1]"+&r"(src1), [src2]"+&r"(src2), [ftmp1]"=&f"(ftmp[0]), [ftmp2]"=&f"(ftmp[1]), [ftmp3]"=&f"(ftmp[2]), [ftmp4]"=&f"(ftmp[3]), [ftmp5]"=&f"(ftmp[4]), [ftmp6]"=&f"(ftmp[5]), [tmp0]"=&r"(tmp[0]), [tmp1]"=&r"(tmp[1]) diff --git a/libavcodec/mips/vp8dsp_mmi.c b/libavcodec/mips/vp8dsp_mmi.c index 327eaf561e..bc774aa365 100644 --- a/libavcodec/mips/vp8dsp_mmi.c +++ b/libavcodec/mips/vp8dsp_mmi.c @@ -791,51 +791,40 @@ static av_always_inline void vp8_v_loop_filter8_mmi(uint8_t *dst, DECLARE_DOUBLE_1; DECLARE_DOUBLE_2; DECLARE_UINT32_T; + DECLARE_VAR_ALL64; + __asm__ volatile( /* Get data from dst */ - "gsldlc1 %[q0], 0x07(%[dst]) \n\t" - "gsldrc1 %[q0], 0x00(%[dst]) \n\t" + MMI_ULDC1(%[q0], %[dst], 0x0) PTR_SUBU "%[tmp0], %[dst], %[stride] \n\t" - "gsldlc1 %[p0], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[p0], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[p0], %[tmp0], 0x0) PTR_SUBU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[p1], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[p1], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[p1], %[tmp0], 0x0) PTR_SUBU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[p2], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[p2], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[p2], %[tmp0], 0x0) PTR_SUBU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[p3], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[p3], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[p3], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[dst], %[stride] \n\t" - "gsldlc1 %[q1], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[q1], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[q1], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[q2], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[q2], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[q2], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[q3], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[q3], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[q3], %[tmp0], 0x0) MMI_VP8_LOOP_FILTER /* Move to dst */ - "gssdlc1 %[q0], 0x07(%[dst]) \n\t" - "gssdrc1 %[q0], 0x00(%[dst]) \n\t" + MMI_USDC1(%[q0], %[dst], 0x0) PTR_SUBU "%[tmp0], %[dst], %[stride] \n\t" - "gssdlc1 %[p0], 0x07(%[tmp0]) \n\t" - "gssdrc1 %[p0], 0x00(%[tmp0]) \n\t" + MMI_USDC1(%[p0], %[tmp0], 0x0) PTR_SUBU "%[tmp0], %[tmp0], %[stride] \n\t" - "gssdlc1 %[p1], 0x07(%[tmp0]) \n\t" - "gssdrc1 %[p1], 0x00(%[tmp0]) \n\t" + MMI_USDC1(%[p1], %[tmp0], 0x0) PTR_SUBU "%[tmp0], %[tmp0], %[stride] \n\t" - "gssdlc1 %[p2], 0x07(%[tmp0]) \n\t" - "gssdrc1 %[p2], 0x00(%[tmp0]) \n\t" + MMI_USDC1(%[p2], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[dst], %[stride] \n\t" - "gssdlc1 %[q1], 0x07(%[tmp0]) \n\t" - "gssdrc1 %[q1], 0x00(%[tmp0]) \n\t" + MMI_USDC1(%[q1], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[stride] \n\t" - "gssdlc1 %[q2], 0x07(%[tmp0]) \n\t" - "gssdrc1 %[q2], 0x00(%[tmp0]) \n\t" - : [p3]"=&f"(ftmp[0]), [p2]"=&f"(ftmp[1]), + MMI_USDC1(%[q2], %[tmp0], 0x0) + : RESTRICT_ASM_ALL64 + [p3]"=&f"(ftmp[0]), [p2]"=&f"(ftmp[1]), [p1]"=&f"(ftmp[2]), [p0]"=&f"(ftmp[3]), [q0]"=&f"(ftmp[4]), [q1]"=&f"(ftmp[5]), [q2]"=&f"(ftmp[6]), [q3]"=&f"(ftmp[7]), @@ -876,31 +865,25 @@ static av_always_inline void vp8_h_loop_filter8_mmi(uint8_t *dst, DECLARE_DOUBLE_1; DECLARE_DOUBLE_2; DECLARE_UINT32_T; + DECLARE_VAR_ALL64; + __asm__ volatile( /* Get data from dst */ - "gsldlc1 %[p3], 0x03(%[dst]) \n\t" - "gsldrc1 %[p3], -0x04(%[dst]) \n\t" + MMI_ULDC1(%[p3], %[dst], -0x04) PTR_ADDU "%[tmp0], %[dst], %[stride] \n\t" - "gsldlc1 %[p2], 0x03(%[tmp0]) \n\t" - "gsldrc1 %[p2], -0x04(%[tmp0]) \n\t" + MMI_ULDC1(%[p2], %[tmp0], -0x04) PTR_ADDU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[p1], 0x03(%[tmp0]) \n\t" - "gsldrc1 %[p1], -0x04(%[tmp0]) \n\t" + MMI_ULDC1(%[p1], %[tmp0], -0x04) PTR_ADDU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[p0], 0x03(%[tmp0]) \n\t" - "gsldrc1 %[p0], -0x04(%[tmp0]) \n\t" + MMI_ULDC1(%[p0], %[tmp0], -0x04) PTR_ADDU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[q0], 0x03(%[tmp0]) \n\t" - "gsldrc1 %[q0], -0x04(%[tmp0]) \n\t" + MMI_ULDC1(%[q0], %[tmp0], -0x04) PTR_ADDU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[q1], 0x03(%[tmp0]) \n\t" - "gsldrc1 %[q1], -0x04(%[tmp0]) \n\t" + MMI_ULDC1(%[q1], %[tmp0], -0x04) PTR_ADDU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[q2], 0x03(%[tmp0]) \n\t" - "gsldrc1 %[q2], -0x04(%[tmp0]) \n\t" + MMI_ULDC1(%[q2], %[tmp0], -0x04) PTR_ADDU "%[tmp0], %[tmp0], %[stride] \n\t" - "gsldlc1 %[q3], 0x03(%[tmp0]) \n\t" - "gsldrc1 %[q3], -0x04(%[tmp0]) \n\t" + MMI_ULDC1(%[q3], %[tmp0], -0x04) /* Matrix transpose */ TRANSPOSE_8B(%[p3], %[p2], %[p1], %[p0], %[q0], %[q1], %[q2], %[q3], @@ -911,30 +894,23 @@ static av_always_inline void vp8_h_loop_filter8_mmi(uint8_t *dst, %[q0], %[q1], %[q2], %[q3], %[ftmp1], %[ftmp2], %[ftmp3], %[ftmp4]) /* Move to dst */ - "gssdlc1 %[p3], 0x03(%[dst]) \n\t" - "gssdrc1 %[p3], -0x04(%[dst]) \n\t" + MMI_USDC1(%[p3], %[dst], -0x04) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" - "gssdlc1 %[p2], 0x03(%[dst]) \n\t" - "gssdrc1 %[p2], -0x04(%[dst]) \n\t" + MMI_USDC1(%[p2], %[dst], -0x04) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" - "gssdlc1 %[p1], 0x03(%[dst]) \n\t" - "gssdrc1 %[p1], -0x04(%[dst]) \n\t" + MMI_USDC1(%[p1], %[dst], -0x04) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" - "gssdlc1 %[p0], 0x03(%[dst]) \n\t" - "gssdrc1 %[p0], -0x04(%[dst]) \n\t" + MMI_USDC1(%[p0], %[dst], -0x04) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" - "gssdlc1 %[q0], 0x03(%[dst]) \n\t" - "gssdrc1 %[q0], -0x04(%[dst]) \n\t" + MMI_USDC1(%[q0], %[dst], -0x04) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" - "gssdlc1 %[q1], 0x03(%[dst]) \n\t" - "gssdrc1 %[q1], -0x04(%[dst]) \n\t" + MMI_USDC1(%[q1], %[dst], -0x04) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" - "gssdlc1 %[q2], 0x03(%[dst]) \n\t" - "gssdrc1 %[q2], -0x04(%[dst]) \n\t" + MMI_USDC1(%[q2], %[dst], -0x04) PTR_ADDU "%[dst], %[dst], %[stride] \n\t" - "gssdlc1 %[q3], 0x03(%[dst]) \n\t" - "gssdrc1 %[q3], -0x04(%[dst]) \n\t" - : [p3]"=&f"(ftmp[0]), [p2]"=&f"(ftmp[1]), + MMI_USDC1(%[q3], %[dst], -0x04) + : RESTRICT_ASM_ALL64 + [p3]"=&f"(ftmp[0]), [p2]"=&f"(ftmp[1]), [p1]"=&f"(ftmp[2]), [p0]"=&f"(ftmp[3]), [q0]"=&f"(ftmp[4]), [q1]"=&f"(ftmp[5]), [q2]"=&f"(ftmp[6]), [q3]"=&f"(ftmp[7]), diff --git a/libavcodec/mips/vp9_mc_mmi.c b/libavcodec/mips/vp9_mc_mmi.c index 812f7a6994..495cac3d0b 100644 --- a/libavcodec/mips/vp9_mc_mmi.c +++ b/libavcodec/mips/vp9_mc_mmi.c @@ -77,29 +77,24 @@ static void convolve_horiz_mmi(const uint8_t *src, int32_t src_stride, { double ftmp[15]; uint32_t tmp[2]; + DECLARE_VAR_ALL64; src -= 3; src_stride -= w; dst_stride -= w; __asm__ volatile ( "move %[tmp1], %[width] \n\t" "pxor %[ftmp0], %[ftmp0], %[ftmp0] \n\t" - "gsldlc1 %[filter1], 0x07(%[filter]) \n\t" - "gsldrc1 %[filter1], 0x00(%[filter]) \n\t" - "gsldlc1 %[filter2], 0x0f(%[filter]) \n\t" - "gsldrc1 %[filter2], 0x08(%[filter]) \n\t" + MMI_ULDC1(%[filter1], %[filter], 0x00) + MMI_ULDC1(%[filter2], %[filter], 0x08) "li %[tmp0], 0x07 \n\t" "dmtc1 %[tmp0], %[ftmp13] \n\t" "punpcklwd %[ftmp13], %[ftmp13], %[ftmp13] \n\t" "1: \n\t" /* Get 8 data per row */ - "gsldlc1 %[ftmp5], 0x07(%[src]) \n\t" - "gsldrc1 %[ftmp5], 0x00(%[src]) \n\t" - "gsldlc1 %[ftmp7], 0x08(%[src]) \n\t" - "gsldrc1 %[ftmp7], 0x01(%[src]) \n\t" - "gsldlc1 %[ftmp9], 0x09(%[src]) \n\t" - "gsldrc1 %[ftmp9], 0x02(%[src]) \n\t" - "gsldlc1 %[ftmp11], 0x0A(%[src]) \n\t" - "gsldrc1 %[ftmp11], 0x03(%[src]) \n\t" + MMI_ULDC1(%[ftmp5], %[src], 0x00) + MMI_ULDC1(%[ftmp7], %[src], 0x01) + MMI_ULDC1(%[ftmp9], %[src], 0x02) + MMI_ULDC1(%[ftmp11], %[src], 0x03) "punpcklbh %[ftmp4], %[ftmp5], %[ftmp0] \n\t" "punpckhbh %[ftmp5], %[ftmp5], %[ftmp0] \n\t" "punpcklbh %[ftmp6], %[ftmp7], %[ftmp0] \n\t" @@ -127,7 +122,8 @@ static void convolve_horiz_mmi(const uint8_t *src, int32_t src_stride, PTR_ADDU "%[dst], %[dst], %[dst_stride] \n\t" PTR_ADDIU "%[height], %[height], -0x01 \n\t" "bnez %[height], 1b \n\t" - : [srcl]"=&f"(ftmp[0]), [srch]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [srcl]"=&f"(ftmp[0]), [srch]"=&f"(ftmp[1]), [filter1]"=&f"(ftmp[2]), [filter2]"=&f"(ftmp[3]), [ftmp0]"=&f"(ftmp[4]), [ftmp4]"=&f"(ftmp[5]), [ftmp5]"=&f"(ftmp[6]), [ftmp6]"=&f"(ftmp[7]), @@ -153,15 +149,14 @@ static void convolve_vert_mmi(const uint8_t *src, int32_t src_stride, double ftmp[17]; uint32_t tmp[1]; ptrdiff_t addr = src_stride; + DECLARE_VAR_ALL64; src_stride -= w; dst_stride -= w; __asm__ volatile ( "pxor %[ftmp0], %[ftmp0], %[ftmp0] \n\t" - "gsldlc1 %[ftmp4], 0x07(%[filter]) \n\t" - "gsldrc1 %[ftmp4], 0x00(%[filter]) \n\t" - "gsldlc1 %[ftmp5], 0x0f(%[filter]) \n\t" - "gsldrc1 %[ftmp5], 0x08(%[filter]) \n\t" + MMI_ULDC1(%[ftmp4], %[filter], 0x00) + MMI_ULDC1(%[ftmp5], %[filter], 0x08) "punpcklwd %[filter10], %[ftmp4], %[ftmp4] \n\t" "punpckhwd %[filter32], %[ftmp4], %[ftmp4] \n\t" "punpcklwd %[filter54], %[ftmp5], %[ftmp5] \n\t" @@ -171,29 +166,21 @@ static void convolve_vert_mmi(const uint8_t *src, int32_t src_stride, "punpcklwd %[ftmp13], %[ftmp13], %[ftmp13] \n\t" "1: \n\t" /* Get 8 data per column */ - "gsldlc1 %[ftmp4], 0x07(%[src]) \n\t" - "gsldrc1 %[ftmp4], 0x00(%[src]) \n\t" + MMI_ULDC1(%[ftmp4], %[src], 0x0) PTR_ADDU "%[tmp0], %[src], %[addr] \n\t" - "gsldlc1 %[ftmp5], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp5], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp5], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp6], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp6], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp6], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp7], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp7], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp7], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp8], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp8], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp8], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp9], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp9], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp9], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp10], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp10], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp10], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp11], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp11], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp11], %[tmp0], 0x0) "punpcklbh %[ftmp4], %[ftmp4], %[ftmp0] \n\t" "punpcklbh %[ftmp5], %[ftmp5], %[ftmp0] \n\t" "punpcklbh %[ftmp6], %[ftmp6], %[ftmp0] \n\t" @@ -221,7 +208,8 @@ static void convolve_vert_mmi(const uint8_t *src, int32_t src_stride, PTR_ADDU "%[dst], %[dst], %[dst_stride] \n\t" PTR_ADDIU "%[height], %[height], -0x01 \n\t" "bnez %[height], 1b \n\t" - : [srcl]"=&f"(ftmp[0]), [srch]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [srcl]"=&f"(ftmp[0]), [srch]"=&f"(ftmp[1]), [filter10]"=&f"(ftmp[2]), [filter32]"=&f"(ftmp[3]), [filter54]"=&f"(ftmp[4]), [filter76]"=&f"(ftmp[5]), [ftmp0]"=&f"(ftmp[6]), [ftmp4]"=&f"(ftmp[7]), @@ -247,6 +235,7 @@ static void convolve_avg_horiz_mmi(const uint8_t *src, int32_t src_stride, { double ftmp[15]; uint32_t tmp[2]; + DECLARE_VAR_ALL64; src -= 3; src_stride -= w; dst_stride -= w; @@ -254,23 +243,17 @@ static void convolve_avg_horiz_mmi(const uint8_t *src, int32_t src_stride, __asm__ volatile ( "move %[tmp1], %[width] \n\t" "pxor %[ftmp0], %[ftmp0], %[ftmp0] \n\t" - "gsldlc1 %[filter1], 0x07(%[filter]) \n\t" - "gsldrc1 %[filter1], 0x00(%[filter]) \n\t" - "gsldlc1 %[filter2], 0x0f(%[filter]) \n\t" - "gsldrc1 %[filter2], 0x08(%[filter]) \n\t" + MMI_ULDC1(%[filter1], %[filter], 0x00) + MMI_ULDC1(%[filter2], %[filter], 0x08) "li %[tmp0], 0x07 \n\t" "dmtc1 %[tmp0], %[ftmp13] \n\t" "punpcklwd %[ftmp13], %[ftmp13], %[ftmp13] \n\t" "1: \n\t" /* Get 8 data per row */ - "gsldlc1 %[ftmp5], 0x07(%[src]) \n\t" - "gsldrc1 %[ftmp5], 0x00(%[src]) \n\t" - "gsldlc1 %[ftmp7], 0x08(%[src]) \n\t" - "gsldrc1 %[ftmp7], 0x01(%[src]) \n\t" - "gsldlc1 %[ftmp9], 0x09(%[src]) \n\t" - "gsldrc1 %[ftmp9], 0x02(%[src]) \n\t" - "gsldlc1 %[ftmp11], 0x0A(%[src]) \n\t" - "gsldrc1 %[ftmp11], 0x03(%[src]) \n\t" + MMI_ULDC1(%[ftmp5], %[src], 0x00) + MMI_ULDC1(%[ftmp7], %[src], 0x01) + MMI_ULDC1(%[ftmp9], %[src], 0x02) + MMI_ULDC1(%[ftmp11], %[src], 0x03) "punpcklbh %[ftmp4], %[ftmp5], %[ftmp0] \n\t" "punpckhbh %[ftmp5], %[ftmp5], %[ftmp0] \n\t" "punpcklbh %[ftmp6], %[ftmp7], %[ftmp0] \n\t" @@ -289,8 +272,7 @@ static void convolve_avg_horiz_mmi(const uint8_t *src, int32_t src_stride, "packsswh %[srcl], %[srcl], %[srch] \n\t" "packushb %[ftmp12], %[srcl], %[ftmp0] \n\t" "punpcklbh %[ftmp12], %[ftmp12], %[ftmp0] \n\t" - "gsldlc1 %[ftmp4], 0x07(%[dst]) \n\t" - "gsldrc1 %[ftmp4], 0x00(%[dst]) \n\t" + MMI_ULDC1(%[ftmp4], %[dst], 0x0) "punpcklbh %[ftmp4], %[ftmp4], %[ftmp0] \n\t" "paddh %[ftmp12], %[ftmp12], %[ftmp4] \n\t" "li %[tmp0], 0x10001 \n\t" @@ -309,7 +291,8 @@ static void convolve_avg_horiz_mmi(const uint8_t *src, int32_t src_stride, PTR_ADDU "%[dst], %[dst], %[dst_stride] \n\t" PTR_ADDIU "%[height], %[height], -0x01 \n\t" "bnez %[height], 1b \n\t" - : [srcl]"=&f"(ftmp[0]), [srch]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [srcl]"=&f"(ftmp[0]), [srch]"=&f"(ftmp[1]), [filter1]"=&f"(ftmp[2]), [filter2]"=&f"(ftmp[3]), [ftmp0]"=&f"(ftmp[4]), [ftmp4]"=&f"(ftmp[5]), [ftmp5]"=&f"(ftmp[6]), [ftmp6]"=&f"(ftmp[7]), @@ -335,15 +318,14 @@ static void convolve_avg_vert_mmi(const uint8_t *src, int32_t src_stride, double ftmp[17]; uint32_t tmp[1]; ptrdiff_t addr = src_stride; + DECLARE_VAR_ALL64; src_stride -= w; dst_stride -= w; __asm__ volatile ( "pxor %[ftmp0], %[ftmp0], %[ftmp0] \n\t" - "gsldlc1 %[ftmp4], 0x07(%[filter]) \n\t" - "gsldrc1 %[ftmp4], 0x00(%[filter]) \n\t" - "gsldlc1 %[ftmp5], 0x0f(%[filter]) \n\t" - "gsldrc1 %[ftmp5], 0x08(%[filter]) \n\t" + MMI_ULDC1(%[ftmp4], %[filter], 0x00) + MMI_ULDC1(%[ftmp5], %[filter], 0x08) "punpcklwd %[filter10], %[ftmp4], %[ftmp4] \n\t" "punpckhwd %[filter32], %[ftmp4], %[ftmp4] \n\t" "punpcklwd %[filter54], %[ftmp5], %[ftmp5] \n\t" @@ -353,29 +335,21 @@ static void convolve_avg_vert_mmi(const uint8_t *src, int32_t src_stride, "punpcklwd %[ftmp13], %[ftmp13], %[ftmp13] \n\t" "1: \n\t" /* Get 8 data per column */ - "gsldlc1 %[ftmp4], 0x07(%[src]) \n\t" - "gsldrc1 %[ftmp4], 0x00(%[src]) \n\t" + MMI_ULDC1(%[ftmp4], %[src], 0x0) PTR_ADDU "%[tmp0], %[src], %[addr] \n\t" - "gsldlc1 %[ftmp5], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp5], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp5], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp6], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp6], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp6], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp7], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp7], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp7], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp8], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp8], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp8], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp9], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp9], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp9], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp10], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp10], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp10], %[tmp0], 0x0) PTR_ADDU "%[tmp0], %[tmp0], %[addr] \n\t" - "gsldlc1 %[ftmp11], 0x07(%[tmp0]) \n\t" - "gsldrc1 %[ftmp11], 0x00(%[tmp0]) \n\t" + MMI_ULDC1(%[ftmp11], %[tmp0], 0x0) "punpcklbh %[ftmp4], %[ftmp4], %[ftmp0] \n\t" "punpcklbh %[ftmp5], %[ftmp5], %[ftmp0] \n\t" "punpcklbh %[ftmp6], %[ftmp6], %[ftmp0] \n\t" @@ -394,8 +368,7 @@ static void convolve_avg_vert_mmi(const uint8_t *src, int32_t src_stride, "packsswh %[srcl], %[srcl], %[srch] \n\t" "packushb %[ftmp12], %[srcl], %[ftmp0] \n\t" "punpcklbh %[ftmp12], %[ftmp12], %[ftmp0] \n\t" - "gsldlc1 %[ftmp4], 0x07(%[dst]) \n\t" - "gsldrc1 %[ftmp4], 0x00(%[dst]) \n\t" + MMI_ULDC1(%[ftmp4], %[dst], 0x00) "punpcklbh %[ftmp4], %[ftmp4], %[ftmp0] \n\t" "paddh %[ftmp12], %[ftmp12], %[ftmp4] \n\t" "li %[tmp0], 0x10001 \n\t" @@ -414,7 +387,8 @@ static void convolve_avg_vert_mmi(const uint8_t *src, int32_t src_stride, PTR_ADDU "%[dst], %[dst], %[dst_stride] \n\t" PTR_ADDIU "%[height], %[height], -0x01 \n\t" "bnez %[height], 1b \n\t" - : [srcl]"=&f"(ftmp[0]), [srch]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [srcl]"=&f"(ftmp[0]), [srch]"=&f"(ftmp[1]), [filter10]"=&f"(ftmp[2]), [filter32]"=&f"(ftmp[3]), [filter54]"=&f"(ftmp[4]), [filter76]"=&f"(ftmp[5]), [ftmp0]"=&f"(ftmp[6]), [ftmp4]"=&f"(ftmp[7]), @@ -439,6 +413,7 @@ static void convolve_avg_mmi(const uint8_t *src, int32_t src_stride, { double ftmp[4]; uint32_t tmp[2]; + DECLARE_VAR_ALL64; src_stride -= w; dst_stride -= w; @@ -449,10 +424,8 @@ static void convolve_avg_mmi(const uint8_t *src, int32_t src_stride, "dmtc1 %[tmp0], %[ftmp3] \n\t" "punpcklhw %[ftmp3], %[ftmp3], %[ftmp3] \n\t" "1: \n\t" - "gslwlc1 %[ftmp1], 0x07(%[src]) \n\t" - "gslwrc1 %[ftmp1], 0x00(%[src]) \n\t" - "gslwlc1 %[ftmp2], 0x07(%[dst]) \n\t" - "gslwrc1 %[ftmp2], 0x00(%[dst]) \n\t" + MMI_ULDC1(%[ftmp1], %[src], 0x00) + MMI_ULDC1(%[ftmp2], %[dst], 0x00) "punpcklbh %[ftmp1], %[ftmp1], %[ftmp0] \n\t" "punpcklbh %[ftmp2], %[ftmp2], %[ftmp0] \n\t" "paddh %[ftmp1], %[ftmp1], %[ftmp2] \n\t" @@ -469,7 +442,8 @@ static void convolve_avg_mmi(const uint8_t *src, int32_t src_stride, PTR_ADDU "%[src], %[src], %[src_stride] \n\t" PTR_ADDIU "%[height], %[height], -0x01 \n\t" "bnez %[height], 1b \n\t" - : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), + : RESTRICT_ASM_ALL64 + [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), [tmp0]"=&r"(tmp[0]), [tmp1]"=&r"(tmp[1]), [src]"+&r"(src), [dst]"+&r"(dst), From patchwork Fri Jul 23 05:53:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 29024 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5d:965a:0:0:0:0:0 with SMTP id d26csp1119077ios; Thu, 22 Jul 2021 22:55:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy9dG3jCrS8VWdCtxmJwCo2ndWlupqwUKbvyWh+EH76OylbkSgun4sXsTWspRHvEo/B/pUq X-Received: by 2002:a17:906:d182:: with SMTP id c2mr3316125ejz.111.1627019707134; Thu, 22 Jul 2021 22:55:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627019707; cv=none; d=google.com; s=arc-20160816; b=QF/7u1+oy5LtswlVVlWCHDrbqruTGjqRX8ei59W7oPXGwwfios4UFDU6O6shKqEBP3 dq5oaY4ZCkGLH8vEiVHSIqSamStDK8EGOmpDIeaTc040LGPbjfjsFjjVt0YbBx5lTe9c kAbnED9gby39fAOSXcng6EgoNfyzj3ZT5ip5LdQpNYum0N8EKX0/QODXi0ES/aQtPs0Q aogLD23KB2f5hFGoRjoy2C/7DUPINhCM+x1LYmc/iIZ5bRDN8O0t9mLLKJ9JKcl/gHjQ 1L1Hd4+id2ydVXeFpoww1GuudTDYn0FViTZ6RsG4fTg9kIiTzOdSgupdNwFWS4+LcC6E 8kcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:dkim-signature:delivered-to; bh=DWVfYxCJtXCeZll8sfC86kFN8/k+tbdY7qi2FUpbIs4=; b=Y8+b84RIdEGcO2E30BopaBrv/AOvYCjnTWdgYyK0YxBegEdUgisACT6Ssi5p+cjejW 6+Sa32Zgns96KcQhvtT6t42cNLkmovMgGr/hnZqJsXDkQokjku3jG3mmTrDRWqCUVlf4 TxRXYT3vFEwX0oUQOTOhx5jfd2I3RNQo74I5NNxE1to+bQCySRnsL6AQjpiwbLz3awEL tXw2XKMB+AEyM/i5a6YRul9r6xoJa58gq8fBMa3loxBv0PglgH0R9URuU2ig2R6C+usB iGBN0OZ1OEVOugWlxCCozdyhuukI0/XucA/iIWxX0BeEN6SuSLithp1d9RY5L049N9NW ZRlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@flygoat.com header.s=fm2 header.b=VhclDEbe; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm3 header.b=v7kSvs5c; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f18si32449338ejx.521.2021.07.22.22.55.06; Thu, 22 Jul 2021 22:55:07 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@flygoat.com header.s=fm2 header.b=VhclDEbe; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm3 header.b=v7kSvs5c; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2BBA468ACFD; Fri, 23 Jul 2021 08:54:40 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8B0E868AC54 for ; Fri, 23 Jul 2021 08:54:33 +0300 (EEST) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id A6D1E5C00FD; Fri, 23 Jul 2021 01:54:32 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Fri, 23 Jul 2021 01:54:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=flygoat.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm2; bh=+tB+pA+d0qted B+Rz6240wwemMCw4SX5ucSN4xqYljU=; b=VhclDEbeR4zxG4BzXpjvbNeryntbp sTf94F/LTMxRnQ3jTUxdx4524Ii2NMbwQMe1BOSYTVowrKFuI0/FvIl1LtYBbM+k SGBNGobmbbxjlAfn4YGWXHdgtEktkQOSwMRuQaSkpaH7XmoJvpqKHFi/x20JCoXH zkGvGTrvl5nQqV1G9WA23erJfw763A6C4gbpOgo4/egr58ZVm+dMoXF8M6UHXbO8 vTWiytm62u3/yEqfszgqRpH1RkCSXMmPPcL9Ob/zrI0dGw/qarwiXHLCrDFUA4LE x3/xn9yrlKb+5HWx8pfGk2D7M8d0vpWQqYfmmYb7NnZStgL8POhL13MkQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=+tB+pA+d0qtedB+Rz6240wwemMCw4SX5ucSN4xqYljU=; b=v7kSvs5c 2X6y6F/1HZIpkzHn3ad57agK256/7B+/7Ey6T5rZAO8KKlE/+TSvQrVVC8mTuVv7 d6Om0GsePxAZvnO9O1jtEjw0as6xN86wrJNtwUYCt4UJQKfja6be0l7ecsuD/aPM 632kC3uuc1PrgEX3HKXHbjlzzk+BiFN9IPqyZN6XeZSeCLygzG87BdA7dUo0wR2i qNH758xnd1uj8y2acccFyRvHeDemD/slxUho7aeK2h0jb8+vBXbaqJWCFR1Mucx2 VBoH2yCZLcHPHckz5dyP+Gsnai5Vkevt5FezTAxYGx0HoT1aGzcj41ual5fgsOab OzzmtL5DjANwcA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrfeejgdeljecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeflihgrgihunhcujggrnhhguceojhhirgiguhhnrdihrghnghes fhhlhihgohgrthdrtghomheqnecuggftrfgrthhtvghrnhepjeeihffgteelkeelffduke dtheevudejvdegkeekjeefhffhhfetudetgfdtffeunecuvehluhhsthgvrhfuihiivgep tdenucfrrghrrghmpehmrghilhhfrhhomhepjhhirgiguhhnrdihrghnghesfhhlhihgoh grthdrtghomh X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 23 Jul 2021 01:54:30 -0400 (EDT) From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Fri, 23 Jul 2021 13:53:43 +0800 Message-Id: <20210723055344.21961-4-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210723055344.21961-1-jiaxun.yang@flygoat.com> References: <20210723055344.21961-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 3/4] avutil/mips: Use $at as MMI macro temporary register X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: yinshiyou-hf@loongson.cn, Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: rTJr5TC1rkdP Some function had exceed 30 inline assembly register oprands limiation when using LOONGSON2 version of MMI macros. We can avoid that by take $at, which is register reserved for assembler, as temporary register. As none of instructions used in these macros is pseudo, it is safe to utilize $at here. Signed-off-by: Jiaxun Yang Reviewed-by: Shiyou Yin --- libavutil/mips/mmiutils.h | 108 +++++++++++++++++++++++--------------- 1 file changed, 66 insertions(+), 42 deletions(-) diff --git a/libavutil/mips/mmiutils.h b/libavutil/mips/mmiutils.h index 41715c6490..7991e14e84 100644 --- a/libavutil/mips/mmiutils.h +++ b/libavutil/mips/mmiutils.h @@ -29,74 +29,103 @@ #include "libavutil/mem_internal.h" #include "libavutil/mips/asmdefs.h" -#if HAVE_LOONGSON2 +/* + * These were used to define temporary registers for MMI marcos + * however now we're using $at. They're theoretically unnecessary + * but just leave them here to avoid mess. + */ +#define DECLARE_VAR_LOW32 +#define RESTRICT_ASM_LOW32 +#define DECLARE_VAR_ALL64 +#define RESTRICT_ASM_ALL64 +#define DECLARE_VAR_ADDRT +#define RESTRICT_ASM_ADDRT -#define DECLARE_VAR_LOW32 int32_t low32 -#define RESTRICT_ASM_LOW32 [low32]"=&r"(low32), -#define DECLARE_VAR_ALL64 int64_t all64 -#define RESTRICT_ASM_ALL64 [all64]"=&r"(all64), -#define DECLARE_VAR_ADDRT mips_reg addrt -#define RESTRICT_ASM_ADDRT [addrt]"=&r"(addrt), +#if HAVE_LOONGSON2 #define MMI_LWX(reg, addr, stride, bias) \ - PTR_ADDU "%[addrt], "#addr", "#stride" \n\t" \ - "lw "#reg", "#bias"(%[addrt]) \n\t" + ".set noat \n\t" \ + PTR_ADDU "$at, "#addr", "#stride" \n\t" \ + "lw "#reg", "#bias"($at) \n\t" \ + ".set at \n\t" #define MMI_SWX(reg, addr, stride, bias) \ - PTR_ADDU "%[addrt], "#addr", "#stride" \n\t" \ - "sw "#reg", "#bias"(%[addrt]) \n\t" + ".set noat \n\t" \ + PTR_ADDU "$at, "#addr", "#stride" \n\t" \ + "sw "#reg", "#bias"($at) \n\t" \ + ".set at \n\t" #define MMI_LDX(reg, addr, stride, bias) \ - PTR_ADDU "%[addrt], "#addr", "#stride" \n\t" \ - "ld "#reg", "#bias"(%[addrt]) \n\t" + ".set noat \n\t" \ + PTR_ADDU "$at, "#addr", "#stride" \n\t" \ + "ld "#reg", "#bias"($at) \n\t" \ + ".set at \n\t" #define MMI_SDX(reg, addr, stride, bias) \ - PTR_ADDU "%[addrt], "#addr", "#stride" \n\t" \ - "sd "#reg", "#bias"(%[addrt]) \n\t" + ".set noat \n\t" \ + PTR_ADDU "$at, "#addr", "#stride" \n\t" \ + "sd "#reg", "#bias"($at) \n\t" \ + ".set at \n\t" #define MMI_LWC1(fp, addr, bias) \ "lwc1 "#fp", "#bias"("#addr") \n\t" #define MMI_ULWC1(fp, addr, bias) \ - "ulw %[low32], "#bias"("#addr") \n\t" \ - "mtc1 %[low32], "#fp" \n\t" + ".set noat \n\t" \ + "ulw $at, "#bias"("#addr") \n\t" \ + "mtc1 $at, "#fp" \n\t" \ + ".set at \n\t" #define MMI_LWXC1(fp, addr, stride, bias) \ - PTR_ADDU "%[addrt], "#addr", "#stride" \n\t" \ - MMI_LWC1(fp, %[addrt], bias) + ".set noat \n\t" \ + PTR_ADDU "$at, "#addr", "#stride" \n\t" \ + MMI_LWC1(fp, $at, bias) \ + ".set at \n\t" #define MMI_SWC1(fp, addr, bias) \ "swc1 "#fp", "#bias"("#addr") \n\t" #define MMI_USWC1(fp, addr, bias) \ - "mfc1 %[low32], "#fp" \n\t" \ - "usw %[low32], "#bias"("#addr") \n\t" + ".set noat \n\t" \ + "mfc1 $at, "#fp" \n\t" \ + "usw $at, "#bias"("#addr") \n\t" \ + ".set at \n\t" #define MMI_SWXC1(fp, addr, stride, bias) \ - PTR_ADDU "%[addrt], "#addr", "#stride" \n\t" \ - MMI_SWC1(fp, %[addrt], bias) + ".set noat \n\t" \ + PTR_ADDU "$at, "#addr", "#stride" \n\t" \ + MMI_SWC1(fp, $at, bias) \ + ".set at \n\t" #define MMI_LDC1(fp, addr, bias) \ "ldc1 "#fp", "#bias"("#addr") \n\t" #define MMI_ULDC1(fp, addr, bias) \ - "uld %[all64], "#bias"("#addr") \n\t" \ - "dmtc1 %[all64], "#fp" \n\t" + ".set noat \n\t" \ + "uld $at, "#bias"("#addr") \n\t" \ + "dmtc1 $at, "#fp" \n\t" \ + ".set at \n\t" #define MMI_LDXC1(fp, addr, stride, bias) \ - PTR_ADDU "%[addrt], "#addr", "#stride" \n\t" \ - MMI_LDC1(fp, %[addrt], bias) + ".set noat \n\t" \ + PTR_ADDU "$at, "#addr", "#stride" \n\t" \ + MMI_LDC1(fp, $at, bias) \ + ".set at \n\t" #define MMI_SDC1(fp, addr, bias) \ "sdc1 "#fp", "#bias"("#addr") \n\t" #define MMI_USDC1(fp, addr, bias) \ - "dmfc1 %[all64], "#fp" \n\t" \ - "usd %[all64], "#bias"("#addr") \n\t" + ".set noat \n\t" \ + "dmfc1 $at, "#fp" \n\t" \ + "usd $at, "#bias"("#addr") \n\t" \ + ".set at \n\t" #define MMI_SDXC1(fp, addr, stride, bias) \ - PTR_ADDU "%[addrt], "#addr", "#stride" \n\t" \ - MMI_SDC1(fp, %[addrt], bias) + ".set noat \n\t" \ + PTR_ADDU "$at, "#addr", "#stride" \n\t" \ + MMI_SDC1(fp, $at, bias) \ + ".set at \n\t" #define MMI_LQ(reg1, reg2, addr, bias) \ "ld "#reg1", "#bias"("#addr") \n\t" \ @@ -116,11 +145,6 @@ #elif HAVE_LOONGSON3 /* !HAVE_LOONGSON2 */ -#define DECLARE_VAR_ALL64 -#define RESTRICT_ASM_ALL64 -#define DECLARE_VAR_ADDRT -#define RESTRICT_ASM_ADDRT - #define MMI_LWX(reg, addr, stride, bias) \ "gslwx "#reg", "#bias"("#addr", "#stride") \n\t" @@ -138,12 +162,12 @@ #if _MIPS_SIM == _ABIO32 /* workaround for 3A2000 gslwlc1 bug */ -#define DECLARE_VAR_LOW32 int32_t low32 -#define RESTRICT_ASM_LOW32 [low32]"=&r"(low32), - -#define MMI_ULWC1(fp, addr, bias) \ - "ulw %[low32], "#bias"("#addr") \n\t" \ - "mtc1 %[low32], "#fp" \n\t" +#define MMI_LWLRC1(fp, addr, bias, off) \ + ".set noat \n\t" \ + "lwl $at, "#bias"+"#off"("#addr") \n\t" \ + "lwr $at, "#bias"("#addr") \n\t" \ + "mtc1 $at, "#fp" \n\t" \ + ".set at \n\t" #else /* _MIPS_SIM != _ABIO32 */ From patchwork Fri Jul 23 05:53:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 29026 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5d:965a:0:0:0:0:0 with SMTP id d26csp1119157ios; Thu, 22 Jul 2021 22:55:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxX4qG5PUl4fGDTJxFn4lRGStps1UUxRM+m/4DwFDKFo5uwEohML3Q8vslH9pmJ41/c81Sc X-Received: by 2002:a05:6402:2206:: with SMTP id cq6mr3664749edb.209.1627019716684; Thu, 22 Jul 2021 22:55:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627019716; cv=none; d=google.com; s=arc-20160816; b=hDx2ZHTbgHQg++pycsDs470vInWFzI8vfqdM0vpElwF1HrzQECqoBLt3GSvdaL444U nbCScO9iqDMSkyPEDzbJVShBO7IZw5GdsdYHNDvYf/NC/C52WuIcMwvQqMqAYOvk0yW+ F1BOA/djwOoTt4axlTwZhEyrqqsaCtdaV/P2fV/p/7dtWUto7R4/57Iu37P+cvpY68OC 8BUt72osvwhNpYx9c+OXncfwc0MVADl1dHwCswiFjZ+Cnd65ejDRsquTtoD+KLOslHz8 2R95z7nfYWuCJoIPSR4jXSkyxMGjEF3NoqwF8dkz4knMoiRmeCJlaLvu6pl+LfvD05AK rZbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:dkim-signature:delivered-to; bh=ELnuOVToijHWZQECnjs9iJX6P2moODLZu2crYzhIeuE=; b=toK7C8p64LfwwcNfmfVweA9za5vDew/CNKhp1O0bRD41vRmIOdCdXWyyTBj6GdtbkF Zf7uM/ibBkCTI7rgJCm801djFW8mD7EmqFHNtQZUdLuvPtTA78+5JHTVkAPYquCRISWI vug9lyv1u1v/wKyf6RrDdPV+Z8V5FQAJZnkj8TXGpmPFTjGzJjJXzzxEx89VTRSzgSTY zKWqi+7FhOo8ufrmNwbt/NmyzrbLyyQEf9IjecvZ9XTqpcRaxoGt6aJdQZSCIPhHqgQ4 CvnRAz2QSR0NxqHyRpk46VnYzLsb7wy0Bbw6+12lTZn67gclTF9OgwfnLMWUEuUPzvPL 231Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@flygoat.com header.s=fm2 header.b=R5p5rlVN; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm3 header.b="pOD/C7+j"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a17si12981847ejg.120.2021.07.22.22.55.16; Thu, 22 Jul 2021 22:55:16 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@flygoat.com header.s=fm2 header.b=R5p5rlVN; dkim=neutral (body hash did not verify) header.i=@messagingengine.com header.s=fm3 header.b="pOD/C7+j"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4774468ADAB; Fri, 23 Jul 2021 08:54:42 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 940DE68AD0A for ; Fri, 23 Jul 2021 08:54:35 +0300 (EEST) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id B000D5C00EE; Fri, 23 Jul 2021 01:54:34 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Fri, 23 Jul 2021 01:54:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=flygoat.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm2; bh=dBhW9irJVK/K1 T0vwuZau8ieyTDhkMNBJeazMbAJx1I=; b=R5p5rlVN8S0jC9eyoHzJLcyAG/mbX kcNDSRBUO5pNHrnYltfr2E5cK0dq9tBiFo73Q+bOCOjbCsLypgqhiK8q+BJblfYl T9lyUYPnz8yTRe1oSx88q0lR2UwPIQaPSh4Tvyyc3KrBN+Eixnaub/1wIZfQl9Wi QvROECZcysVb3TnsE3bp91TLO3sR5/oz8nDmTNo6UhyS+vRr3/yCbfO3TfeJPjN3 22QU8yewORAHKpnRz17L7wSR2UEpIh+9nhtfHsx+UfnP3/g5EeJeoTf44vcgQQyx MQzP33Xl0y+2/vecCQIijjuR3ulF1gG6pAcyNMfZShaKUuFiqwkyNCkWQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=dBhW9irJVK/K1T0vwuZau8ieyTDhkMNBJeazMbAJx1I=; b=pOD/C7+j tGjK1fT8ByMSCpMxenOiNSXWViYXI8goGGjjsYEGL+QYqMpSyoCI9Z9QQEow0w1S y3m8y4I0QeUF3LnyLjWmSHIzYngzbYltdULKF2eckUYGbBJcselM8YirYkytI/2Z PetqbR/ZgMUZ81CMIe9tZYh/PEPUe5A068Fw03MkG2WDVd15LYCai2AnLHMr6pmj vY5bbvUxzImuXR/jabGIY2K0P05Ob8JG3Xy9Brl3NxkeJAhzaepuN60r9V34LfU/ 2W9UTzbs8GIwKN+Tk6WCAobMIr7pff6/BVE48LFMgKp5qm74oAB+V689EYwldvWu WVfVy6AKkv2LdA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrfeejgdeljecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeflihgrgihunhcujggrnhhguceojhhirgiguhhnrdihrghnghes fhhlhihgohgrthdrtghomheqnecuggftrfgrthhtvghrnhepjeeihffgteelkeelffduke dtheevudejvdegkeekjeefhffhhfetudetgfdtffeunecuvehluhhsthgvrhfuihiivgep tdenucfrrghrrghmpehmrghilhhfrhhomhepjhhirgiguhhnrdihrghnghesfhhlhihgoh grthdrtghomh X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 23 Jul 2021 01:54:32 -0400 (EDT) From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Fri, 23 Jul 2021 13:53:44 +0800 Message-Id: <20210723055344.21961-5-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210723055344.21961-1-jiaxun.yang@flygoat.com> References: <20210723055344.21961-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 4/4] avcodec/mips: cabac.h provide fallback for wsbh instruction X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: yinshiyou-hf@loongson.cn, Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: UU6lo4rGhR2f wsbh is only avilable for MIPS R2+. Provide a fallback for older processors. Signed-off-by: Jiaxun Yang Reviewed-by: Shiyou Yin --- libavcodec/mips/cabac.h | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/libavcodec/mips/cabac.h b/libavcodec/mips/cabac.h index f1e953dabe..39c308c7e0 100644 --- a/libavcodec/mips/cabac.h +++ b/libavcodec/mips/cabac.h @@ -77,7 +77,15 @@ static av_always_inline int get_cabac_inline_mips(CABACContext *c, "lhu %[tmp0], 0(%[c_bytestream]) \n\t" #else "lhu %[tmp0], 0(%[c_bytestream]) \n\t" +#if HAVE_MIPS32R2 || HAVE_MIPS64R2 "wsbh %[tmp0], %[tmp0] \n\t" +#else + "and %[tmp1], %[tmp0], 0xff00ff00 \n\t" + "srl %[tmp1], %[tmp1], 8 \n\t" + "and %[tmp0], %[tmp0], 0x00ff00ff \n\t" + "sll %[tmp0], %[tmp0], 8 \n\t" + "or %[tmp0], %[tmp0], %[tmp1] \n\t" +#endif #endif PTR_SLL "%[tmp0], %[tmp0], 0x01 \n\t" PTR_SUBU "%[tmp0], %[tmp0], %[cabac_mask] \n\t" @@ -125,7 +133,15 @@ static av_always_inline int get_cabac_bypass_mips(CABACContext *c) "lhu %[tmp1], 0(%[c_bytestream]) \n\t" #else "lhu %[tmp1], 0(%[c_bytestream]) \n\t" +#if HAVE_MIPS32R2 || HAVE_MIPS64R2 "wsbh %[tmp1], %[tmp1] \n\t" +#else + "and %[tmp0], %[tmp1], 0xff00ff00 \n\t" + "srl %[tmp0], %[tmp0], 8 \n\t" + "and %[tmp1], %[tmp1], 0x00ff00ff \n\t" + "sll %[tmp1], %[tmp1], 8 \n\t" + "or %[tmp1], %[tmp1], %[tmp0] \n\t" +#endif #endif PTR_SLL "%[tmp1], %[tmp1], 0x01 \n\t" PTR_SUBU "%[tmp1], %[tmp1], %[cabac_mask] \n\t" @@ -169,7 +185,15 @@ static av_always_inline int get_cabac_bypass_sign_mips(CABACContext *c, int val) "lhu %[tmp1], 0(%[c_bytestream]) \n\t" #else "lhu %[tmp1], 0(%[c_bytestream]) \n\t" +#if HAVE_MIPS32R2 || HAVE_MIPS64R2 "wsbh %[tmp1], %[tmp1] \n\t" +#else + "and %[tmp0], %[tmp1], 0xff00ff00 \n\t" + "srl %[tmp0], %[tmp0], 8 \n\t" + "and %[tmp1], %[tmp1], 0x00ff00ff \n\t" + "sll %[tmp1], %[tmp1], 8 \n\t" + "or %[tmp1], %[tmp1], %[tmp0] \n\t" +#endif #endif PTR_SLL "%[tmp1], %[tmp1], 0x01 \n\t" PTR_SUBU "%[tmp1], %[tmp1], %[cabac_mask] \n\t"