From patchwork Thu Jul 2 15:46:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 20780 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 591ED44B923 for ; Thu, 2 Jul 2020 18:47:07 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 43CC368AD9A; Thu, 2 Jul 2020 18:47:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from relay4.mymailcheap.com (relay4.mymailcheap.com [137.74.80.156]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 73B9F68A0BE for ; Thu, 2 Jul 2020 18:46:59 +0300 (EEST) Received: from filter2.mymailcheap.com (filter2.mymailcheap.com [91.134.140.82]) by relay4.mymailcheap.com (Postfix) with ESMTPS id 959313F1CF for ; Thu, 2 Jul 2020 17:46:57 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by filter2.mymailcheap.com (Postfix) with ESMTP id 63B542A4F0 for ; Thu, 2 Jul 2020 17:46:57 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=mymailcheap.com; s=default; t=1593704817; bh=WENfaguslmkx/sed9tpjYBmjFwSvSsMpcPLT6uOLDEE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HGIol5+xFFbAaq4G+hItxbpwK1L+7SemHZGJcp6rjcdXccX8f2peuQ82tiFTBKnB0 xUt2VRNgonIrm+kkXZ/cGPNrdnNUadOhhF3fPfJ56QqwJ6iTYjfFG5VaHtH2qXCEaY fZSTzyKEMHFovZF6U6L8zwMEqk2tNL3Va1zvU7uY= X-Virus-Scanned: Debian amavisd-new at filter2.mymailcheap.com Received: from filter2.mymailcheap.com ([127.0.0.1]) by localhost (filter2.mymailcheap.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EEo4bPzEHbU7 for ; Thu, 2 Jul 2020 17:46:55 +0200 (CEST) Received: from mail20.mymailcheap.com (mail20.mymailcheap.com [51.83.111.147]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by filter2.mymailcheap.com (Postfix) with ESMTPS for ; Thu, 2 Jul 2020 17:46:55 +0200 (CEST) Received: from [213.133.102.83] (ml.mymailcheap.com [213.133.102.83]) by mail20.mymailcheap.com (Postfix) with ESMTP id 6A349403B3; Thu, 2 Jul 2020 15:46:53 +0000 (UTC) Authentication-Results: mail20.mymailcheap.com; dkim=pass (1024-bit key; unprotected) header.d=flygoat.com header.i=@flygoat.com header.b="XNK5d6vx"; dkim-atps=neutral AI-Spam-Status: Not processed Received: from halation.202.net.flygoat.com (unknown [115.227.164.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail20.mymailcheap.com (Postfix) with ESMTPSA id 949084084C; Thu, 2 Jul 2020 15:46:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=flygoat.com; s=default; t=1593704795; bh=WENfaguslmkx/sed9tpjYBmjFwSvSsMpcPLT6uOLDEE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XNK5d6vxCQad6v+qdPoI7Am7AT3OUVcxr8savJ54AVty27mqb1SiJ306qkjlfgv7G X9xrgu/tOTtp+DytZn2oS8YaWbG4QBZWSPpg+iulzwgNp2cItau9w1T0UyzV81hCHl JGhIccyexsJ832FB6o8WLhBPGw9XaCqb/I2ESEtI= From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Thu, 2 Jul 2020 23:46:17 +0800 Message-Id: <20200702154622.21280-2-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200702154622.21280-1-jiaxun.yang@flygoat.com> References: <20200702154622.21280-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 6A349403B3 X-Spamd-Result: default: False [4.90 / 10.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(0.00)[flygoat.com:s=default]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; R_SPF_SOFTFAIL(0.00)[~all]; ML_SERVERS(-3.10)[213.133.102.83]; DKIM_TRACE(0.00)[flygoat.com:+]; RCPT_COUNT_TWO(0.00)[2]; MID_CONTAINS_FROM(1.00)[]; RCVD_IN_DNSWL_NONE(0.00)[213.133.102.83:from]; DMARC_POLICY_ALLOW(0.00)[flygoat.com,none]; DMARC_POLICY_ALLOW_WITH_FAILURES(0.00)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:213.133.96.0/19, country:DE]; RCVD_COUNT_TWO(0.00)[2]; HFILTER_HELO_BAREIP(3.00)[213.133.102.83,1] X-Rspamd-Server: mail20.mymailcheap.com X-Spam: Yes Subject: [FFmpeg-devel] [PATCH v5 1/6] ffbuild: Refine MIPS handling X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" To enable runtime detection for MIPS, we need to refine ffbuild part to support buildding these feature together. Firstly, we fixed configure, let it probe native ability of toolchain to decide wether a feature can to be enabled, also clearly marked the conflictions between loongson2 & loongson3 and Release 6 & rest. Secondly, we compile MMI and MSA C sources with their own flags to ensure their flags won't pollute the whole program and generate illegal code. Signed-off-by: Jiaxun Yang --- v5: Minor fixes --- configure | 180 +++++++++++++++++++++++---------------- ffbuild/common.mak | 10 ++- libavcodec/mips/Makefile | 3 +- 3 files changed, 118 insertions(+), 75 deletions(-) diff --git a/configure b/configure index 7495f35faa..385a9e5f5f 100755 --- a/configure +++ b/configure @@ -2551,7 +2551,7 @@ mips64r6_deps="mips" mipsfpu_deps="mips" mipsdsp_deps="mips" mipsdspr2_deps="mips" -mmi_deps="mips" +mmi_deps_any="loongson2 loongson3" msa_deps="mipsfpu" msa2_deps="msa" @@ -5002,8 +5002,6 @@ elif enabled bfin; then elif enabled mips; then - cpuflags="-march=$cpu" - if [ "$cpu" != "generic" ]; then disable mips32r2 disable mips32r5 @@ -5012,19 +5010,61 @@ elif enabled mips; then disable mips64r6 disable loongson2 disable loongson3 + disable mipsdsp + disable mipsdspr2 + disable msa + disable mmi + + cpuflags="-march=$cpu" case $cpu in - 24kc|24kf*|24kec|34kc|1004kc|24kef*|34kf*|1004kf*|74kc|74kf) + # General ISA levels + mips1|mips3) + ;; + mips32r2) + enable msa enable mips32r2 - disable msa ;; - p5600|i6400|p6600) - disable mipsdsp - disable mipsdspr2 + mips32r5) + enable msa + enable mips32r2 + enable mips32r5 ;; - loongson*) - enable loongson2 + mips64r2|mips64r5) + enable msa + enable mmi + enable mips64r2 enable loongson3 + ;; + # Cores from MIPS(MTI) + 24kc) + disable mipsfpu + enable mips32r2 + ;; + 24kf*|24kec|34kc|74Kc|1004kc) + enable mips32r2 + ;; + 24kef*|34kf*|1004kf*) + enable mipsdsp + enable mips32r2 + ;; + p5600) + enable msa + enable mips32r2 + enable mips32r5 + check_cflags "-mtune=p5600" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" + ;; + i6400) + enable mips64r6 + check_cflags "-mtune=i6400 -mabi=64" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" && check_ldflags "-mabi=64" + ;; + p6600) + enable mips64r6 + check_cflags "-mtune=p6600 -mabi=64" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" && check_ldflags "-mabi=64" + ;; + # Cores from Loongson + loongson2e|loongson2f|loongson3*) + enable mmi enable local_aligned enable simd_align_16 enable fast_64bit @@ -5032,75 +5072,44 @@ elif enabled mips; then enable fast_cmov enable fast_unaligned disable aligned_stack - disable mipsdsp - disable mipsdspr2 # When gcc version less than 5.3.0, add -fno-expensive-optimizations flag. - if [ $cc == gcc ]; then - gcc_version=$(gcc -dumpversion) - if [ "$(echo "$gcc_version 5.3.0" | tr " " "\n" | sort -rV | head -n 1)" == "$gcc_version" ]; then - expensive_optimization_flag="" - else + if test "$cc_type" = "gcc"; then + case $gcc_basever in + 2|2.*|3.*|4.*|5.0|5.1|5.2) expensive_optimization_flag="-fno-expensive-optimizations" - fi + ;; + *) + expensive_optimization_flag="" + ;; + esac fi + case $cpu in loongson3*) + enable loongson3 + enable mips64r2 + enable msa cpuflags="-march=loongson3a -mhard-float $expensive_optimization_flag" ;; loongson2e) + enable loongson2 cpuflags="-march=loongson2e -mhard-float $expensive_optimization_flag" ;; loongson2f) + enable loongson2 cpuflags="-march=loongson2f -mhard-float $expensive_optimization_flag" ;; esac ;; *) - # Unknown CPU. Disable everything. - warn "unknown CPU. Disabling all MIPS optimizations." - disable mipsfpu - disable mipsdsp - disable mipsdspr2 - disable msa - disable mmi + warn "unknown MIPS CPU" ;; esac - case $cpu in - 24kc) - disable mipsfpu - disable mipsdsp - disable mipsdspr2 - ;; - 24kf*) - disable mipsdsp - disable mipsdspr2 - ;; - 24kec|34kc|1004kc) - disable mipsfpu - disable mipsdspr2 - ;; - 24kef*|34kf*|1004kf*) - disable mipsdspr2 - ;; - 74kc) - disable mipsfpu - ;; - p5600) - enable mips32r5 - check_cflags "-mtune=p5600" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" - ;; - i6400) - enable mips64r6 - check_cflags "-mtune=i6400 -mabi=64" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" && check_ldflags "-mabi=64" - ;; - p6600) - enable mips64r6 - check_cflags "-mtune=p6600 -mabi=64" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" && check_ldflags "-mabi=64" - ;; - esac else - # We do not disable anything. Is up to the user to disable the unwanted features. + disable mipsdsp + disable mipsdspr2 + # Disable DSP stuff for generic CPU, it can't be detected at runtime. warn 'generic cpu selected' fi @@ -5847,28 +5856,51 @@ EOF elif enabled mips; then - enabled loongson2 && check_inline_asm loongson2 '"dmult.g $8, $9, $10"' - enabled loongson3 && check_inline_asm loongson3 '"gsldxc1 $f0, 0($2, $3)"' - enabled mmi && check_inline_asm mmi '"punpcklhw $f0, $f0, $f0"' - # Enable minimum ISA based on selected options + # Check toolchain ISA level if enabled mips64; then - enabled mips64r6 && check_inline_asm_flags mips64r6 '"dlsa $0, $0, $0, 1"' '-mips64r6' - enabled mips64r2 && check_inline_asm_flags mips64r2 '"dext $0, $0, 0, 1"' '-mips64r2' - disabled mips64r6 && disabled mips64r2 && check_inline_asm_flags mips64r1 '"daddi $0, $0, 0"' '-mips64' + enabled mips64r6 && check_inline_asm mips64r6 '"dlsa $0, $0, $0, 1"' && + disable mips64r2 + + enabled mips64r2 && check_inline_asm mips64r2 '"dext $0, $0, 0, 1"' + + disable mips32r6 && disable mips32r5 && disable mips32r2 else - enabled mips32r6 && check_inline_asm_flags mips32r6 '"aui $0, $0, 0"' '-mips32r6' - enabled mips32r5 && check_inline_asm_flags mips32r5 '"eretnc"' '-mips32r5' - enabled mips32r2 && check_inline_asm_flags mips32r2 '"ext $0, $0, 0, 1"' '-mips32r2' - disabled mips32r6 && disabled mips32r5 && disabled mips32r2 && check_inline_asm_flags mips32r1 '"addi $0, $0, 0"' '-mips32' + enabled mips32r6 && check_inline_asm mips32r6 '"aui $0, $0, 0"' && + disable mips32r5 && disable mips32r2 + + enabled mips32r5 && check_inline_asm mips32r5 '"eretnc"' + enabled mips32r2 && check_inline_asm mips32r2 '"ext $0, $0, 0, 1"' + + disable mips64r6 && disable mips64r5 && disable mips64r2 fi - enabled mipsfpu && check_inline_asm_flags mipsfpu '"cvt.d.l $f0, $f2"' '-mhard-float' + enabled mipsfpu && check_inline_asm mipsfpu '"cvt.d.l $f0, $f2"' enabled mipsfpu && (enabled mips32r5 || enabled mips32r6 || enabled mips64r6) && check_inline_asm_flags mipsfpu '"cvt.d.l $f0, $f1"' '-mfp64' - enabled mipsfpu && enabled msa && check_inline_asm_flags msa '"addvi.b $w0, $w1, 1"' '-mmsa' && check_headers msa.h || disable msa + enabled mipsdsp && check_inline_asm_flags mipsdsp '"addu.qb $t0, $t1, $t2"' '-mdsp' enabled mipsdspr2 && check_inline_asm_flags mipsdspr2 '"absq_s.qb $t0, $t1"' '-mdspr2' - enabled msa && enabled msa2 && check_inline_asm_flags msa2 '"nxbits.any.b $w0, $w0"' '-mmsa2' && check_headers msa2.h || disable msa2 + + # MSA and MSA2 can be detected at runtime so we supply extra flags here + enabled mipsfpu && enabled msa && check_inline_asm msa '"addvi.b $w0, $w1, 1"' '-mmsa' && append MSAFLAGS '-mmsa' + enabled msa && enabled msa2 && check_inline_asm msa2 '"nxbits.any.b $w0, $w0"' '-mmsa2' && append MSAFLAGS '-mmsa2' + + # loongson2 have no switch cflag so we can only probe toolchain ability + enabled loongson2 && check_inline_asm loongson2 '"dmult.g $8, $9, $10"' + if enabled loongson2 ; then + disable loongson3 + fi + + # loongson3 is paired with MMI + enabled mips64r2 && enabled loongson3 && check_inline_asm loongson3 '"gsldxc1 $f0, 0($2, $3)"' '-mloongson-ext' && append MMIFLAGS '-mloongson-ext' + + # MMI must come together with loongson2 or loongson3 + if disabled loongson2 && disabled loongson3; then + disable mmi + fi + + # MMI can be detected at runtime too + enabled mmi && check_inline_asm mmi '"punpcklhw $f0, $f0, $f0"' '-mloongson-mmi' && append MMIFLAGS '-mloongson-mmi' if enabled bigendian && enabled msa; then disable msa @@ -7443,6 +7475,8 @@ LDSOFLAGS=$LDSOFLAGS SHFLAGS=$(echo $($ldflags_filter $SHFLAGS)) ASMSTRIPFLAGS=$ASMSTRIPFLAGS X86ASMFLAGS=$X86ASMFLAGS +MSAFLAGS=$MSAFLAGS +MMIFLAGS=$MMIFLAGS BUILDSUF=$build_suffix PROGSSUF=$progs_suffix FULLNAME=$FULLNAME diff --git a/ffbuild/common.mak b/ffbuild/common.mak index a60d27c9bd..6b95a17fbb 100644 --- a/ffbuild/common.mak +++ b/ffbuild/common.mak @@ -44,7 +44,7 @@ LDFLAGS := $(ALLFFLIBS:%=$(LD_PATH)lib%) $(LDFLAGS) define COMPILE $(call $(1)DEP,$(1)) - $($(1)) $($(1)FLAGS) $($(1)_DEPFLAGS) $($(1)_C) $($(1)_O) $(patsubst $(SRC_PATH)/%,$(SRC_LINK)/%,$<) + $($(1)) $($(1)FLAGS) $($(1)_DEPFLAGS) $($(1)_C) $($(1)_O) $($(2)) $(patsubst $(SRC_PATH)/%,$(SRC_LINK)/%,$<) endef COMPILE_C = $(call COMPILE,CC) @@ -54,6 +54,14 @@ COMPILE_M = $(call COMPILE,OBJCC) COMPILE_X86ASM = $(call COMPILE,X86ASM) COMPILE_HOSTC = $(call COMPILE,HOSTCC) COMPILE_NVCC = $(call COMPILE,NVCC) +COMPILE_MMI = $(call COMPILE,CC,MMIFLAGS) +COMPILE_MSA = $(call COMPILE,CC,MSAFLAGS) + +%_mmi.o: %_mmi.c + $(COMPILE_MMI) + +%_msa.o: %_msa.c + $(COMPILE_MSA) %.o: %.c $(COMPILE_C) diff --git a/libavcodec/mips/Makefile b/libavcodec/mips/Makefile index b4993f6e76..2be4d9b8a2 100644 --- a/libavcodec/mips/Makefile +++ b/libavcodec/mips/Makefile @@ -71,6 +71,8 @@ MSA-OBJS-$(CONFIG_IDCTDSP) += mips/idctdsp_msa.o \ MSA-OBJS-$(CONFIG_MPEGVIDEO) += mips/mpegvideo_msa.o MSA-OBJS-$(CONFIG_MPEGVIDEOENC) += mips/mpegvideoencdsp_msa.o MSA-OBJS-$(CONFIG_ME_CMP) += mips/me_cmp_msa.o +MSA-OBJS-$(CONFIG_VC1_DECODER) += mips/vc1dsp_msa.o + MMI-OBJS += mips/constants.o MMI-OBJS-$(CONFIG_H264DSP) += mips/h264dsp_mmi.o MMI-OBJS-$(CONFIG_H264CHROMA) += mips/h264chroma_mmi.o @@ -89,4 +91,3 @@ MMI-OBJS-$(CONFIG_WMV2DSP) += mips/wmv2dsp_mmi.o MMI-OBJS-$(CONFIG_HEVC_DECODER) += mips/hevcdsp_mmi.o MMI-OBJS-$(CONFIG_VP3DSP) += mips/vp3dsp_idct_mmi.o MMI-OBJS-$(CONFIG_VP9_DECODER) += mips/vp9_mc_mmi.o -MSA-OBJS-$(CONFIG_VC1_DECODER) += mips/vc1dsp_msa.o From patchwork Thu Jul 2 15:46:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 20779 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 6D69844B923 for ; Thu, 2 Jul 2020 18:47:06 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5048D68AEEA; Thu, 2 Jul 2020 18:47:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from relay1.mymailcheap.com (relay1.mymailcheap.com [144.217.248.102]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 021FB68A0BE for ; Thu, 2 Jul 2020 18:46:59 +0300 (EEST) Received: from filter1.mymailcheap.com (filter1.mymailcheap.com [149.56.130.247]) by relay1.mymailcheap.com (Postfix) with ESMTPS id 695AA3F157 for ; Thu, 2 Jul 2020 11:46:57 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by filter1.mymailcheap.com (Postfix) with ESMTP id 520112A3A7 for ; Thu, 2 Jul 2020 11:46:57 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=mymailcheap.com; s=default; t=1593704817; bh=8F2XiNA02qK3TBerFnwypjgMRwAjUuo1fn0cURL81cg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KyyXTBfGWTzVVC4ZTYSVJ/+VLFhaKqzN7MFNmwkazR/ERnGSJvLCqDZ3W80z+NqVv WHa8gNkhkc7jjmeuJ/hAT/YvQUyQwZw3JHxmKTtVs1GVjV7WgrLRqFEEO+XM+xHRun 1OfWkId4rSwXlXB6BZ3V3bO0mvPUZ3vvyZXI9Aro= X-Virus-Scanned: Debian amavisd-new at filter1.mymailcheap.com Received: from filter1.mymailcheap.com ([127.0.0.1]) by localhost (filter1.mymailcheap.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id n2eewdA3EN-c for ; Thu, 2 Jul 2020 11:46:56 -0400 (EDT) Received: from mail20.mymailcheap.com (mail20.mymailcheap.com [51.83.111.147]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by filter1.mymailcheap.com (Postfix) with ESMTPS for ; Thu, 2 Jul 2020 11:46:56 -0400 (EDT) Received: from [213.133.102.83] (ml.mymailcheap.com [213.133.102.83]) by mail20.mymailcheap.com (Postfix) with ESMTP id E5C3C4084C; Thu, 2 Jul 2020 15:46:53 +0000 (UTC) Authentication-Results: mail20.mymailcheap.com; dkim=pass (1024-bit key; unprotected) header.d=flygoat.com header.i=@flygoat.com header.b="WSk+TZVk"; dkim-atps=neutral AI-Spam-Status: Not processed Received: from halation.202.net.flygoat.com (unknown [115.227.164.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail20.mymailcheap.com (Postfix) with ESMTPSA id 0F11D403B3; Thu, 2 Jul 2020 15:46:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=flygoat.com; s=default; t=1593704797; bh=8F2XiNA02qK3TBerFnwypjgMRwAjUuo1fn0cURL81cg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WSk+TZVkjWVIy7NpyeTYidRhDII3445TuT68ik8nBC3odJ8c/vW09M8w0ylprgs3j hyzKXwFSG3BPRzzv71f4Yb3pUhol6QnNgSYm7KDLWNRMlhqyi/du80Md+ZU+ghwZtH RQ5fuK5yHXDoEihAF4Uv+jMqO6amuWlODc7/Q3w4= From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Thu, 2 Jul 2020 23:46:18 +0800 Message-Id: <20200702154622.21280-3-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200702154622.21280-1-jiaxun.yang@flygoat.com> References: <20200702154622.21280-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: E5C3C4084C X-Spamd-Result: default: False [4.90 / 10.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(0.00)[flygoat.com:s=default]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; R_SPF_SOFTFAIL(0.00)[~all]; ML_SERVERS(-3.10)[213.133.102.83]; DKIM_TRACE(0.00)[flygoat.com:+]; RCPT_COUNT_TWO(0.00)[2]; MID_CONTAINS_FROM(1.00)[]; RCVD_IN_DNSWL_NONE(0.00)[213.133.102.83:from]; DMARC_POLICY_ALLOW(0.00)[flygoat.com,none]; DMARC_POLICY_ALLOW_WITH_FAILURES(0.00)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:213.133.96.0/19, country:DE]; RCVD_COUNT_TWO(0.00)[2]; HFILTER_HELO_BAREIP(3.00)[213.133.102.83,1] X-Rspamd-Server: mail20.mymailcheap.com X-Spam: Yes Subject: [FFmpeg-devel] [PATCH v5 2/6] libavutils: Add parse_r helper for MIPS X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" That helper grab from kernel code can allow us to inline newer instructions (not implemented by the assembler) in a elegant manner. Signed-off-by: Jiaxun Yang --- libavutil/mips/asmdefs.h | 42 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/libavutil/mips/asmdefs.h b/libavutil/mips/asmdefs.h index 748119918a..7e0e4cf575 100644 --- a/libavutil/mips/asmdefs.h +++ b/libavutil/mips/asmdefs.h @@ -55,4 +55,46 @@ # define PTR_SLL "sll " #endif +/* + * parse_r var, r - Helper assembler macro for parsing register names. + * + * This converts the register name in $n form provided in \r to the + * corresponding register number, which is assigned to the variable \var. It is + * needed to allow explicit encoding of instructions in inline assembly where + * registers are chosen by the compiler in $n form, allowing us to avoid using + * fixed register numbers. + * + * It also allows newer instructions (not implemented by the assembler) to be + * transparently implemented using assembler macros, instead of needing separate + * cases depending on toolchain support. + * + * Simple usage example: + * __asm__ __volatile__("parse_r __rt, %0\n\t" + * ".insn\n\t" + * "# di %0\n\t" + * ".word (0x41606000 | (__rt << 16))" + * : "=r" (status); + */ + +/* Match an individual register number and assign to \var */ +#define _IFC_REG(n) \ + ".ifc \\r, $" #n "\n\t" \ + "\\var = " #n "\n\t" \ + ".endif\n\t" + +__asm__(".macro parse_r var r\n\t" + "\\var = -1\n\t" + _IFC_REG(0) _IFC_REG(1) _IFC_REG(2) _IFC_REG(3) + _IFC_REG(4) _IFC_REG(5) _IFC_REG(6) _IFC_REG(7) + _IFC_REG(8) _IFC_REG(9) _IFC_REG(10) _IFC_REG(11) + _IFC_REG(12) _IFC_REG(13) _IFC_REG(14) _IFC_REG(15) + _IFC_REG(16) _IFC_REG(17) _IFC_REG(18) _IFC_REG(19) + _IFC_REG(20) _IFC_REG(21) _IFC_REG(22) _IFC_REG(23) + _IFC_REG(24) _IFC_REG(25) _IFC_REG(26) _IFC_REG(27) + _IFC_REG(28) _IFC_REG(29) _IFC_REG(30) _IFC_REG(31) + ".iflt \\var\n\t" + ".error \"Unable to parse register name \\r\"\n\t" + ".endif\n\t" + ".endm"); + #endif /* AVCODEC_MIPS_ASMDEFS_H */ From patchwork Thu Jul 2 15:46:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 20781 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 03AD544B923 for ; Thu, 2 Jul 2020 18:47:08 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E643368AF27; Thu, 2 Jul 2020 18:47:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from relay1.mymailcheap.com (relay1.mymailcheap.com [144.217.248.102]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0309A68A0BE for ; Thu, 2 Jul 2020 18:47:00 +0300 (EEST) Received: from filter1.mymailcheap.com (filter1.mymailcheap.com [149.56.130.247]) by relay1.mymailcheap.com (Postfix) with ESMTPS id 9B9443F1C5 for ; Thu, 2 Jul 2020 11:46:58 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by filter1.mymailcheap.com (Postfix) with ESMTP id 83D992A3A7 for ; Thu, 2 Jul 2020 11:46:58 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=mymailcheap.com; s=default; t=1593704818; bh=fS5K1ZfU1d3mys9Wm8ttULEmhqZf3K1d4YVG85EJ0Yc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FUw3vbNbdvibyxmeOn6gZSZwzPgHcGlbIdhwDgUJy+h82MG468G8o8lHLADRRmCfZ ueoS78w/yMngV+fnmeN0qapuYrCSoLmeIWDtWb5rlId7B09THF44OCAN3d4n1FJiAD nHke/1KLbbkLO3FqL0BgK8VHvxgKx9bWuIEQiIMQ= X-Virus-Scanned: Debian amavisd-new at filter1.mymailcheap.com Received: from filter1.mymailcheap.com ([127.0.0.1]) by localhost (filter1.mymailcheap.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Fhzyu2WUowpI for ; Thu, 2 Jul 2020 11:46:57 -0400 (EDT) Received: from mail20.mymailcheap.com (mail20.mymailcheap.com [51.83.111.147]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by filter1.mymailcheap.com (Postfix) with ESMTPS for ; Thu, 2 Jul 2020 11:46:56 -0400 (EDT) Received: from [213.133.102.83] (ml.mymailcheap.com [213.133.102.83]) by mail20.mymailcheap.com (Postfix) with ESMTP id 03C6A4085D; Thu, 2 Jul 2020 15:46:54 +0000 (UTC) Authentication-Results: mail20.mymailcheap.com; dkim=pass (1024-bit key; unprotected) header.d=flygoat.com header.i=@flygoat.com header.b="RhcWHe3n"; dkim-atps=neutral AI-Spam-Status: Not processed Received: from halation.202.net.flygoat.com (unknown [115.227.164.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail20.mymailcheap.com (Postfix) with ESMTPSA id 9448C403B3; Thu, 2 Jul 2020 15:46:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=flygoat.com; s=default; t=1593704798; bh=fS5K1ZfU1d3mys9Wm8ttULEmhqZf3K1d4YVG85EJ0Yc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RhcWHe3ndODuuSjx21HzD0QavVcAPc5dBHDEt6XNHxjIVla3J96i0dk9ITgBnG/bo cQRxVCe0oAGFnMXyC7VsejBxICTn9Jw9tRGVkUuLasP6LZoNj3RWdh5x/aVquvmsCc bLl5RnXnB+BipjeEtiGvEY3chL8W5VPtg+lmtdmc= From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Thu, 2 Jul 2020 23:46:19 +0800 Message-Id: <20200702154622.21280-4-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200702154622.21280-1-jiaxun.yang@flygoat.com> References: <20200702154622.21280-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 03C6A4085D X-Spamd-Result: default: False [4.90 / 10.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(0.00)[flygoat.com:s=default]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; R_SPF_SOFTFAIL(0.00)[~all]; ML_SERVERS(-3.10)[213.133.102.83]; DKIM_TRACE(0.00)[flygoat.com:+]; RCPT_COUNT_TWO(0.00)[2]; MID_CONTAINS_FROM(1.00)[]; RCVD_IN_DNSWL_NONE(0.00)[213.133.102.83:from]; DMARC_POLICY_ALLOW(0.00)[flygoat.com,none]; DMARC_POLICY_ALLOW_WITH_FAILURES(0.00)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:213.133.96.0/19, country:DE]; RCVD_COUNT_TWO(0.00)[2]; HFILTER_HELO_BAREIP(3.00)[213.133.102.83,1] X-Rspamd-Server: mail20.mymailcheap.com X-Spam: Yes Subject: [FFmpeg-devel] [PATCH v5 3/6] libavutil: Detect MMI and MSA flags for MIPS X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Add MMI & MSA runtime detection for MIPS. Basically there are two code pathes. For systems that natively support CPUCFG instruction or kernel emulated that instruction, we'll sense this feature from HWCAP and report the flags according to values grab from CPUCFG. For systems that have no CPUCFG (or not export it in HWCAP), we'll parse /proc/cpuinfo instead. Signed-off-by: Jiaxun Yang --- v5: Fix a stupid typo --- libavutil/cpu.c | 10 +++ libavutil/cpu.h | 3 + libavutil/cpu_internal.h | 2 + libavutil/mips/Makefile | 2 +- libavutil/mips/cpu.c | 134 ++++++++++++++++++++++++++++++++++++++ libavutil/mips/cpu.h | 28 ++++++++ libavutil/tests/cpu.c | 3 + tests/checkasm/checkasm.c | 3 + 8 files changed, 184 insertions(+), 1 deletion(-) create mode 100644 libavutil/mips/cpu.c create mode 100644 libavutil/mips/cpu.h diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 6548cc3042..52f6b9a3bf 100644 --- a/libavutil/cpu.c +++ b/libavutil/cpu.c @@ -51,6 +51,8 @@ static atomic_int cpu_flags = ATOMIC_VAR_INIT(-1); static int get_cpu_flags(void) { + if (ARCH_MIPS) + return ff_get_cpu_flags_mips(); if (ARCH_AARCH64) return ff_get_cpu_flags_aarch64(); if (ARCH_ARM) @@ -169,6 +171,9 @@ int av_parse_cpu_flags(const char *s) { "armv8", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ARMV8 }, .unit = "flags" }, { "neon", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_NEON }, .unit = "flags" }, { "vfp", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VFP }, .unit = "flags" }, +#elif ARCH_MIPS + { "mmi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MMI }, .unit = "flags" }, + { "msa", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MSA }, .unit = "flags" }, #endif { NULL }, }; @@ -250,6 +255,9 @@ int av_parse_cpu_caps(unsigned *flags, const char *s) { "armv8", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_ARMV8 }, .unit = "flags" }, { "neon", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_NEON }, .unit = "flags" }, { "vfp", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_VFP }, .unit = "flags" }, +#elif ARCH_MIPS + { "mmi", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MMI }, .unit = "flags" }, + { "msa", NULL, 0, AV_OPT_TYPE_CONST, { .i64 = AV_CPU_FLAG_MSA }, .unit = "flags" }, #endif { NULL }, }; @@ -308,6 +316,8 @@ int av_cpu_count(void) size_t av_cpu_max_align(void) { + if (ARCH_MIPS) + return ff_get_cpu_max_align_mips(); if (ARCH_AARCH64) return ff_get_cpu_max_align_aarch64(); if (ARCH_ARM) diff --git a/libavutil/cpu.h b/libavutil/cpu.h index 8bb9eb606b..83099dd969 100644 --- a/libavutil/cpu.h +++ b/libavutil/cpu.h @@ -71,6 +71,9 @@ #define AV_CPU_FLAG_VFP_VM (1 << 7) ///< VFPv2 vector mode, deprecated in ARMv7-A and unavailable in various CPUs implementations #define AV_CPU_FLAG_SETEND (1 <<16) +#define AV_CPU_FLAG_MMI (1 << 0) +#define AV_CPU_FLAG_MSA (1 << 1) + /** * Return the flags which specify extensions supported by the CPU. * The returned value is affected by av_force_cpu_flags() if that was used diff --git a/libavutil/cpu_internal.h b/libavutil/cpu_internal.h index 37122d1c5f..889764320b 100644 --- a/libavutil/cpu_internal.h +++ b/libavutil/cpu_internal.h @@ -41,11 +41,13 @@ #define CPUEXT_FAST(flags, cpuext) CPUEXT_SUFFIX_FAST(flags, , cpuext) #define CPUEXT_SLOW(flags, cpuext) CPUEXT_SUFFIX_SLOW(flags, , cpuext) +int ff_get_cpu_flags_mips(void); int ff_get_cpu_flags_aarch64(void); int ff_get_cpu_flags_arm(void); int ff_get_cpu_flags_ppc(void); int ff_get_cpu_flags_x86(void); +size_t ff_get_cpu_max_align_mips(void); size_t ff_get_cpu_max_align_aarch64(void); size_t ff_get_cpu_max_align_arm(void); size_t ff_get_cpu_max_align_ppc(void); diff --git a/libavutil/mips/Makefile b/libavutil/mips/Makefile index dbfa5aa341..5f8c9b64e9 100644 --- a/libavutil/mips/Makefile +++ b/libavutil/mips/Makefile @@ -1 +1 @@ -OBJS += mips/float_dsp_mips.o +OBJS += mips/float_dsp_mips.o mips/cpu.o diff --git a/libavutil/mips/cpu.c b/libavutil/mips/cpu.c new file mode 100644 index 0000000000..6b9d721939 --- /dev/null +++ b/libavutil/mips/cpu.c @@ -0,0 +1,134 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" +#include "config.h" +#if defined __linux__ || defined __ANDROID__ +#include +#include +#include +#include +#include "asmdefs.h" +#include "libavutil/avstring.h" +#endif + +#if defined __linux__ || defined __ANDROID__ + +#define HWCAP_LOONGSON_CPUCFG (1 << 14) + +static int cpucfg_available(void) +{ + return getauxval(AT_HWCAP) & HWCAP_LOONGSON_CPUCFG; +} + +/* Most toolchains have no CPUCFG support yet */ +static uint32_t read_cpucfg(uint32_t reg) +{ + uint32_t __res; + + __asm__ __volatile__( + "parse_r __res,%0\n\t" + "parse_r reg,%1\n\t" + ".insn \n\t" + ".word (0xc8080118 | (reg << 21) | (__res << 11))\n\t" + :"=r"(__res) + :"r"(reg) + : + ); + return __res; +} + +#define LOONGSON_CFG1 0x1 + +#define LOONGSON_CFG1_MMI (1 << 4) +#define LOONGSON_CFG1_MSA1 (1 << 5) + +static int cpu_flags_cpucfg(void) +{ + int flags = 0; + uint32_t cfg1 = read_cpucfg(LOONGSON_CFG1); + + if (cfg1 & LOONGSON_CFG1_MMI) + flags |= AV_CPU_FLAG_MMI; + + if (cfg1 & LOONGSON_CFG1_MSA1) + flags |= AV_CPU_FLAG_MSA; + + return flags; +} + +static int cpu_flags_cpuinfo(void) +{ + FILE *f = fopen("/proc/cpuinfo", "r"); + char buf[200]; + int flags = 0; + + if (!f) + return -1; + + while (fgets(buf, sizeof(buf), f)) { + /* Legacy kernel may not export MMI in ASEs implemented */ + if (av_strstart(buf, "cpu model", NULL)) { + if (strstr(buf, "Loongson-3 ")) + flags |= AV_CPU_FLAG_MMI; + } + + if (av_strstart(buf, "ASEs implemented", NULL)) { + if (strstr(buf, " loongson-mmi")) + flags |= AV_CPU_FLAG_MMI; + if (strstr(buf, " msa")) + flags |= AV_CPU_FLAG_MSA; + + break; + } + } + fclose(f); + return flags; +} +#endif + +int ff_get_cpu_flags_mips(void) +{ +#if defined __linux__ || defined __ANDROID__ + if (cpucfg_available()) + return cpu_flags_cpucfg(); + else + return cpu_flags_cpuinfo(); +#else + /* Assume no SIMD ASE supported */ + return 0; +#endif +} + +size_t ff_get_cpu_max_align_mips(void) +{ + int flags = av_get_cpu_flags(); + + if (flags & AV_CPU_FLAG_MSA) + return 16; + + /* + * MMI itself is 64-bit but quad word load & store + * needs 128-bit align. + */ + if (flags & AV_CPU_FLAG_MMI) + return 16; + + return 8; +} diff --git a/libavutil/mips/cpu.h b/libavutil/mips/cpu.h new file mode 100644 index 0000000000..615dc49759 --- /dev/null +++ b/libavutil/mips/cpu.h @@ -0,0 +1,28 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_MIPS_CPU_H +#define AVUTIL_MIPS_CPU_H + +#include "libavutil/cpu.h" +#include "libavutil/cpu_internal.h" + +#define have_mmi(flags) CPUEXT(flags, MMI) +#define have_msa(flags) CPUEXT(flags, MSA) + +#endif /* AVUTIL_MIPS_CPU_H */ diff --git a/libavutil/tests/cpu.c b/libavutil/tests/cpu.c index ce45b715a0..c853371fb3 100644 --- a/libavutil/tests/cpu.c +++ b/libavutil/tests/cpu.c @@ -49,6 +49,9 @@ static const struct { { AV_CPU_FLAG_SETEND, "setend" }, #elif ARCH_PPC { AV_CPU_FLAG_ALTIVEC, "altivec" }, +#elif ARCH_MIPS + { AV_CPU_FLAG_MMI, "mmi" }, + { AV_CPU_FLAG_MSA, "msa" }, #elif ARCH_X86 { AV_CPU_FLAG_MMX, "mmx" }, { AV_CPU_FLAG_MMXEXT, "mmxext" }, diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 899f68bb32..b3ac76c325 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -213,6 +213,9 @@ static const struct { { "ALTIVEC", "altivec", AV_CPU_FLAG_ALTIVEC }, { "VSX", "vsx", AV_CPU_FLAG_VSX }, { "POWER8", "power8", AV_CPU_FLAG_POWER8 }, +#elif ARCH_MIPS + { "MMI", "mmi", AV_CPU_FLAG_MMI }, + { "MSA", "msa", AV_CPU_FLAG_MSA }, #elif ARCH_X86 { "MMX", "mmx", AV_CPU_FLAG_MMX|AV_CPU_FLAG_CMOV }, { "MMXEXT", "mmxext", AV_CPU_FLAG_MMXEXT }, From patchwork Thu Jul 2 15:46:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 20804 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a25:80ca:0:0:0:0:0 with SMTP id c10csp1582501ybm; Thu, 2 Jul 2020 08:48:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxTIdnWHsfTmoPhWGxSMdlj4rJfJdeq20QQh7zBZRln1hIsGSY9D0HR1xNCRS0FVSWee7Vx X-Received: by 2002:a1c:48c5:: with SMTP id v188mr31263467wma.58.1593704904793; Thu, 02 Jul 2020 08:48:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593704904; cv=none; d=google.com; s=arc-20160816; b=gK/MGfSLwYWj7kqV07uCepSHOJW7/5efC8OxvWuj+6ToTVqcGfMTE/I+g4Ev6VHjSh QJ39VLIqY86+/cSjcVeo4fYzvMjINSbziV8wuFCu+nf/zmd9iCOY41ioItob4nfCiHU4 moFHnpnGq/6Dy/WBTd2MzvkEslXsKq9/vIjdQPm4wiT+gmRZv1dFy2dTWNkxyOw8I4rt 4HoL23aPtxcl4ldrshny621yvAicBvEGRYB7hUUTBw3aWjyRTk6nHexizQPf/cj2ce+w kuFQ/Iprg2Bz6kqouirXMmJ4PvKZM+2uMh5nVCMM238sET3SDGvlvo/BmItiBlERYXul nbNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:ai-spam-status :dkim-signature:delivered-to; bh=8KpuCf8yKaAJNZXFmMdF0dCdstYFQVtaDHGnAW4VJJY=; b=nkgzmQTS+1nyAXY61NzuUCN0iA+r9q+rkIo3e0S/vNU5xucNqMQZ4J4YZqR6wVPzlT YCPJRGkXHj79qQlB/QbMuub+4oNojlbEk6MxBMliTa8iqTDSqX56YhCog3Du/MN6aZUg LuDZjILN7ILvWJhjOAAG85092mmna/e5BT22ijpCtHXZyBwrWq6lAj6onFVo9Mr/cH3v Wmv49sHRYiX+c5lKUK6iZ+ESq73wv/lK0n+8kahcKW+fP6TXOj4E8pSBx9qoWNCVoSYn lmGAco4lrF1TKBbodF8lV0Y0gSgG6LkzV12IRTFBkkdNEBFDUG3MIxeFCeKq/yXec8Hx T0bw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@mymailcheap.com header.s=default header.b=WQ3+S6jv; dkim=neutral (body hash did not verify) header.i=@flygoat.com header.s=default header.b=hTHfWUpG; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=flygoat.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s2si8757787wrm.172.2020.07.02.08.48.24; Thu, 02 Jul 2020 08:48:24 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@mymailcheap.com header.s=default header.b=WQ3+S6jv; dkim=neutral (body hash did not verify) header.i=@flygoat.com header.s=default header.b=hTHfWUpG; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=flygoat.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 10CB368B1AE; Thu, 2 Jul 2020 18:48:04 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from relay1.mymailcheap.com (relay1.mymailcheap.com [144.217.248.102]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 05A9068B162 for ; Thu, 2 Jul 2020 18:48:01 +0300 (EEST) Received: from filter2.mymailcheap.com (filter2.mymailcheap.com [91.134.140.82]) by relay1.mymailcheap.com (Postfix) with ESMTPS id 77D723F157 for ; Thu, 2 Jul 2020 11:48:00 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by filter2.mymailcheap.com (Postfix) with ESMTP id A59652A4F0 for ; Thu, 2 Jul 2020 17:47:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=mymailcheap.com; s=default; t=1593704879; bh=MM8So4rMJN04tB+vxhzJQGj0EW1O8QEMFBWB7ADDpLc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WQ3+S6jvxwK8FERzosr8NBxK+CCcB8g9GC89KPjfOa8D+FknmZHwH4KlCwQJZ41CO Lv/MbBSQqPPO2lfimwKzATpEcNH9nG1ydwTvoGLBYa4+JsYr0b7InvOiaZO1PQqp8o N5Q0L9A7b1+B/zRWcrMeEyRC8cskKAd4WAu3nZZE= X-Virus-Scanned: Debian amavisd-new at filter2.mymailcheap.com Received: from filter2.mymailcheap.com ([127.0.0.1]) by localhost (filter2.mymailcheap.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jdTckojpURQu for ; Thu, 2 Jul 2020 17:47:53 +0200 (CEST) Received: from mail20.mymailcheap.com (mail20.mymailcheap.com [51.83.111.147]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by filter2.mymailcheap.com (Postfix) with ESMTPS for ; Thu, 2 Jul 2020 17:47:52 +0200 (CEST) Received: from [148.251.23.173] (ml.mymailcheap.com [148.251.23.173]) by mail20.mymailcheap.com (Postfix) with ESMTP id DE8114085C; Thu, 2 Jul 2020 15:47:50 +0000 (UTC) Authentication-Results: mail20.mymailcheap.com; dkim=pass (1024-bit key; unprotected) header.d=flygoat.com header.i=@flygoat.com header.b="hTHfWUpG"; dkim-atps=neutral AI-Spam-Status: Not processed Received: from halation.202.net.flygoat.com (unknown [115.227.164.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail20.mymailcheap.com (Postfix) with ESMTPSA id 23129403B3; Thu, 2 Jul 2020 15:46:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=flygoat.com; s=default; t=1593704804; bh=MM8So4rMJN04tB+vxhzJQGj0EW1O8QEMFBWB7ADDpLc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hTHfWUpGYVty1/+ieqSiHv0pNHI2mLD1gz5BR8/wWCczD9NQPYeN0Ij+SJe5iXsQQ 4JjCbJ4KmW1dyVKnvu7+ELArtMtjzWQfme3/PHFiYWJJHuNgc+UsoeO6U8xwatnTKT zNkYM0qDPMOLtZ4+Ks4gYAnw6spzC5C67ExuTsGM= From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Thu, 2 Jul 2020 23:46:20 +0800 Message-Id: <20200702154622.21280-5-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200702154622.21280-1-jiaxun.yang@flygoat.com> References: <20200702154622.21280-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: DE8114085C X-Spamd-Result: default: False [4.90 / 10.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(0.00)[flygoat.com:s=default]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; R_SPF_SOFTFAIL(0.00)[~all:c]; ML_SERVERS(-3.10)[148.251.23.173]; DKIM_TRACE(0.00)[flygoat.com:+]; RCPT_COUNT_TWO(0.00)[2]; MID_CONTAINS_FROM(1.00)[]; DMARC_POLICY_ALLOW(0.00)[flygoat.com,none]; DMARC_POLICY_ALLOW_WITH_FAILURES(0.00)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:148.251.0.0/16, country:DE]; RCVD_COUNT_TWO(0.00)[2]; HFILTER_HELO_BAREIP(3.00)[148.251.23.173,1] X-Rspamd-Server: mail20.mymailcheap.com X-Spam: Yes Subject: [FFmpeg-devel] [PATCH v5 4/6] libavcodec: Enable runtime detection for MIPS MMI & MSA X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: spnyphfnkA6S Content-Length: 257999 Apply optimized functions according to cpuflags. MSA is usually put after MMI as it's generally faster than MMI. Signed-off-by: Jiaxun Yang --- v5: Refine if, also make h264chroma as a exception that MMI faster than MSA --- libavcodec/mips/blockdsp_init_mips.c | 40 +- libavcodec/mips/cabac.h | 2 +- libavcodec/mips/h263dsp_init_mips.c | 18 +- libavcodec/mips/h264chroma_init_mips.c | 55 +- libavcodec/mips/h264dsp_init_mips.c | 225 +++-- libavcodec/mips/h264pred_init_mips.c | 207 ++-- libavcodec/mips/h264qpel_init_mips.c | 412 ++++---- libavcodec/mips/hevcdsp_init_mips.c | 992 ++++++++++---------- libavcodec/mips/hevcpred_init_mips.c | 40 +- libavcodec/mips/hpeldsp_init_mips.c | 180 ++-- libavcodec/mips/idctdsp_init_mips.c | 74 +- libavcodec/mips/me_cmp_init_mips.c | 50 +- libavcodec/mips/mpegvideo_init_mips.c | 48 +- libavcodec/mips/mpegvideoencdsp_init_mips.c | 21 +- libavcodec/mips/pixblockdsp_init_mips.c | 63 +- libavcodec/mips/qpeldsp_init_mips.c | 270 +++--- libavcodec/mips/vc1dsp_init_mips.c | 186 ++-- libavcodec/mips/videodsp_init.c | 10 +- libavcodec/mips/vp3dsp_init_mips.c | 44 +- libavcodec/mips/vp8dsp_init_mips.c | 240 +++-- libavcodec/mips/vp9dsp_init_mips.c | 16 +- libavcodec/mips/wmv2dsp_init_mips.c | 18 +- libavcodec/mips/wmv2dsp_mips.h | 4 +- libavcodec/mips/wmv2dsp_mmi.c | 4 +- libavcodec/mips/xvididct_init_mips.c | 31 +- 25 files changed, 1543 insertions(+), 1707 deletions(-) diff --git a/libavcodec/mips/blockdsp_init_mips.c b/libavcodec/mips/blockdsp_init_mips.c index 55ac1c3e99..c6964fa74e 100644 --- a/libavcodec/mips/blockdsp_init_mips.c +++ b/libavcodec/mips/blockdsp_init_mips.c @@ -19,36 +19,26 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "blockdsp_mips.h" -#if HAVE_MSA -static av_cold void blockdsp_init_msa(BlockDSPContext *c) +void ff_blockdsp_init_mips(BlockDSPContext *c) { - c->clear_block = ff_clear_block_msa; - c->clear_blocks = ff_clear_blocks_msa; + int cpu_flags = av_get_cpu_flags(); - c->fill_block_tab[0] = ff_fill_block16_msa; - c->fill_block_tab[1] = ff_fill_block8_msa; -} -#endif // #if HAVE_MSA + if (have_mmi(cpu_flags)) { + c->clear_block = ff_clear_block_mmi; + c->clear_blocks = ff_clear_blocks_mmi; -#if HAVE_MMI -static av_cold void blockdsp_init_mmi(BlockDSPContext *c) -{ - c->clear_block = ff_clear_block_mmi; - c->clear_blocks = ff_clear_blocks_mmi; + c->fill_block_tab[0] = ff_fill_block16_mmi; + c->fill_block_tab[1] = ff_fill_block8_mmi; + } - c->fill_block_tab[0] = ff_fill_block16_mmi; - c->fill_block_tab[1] = ff_fill_block8_mmi; -} -#endif /* HAVE_MMI */ + if (have_msa(cpu_flags)) { + c->clear_block = ff_clear_block_msa; + c->clear_blocks = ff_clear_blocks_msa; -void ff_blockdsp_init_mips(BlockDSPContext *c) -{ -#if HAVE_MMI - blockdsp_init_mmi(c); -#endif /* HAVE_MMI */ -#if HAVE_MSA - blockdsp_init_msa(c); -#endif // #if HAVE_MSA + c->fill_block_tab[0] = ff_fill_block16_msa; + c->fill_block_tab[1] = ff_fill_block8_msa; + } } diff --git a/libavcodec/mips/cabac.h b/libavcodec/mips/cabac.h index 03b5010edc..c595915eda 100644 --- a/libavcodec/mips/cabac.h +++ b/libavcodec/mips/cabac.h @@ -25,7 +25,7 @@ #define AVCODEC_MIPS_CABAC_H #include "libavcodec/cabac.h" -#include "libavutil/mips/mmiutils.h" +#include "libavutil/mips/asmdefs.h" #include "config.h" #define get_cabac_inline get_cabac_inline_mips diff --git a/libavcodec/mips/h263dsp_init_mips.c b/libavcodec/mips/h263dsp_init_mips.c index 09bd93707d..a73eb12d87 100644 --- a/libavcodec/mips/h263dsp_init_mips.c +++ b/libavcodec/mips/h263dsp_init_mips.c @@ -18,19 +18,15 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "h263dsp_mips.h" -#if HAVE_MSA -static av_cold void h263dsp_init_msa(H263DSPContext *c) -{ - c->h263_h_loop_filter = ff_h263_h_loop_filter_msa; - c->h263_v_loop_filter = ff_h263_v_loop_filter_msa; -} -#endif // #if HAVE_MSA - av_cold void ff_h263dsp_init_mips(H263DSPContext *c) { -#if HAVE_MSA - h263dsp_init_msa(c); -#endif // #if HAVE_MSA + int cpu_flags = av_get_cpu_flags(); + + if (have_msa(cpu_flags)){ + c->h263_h_loop_filter = ff_h263_h_loop_filter_msa; + c->h263_v_loop_filter = ff_h263_v_loop_filter_msa; + } } diff --git a/libavcodec/mips/h264chroma_init_mips.c b/libavcodec/mips/h264chroma_init_mips.c index ae817e47ae..6bb19d3ddd 100644 --- a/libavcodec/mips/h264chroma_init_mips.c +++ b/libavcodec/mips/h264chroma_init_mips.c @@ -19,45 +19,34 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "h264chroma_mips.h" -#if HAVE_MSA -static av_cold void h264chroma_init_msa(H264ChromaContext *c, int bit_depth) -{ - const int high_bit_depth = bit_depth > 8; - - if (!high_bit_depth) { - c->put_h264_chroma_pixels_tab[0] = ff_put_h264_chroma_mc8_msa; - c->put_h264_chroma_pixels_tab[1] = ff_put_h264_chroma_mc4_msa; - c->put_h264_chroma_pixels_tab[2] = ff_put_h264_chroma_mc2_msa; - - c->avg_h264_chroma_pixels_tab[0] = ff_avg_h264_chroma_mc8_msa; - c->avg_h264_chroma_pixels_tab[1] = ff_avg_h264_chroma_mc4_msa; - c->avg_h264_chroma_pixels_tab[2] = ff_avg_h264_chroma_mc2_msa; - } -} -#endif // #if HAVE_MSA -#if HAVE_MMI -static av_cold void h264chroma_init_mmi(H264ChromaContext *c, int bit_depth) +av_cold void ff_h264chroma_init_mips(H264ChromaContext *c, int bit_depth) { + int cpu_flags = av_get_cpu_flags(); int high_bit_depth = bit_depth > 8; - if (!high_bit_depth) { - c->put_h264_chroma_pixels_tab[0] = ff_put_h264_chroma_mc8_mmi; - c->avg_h264_chroma_pixels_tab[0] = ff_avg_h264_chroma_mc8_mmi; - c->put_h264_chroma_pixels_tab[1] = ff_put_h264_chroma_mc4_mmi; - c->avg_h264_chroma_pixels_tab[1] = ff_avg_h264_chroma_mc4_mmi; + /* MMI apears to be faster than MSA here */ + if (have_msa(cpu_flags)) { + if (!high_bit_depth) { + c->put_h264_chroma_pixels_tab[0] = ff_put_h264_chroma_mc8_msa; + c->put_h264_chroma_pixels_tab[1] = ff_put_h264_chroma_mc4_msa; + c->put_h264_chroma_pixels_tab[2] = ff_put_h264_chroma_mc2_msa; + + c->avg_h264_chroma_pixels_tab[0] = ff_avg_h264_chroma_mc8_msa; + c->avg_h264_chroma_pixels_tab[1] = ff_avg_h264_chroma_mc4_msa; + c->avg_h264_chroma_pixels_tab[2] = ff_avg_h264_chroma_mc2_msa; + } } -} -#endif /* HAVE_MMI */ -av_cold void ff_h264chroma_init_mips(H264ChromaContext *c, int bit_depth) -{ -#if HAVE_MMI - h264chroma_init_mmi(c, bit_depth); -#endif /* HAVE_MMI */ -#if HAVE_MSA - h264chroma_init_msa(c, bit_depth); -#endif // #if HAVE_MSA + if (have_mmi(cpu_flags)) { + if (!high_bit_depth) { + c->put_h264_chroma_pixels_tab[0] = ff_put_h264_chroma_mc8_mmi; + c->avg_h264_chroma_pixels_tab[0] = ff_avg_h264_chroma_mc8_mmi; + c->put_h264_chroma_pixels_tab[1] = ff_put_h264_chroma_mc4_mmi; + c->avg_h264_chroma_pixels_tab[1] = ff_avg_h264_chroma_mc4_mmi; + } + } } diff --git a/libavcodec/mips/h264dsp_init_mips.c b/libavcodec/mips/h264dsp_init_mips.c index dc08a25800..9cd05e0f2f 100644 --- a/libavcodec/mips/h264dsp_init_mips.c +++ b/libavcodec/mips/h264dsp_init_mips.c @@ -19,129 +19,116 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "h264dsp_mips.h" -#if HAVE_MSA -static av_cold void h264dsp_init_msa(H264DSPContext *c, - const int bit_depth, - const int chroma_format_idc) -{ - if (8 == bit_depth) { - c->h264_v_loop_filter_luma = ff_h264_v_lpf_luma_inter_msa; - c->h264_h_loop_filter_luma = ff_h264_h_lpf_luma_inter_msa; - c->h264_h_loop_filter_luma_mbaff = - ff_h264_h_loop_filter_luma_mbaff_msa; - c->h264_v_loop_filter_luma_intra = ff_h264_v_lpf_luma_intra_msa; - c->h264_h_loop_filter_luma_intra = ff_h264_h_lpf_luma_intra_msa; - c->h264_h_loop_filter_luma_mbaff_intra = - ff_h264_h_loop_filter_luma_mbaff_intra_msa; - c->h264_v_loop_filter_chroma = ff_h264_v_lpf_chroma_inter_msa; - - if (chroma_format_idc <= 1) - c->h264_h_loop_filter_chroma = ff_h264_h_lpf_chroma_inter_msa; - else - c->h264_h_loop_filter_chroma = - ff_h264_h_loop_filter_chroma422_msa; - - if (chroma_format_idc > 1) - c->h264_h_loop_filter_chroma_mbaff = - ff_h264_h_loop_filter_chroma422_mbaff_msa; - - c->h264_v_loop_filter_chroma_intra = - ff_h264_v_lpf_chroma_intra_msa; - - if (chroma_format_idc <= 1) - c->h264_h_loop_filter_chroma_intra = - ff_h264_h_lpf_chroma_intra_msa; - - /* Weighted MC */ - c->weight_h264_pixels_tab[0] = ff_weight_h264_pixels16_8_msa; - c->weight_h264_pixels_tab[1] = ff_weight_h264_pixels8_8_msa; - c->weight_h264_pixels_tab[2] = ff_weight_h264_pixels4_8_msa; - - c->biweight_h264_pixels_tab[0] = ff_biweight_h264_pixels16_8_msa; - c->biweight_h264_pixels_tab[1] = ff_biweight_h264_pixels8_8_msa; - c->biweight_h264_pixels_tab[2] = ff_biweight_h264_pixels4_8_msa; - - c->h264_idct_add = ff_h264_idct_add_msa; - c->h264_idct8_add = ff_h264_idct8_addblk_msa; - c->h264_idct_dc_add = ff_h264_idct4x4_addblk_dc_msa; - c->h264_idct8_dc_add = ff_h264_idct8_dc_addblk_msa; - c->h264_idct_add16 = ff_h264_idct_add16_msa; - c->h264_idct8_add4 = ff_h264_idct8_add4_msa; - - if (chroma_format_idc <= 1) - c->h264_idct_add8 = ff_h264_idct_add8_msa; - else - c->h264_idct_add8 = ff_h264_idct_add8_422_msa; - - c->h264_idct_add16intra = ff_h264_idct_add16_intra_msa; - c->h264_luma_dc_dequant_idct = ff_h264_deq_idct_luma_dc_msa; - } // if (8 == bit_depth) -} -#endif // #if HAVE_MSA - -#if HAVE_MMI -static av_cold void h264dsp_init_mmi(H264DSPContext * c, const int bit_depth, - const int chroma_format_idc) +av_cold void ff_h264dsp_init_mips(H264DSPContext *c, const int bit_depth, + const int chroma_format_idc) { - if (bit_depth == 8) { - c->h264_add_pixels4_clear = ff_h264_add_pixels4_8_mmi; - c->h264_idct_add = ff_h264_idct_add_8_mmi; - c->h264_idct8_add = ff_h264_idct8_add_8_mmi; - c->h264_idct_dc_add = ff_h264_idct_dc_add_8_mmi; - c->h264_idct8_dc_add = ff_h264_idct8_dc_add_8_mmi; - c->h264_idct_add16 = ff_h264_idct_add16_8_mmi; - c->h264_idct_add16intra = ff_h264_idct_add16intra_8_mmi; - c->h264_idct8_add4 = ff_h264_idct8_add4_8_mmi; - - if (chroma_format_idc <= 1) - c->h264_idct_add8 = ff_h264_idct_add8_8_mmi; - else - c->h264_idct_add8 = ff_h264_idct_add8_422_8_mmi; - - c->h264_luma_dc_dequant_idct = ff_h264_luma_dc_dequant_idct_8_mmi; - - if (chroma_format_idc <= 1) - c->h264_chroma_dc_dequant_idct = - ff_h264_chroma_dc_dequant_idct_8_mmi; - else - c->h264_chroma_dc_dequant_idct = - ff_h264_chroma422_dc_dequant_idct_8_mmi; - - c->weight_h264_pixels_tab[0] = ff_h264_weight_pixels16_8_mmi; - c->weight_h264_pixels_tab[1] = ff_h264_weight_pixels8_8_mmi; - c->weight_h264_pixels_tab[2] = ff_h264_weight_pixels4_8_mmi; - - c->biweight_h264_pixels_tab[0] = ff_h264_biweight_pixels16_8_mmi; - c->biweight_h264_pixels_tab[1] = ff_h264_biweight_pixels8_8_mmi; - c->biweight_h264_pixels_tab[2] = ff_h264_biweight_pixels4_8_mmi; - - c->h264_v_loop_filter_chroma = ff_deblock_v_chroma_8_mmi; - c->h264_v_loop_filter_chroma_intra = ff_deblock_v_chroma_intra_8_mmi; - - if (chroma_format_idc <= 1) { - c->h264_h_loop_filter_chroma = - ff_deblock_h_chroma_8_mmi; - c->h264_h_loop_filter_chroma_intra = - ff_deblock_h_chroma_intra_8_mmi; + int cpu_flags = av_get_cpu_flags(); + + if (have_mmi(cpu_flags)) { + if (bit_depth == 8) { + c->h264_add_pixels4_clear = ff_h264_add_pixels4_8_mmi; + c->h264_idct_add = ff_h264_idct_add_8_mmi; + c->h264_idct8_add = ff_h264_idct8_add_8_mmi; + c->h264_idct_dc_add = ff_h264_idct_dc_add_8_mmi; + c->h264_idct8_dc_add = ff_h264_idct8_dc_add_8_mmi; + c->h264_idct_add16 = ff_h264_idct_add16_8_mmi; + c->h264_idct_add16intra = ff_h264_idct_add16intra_8_mmi; + c->h264_idct8_add4 = ff_h264_idct8_add4_8_mmi; + + if (chroma_format_idc <= 1) + c->h264_idct_add8 = ff_h264_idct_add8_8_mmi; + else + c->h264_idct_add8 = ff_h264_idct_add8_422_8_mmi; + + c->h264_luma_dc_dequant_idct = ff_h264_luma_dc_dequant_idct_8_mmi; + + if (chroma_format_idc <= 1) + c->h264_chroma_dc_dequant_idct = + ff_h264_chroma_dc_dequant_idct_8_mmi; + else + c->h264_chroma_dc_dequant_idct = + ff_h264_chroma422_dc_dequant_idct_8_mmi; + + c->weight_h264_pixels_tab[0] = ff_h264_weight_pixels16_8_mmi; + c->weight_h264_pixels_tab[1] = ff_h264_weight_pixels8_8_mmi; + c->weight_h264_pixels_tab[2] = ff_h264_weight_pixels4_8_mmi; + + c->biweight_h264_pixels_tab[0] = ff_h264_biweight_pixels16_8_mmi; + c->biweight_h264_pixels_tab[1] = ff_h264_biweight_pixels8_8_mmi; + c->biweight_h264_pixels_tab[2] = ff_h264_biweight_pixels4_8_mmi; + + c->h264_v_loop_filter_chroma = ff_deblock_v_chroma_8_mmi; + c->h264_v_loop_filter_chroma_intra = ff_deblock_v_chroma_intra_8_mmi; + + if (chroma_format_idc <= 1) { + c->h264_h_loop_filter_chroma = + ff_deblock_h_chroma_8_mmi; + c->h264_h_loop_filter_chroma_intra = + ff_deblock_h_chroma_intra_8_mmi; + } + + c->h264_v_loop_filter_luma = ff_deblock_v_luma_8_mmi; + c->h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_mmi; + c->h264_h_loop_filter_luma = ff_deblock_h_luma_8_mmi; + c->h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_mmi; } - - c->h264_v_loop_filter_luma = ff_deblock_v_luma_8_mmi; - c->h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_mmi; - c->h264_h_loop_filter_luma = ff_deblock_h_luma_8_mmi; - c->h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_mmi; } -} -#endif /* HAVE_MMI */ -av_cold void ff_h264dsp_init_mips(H264DSPContext *c, const int bit_depth, - const int chroma_format_idc) -{ -#if HAVE_MMI - h264dsp_init_mmi(c, bit_depth, chroma_format_idc); -#endif /* HAVE_MMI */ -#if HAVE_MSA - h264dsp_init_msa(c, bit_depth, chroma_format_idc); -#endif // #if HAVE_MSA + if (have_msa(cpu_flags)) { + if (bit_depth == 8) { + c->h264_v_loop_filter_luma = ff_h264_v_lpf_luma_inter_msa; + c->h264_h_loop_filter_luma = ff_h264_h_lpf_luma_inter_msa; + c->h264_h_loop_filter_luma_mbaff = + ff_h264_h_loop_filter_luma_mbaff_msa; + c->h264_v_loop_filter_luma_intra = ff_h264_v_lpf_luma_intra_msa; + c->h264_h_loop_filter_luma_intra = ff_h264_h_lpf_luma_intra_msa; + c->h264_h_loop_filter_luma_mbaff_intra = + ff_h264_h_loop_filter_luma_mbaff_intra_msa; + c->h264_v_loop_filter_chroma = ff_h264_v_lpf_chroma_inter_msa; + + if (chroma_format_idc <= 1) + c->h264_h_loop_filter_chroma = ff_h264_h_lpf_chroma_inter_msa; + else + c->h264_h_loop_filter_chroma = + ff_h264_h_loop_filter_chroma422_msa; + + if (chroma_format_idc > 1) + c->h264_h_loop_filter_chroma_mbaff = + ff_h264_h_loop_filter_chroma422_mbaff_msa; + + c->h264_v_loop_filter_chroma_intra = + ff_h264_v_lpf_chroma_intra_msa; + + if (chroma_format_idc <= 1) + c->h264_h_loop_filter_chroma_intra = + ff_h264_h_lpf_chroma_intra_msa; + + /* Weighted MC */ + c->weight_h264_pixels_tab[0] = ff_weight_h264_pixels16_8_msa; + c->weight_h264_pixels_tab[1] = ff_weight_h264_pixels8_8_msa; + c->weight_h264_pixels_tab[2] = ff_weight_h264_pixels4_8_msa; + + c->biweight_h264_pixels_tab[0] = ff_biweight_h264_pixels16_8_msa; + c->biweight_h264_pixels_tab[1] = ff_biweight_h264_pixels8_8_msa; + c->biweight_h264_pixels_tab[2] = ff_biweight_h264_pixels4_8_msa; + + c->h264_idct_add = ff_h264_idct_add_msa; + c->h264_idct8_add = ff_h264_idct8_addblk_msa; + c->h264_idct_dc_add = ff_h264_idct4x4_addblk_dc_msa; + c->h264_idct8_dc_add = ff_h264_idct8_dc_addblk_msa; + c->h264_idct_add16 = ff_h264_idct_add16_msa; + c->h264_idct8_add4 = ff_h264_idct8_add4_msa; + + if (chroma_format_idc <= 1) + c->h264_idct_add8 = ff_h264_idct_add8_msa; + else + c->h264_idct_add8 = ff_h264_idct_add8_422_msa; + + c->h264_idct_add16intra = ff_h264_idct_add16_intra_msa; + c->h264_luma_dc_dequant_idct = ff_h264_deq_idct_luma_dc_msa; + } + } } diff --git a/libavcodec/mips/h264pred_init_mips.c b/libavcodec/mips/h264pred_init_mips.c index e537ad8bd4..0fd9bb737a 100644 --- a/libavcodec/mips/h264pred_init_mips.c +++ b/libavcodec/mips/h264pred_init_mips.c @@ -19,134 +19,121 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "config.h" #include "h264dsp_mips.h" #include "h264pred_mips.h" -#if HAVE_MSA -static av_cold void h264_pred_init_msa(H264PredContext *h, int codec_id, - const int bit_depth, - const int chroma_format_idc) +av_cold void ff_h264_pred_init_mips(H264PredContext *h, int codec_id, + int bit_depth, + const int chroma_format_idc) { - if (8 == bit_depth) { - if (chroma_format_idc == 1) { - h->pred8x8[VERT_PRED8x8] = ff_h264_intra_pred_vert_8x8_msa; - h->pred8x8[HOR_PRED8x8] = ff_h264_intra_pred_horiz_8x8_msa; - } + int cpu_flags = av_get_cpu_flags(); - if (codec_id != AV_CODEC_ID_VP7 && codec_id != AV_CODEC_ID_VP8) { + if (have_mmi(cpu_flags)) { + if (bit_depth == 8) { if (chroma_format_idc == 1) { - h->pred8x8[PLANE_PRED8x8] = ff_h264_intra_predict_plane_8x8_msa; - } - } - if (codec_id != AV_CODEC_ID_RV40 && codec_id != AV_CODEC_ID_VP7 - && codec_id != AV_CODEC_ID_VP8) { - if (chroma_format_idc == 1) { - h->pred8x8[DC_PRED8x8] = ff_h264_intra_predict_dc_4blk_8x8_msa; - h->pred8x8[LEFT_DC_PRED8x8] = - ff_h264_intra_predict_hor_dc_8x8_msa; - h->pred8x8[TOP_DC_PRED8x8] = - ff_h264_intra_predict_vert_dc_8x8_msa; - h->pred8x8[ALZHEIMER_DC_L0T_PRED8x8] = - ff_h264_intra_predict_mad_cow_dc_l0t_8x8_msa; - h->pred8x8[ALZHEIMER_DC_0LT_PRED8x8] = - ff_h264_intra_predict_mad_cow_dc_0lt_8x8_msa; - h->pred8x8[ALZHEIMER_DC_L00_PRED8x8] = - ff_h264_intra_predict_mad_cow_dc_l00_8x8_msa; - h->pred8x8[ALZHEIMER_DC_0L0_PRED8x8] = - ff_h264_intra_predict_mad_cow_dc_0l0_8x8_msa; - } - } else { - if (codec_id == AV_CODEC_ID_VP7 || codec_id == AV_CODEC_ID_VP8) { - h->pred8x8[7] = ff_vp8_pred8x8_127_dc_8_msa; - h->pred8x8[8] = ff_vp8_pred8x8_129_dc_8_msa; + h->pred8x8 [VERT_PRED8x8 ] = ff_pred8x8_vertical_8_mmi; + h->pred8x8 [HOR_PRED8x8 ] = ff_pred8x8_horizontal_8_mmi; + } else { + h->pred8x8 [VERT_PRED8x8 ] = ff_pred8x16_vertical_8_mmi; + h->pred8x8 [HOR_PRED8x8 ] = ff_pred8x16_horizontal_8_mmi; } - } - if (chroma_format_idc == 1) { - h->pred8x8[DC_128_PRED8x8] = ff_h264_intra_pred_dc_128_8x8_msa; - } + h->pred16x16[DC_PRED8x8 ] = ff_pred16x16_dc_8_mmi; + h->pred16x16[VERT_PRED8x8 ] = ff_pred16x16_vertical_8_mmi; + h->pred16x16[HOR_PRED8x8 ] = ff_pred16x16_horizontal_8_mmi; + h->pred8x8l [TOP_DC_PRED ] = ff_pred8x8l_top_dc_8_mmi; + h->pred8x8l [DC_PRED ] = ff_pred8x8l_dc_8_mmi; - h->pred16x16[DC_PRED8x8] = ff_h264_intra_pred_dc_16x16_msa; - h->pred16x16[VERT_PRED8x8] = ff_h264_intra_pred_vert_16x16_msa; - h->pred16x16[HOR_PRED8x8] = ff_h264_intra_pred_horiz_16x16_msa; + #if ARCH_MIPS64 + switch (codec_id) { + case AV_CODEC_ID_SVQ3: + h->pred16x16[PLANE_PRED8x8 ] = ff_pred16x16_plane_svq3_8_mmi; + break; + case AV_CODEC_ID_RV40: + h->pred16x16[PLANE_PRED8x8 ] = ff_pred16x16_plane_rv40_8_mmi; + break; + case AV_CODEC_ID_VP7: + case AV_CODEC_ID_VP8: + break; + default: + h->pred16x16[PLANE_PRED8x8 ] = ff_pred16x16_plane_h264_8_mmi; + break; + } + #endif - switch (codec_id) { - case AV_CODEC_ID_SVQ3: - case AV_CODEC_ID_RV40: - break; - case AV_CODEC_ID_VP7: - case AV_CODEC_ID_VP8: - h->pred16x16[7] = ff_vp8_pred16x16_127_dc_8_msa; - h->pred16x16[8] = ff_vp8_pred16x16_129_dc_8_msa; - break; - default: - h->pred16x16[PLANE_PRED8x8] = - ff_h264_intra_predict_plane_16x16_msa; - break; + if (codec_id == AV_CODEC_ID_SVQ3 || codec_id == AV_CODEC_ID_H264) { + if (chroma_format_idc == 1) { + h->pred8x8[TOP_DC_PRED8x8 ] = ff_pred8x8_top_dc_8_mmi; + h->pred8x8[DC_PRED8x8 ] = ff_pred8x8_dc_8_mmi; + } + } } - - h->pred16x16[LEFT_DC_PRED8x8] = ff_h264_intra_pred_dc_left_16x16_msa; - h->pred16x16[TOP_DC_PRED8x8] = ff_h264_intra_pred_dc_top_16x16_msa; - h->pred16x16[DC_128_PRED8x8] = ff_h264_intra_pred_dc_128_16x16_msa; } -} -#endif // #if HAVE_MSA - -#if HAVE_MMI -static av_cold void h264_pred_init_mmi(H264PredContext *h, int codec_id, - const int bit_depth, const int chroma_format_idc) -{ - if (bit_depth == 8) { - if (chroma_format_idc == 1) { - h->pred8x8 [VERT_PRED8x8 ] = ff_pred8x8_vertical_8_mmi; - h->pred8x8 [HOR_PRED8x8 ] = ff_pred8x8_horizontal_8_mmi; - } else { - h->pred8x8 [VERT_PRED8x8 ] = ff_pred8x16_vertical_8_mmi; - h->pred8x8 [HOR_PRED8x8 ] = ff_pred8x16_horizontal_8_mmi; - } - h->pred16x16[DC_PRED8x8 ] = ff_pred16x16_dc_8_mmi; - h->pred16x16[VERT_PRED8x8 ] = ff_pred16x16_vertical_8_mmi; - h->pred16x16[HOR_PRED8x8 ] = ff_pred16x16_horizontal_8_mmi; - h->pred8x8l [TOP_DC_PRED ] = ff_pred8x8l_top_dc_8_mmi; - h->pred8x8l [DC_PRED ] = ff_pred8x8l_dc_8_mmi; + if (have_msa(cpu_flags)) { + if (8 == bit_depth) { + if (chroma_format_idc == 1) { + h->pred8x8[VERT_PRED8x8] = ff_h264_intra_pred_vert_8x8_msa; + h->pred8x8[HOR_PRED8x8] = ff_h264_intra_pred_horiz_8x8_msa; + } -#if ARCH_MIPS64 - switch (codec_id) { - case AV_CODEC_ID_SVQ3: - h->pred16x16[PLANE_PRED8x8 ] = ff_pred16x16_plane_svq3_8_mmi; - break; - case AV_CODEC_ID_RV40: - h->pred16x16[PLANE_PRED8x8 ] = ff_pred16x16_plane_rv40_8_mmi; - break; - case AV_CODEC_ID_VP7: - case AV_CODEC_ID_VP8: - break; - default: - h->pred16x16[PLANE_PRED8x8 ] = ff_pred16x16_plane_h264_8_mmi; - break; - } -#endif + if (codec_id != AV_CODEC_ID_VP7 && codec_id != AV_CODEC_ID_VP8) { + if (chroma_format_idc == 1) { + h->pred8x8[PLANE_PRED8x8] = ff_h264_intra_predict_plane_8x8_msa; + } + } + if (codec_id != AV_CODEC_ID_RV40 && codec_id != AV_CODEC_ID_VP7 + && codec_id != AV_CODEC_ID_VP8) { + if (chroma_format_idc == 1) { + h->pred8x8[DC_PRED8x8] = ff_h264_intra_predict_dc_4blk_8x8_msa; + h->pred8x8[LEFT_DC_PRED8x8] = + ff_h264_intra_predict_hor_dc_8x8_msa; + h->pred8x8[TOP_DC_PRED8x8] = + ff_h264_intra_predict_vert_dc_8x8_msa; + h->pred8x8[ALZHEIMER_DC_L0T_PRED8x8] = + ff_h264_intra_predict_mad_cow_dc_l0t_8x8_msa; + h->pred8x8[ALZHEIMER_DC_0LT_PRED8x8] = + ff_h264_intra_predict_mad_cow_dc_0lt_8x8_msa; + h->pred8x8[ALZHEIMER_DC_L00_PRED8x8] = + ff_h264_intra_predict_mad_cow_dc_l00_8x8_msa; + h->pred8x8[ALZHEIMER_DC_0L0_PRED8x8] = + ff_h264_intra_predict_mad_cow_dc_0l0_8x8_msa; + } + } else { + if (codec_id == AV_CODEC_ID_VP7 || codec_id == AV_CODEC_ID_VP8) { + h->pred8x8[7] = ff_vp8_pred8x8_127_dc_8_msa; + h->pred8x8[8] = ff_vp8_pred8x8_129_dc_8_msa; + } + } - if (codec_id == AV_CODEC_ID_SVQ3 || codec_id == AV_CODEC_ID_H264) { if (chroma_format_idc == 1) { - h->pred8x8[TOP_DC_PRED8x8 ] = ff_pred8x8_top_dc_8_mmi; - h->pred8x8[DC_PRED8x8 ] = ff_pred8x8_dc_8_mmi; + h->pred8x8[DC_128_PRED8x8] = ff_h264_intra_pred_dc_128_8x8_msa; + } + + h->pred16x16[DC_PRED8x8] = ff_h264_intra_pred_dc_16x16_msa; + h->pred16x16[VERT_PRED8x8] = ff_h264_intra_pred_vert_16x16_msa; + h->pred16x16[HOR_PRED8x8] = ff_h264_intra_pred_horiz_16x16_msa; + + switch (codec_id) { + case AV_CODEC_ID_SVQ3: + case AV_CODEC_ID_RV40: + break; + case AV_CODEC_ID_VP7: + case AV_CODEC_ID_VP8: + h->pred16x16[7] = ff_vp8_pred16x16_127_dc_8_msa; + h->pred16x16[8] = ff_vp8_pred16x16_129_dc_8_msa; + break; + default: + h->pred16x16[PLANE_PRED8x8] = + ff_h264_intra_predict_plane_16x16_msa; + break; } + + h->pred16x16[LEFT_DC_PRED8x8] = ff_h264_intra_pred_dc_left_16x16_msa; + h->pred16x16[TOP_DC_PRED8x8] = ff_h264_intra_pred_dc_top_16x16_msa; + h->pred16x16[DC_128_PRED8x8] = ff_h264_intra_pred_dc_128_16x16_msa; } } } -#endif /* HAVE_MMI */ - -av_cold void ff_h264_pred_init_mips(H264PredContext *h, int codec_id, - int bit_depth, - const int chroma_format_idc) -{ -#if HAVE_MMI - h264_pred_init_mmi(h, codec_id, bit_depth, chroma_format_idc); -#endif /* HAVE_MMI */ -#if HAVE_MSA - h264_pred_init_msa(h, codec_id, bit_depth, chroma_format_idc); -#endif // #if HAVE_MSA -} diff --git a/libavcodec/mips/h264qpel_init_mips.c b/libavcodec/mips/h264qpel_init_mips.c index 33bae3093a..d896a4841b 100644 --- a/libavcodec/mips/h264qpel_init_mips.c +++ b/libavcodec/mips/h264qpel_init_mips.c @@ -19,231 +19,221 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "h264dsp_mips.h" -#if HAVE_MSA -static av_cold void h264qpel_init_msa(H264QpelContext *c, int bit_depth) +av_cold void ff_h264qpel_init_mips(H264QpelContext *c, int bit_depth) { - if (8 == bit_depth) { - c->put_h264_qpel_pixels_tab[0][0] = ff_put_h264_qpel16_mc00_msa; - c->put_h264_qpel_pixels_tab[0][1] = ff_put_h264_qpel16_mc10_msa; - c->put_h264_qpel_pixels_tab[0][2] = ff_put_h264_qpel16_mc20_msa; - c->put_h264_qpel_pixels_tab[0][3] = ff_put_h264_qpel16_mc30_msa; - c->put_h264_qpel_pixels_tab[0][4] = ff_put_h264_qpel16_mc01_msa; - c->put_h264_qpel_pixels_tab[0][5] = ff_put_h264_qpel16_mc11_msa; - c->put_h264_qpel_pixels_tab[0][6] = ff_put_h264_qpel16_mc21_msa; - c->put_h264_qpel_pixels_tab[0][7] = ff_put_h264_qpel16_mc31_msa; - c->put_h264_qpel_pixels_tab[0][8] = ff_put_h264_qpel16_mc02_msa; - c->put_h264_qpel_pixels_tab[0][9] = ff_put_h264_qpel16_mc12_msa; - c->put_h264_qpel_pixels_tab[0][10] = ff_put_h264_qpel16_mc22_msa; - c->put_h264_qpel_pixels_tab[0][11] = ff_put_h264_qpel16_mc32_msa; - c->put_h264_qpel_pixels_tab[0][12] = ff_put_h264_qpel16_mc03_msa; - c->put_h264_qpel_pixels_tab[0][13] = ff_put_h264_qpel16_mc13_msa; - c->put_h264_qpel_pixels_tab[0][14] = ff_put_h264_qpel16_mc23_msa; - c->put_h264_qpel_pixels_tab[0][15] = ff_put_h264_qpel16_mc33_msa; + int cpu_flags = av_get_cpu_flags(); + + if (have_mmi(cpu_flags)) { + if (bit_depth == 8) { + c->put_h264_qpel_pixels_tab[0][0] = ff_put_h264_qpel16_mc00_mmi; + c->put_h264_qpel_pixels_tab[0][1] = ff_put_h264_qpel16_mc10_mmi; + c->put_h264_qpel_pixels_tab[0][2] = ff_put_h264_qpel16_mc20_mmi; + c->put_h264_qpel_pixels_tab[0][3] = ff_put_h264_qpel16_mc30_mmi; + c->put_h264_qpel_pixels_tab[0][4] = ff_put_h264_qpel16_mc01_mmi; + c->put_h264_qpel_pixels_tab[0][5] = ff_put_h264_qpel16_mc11_mmi; + c->put_h264_qpel_pixels_tab[0][6] = ff_put_h264_qpel16_mc21_mmi; + c->put_h264_qpel_pixels_tab[0][7] = ff_put_h264_qpel16_mc31_mmi; + c->put_h264_qpel_pixels_tab[0][8] = ff_put_h264_qpel16_mc02_mmi; + c->put_h264_qpel_pixels_tab[0][9] = ff_put_h264_qpel16_mc12_mmi; + c->put_h264_qpel_pixels_tab[0][10] = ff_put_h264_qpel16_mc22_mmi; + c->put_h264_qpel_pixels_tab[0][11] = ff_put_h264_qpel16_mc32_mmi; + c->put_h264_qpel_pixels_tab[0][12] = ff_put_h264_qpel16_mc03_mmi; + c->put_h264_qpel_pixels_tab[0][13] = ff_put_h264_qpel16_mc13_mmi; + c->put_h264_qpel_pixels_tab[0][14] = ff_put_h264_qpel16_mc23_mmi; + c->put_h264_qpel_pixels_tab[0][15] = ff_put_h264_qpel16_mc33_mmi; - c->put_h264_qpel_pixels_tab[1][0] = ff_put_h264_qpel8_mc00_msa; - c->put_h264_qpel_pixels_tab[1][1] = ff_put_h264_qpel8_mc10_msa; - c->put_h264_qpel_pixels_tab[1][2] = ff_put_h264_qpel8_mc20_msa; - c->put_h264_qpel_pixels_tab[1][3] = ff_put_h264_qpel8_mc30_msa; - c->put_h264_qpel_pixels_tab[1][4] = ff_put_h264_qpel8_mc01_msa; - c->put_h264_qpel_pixels_tab[1][5] = ff_put_h264_qpel8_mc11_msa; - c->put_h264_qpel_pixels_tab[1][6] = ff_put_h264_qpel8_mc21_msa; - c->put_h264_qpel_pixels_tab[1][7] = ff_put_h264_qpel8_mc31_msa; - c->put_h264_qpel_pixels_tab[1][8] = ff_put_h264_qpel8_mc02_msa; - c->put_h264_qpel_pixels_tab[1][9] = ff_put_h264_qpel8_mc12_msa; - c->put_h264_qpel_pixels_tab[1][10] = ff_put_h264_qpel8_mc22_msa; - c->put_h264_qpel_pixels_tab[1][11] = ff_put_h264_qpel8_mc32_msa; - c->put_h264_qpel_pixels_tab[1][12] = ff_put_h264_qpel8_mc03_msa; - c->put_h264_qpel_pixels_tab[1][13] = ff_put_h264_qpel8_mc13_msa; - c->put_h264_qpel_pixels_tab[1][14] = ff_put_h264_qpel8_mc23_msa; - c->put_h264_qpel_pixels_tab[1][15] = ff_put_h264_qpel8_mc33_msa; + c->put_h264_qpel_pixels_tab[1][0] = ff_put_h264_qpel8_mc00_mmi; + c->put_h264_qpel_pixels_tab[1][1] = ff_put_h264_qpel8_mc10_mmi; + c->put_h264_qpel_pixels_tab[1][2] = ff_put_h264_qpel8_mc20_mmi; + c->put_h264_qpel_pixels_tab[1][3] = ff_put_h264_qpel8_mc30_mmi; + c->put_h264_qpel_pixels_tab[1][4] = ff_put_h264_qpel8_mc01_mmi; + c->put_h264_qpel_pixels_tab[1][5] = ff_put_h264_qpel8_mc11_mmi; + c->put_h264_qpel_pixels_tab[1][6] = ff_put_h264_qpel8_mc21_mmi; + c->put_h264_qpel_pixels_tab[1][7] = ff_put_h264_qpel8_mc31_mmi; + c->put_h264_qpel_pixels_tab[1][8] = ff_put_h264_qpel8_mc02_mmi; + c->put_h264_qpel_pixels_tab[1][9] = ff_put_h264_qpel8_mc12_mmi; + c->put_h264_qpel_pixels_tab[1][10] = ff_put_h264_qpel8_mc22_mmi; + c->put_h264_qpel_pixels_tab[1][11] = ff_put_h264_qpel8_mc32_mmi; + c->put_h264_qpel_pixels_tab[1][12] = ff_put_h264_qpel8_mc03_mmi; + c->put_h264_qpel_pixels_tab[1][13] = ff_put_h264_qpel8_mc13_mmi; + c->put_h264_qpel_pixels_tab[1][14] = ff_put_h264_qpel8_mc23_mmi; + c->put_h264_qpel_pixels_tab[1][15] = ff_put_h264_qpel8_mc33_mmi; - c->put_h264_qpel_pixels_tab[2][1] = ff_put_h264_qpel4_mc10_msa; - c->put_h264_qpel_pixels_tab[2][2] = ff_put_h264_qpel4_mc20_msa; - c->put_h264_qpel_pixels_tab[2][3] = ff_put_h264_qpel4_mc30_msa; - c->put_h264_qpel_pixels_tab[2][4] = ff_put_h264_qpel4_mc01_msa; - c->put_h264_qpel_pixels_tab[2][5] = ff_put_h264_qpel4_mc11_msa; - c->put_h264_qpel_pixels_tab[2][6] = ff_put_h264_qpel4_mc21_msa; - c->put_h264_qpel_pixels_tab[2][7] = ff_put_h264_qpel4_mc31_msa; - c->put_h264_qpel_pixels_tab[2][8] = ff_put_h264_qpel4_mc02_msa; - c->put_h264_qpel_pixels_tab[2][9] = ff_put_h264_qpel4_mc12_msa; - c->put_h264_qpel_pixels_tab[2][10] = ff_put_h264_qpel4_mc22_msa; - c->put_h264_qpel_pixels_tab[2][11] = ff_put_h264_qpel4_mc32_msa; - c->put_h264_qpel_pixels_tab[2][12] = ff_put_h264_qpel4_mc03_msa; - c->put_h264_qpel_pixels_tab[2][13] = ff_put_h264_qpel4_mc13_msa; - c->put_h264_qpel_pixels_tab[2][14] = ff_put_h264_qpel4_mc23_msa; - c->put_h264_qpel_pixels_tab[2][15] = ff_put_h264_qpel4_mc33_msa; + c->put_h264_qpel_pixels_tab[2][0] = ff_put_h264_qpel4_mc00_mmi; + c->put_h264_qpel_pixels_tab[2][1] = ff_put_h264_qpel4_mc10_mmi; + c->put_h264_qpel_pixels_tab[2][2] = ff_put_h264_qpel4_mc20_mmi; + c->put_h264_qpel_pixels_tab[2][3] = ff_put_h264_qpel4_mc30_mmi; + c->put_h264_qpel_pixels_tab[2][4] = ff_put_h264_qpel4_mc01_mmi; + c->put_h264_qpel_pixels_tab[2][5] = ff_put_h264_qpel4_mc11_mmi; + c->put_h264_qpel_pixels_tab[2][6] = ff_put_h264_qpel4_mc21_mmi; + c->put_h264_qpel_pixels_tab[2][7] = ff_put_h264_qpel4_mc31_mmi; + c->put_h264_qpel_pixels_tab[2][8] = ff_put_h264_qpel4_mc02_mmi; + c->put_h264_qpel_pixels_tab[2][9] = ff_put_h264_qpel4_mc12_mmi; + c->put_h264_qpel_pixels_tab[2][10] = ff_put_h264_qpel4_mc22_mmi; + c->put_h264_qpel_pixels_tab[2][11] = ff_put_h264_qpel4_mc32_mmi; + c->put_h264_qpel_pixels_tab[2][12] = ff_put_h264_qpel4_mc03_mmi; + c->put_h264_qpel_pixels_tab[2][13] = ff_put_h264_qpel4_mc13_mmi; + c->put_h264_qpel_pixels_tab[2][14] = ff_put_h264_qpel4_mc23_mmi; + c->put_h264_qpel_pixels_tab[2][15] = ff_put_h264_qpel4_mc33_mmi; - c->avg_h264_qpel_pixels_tab[0][0] = ff_avg_h264_qpel16_mc00_msa; - c->avg_h264_qpel_pixels_tab[0][1] = ff_avg_h264_qpel16_mc10_msa; - c->avg_h264_qpel_pixels_tab[0][2] = ff_avg_h264_qpel16_mc20_msa; - c->avg_h264_qpel_pixels_tab[0][3] = ff_avg_h264_qpel16_mc30_msa; - c->avg_h264_qpel_pixels_tab[0][4] = ff_avg_h264_qpel16_mc01_msa; - c->avg_h264_qpel_pixels_tab[0][5] = ff_avg_h264_qpel16_mc11_msa; - c->avg_h264_qpel_pixels_tab[0][6] = ff_avg_h264_qpel16_mc21_msa; - c->avg_h264_qpel_pixels_tab[0][7] = ff_avg_h264_qpel16_mc31_msa; - c->avg_h264_qpel_pixels_tab[0][8] = ff_avg_h264_qpel16_mc02_msa; - c->avg_h264_qpel_pixels_tab[0][9] = ff_avg_h264_qpel16_mc12_msa; - c->avg_h264_qpel_pixels_tab[0][10] = ff_avg_h264_qpel16_mc22_msa; - c->avg_h264_qpel_pixels_tab[0][11] = ff_avg_h264_qpel16_mc32_msa; - c->avg_h264_qpel_pixels_tab[0][12] = ff_avg_h264_qpel16_mc03_msa; - c->avg_h264_qpel_pixels_tab[0][13] = ff_avg_h264_qpel16_mc13_msa; - c->avg_h264_qpel_pixels_tab[0][14] = ff_avg_h264_qpel16_mc23_msa; - c->avg_h264_qpel_pixels_tab[0][15] = ff_avg_h264_qpel16_mc33_msa; + c->avg_h264_qpel_pixels_tab[0][0] = ff_avg_h264_qpel16_mc00_mmi; + c->avg_h264_qpel_pixels_tab[0][1] = ff_avg_h264_qpel16_mc10_mmi; + c->avg_h264_qpel_pixels_tab[0][2] = ff_avg_h264_qpel16_mc20_mmi; + c->avg_h264_qpel_pixels_tab[0][3] = ff_avg_h264_qpel16_mc30_mmi; + c->avg_h264_qpel_pixels_tab[0][4] = ff_avg_h264_qpel16_mc01_mmi; + c->avg_h264_qpel_pixels_tab[0][5] = ff_avg_h264_qpel16_mc11_mmi; + c->avg_h264_qpel_pixels_tab[0][6] = ff_avg_h264_qpel16_mc21_mmi; + c->avg_h264_qpel_pixels_tab[0][7] = ff_avg_h264_qpel16_mc31_mmi; + c->avg_h264_qpel_pixels_tab[0][8] = ff_avg_h264_qpel16_mc02_mmi; + c->avg_h264_qpel_pixels_tab[0][9] = ff_avg_h264_qpel16_mc12_mmi; + c->avg_h264_qpel_pixels_tab[0][10] = ff_avg_h264_qpel16_mc22_mmi; + c->avg_h264_qpel_pixels_tab[0][11] = ff_avg_h264_qpel16_mc32_mmi; + c->avg_h264_qpel_pixels_tab[0][12] = ff_avg_h264_qpel16_mc03_mmi; + c->avg_h264_qpel_pixels_tab[0][13] = ff_avg_h264_qpel16_mc13_mmi; + c->avg_h264_qpel_pixels_tab[0][14] = ff_avg_h264_qpel16_mc23_mmi; + c->avg_h264_qpel_pixels_tab[0][15] = ff_avg_h264_qpel16_mc33_mmi; - c->avg_h264_qpel_pixels_tab[1][0] = ff_avg_h264_qpel8_mc00_msa; - c->avg_h264_qpel_pixels_tab[1][1] = ff_avg_h264_qpel8_mc10_msa; - c->avg_h264_qpel_pixels_tab[1][2] = ff_avg_h264_qpel8_mc20_msa; - c->avg_h264_qpel_pixels_tab[1][3] = ff_avg_h264_qpel8_mc30_msa; - c->avg_h264_qpel_pixels_tab[1][4] = ff_avg_h264_qpel8_mc01_msa; - c->avg_h264_qpel_pixels_tab[1][5] = ff_avg_h264_qpel8_mc11_msa; - c->avg_h264_qpel_pixels_tab[1][6] = ff_avg_h264_qpel8_mc21_msa; - c->avg_h264_qpel_pixels_tab[1][7] = ff_avg_h264_qpel8_mc31_msa; - c->avg_h264_qpel_pixels_tab[1][8] = ff_avg_h264_qpel8_mc02_msa; - c->avg_h264_qpel_pixels_tab[1][9] = ff_avg_h264_qpel8_mc12_msa; - c->avg_h264_qpel_pixels_tab[1][10] = ff_avg_h264_qpel8_mc22_msa; - c->avg_h264_qpel_pixels_tab[1][11] = ff_avg_h264_qpel8_mc32_msa; - c->avg_h264_qpel_pixels_tab[1][12] = ff_avg_h264_qpel8_mc03_msa; - c->avg_h264_qpel_pixels_tab[1][13] = ff_avg_h264_qpel8_mc13_msa; - c->avg_h264_qpel_pixels_tab[1][14] = ff_avg_h264_qpel8_mc23_msa; - c->avg_h264_qpel_pixels_tab[1][15] = ff_avg_h264_qpel8_mc33_msa; + c->avg_h264_qpel_pixels_tab[1][0] = ff_avg_h264_qpel8_mc00_mmi; + c->avg_h264_qpel_pixels_tab[1][1] = ff_avg_h264_qpel8_mc10_mmi; + c->avg_h264_qpel_pixels_tab[1][2] = ff_avg_h264_qpel8_mc20_mmi; + c->avg_h264_qpel_pixels_tab[1][3] = ff_avg_h264_qpel8_mc30_mmi; + c->avg_h264_qpel_pixels_tab[1][4] = ff_avg_h264_qpel8_mc01_mmi; + c->avg_h264_qpel_pixels_tab[1][5] = ff_avg_h264_qpel8_mc11_mmi; + c->avg_h264_qpel_pixels_tab[1][6] = ff_avg_h264_qpel8_mc21_mmi; + c->avg_h264_qpel_pixels_tab[1][7] = ff_avg_h264_qpel8_mc31_mmi; + c->avg_h264_qpel_pixels_tab[1][8] = ff_avg_h264_qpel8_mc02_mmi; + c->avg_h264_qpel_pixels_tab[1][9] = ff_avg_h264_qpel8_mc12_mmi; + c->avg_h264_qpel_pixels_tab[1][10] = ff_avg_h264_qpel8_mc22_mmi; + c->avg_h264_qpel_pixels_tab[1][11] = ff_avg_h264_qpel8_mc32_mmi; + c->avg_h264_qpel_pixels_tab[1][12] = ff_avg_h264_qpel8_mc03_mmi; + c->avg_h264_qpel_pixels_tab[1][13] = ff_avg_h264_qpel8_mc13_mmi; + c->avg_h264_qpel_pixels_tab[1][14] = ff_avg_h264_qpel8_mc23_mmi; + c->avg_h264_qpel_pixels_tab[1][15] = ff_avg_h264_qpel8_mc33_mmi; - c->avg_h264_qpel_pixels_tab[2][0] = ff_avg_h264_qpel4_mc00_msa; - c->avg_h264_qpel_pixels_tab[2][1] = ff_avg_h264_qpel4_mc10_msa; - c->avg_h264_qpel_pixels_tab[2][2] = ff_avg_h264_qpel4_mc20_msa; - c->avg_h264_qpel_pixels_tab[2][3] = ff_avg_h264_qpel4_mc30_msa; - c->avg_h264_qpel_pixels_tab[2][4] = ff_avg_h264_qpel4_mc01_msa; - c->avg_h264_qpel_pixels_tab[2][5] = ff_avg_h264_qpel4_mc11_msa; - c->avg_h264_qpel_pixels_tab[2][6] = ff_avg_h264_qpel4_mc21_msa; - c->avg_h264_qpel_pixels_tab[2][7] = ff_avg_h264_qpel4_mc31_msa; - c->avg_h264_qpel_pixels_tab[2][8] = ff_avg_h264_qpel4_mc02_msa; - c->avg_h264_qpel_pixels_tab[2][9] = ff_avg_h264_qpel4_mc12_msa; - c->avg_h264_qpel_pixels_tab[2][10] = ff_avg_h264_qpel4_mc22_msa; - c->avg_h264_qpel_pixels_tab[2][11] = ff_avg_h264_qpel4_mc32_msa; - c->avg_h264_qpel_pixels_tab[2][12] = ff_avg_h264_qpel4_mc03_msa; - c->avg_h264_qpel_pixels_tab[2][13] = ff_avg_h264_qpel4_mc13_msa; - c->avg_h264_qpel_pixels_tab[2][14] = ff_avg_h264_qpel4_mc23_msa; - c->avg_h264_qpel_pixels_tab[2][15] = ff_avg_h264_qpel4_mc33_msa; + c->avg_h264_qpel_pixels_tab[2][0] = ff_avg_h264_qpel4_mc00_mmi; + c->avg_h264_qpel_pixels_tab[2][1] = ff_avg_h264_qpel4_mc10_mmi; + c->avg_h264_qpel_pixels_tab[2][2] = ff_avg_h264_qpel4_mc20_mmi; + c->avg_h264_qpel_pixels_tab[2][3] = ff_avg_h264_qpel4_mc30_mmi; + c->avg_h264_qpel_pixels_tab[2][4] = ff_avg_h264_qpel4_mc01_mmi; + c->avg_h264_qpel_pixels_tab[2][5] = ff_avg_h264_qpel4_mc11_mmi; + c->avg_h264_qpel_pixels_tab[2][6] = ff_avg_h264_qpel4_mc21_mmi; + c->avg_h264_qpel_pixels_tab[2][7] = ff_avg_h264_qpel4_mc31_mmi; + c->avg_h264_qpel_pixels_tab[2][8] = ff_avg_h264_qpel4_mc02_mmi; + c->avg_h264_qpel_pixels_tab[2][9] = ff_avg_h264_qpel4_mc12_mmi; + c->avg_h264_qpel_pixels_tab[2][10] = ff_avg_h264_qpel4_mc22_mmi; + c->avg_h264_qpel_pixels_tab[2][11] = ff_avg_h264_qpel4_mc32_mmi; + c->avg_h264_qpel_pixels_tab[2][12] = ff_avg_h264_qpel4_mc03_mmi; + c->avg_h264_qpel_pixels_tab[2][13] = ff_avg_h264_qpel4_mc13_mmi; + c->avg_h264_qpel_pixels_tab[2][14] = ff_avg_h264_qpel4_mc23_mmi; + c->avg_h264_qpel_pixels_tab[2][15] = ff_avg_h264_qpel4_mc33_mmi; + } } -} -#endif // #if HAVE_MSA -#if HAVE_MMI -static av_cold void h264qpel_init_mmi(H264QpelContext *c, int bit_depth) -{ - if (8 == bit_depth) { - c->put_h264_qpel_pixels_tab[0][0] = ff_put_h264_qpel16_mc00_mmi; - c->put_h264_qpel_pixels_tab[0][1] = ff_put_h264_qpel16_mc10_mmi; - c->put_h264_qpel_pixels_tab[0][2] = ff_put_h264_qpel16_mc20_mmi; - c->put_h264_qpel_pixels_tab[0][3] = ff_put_h264_qpel16_mc30_mmi; - c->put_h264_qpel_pixels_tab[0][4] = ff_put_h264_qpel16_mc01_mmi; - c->put_h264_qpel_pixels_tab[0][5] = ff_put_h264_qpel16_mc11_mmi; - c->put_h264_qpel_pixels_tab[0][6] = ff_put_h264_qpel16_mc21_mmi; - c->put_h264_qpel_pixels_tab[0][7] = ff_put_h264_qpel16_mc31_mmi; - c->put_h264_qpel_pixels_tab[0][8] = ff_put_h264_qpel16_mc02_mmi; - c->put_h264_qpel_pixels_tab[0][9] = ff_put_h264_qpel16_mc12_mmi; - c->put_h264_qpel_pixels_tab[0][10] = ff_put_h264_qpel16_mc22_mmi; - c->put_h264_qpel_pixels_tab[0][11] = ff_put_h264_qpel16_mc32_mmi; - c->put_h264_qpel_pixels_tab[0][12] = ff_put_h264_qpel16_mc03_mmi; - c->put_h264_qpel_pixels_tab[0][13] = ff_put_h264_qpel16_mc13_mmi; - c->put_h264_qpel_pixels_tab[0][14] = ff_put_h264_qpel16_mc23_mmi; - c->put_h264_qpel_pixels_tab[0][15] = ff_put_h264_qpel16_mc33_mmi; + if (have_msa(cpu_flags)) { + if (bit_depth == 8) { + c->put_h264_qpel_pixels_tab[0][0] = ff_put_h264_qpel16_mc00_msa; + c->put_h264_qpel_pixels_tab[0][1] = ff_put_h264_qpel16_mc10_msa; + c->put_h264_qpel_pixels_tab[0][2] = ff_put_h264_qpel16_mc20_msa; + c->put_h264_qpel_pixels_tab[0][3] = ff_put_h264_qpel16_mc30_msa; + c->put_h264_qpel_pixels_tab[0][4] = ff_put_h264_qpel16_mc01_msa; + c->put_h264_qpel_pixels_tab[0][5] = ff_put_h264_qpel16_mc11_msa; + c->put_h264_qpel_pixels_tab[0][6] = ff_put_h264_qpel16_mc21_msa; + c->put_h264_qpel_pixels_tab[0][7] = ff_put_h264_qpel16_mc31_msa; + c->put_h264_qpel_pixels_tab[0][8] = ff_put_h264_qpel16_mc02_msa; + c->put_h264_qpel_pixels_tab[0][9] = ff_put_h264_qpel16_mc12_msa; + c->put_h264_qpel_pixels_tab[0][10] = ff_put_h264_qpel16_mc22_msa; + c->put_h264_qpel_pixels_tab[0][11] = ff_put_h264_qpel16_mc32_msa; + c->put_h264_qpel_pixels_tab[0][12] = ff_put_h264_qpel16_mc03_msa; + c->put_h264_qpel_pixels_tab[0][13] = ff_put_h264_qpel16_mc13_msa; + c->put_h264_qpel_pixels_tab[0][14] = ff_put_h264_qpel16_mc23_msa; + c->put_h264_qpel_pixels_tab[0][15] = ff_put_h264_qpel16_mc33_msa; - c->put_h264_qpel_pixels_tab[1][0] = ff_put_h264_qpel8_mc00_mmi; - c->put_h264_qpel_pixels_tab[1][1] = ff_put_h264_qpel8_mc10_mmi; - c->put_h264_qpel_pixels_tab[1][2] = ff_put_h264_qpel8_mc20_mmi; - c->put_h264_qpel_pixels_tab[1][3] = ff_put_h264_qpel8_mc30_mmi; - c->put_h264_qpel_pixels_tab[1][4] = ff_put_h264_qpel8_mc01_mmi; - c->put_h264_qpel_pixels_tab[1][5] = ff_put_h264_qpel8_mc11_mmi; - c->put_h264_qpel_pixels_tab[1][6] = ff_put_h264_qpel8_mc21_mmi; - c->put_h264_qpel_pixels_tab[1][7] = ff_put_h264_qpel8_mc31_mmi; - c->put_h264_qpel_pixels_tab[1][8] = ff_put_h264_qpel8_mc02_mmi; - c->put_h264_qpel_pixels_tab[1][9] = ff_put_h264_qpel8_mc12_mmi; - c->put_h264_qpel_pixels_tab[1][10] = ff_put_h264_qpel8_mc22_mmi; - c->put_h264_qpel_pixels_tab[1][11] = ff_put_h264_qpel8_mc32_mmi; - c->put_h264_qpel_pixels_tab[1][12] = ff_put_h264_qpel8_mc03_mmi; - c->put_h264_qpel_pixels_tab[1][13] = ff_put_h264_qpel8_mc13_mmi; - c->put_h264_qpel_pixels_tab[1][14] = ff_put_h264_qpel8_mc23_mmi; - c->put_h264_qpel_pixels_tab[1][15] = ff_put_h264_qpel8_mc33_mmi; + c->put_h264_qpel_pixels_tab[1][0] = ff_put_h264_qpel8_mc00_msa; + c->put_h264_qpel_pixels_tab[1][1] = ff_put_h264_qpel8_mc10_msa; + c->put_h264_qpel_pixels_tab[1][2] = ff_put_h264_qpel8_mc20_msa; + c->put_h264_qpel_pixels_tab[1][3] = ff_put_h264_qpel8_mc30_msa; + c->put_h264_qpel_pixels_tab[1][4] = ff_put_h264_qpel8_mc01_msa; + c->put_h264_qpel_pixels_tab[1][5] = ff_put_h264_qpel8_mc11_msa; + c->put_h264_qpel_pixels_tab[1][6] = ff_put_h264_qpel8_mc21_msa; + c->put_h264_qpel_pixels_tab[1][7] = ff_put_h264_qpel8_mc31_msa; + c->put_h264_qpel_pixels_tab[1][8] = ff_put_h264_qpel8_mc02_msa; + c->put_h264_qpel_pixels_tab[1][9] = ff_put_h264_qpel8_mc12_msa; + c->put_h264_qpel_pixels_tab[1][10] = ff_put_h264_qpel8_mc22_msa; + c->put_h264_qpel_pixels_tab[1][11] = ff_put_h264_qpel8_mc32_msa; + c->put_h264_qpel_pixels_tab[1][12] = ff_put_h264_qpel8_mc03_msa; + c->put_h264_qpel_pixels_tab[1][13] = ff_put_h264_qpel8_mc13_msa; + c->put_h264_qpel_pixels_tab[1][14] = ff_put_h264_qpel8_mc23_msa; + c->put_h264_qpel_pixels_tab[1][15] = ff_put_h264_qpel8_mc33_msa; - c->put_h264_qpel_pixels_tab[2][0] = ff_put_h264_qpel4_mc00_mmi; - c->put_h264_qpel_pixels_tab[2][1] = ff_put_h264_qpel4_mc10_mmi; - c->put_h264_qpel_pixels_tab[2][2] = ff_put_h264_qpel4_mc20_mmi; - c->put_h264_qpel_pixels_tab[2][3] = ff_put_h264_qpel4_mc30_mmi; - c->put_h264_qpel_pixels_tab[2][4] = ff_put_h264_qpel4_mc01_mmi; - c->put_h264_qpel_pixels_tab[2][5] = ff_put_h264_qpel4_mc11_mmi; - c->put_h264_qpel_pixels_tab[2][6] = ff_put_h264_qpel4_mc21_mmi; - c->put_h264_qpel_pixels_tab[2][7] = ff_put_h264_qpel4_mc31_mmi; - c->put_h264_qpel_pixels_tab[2][8] = ff_put_h264_qpel4_mc02_mmi; - c->put_h264_qpel_pixels_tab[2][9] = ff_put_h264_qpel4_mc12_mmi; - c->put_h264_qpel_pixels_tab[2][10] = ff_put_h264_qpel4_mc22_mmi; - c->put_h264_qpel_pixels_tab[2][11] = ff_put_h264_qpel4_mc32_mmi; - c->put_h264_qpel_pixels_tab[2][12] = ff_put_h264_qpel4_mc03_mmi; - c->put_h264_qpel_pixels_tab[2][13] = ff_put_h264_qpel4_mc13_mmi; - c->put_h264_qpel_pixels_tab[2][14] = ff_put_h264_qpel4_mc23_mmi; - c->put_h264_qpel_pixels_tab[2][15] = ff_put_h264_qpel4_mc33_mmi; + c->put_h264_qpel_pixels_tab[2][1] = ff_put_h264_qpel4_mc10_msa; + c->put_h264_qpel_pixels_tab[2][2] = ff_put_h264_qpel4_mc20_msa; + c->put_h264_qpel_pixels_tab[2][3] = ff_put_h264_qpel4_mc30_msa; + c->put_h264_qpel_pixels_tab[2][4] = ff_put_h264_qpel4_mc01_msa; + c->put_h264_qpel_pixels_tab[2][5] = ff_put_h264_qpel4_mc11_msa; + c->put_h264_qpel_pixels_tab[2][6] = ff_put_h264_qpel4_mc21_msa; + c->put_h264_qpel_pixels_tab[2][7] = ff_put_h264_qpel4_mc31_msa; + c->put_h264_qpel_pixels_tab[2][8] = ff_put_h264_qpel4_mc02_msa; + c->put_h264_qpel_pixels_tab[2][9] = ff_put_h264_qpel4_mc12_msa; + c->put_h264_qpel_pixels_tab[2][10] = ff_put_h264_qpel4_mc22_msa; + c->put_h264_qpel_pixels_tab[2][11] = ff_put_h264_qpel4_mc32_msa; + c->put_h264_qpel_pixels_tab[2][12] = ff_put_h264_qpel4_mc03_msa; + c->put_h264_qpel_pixels_tab[2][13] = ff_put_h264_qpel4_mc13_msa; + c->put_h264_qpel_pixels_tab[2][14] = ff_put_h264_qpel4_mc23_msa; + c->put_h264_qpel_pixels_tab[2][15] = ff_put_h264_qpel4_mc33_msa; - c->avg_h264_qpel_pixels_tab[0][0] = ff_avg_h264_qpel16_mc00_mmi; - c->avg_h264_qpel_pixels_tab[0][1] = ff_avg_h264_qpel16_mc10_mmi; - c->avg_h264_qpel_pixels_tab[0][2] = ff_avg_h264_qpel16_mc20_mmi; - c->avg_h264_qpel_pixels_tab[0][3] = ff_avg_h264_qpel16_mc30_mmi; - c->avg_h264_qpel_pixels_tab[0][4] = ff_avg_h264_qpel16_mc01_mmi; - c->avg_h264_qpel_pixels_tab[0][5] = ff_avg_h264_qpel16_mc11_mmi; - c->avg_h264_qpel_pixels_tab[0][6] = ff_avg_h264_qpel16_mc21_mmi; - c->avg_h264_qpel_pixels_tab[0][7] = ff_avg_h264_qpel16_mc31_mmi; - c->avg_h264_qpel_pixels_tab[0][8] = ff_avg_h264_qpel16_mc02_mmi; - c->avg_h264_qpel_pixels_tab[0][9] = ff_avg_h264_qpel16_mc12_mmi; - c->avg_h264_qpel_pixels_tab[0][10] = ff_avg_h264_qpel16_mc22_mmi; - c->avg_h264_qpel_pixels_tab[0][11] = ff_avg_h264_qpel16_mc32_mmi; - c->avg_h264_qpel_pixels_tab[0][12] = ff_avg_h264_qpel16_mc03_mmi; - c->avg_h264_qpel_pixels_tab[0][13] = ff_avg_h264_qpel16_mc13_mmi; - c->avg_h264_qpel_pixels_tab[0][14] = ff_avg_h264_qpel16_mc23_mmi; - c->avg_h264_qpel_pixels_tab[0][15] = ff_avg_h264_qpel16_mc33_mmi; + c->avg_h264_qpel_pixels_tab[0][0] = ff_avg_h264_qpel16_mc00_msa; + c->avg_h264_qpel_pixels_tab[0][1] = ff_avg_h264_qpel16_mc10_msa; + c->avg_h264_qpel_pixels_tab[0][2] = ff_avg_h264_qpel16_mc20_msa; + c->avg_h264_qpel_pixels_tab[0][3] = ff_avg_h264_qpel16_mc30_msa; + c->avg_h264_qpel_pixels_tab[0][4] = ff_avg_h264_qpel16_mc01_msa; + c->avg_h264_qpel_pixels_tab[0][5] = ff_avg_h264_qpel16_mc11_msa; + c->avg_h264_qpel_pixels_tab[0][6] = ff_avg_h264_qpel16_mc21_msa; + c->avg_h264_qpel_pixels_tab[0][7] = ff_avg_h264_qpel16_mc31_msa; + c->avg_h264_qpel_pixels_tab[0][8] = ff_avg_h264_qpel16_mc02_msa; + c->avg_h264_qpel_pixels_tab[0][9] = ff_avg_h264_qpel16_mc12_msa; + c->avg_h264_qpel_pixels_tab[0][10] = ff_avg_h264_qpel16_mc22_msa; + c->avg_h264_qpel_pixels_tab[0][11] = ff_avg_h264_qpel16_mc32_msa; + c->avg_h264_qpel_pixels_tab[0][12] = ff_avg_h264_qpel16_mc03_msa; + c->avg_h264_qpel_pixels_tab[0][13] = ff_avg_h264_qpel16_mc13_msa; + c->avg_h264_qpel_pixels_tab[0][14] = ff_avg_h264_qpel16_mc23_msa; + c->avg_h264_qpel_pixels_tab[0][15] = ff_avg_h264_qpel16_mc33_msa; - c->avg_h264_qpel_pixels_tab[1][0] = ff_avg_h264_qpel8_mc00_mmi; - c->avg_h264_qpel_pixels_tab[1][1] = ff_avg_h264_qpel8_mc10_mmi; - c->avg_h264_qpel_pixels_tab[1][2] = ff_avg_h264_qpel8_mc20_mmi; - c->avg_h264_qpel_pixels_tab[1][3] = ff_avg_h264_qpel8_mc30_mmi; - c->avg_h264_qpel_pixels_tab[1][4] = ff_avg_h264_qpel8_mc01_mmi; - c->avg_h264_qpel_pixels_tab[1][5] = ff_avg_h264_qpel8_mc11_mmi; - c->avg_h264_qpel_pixels_tab[1][6] = ff_avg_h264_qpel8_mc21_mmi; - c->avg_h264_qpel_pixels_tab[1][7] = ff_avg_h264_qpel8_mc31_mmi; - c->avg_h264_qpel_pixels_tab[1][8] = ff_avg_h264_qpel8_mc02_mmi; - c->avg_h264_qpel_pixels_tab[1][9] = ff_avg_h264_qpel8_mc12_mmi; - c->avg_h264_qpel_pixels_tab[1][10] = ff_avg_h264_qpel8_mc22_mmi; - c->avg_h264_qpel_pixels_tab[1][11] = ff_avg_h264_qpel8_mc32_mmi; - c->avg_h264_qpel_pixels_tab[1][12] = ff_avg_h264_qpel8_mc03_mmi; - c->avg_h264_qpel_pixels_tab[1][13] = ff_avg_h264_qpel8_mc13_mmi; - c->avg_h264_qpel_pixels_tab[1][14] = ff_avg_h264_qpel8_mc23_mmi; - c->avg_h264_qpel_pixels_tab[1][15] = ff_avg_h264_qpel8_mc33_mmi; + c->avg_h264_qpel_pixels_tab[1][0] = ff_avg_h264_qpel8_mc00_msa; + c->avg_h264_qpel_pixels_tab[1][1] = ff_avg_h264_qpel8_mc10_msa; + c->avg_h264_qpel_pixels_tab[1][2] = ff_avg_h264_qpel8_mc20_msa; + c->avg_h264_qpel_pixels_tab[1][3] = ff_avg_h264_qpel8_mc30_msa; + c->avg_h264_qpel_pixels_tab[1][4] = ff_avg_h264_qpel8_mc01_msa; + c->avg_h264_qpel_pixels_tab[1][5] = ff_avg_h264_qpel8_mc11_msa; + c->avg_h264_qpel_pixels_tab[1][6] = ff_avg_h264_qpel8_mc21_msa; + c->avg_h264_qpel_pixels_tab[1][7] = ff_avg_h264_qpel8_mc31_msa; + c->avg_h264_qpel_pixels_tab[1][8] = ff_avg_h264_qpel8_mc02_msa; + c->avg_h264_qpel_pixels_tab[1][9] = ff_avg_h264_qpel8_mc12_msa; + c->avg_h264_qpel_pixels_tab[1][10] = ff_avg_h264_qpel8_mc22_msa; + c->avg_h264_qpel_pixels_tab[1][11] = ff_avg_h264_qpel8_mc32_msa; + c->avg_h264_qpel_pixels_tab[1][12] = ff_avg_h264_qpel8_mc03_msa; + c->avg_h264_qpel_pixels_tab[1][13] = ff_avg_h264_qpel8_mc13_msa; + c->avg_h264_qpel_pixels_tab[1][14] = ff_avg_h264_qpel8_mc23_msa; + c->avg_h264_qpel_pixels_tab[1][15] = ff_avg_h264_qpel8_mc33_msa; - c->avg_h264_qpel_pixels_tab[2][0] = ff_avg_h264_qpel4_mc00_mmi; - c->avg_h264_qpel_pixels_tab[2][1] = ff_avg_h264_qpel4_mc10_mmi; - c->avg_h264_qpel_pixels_tab[2][2] = ff_avg_h264_qpel4_mc20_mmi; - c->avg_h264_qpel_pixels_tab[2][3] = ff_avg_h264_qpel4_mc30_mmi; - c->avg_h264_qpel_pixels_tab[2][4] = ff_avg_h264_qpel4_mc01_mmi; - c->avg_h264_qpel_pixels_tab[2][5] = ff_avg_h264_qpel4_mc11_mmi; - c->avg_h264_qpel_pixels_tab[2][6] = ff_avg_h264_qpel4_mc21_mmi; - c->avg_h264_qpel_pixels_tab[2][7] = ff_avg_h264_qpel4_mc31_mmi; - c->avg_h264_qpel_pixels_tab[2][8] = ff_avg_h264_qpel4_mc02_mmi; - c->avg_h264_qpel_pixels_tab[2][9] = ff_avg_h264_qpel4_mc12_mmi; - c->avg_h264_qpel_pixels_tab[2][10] = ff_avg_h264_qpel4_mc22_mmi; - c->avg_h264_qpel_pixels_tab[2][11] = ff_avg_h264_qpel4_mc32_mmi; - c->avg_h264_qpel_pixels_tab[2][12] = ff_avg_h264_qpel4_mc03_mmi; - c->avg_h264_qpel_pixels_tab[2][13] = ff_avg_h264_qpel4_mc13_mmi; - c->avg_h264_qpel_pixels_tab[2][14] = ff_avg_h264_qpel4_mc23_mmi; - c->avg_h264_qpel_pixels_tab[2][15] = ff_avg_h264_qpel4_mc33_mmi; + c->avg_h264_qpel_pixels_tab[2][0] = ff_avg_h264_qpel4_mc00_msa; + c->avg_h264_qpel_pixels_tab[2][1] = ff_avg_h264_qpel4_mc10_msa; + c->avg_h264_qpel_pixels_tab[2][2] = ff_avg_h264_qpel4_mc20_msa; + c->avg_h264_qpel_pixels_tab[2][3] = ff_avg_h264_qpel4_mc30_msa; + c->avg_h264_qpel_pixels_tab[2][4] = ff_avg_h264_qpel4_mc01_msa; + c->avg_h264_qpel_pixels_tab[2][5] = ff_avg_h264_qpel4_mc11_msa; + c->avg_h264_qpel_pixels_tab[2][6] = ff_avg_h264_qpel4_mc21_msa; + c->avg_h264_qpel_pixels_tab[2][7] = ff_avg_h264_qpel4_mc31_msa; + c->avg_h264_qpel_pixels_tab[2][8] = ff_avg_h264_qpel4_mc02_msa; + c->avg_h264_qpel_pixels_tab[2][9] = ff_avg_h264_qpel4_mc12_msa; + c->avg_h264_qpel_pixels_tab[2][10] = ff_avg_h264_qpel4_mc22_msa; + c->avg_h264_qpel_pixels_tab[2][11] = ff_avg_h264_qpel4_mc32_msa; + c->avg_h264_qpel_pixels_tab[2][12] = ff_avg_h264_qpel4_mc03_msa; + c->avg_h264_qpel_pixels_tab[2][13] = ff_avg_h264_qpel4_mc13_msa; + c->avg_h264_qpel_pixels_tab[2][14] = ff_avg_h264_qpel4_mc23_msa; + c->avg_h264_qpel_pixels_tab[2][15] = ff_avg_h264_qpel4_mc33_msa; + } } } -#endif /* HAVE_MMI */ - -av_cold void ff_h264qpel_init_mips(H264QpelContext *c, int bit_depth) -{ -#if HAVE_MMI - h264qpel_init_mmi(c, bit_depth); -#endif /* HAVE_MMI */ -#if HAVE_MSA - h264qpel_init_msa(c, bit_depth); -#endif // #if HAVE_MSA -} diff --git a/libavcodec/mips/hevcdsp_init_mips.c b/libavcodec/mips/hevcdsp_init_mips.c index 88337f462e..eb261e5adf 100644 --- a/libavcodec/mips/hevcdsp_init_mips.c +++ b/libavcodec/mips/hevcdsp_init_mips.c @@ -18,512 +18,500 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "libavcodec/mips/hevcdsp_mips.h" -#if HAVE_MMI -static av_cold void hevc_dsp_init_mmi(HEVCDSPContext *c, - const int bit_depth) +void ff_hevc_dsp_init_mips(HEVCDSPContext *c, const int bit_depth) { - if (8 == bit_depth) { - c->put_hevc_qpel[1][0][1] = ff_hevc_put_hevc_qpel_h4_8_mmi; - c->put_hevc_qpel[3][0][1] = ff_hevc_put_hevc_qpel_h8_8_mmi; - c->put_hevc_qpel[4][0][1] = ff_hevc_put_hevc_qpel_h12_8_mmi; - c->put_hevc_qpel[5][0][1] = ff_hevc_put_hevc_qpel_h16_8_mmi; - c->put_hevc_qpel[6][0][1] = ff_hevc_put_hevc_qpel_h24_8_mmi; - c->put_hevc_qpel[7][0][1] = ff_hevc_put_hevc_qpel_h32_8_mmi; - c->put_hevc_qpel[8][0][1] = ff_hevc_put_hevc_qpel_h48_8_mmi; - c->put_hevc_qpel[9][0][1] = ff_hevc_put_hevc_qpel_h64_8_mmi; - - c->put_hevc_qpel[1][1][1] = ff_hevc_put_hevc_qpel_hv4_8_mmi; - c->put_hevc_qpel[3][1][1] = ff_hevc_put_hevc_qpel_hv8_8_mmi; - c->put_hevc_qpel[4][1][1] = ff_hevc_put_hevc_qpel_hv12_8_mmi; - c->put_hevc_qpel[5][1][1] = ff_hevc_put_hevc_qpel_hv16_8_mmi; - c->put_hevc_qpel[6][1][1] = ff_hevc_put_hevc_qpel_hv24_8_mmi; - c->put_hevc_qpel[7][1][1] = ff_hevc_put_hevc_qpel_hv32_8_mmi; - c->put_hevc_qpel[8][1][1] = ff_hevc_put_hevc_qpel_hv48_8_mmi; - c->put_hevc_qpel[9][1][1] = ff_hevc_put_hevc_qpel_hv64_8_mmi; - - c->put_hevc_qpel_bi[1][0][1] = ff_hevc_put_hevc_qpel_bi_h4_8_mmi; - c->put_hevc_qpel_bi[3][0][1] = ff_hevc_put_hevc_qpel_bi_h8_8_mmi; - c->put_hevc_qpel_bi[4][0][1] = ff_hevc_put_hevc_qpel_bi_h12_8_mmi; - c->put_hevc_qpel_bi[5][0][1] = ff_hevc_put_hevc_qpel_bi_h16_8_mmi; - c->put_hevc_qpel_bi[6][0][1] = ff_hevc_put_hevc_qpel_bi_h24_8_mmi; - c->put_hevc_qpel_bi[7][0][1] = ff_hevc_put_hevc_qpel_bi_h32_8_mmi; - c->put_hevc_qpel_bi[8][0][1] = ff_hevc_put_hevc_qpel_bi_h48_8_mmi; - c->put_hevc_qpel_bi[9][0][1] = ff_hevc_put_hevc_qpel_bi_h64_8_mmi; - - c->put_hevc_qpel_bi[1][1][1] = ff_hevc_put_hevc_qpel_bi_hv4_8_mmi; - c->put_hevc_qpel_bi[3][1][1] = ff_hevc_put_hevc_qpel_bi_hv8_8_mmi; - c->put_hevc_qpel_bi[4][1][1] = ff_hevc_put_hevc_qpel_bi_hv12_8_mmi; - c->put_hevc_qpel_bi[5][1][1] = ff_hevc_put_hevc_qpel_bi_hv16_8_mmi; - c->put_hevc_qpel_bi[6][1][1] = ff_hevc_put_hevc_qpel_bi_hv24_8_mmi; - c->put_hevc_qpel_bi[7][1][1] = ff_hevc_put_hevc_qpel_bi_hv32_8_mmi; - c->put_hevc_qpel_bi[8][1][1] = ff_hevc_put_hevc_qpel_bi_hv48_8_mmi; - c->put_hevc_qpel_bi[9][1][1] = ff_hevc_put_hevc_qpel_bi_hv64_8_mmi; - - c->put_hevc_qpel_bi[3][0][0] = ff_hevc_put_hevc_pel_bi_pixels8_8_mmi; - c->put_hevc_qpel_bi[5][0][0] = ff_hevc_put_hevc_pel_bi_pixels16_8_mmi; - c->put_hevc_qpel_bi[6][0][0] = ff_hevc_put_hevc_pel_bi_pixels24_8_mmi; - c->put_hevc_qpel_bi[7][0][0] = ff_hevc_put_hevc_pel_bi_pixels32_8_mmi; - c->put_hevc_qpel_bi[8][0][0] = ff_hevc_put_hevc_pel_bi_pixels48_8_mmi; - c->put_hevc_qpel_bi[9][0][0] = ff_hevc_put_hevc_pel_bi_pixels64_8_mmi; - - c->put_hevc_epel_bi[3][0][0] = ff_hevc_put_hevc_pel_bi_pixels8_8_mmi; - c->put_hevc_epel_bi[5][0][0] = ff_hevc_put_hevc_pel_bi_pixels16_8_mmi; - c->put_hevc_epel_bi[6][0][0] = ff_hevc_put_hevc_pel_bi_pixels24_8_mmi; - c->put_hevc_epel_bi[7][0][0] = ff_hevc_put_hevc_pel_bi_pixels32_8_mmi; - - c->put_hevc_epel_bi[1][1][1] = ff_hevc_put_hevc_epel_bi_hv4_8_mmi; - c->put_hevc_epel_bi[3][1][1] = ff_hevc_put_hevc_epel_bi_hv8_8_mmi; - c->put_hevc_epel_bi[4][1][1] = ff_hevc_put_hevc_epel_bi_hv12_8_mmi; - c->put_hevc_epel_bi[5][1][1] = ff_hevc_put_hevc_epel_bi_hv16_8_mmi; - c->put_hevc_epel_bi[6][1][1] = ff_hevc_put_hevc_epel_bi_hv24_8_mmi; - c->put_hevc_epel_bi[7][1][1] = ff_hevc_put_hevc_epel_bi_hv32_8_mmi; - - c->put_hevc_qpel_uni[1][1][1] = ff_hevc_put_hevc_qpel_uni_hv4_8_mmi; - c->put_hevc_qpel_uni[3][1][1] = ff_hevc_put_hevc_qpel_uni_hv8_8_mmi; - c->put_hevc_qpel_uni[4][1][1] = ff_hevc_put_hevc_qpel_uni_hv12_8_mmi; - c->put_hevc_qpel_uni[5][1][1] = ff_hevc_put_hevc_qpel_uni_hv16_8_mmi; - c->put_hevc_qpel_uni[6][1][1] = ff_hevc_put_hevc_qpel_uni_hv24_8_mmi; - c->put_hevc_qpel_uni[7][1][1] = ff_hevc_put_hevc_qpel_uni_hv32_8_mmi; - c->put_hevc_qpel_uni[8][1][1] = ff_hevc_put_hevc_qpel_uni_hv48_8_mmi; - c->put_hevc_qpel_uni[9][1][1] = ff_hevc_put_hevc_qpel_uni_hv64_8_mmi; + int cpu_flags = av_get_cpu_flags(); + + if (have_mmi(cpu_flags)) { + if (bit_depth == 8) { + c->put_hevc_qpel[1][0][1] = ff_hevc_put_hevc_qpel_h4_8_mmi; + c->put_hevc_qpel[3][0][1] = ff_hevc_put_hevc_qpel_h8_8_mmi; + c->put_hevc_qpel[4][0][1] = ff_hevc_put_hevc_qpel_h12_8_mmi; + c->put_hevc_qpel[5][0][1] = ff_hevc_put_hevc_qpel_h16_8_mmi; + c->put_hevc_qpel[6][0][1] = ff_hevc_put_hevc_qpel_h24_8_mmi; + c->put_hevc_qpel[7][0][1] = ff_hevc_put_hevc_qpel_h32_8_mmi; + c->put_hevc_qpel[8][0][1] = ff_hevc_put_hevc_qpel_h48_8_mmi; + c->put_hevc_qpel[9][0][1] = ff_hevc_put_hevc_qpel_h64_8_mmi; + + c->put_hevc_qpel[1][1][1] = ff_hevc_put_hevc_qpel_hv4_8_mmi; + c->put_hevc_qpel[3][1][1] = ff_hevc_put_hevc_qpel_hv8_8_mmi; + c->put_hevc_qpel[4][1][1] = ff_hevc_put_hevc_qpel_hv12_8_mmi; + c->put_hevc_qpel[5][1][1] = ff_hevc_put_hevc_qpel_hv16_8_mmi; + c->put_hevc_qpel[6][1][1] = ff_hevc_put_hevc_qpel_hv24_8_mmi; + c->put_hevc_qpel[7][1][1] = ff_hevc_put_hevc_qpel_hv32_8_mmi; + c->put_hevc_qpel[8][1][1] = ff_hevc_put_hevc_qpel_hv48_8_mmi; + c->put_hevc_qpel[9][1][1] = ff_hevc_put_hevc_qpel_hv64_8_mmi; + + c->put_hevc_qpel_bi[1][0][1] = ff_hevc_put_hevc_qpel_bi_h4_8_mmi; + c->put_hevc_qpel_bi[3][0][1] = ff_hevc_put_hevc_qpel_bi_h8_8_mmi; + c->put_hevc_qpel_bi[4][0][1] = ff_hevc_put_hevc_qpel_bi_h12_8_mmi; + c->put_hevc_qpel_bi[5][0][1] = ff_hevc_put_hevc_qpel_bi_h16_8_mmi; + c->put_hevc_qpel_bi[6][0][1] = ff_hevc_put_hevc_qpel_bi_h24_8_mmi; + c->put_hevc_qpel_bi[7][0][1] = ff_hevc_put_hevc_qpel_bi_h32_8_mmi; + c->put_hevc_qpel_bi[8][0][1] = ff_hevc_put_hevc_qpel_bi_h48_8_mmi; + c->put_hevc_qpel_bi[9][0][1] = ff_hevc_put_hevc_qpel_bi_h64_8_mmi; + + c->put_hevc_qpel_bi[1][1][1] = ff_hevc_put_hevc_qpel_bi_hv4_8_mmi; + c->put_hevc_qpel_bi[3][1][1] = ff_hevc_put_hevc_qpel_bi_hv8_8_mmi; + c->put_hevc_qpel_bi[4][1][1] = ff_hevc_put_hevc_qpel_bi_hv12_8_mmi; + c->put_hevc_qpel_bi[5][1][1] = ff_hevc_put_hevc_qpel_bi_hv16_8_mmi; + c->put_hevc_qpel_bi[6][1][1] = ff_hevc_put_hevc_qpel_bi_hv24_8_mmi; + c->put_hevc_qpel_bi[7][1][1] = ff_hevc_put_hevc_qpel_bi_hv32_8_mmi; + c->put_hevc_qpel_bi[8][1][1] = ff_hevc_put_hevc_qpel_bi_hv48_8_mmi; + c->put_hevc_qpel_bi[9][1][1] = ff_hevc_put_hevc_qpel_bi_hv64_8_mmi; + + c->put_hevc_qpel_bi[3][0][0] = ff_hevc_put_hevc_pel_bi_pixels8_8_mmi; + c->put_hevc_qpel_bi[5][0][0] = ff_hevc_put_hevc_pel_bi_pixels16_8_mmi; + c->put_hevc_qpel_bi[6][0][0] = ff_hevc_put_hevc_pel_bi_pixels24_8_mmi; + c->put_hevc_qpel_bi[7][0][0] = ff_hevc_put_hevc_pel_bi_pixels32_8_mmi; + c->put_hevc_qpel_bi[8][0][0] = ff_hevc_put_hevc_pel_bi_pixels48_8_mmi; + c->put_hevc_qpel_bi[9][0][0] = ff_hevc_put_hevc_pel_bi_pixels64_8_mmi; + + c->put_hevc_epel_bi[3][0][0] = ff_hevc_put_hevc_pel_bi_pixels8_8_mmi; + c->put_hevc_epel_bi[5][0][0] = ff_hevc_put_hevc_pel_bi_pixels16_8_mmi; + c->put_hevc_epel_bi[6][0][0] = ff_hevc_put_hevc_pel_bi_pixels24_8_mmi; + c->put_hevc_epel_bi[7][0][0] = ff_hevc_put_hevc_pel_bi_pixels32_8_mmi; + + c->put_hevc_epel_bi[1][1][1] = ff_hevc_put_hevc_epel_bi_hv4_8_mmi; + c->put_hevc_epel_bi[3][1][1] = ff_hevc_put_hevc_epel_bi_hv8_8_mmi; + c->put_hevc_epel_bi[4][1][1] = ff_hevc_put_hevc_epel_bi_hv12_8_mmi; + c->put_hevc_epel_bi[5][1][1] = ff_hevc_put_hevc_epel_bi_hv16_8_mmi; + c->put_hevc_epel_bi[6][1][1] = ff_hevc_put_hevc_epel_bi_hv24_8_mmi; + c->put_hevc_epel_bi[7][1][1] = ff_hevc_put_hevc_epel_bi_hv32_8_mmi; + + c->put_hevc_qpel_uni[1][1][1] = ff_hevc_put_hevc_qpel_uni_hv4_8_mmi; + c->put_hevc_qpel_uni[3][1][1] = ff_hevc_put_hevc_qpel_uni_hv8_8_mmi; + c->put_hevc_qpel_uni[4][1][1] = ff_hevc_put_hevc_qpel_uni_hv12_8_mmi; + c->put_hevc_qpel_uni[5][1][1] = ff_hevc_put_hevc_qpel_uni_hv16_8_mmi; + c->put_hevc_qpel_uni[6][1][1] = ff_hevc_put_hevc_qpel_uni_hv24_8_mmi; + c->put_hevc_qpel_uni[7][1][1] = ff_hevc_put_hevc_qpel_uni_hv32_8_mmi; + c->put_hevc_qpel_uni[8][1][1] = ff_hevc_put_hevc_qpel_uni_hv48_8_mmi; + c->put_hevc_qpel_uni[9][1][1] = ff_hevc_put_hevc_qpel_uni_hv64_8_mmi; + } } -} -#endif // #if HAVE_MMI -#if HAVE_MSA -static av_cold void hevc_dsp_init_msa(HEVCDSPContext *c, - const int bit_depth) -{ - if (8 == bit_depth) { - c->put_hevc_qpel[1][0][0] = ff_hevc_put_hevc_pel_pixels4_8_msa; - c->put_hevc_qpel[2][0][0] = ff_hevc_put_hevc_pel_pixels6_8_msa; - c->put_hevc_qpel[3][0][0] = ff_hevc_put_hevc_pel_pixels8_8_msa; - c->put_hevc_qpel[4][0][0] = ff_hevc_put_hevc_pel_pixels12_8_msa; - c->put_hevc_qpel[5][0][0] = ff_hevc_put_hevc_pel_pixels16_8_msa; - c->put_hevc_qpel[6][0][0] = ff_hevc_put_hevc_pel_pixels24_8_msa; - c->put_hevc_qpel[7][0][0] = ff_hevc_put_hevc_pel_pixels32_8_msa; - c->put_hevc_qpel[8][0][0] = ff_hevc_put_hevc_pel_pixels48_8_msa; - c->put_hevc_qpel[9][0][0] = ff_hevc_put_hevc_pel_pixels64_8_msa; - - c->put_hevc_qpel[1][0][1] = ff_hevc_put_hevc_qpel_h4_8_msa; - c->put_hevc_qpel[3][0][1] = ff_hevc_put_hevc_qpel_h8_8_msa; - c->put_hevc_qpel[4][0][1] = ff_hevc_put_hevc_qpel_h12_8_msa; - c->put_hevc_qpel[5][0][1] = ff_hevc_put_hevc_qpel_h16_8_msa; - c->put_hevc_qpel[6][0][1] = ff_hevc_put_hevc_qpel_h24_8_msa; - c->put_hevc_qpel[7][0][1] = ff_hevc_put_hevc_qpel_h32_8_msa; - c->put_hevc_qpel[8][0][1] = ff_hevc_put_hevc_qpel_h48_8_msa; - c->put_hevc_qpel[9][0][1] = ff_hevc_put_hevc_qpel_h64_8_msa; - - c->put_hevc_qpel[1][1][0] = ff_hevc_put_hevc_qpel_v4_8_msa; - c->put_hevc_qpel[3][1][0] = ff_hevc_put_hevc_qpel_v8_8_msa; - c->put_hevc_qpel[4][1][0] = ff_hevc_put_hevc_qpel_v12_8_msa; - c->put_hevc_qpel[5][1][0] = ff_hevc_put_hevc_qpel_v16_8_msa; - c->put_hevc_qpel[6][1][0] = ff_hevc_put_hevc_qpel_v24_8_msa; - c->put_hevc_qpel[7][1][0] = ff_hevc_put_hevc_qpel_v32_8_msa; - c->put_hevc_qpel[8][1][0] = ff_hevc_put_hevc_qpel_v48_8_msa; - c->put_hevc_qpel[9][1][0] = ff_hevc_put_hevc_qpel_v64_8_msa; - - c->put_hevc_qpel[1][1][1] = ff_hevc_put_hevc_qpel_hv4_8_msa; - c->put_hevc_qpel[3][1][1] = ff_hevc_put_hevc_qpel_hv8_8_msa; - c->put_hevc_qpel[4][1][1] = ff_hevc_put_hevc_qpel_hv12_8_msa; - c->put_hevc_qpel[5][1][1] = ff_hevc_put_hevc_qpel_hv16_8_msa; - c->put_hevc_qpel[6][1][1] = ff_hevc_put_hevc_qpel_hv24_8_msa; - c->put_hevc_qpel[7][1][1] = ff_hevc_put_hevc_qpel_hv32_8_msa; - c->put_hevc_qpel[8][1][1] = ff_hevc_put_hevc_qpel_hv48_8_msa; - c->put_hevc_qpel[9][1][1] = ff_hevc_put_hevc_qpel_hv64_8_msa; - - c->put_hevc_epel[1][0][0] = ff_hevc_put_hevc_pel_pixels4_8_msa; - c->put_hevc_epel[2][0][0] = ff_hevc_put_hevc_pel_pixels6_8_msa; - c->put_hevc_epel[3][0][0] = ff_hevc_put_hevc_pel_pixels8_8_msa; - c->put_hevc_epel[4][0][0] = ff_hevc_put_hevc_pel_pixels12_8_msa; - c->put_hevc_epel[5][0][0] = ff_hevc_put_hevc_pel_pixels16_8_msa; - c->put_hevc_epel[6][0][0] = ff_hevc_put_hevc_pel_pixels24_8_msa; - c->put_hevc_epel[7][0][0] = ff_hevc_put_hevc_pel_pixels32_8_msa; - - c->put_hevc_epel[1][0][1] = ff_hevc_put_hevc_epel_h4_8_msa; - c->put_hevc_epel[2][0][1] = ff_hevc_put_hevc_epel_h6_8_msa; - c->put_hevc_epel[3][0][1] = ff_hevc_put_hevc_epel_h8_8_msa; - c->put_hevc_epel[4][0][1] = ff_hevc_put_hevc_epel_h12_8_msa; - c->put_hevc_epel[5][0][1] = ff_hevc_put_hevc_epel_h16_8_msa; - c->put_hevc_epel[6][0][1] = ff_hevc_put_hevc_epel_h24_8_msa; - c->put_hevc_epel[7][0][1] = ff_hevc_put_hevc_epel_h32_8_msa; - - c->put_hevc_epel[1][1][0] = ff_hevc_put_hevc_epel_v4_8_msa; - c->put_hevc_epel[2][1][0] = ff_hevc_put_hevc_epel_v6_8_msa; - c->put_hevc_epel[3][1][0] = ff_hevc_put_hevc_epel_v8_8_msa; - c->put_hevc_epel[4][1][0] = ff_hevc_put_hevc_epel_v12_8_msa; - c->put_hevc_epel[5][1][0] = ff_hevc_put_hevc_epel_v16_8_msa; - c->put_hevc_epel[6][1][0] = ff_hevc_put_hevc_epel_v24_8_msa; - c->put_hevc_epel[7][1][0] = ff_hevc_put_hevc_epel_v32_8_msa; - - c->put_hevc_epel[1][1][1] = ff_hevc_put_hevc_epel_hv4_8_msa; - c->put_hevc_epel[2][1][1] = ff_hevc_put_hevc_epel_hv6_8_msa; - c->put_hevc_epel[3][1][1] = ff_hevc_put_hevc_epel_hv8_8_msa; - c->put_hevc_epel[4][1][1] = ff_hevc_put_hevc_epel_hv12_8_msa; - c->put_hevc_epel[5][1][1] = ff_hevc_put_hevc_epel_hv16_8_msa; - c->put_hevc_epel[6][1][1] = ff_hevc_put_hevc_epel_hv24_8_msa; - c->put_hevc_epel[7][1][1] = ff_hevc_put_hevc_epel_hv32_8_msa; - - c->put_hevc_qpel_uni[3][0][0] = ff_hevc_put_hevc_uni_pel_pixels8_8_msa; - c->put_hevc_qpel_uni[4][0][0] = ff_hevc_put_hevc_uni_pel_pixels12_8_msa; - c->put_hevc_qpel_uni[5][0][0] = ff_hevc_put_hevc_uni_pel_pixels16_8_msa; - c->put_hevc_qpel_uni[6][0][0] = ff_hevc_put_hevc_uni_pel_pixels24_8_msa; - c->put_hevc_qpel_uni[7][0][0] = ff_hevc_put_hevc_uni_pel_pixels32_8_msa; - c->put_hevc_qpel_uni[8][0][0] = ff_hevc_put_hevc_uni_pel_pixels48_8_msa; - c->put_hevc_qpel_uni[9][0][0] = ff_hevc_put_hevc_uni_pel_pixels64_8_msa; - - c->put_hevc_qpel_uni[1][0][1] = ff_hevc_put_hevc_uni_qpel_h4_8_msa; - c->put_hevc_qpel_uni[3][0][1] = ff_hevc_put_hevc_uni_qpel_h8_8_msa; - c->put_hevc_qpel_uni[4][0][1] = ff_hevc_put_hevc_uni_qpel_h12_8_msa; - c->put_hevc_qpel_uni[5][0][1] = ff_hevc_put_hevc_uni_qpel_h16_8_msa; - c->put_hevc_qpel_uni[6][0][1] = ff_hevc_put_hevc_uni_qpel_h24_8_msa; - c->put_hevc_qpel_uni[7][0][1] = ff_hevc_put_hevc_uni_qpel_h32_8_msa; - c->put_hevc_qpel_uni[8][0][1] = ff_hevc_put_hevc_uni_qpel_h48_8_msa; - c->put_hevc_qpel_uni[9][0][1] = ff_hevc_put_hevc_uni_qpel_h64_8_msa; - - c->put_hevc_qpel_uni[1][1][0] = ff_hevc_put_hevc_uni_qpel_v4_8_msa; - c->put_hevc_qpel_uni[3][1][0] = ff_hevc_put_hevc_uni_qpel_v8_8_msa; - c->put_hevc_qpel_uni[4][1][0] = ff_hevc_put_hevc_uni_qpel_v12_8_msa; - c->put_hevc_qpel_uni[5][1][0] = ff_hevc_put_hevc_uni_qpel_v16_8_msa; - c->put_hevc_qpel_uni[6][1][0] = ff_hevc_put_hevc_uni_qpel_v24_8_msa; - c->put_hevc_qpel_uni[7][1][0] = ff_hevc_put_hevc_uni_qpel_v32_8_msa; - c->put_hevc_qpel_uni[8][1][0] = ff_hevc_put_hevc_uni_qpel_v48_8_msa; - c->put_hevc_qpel_uni[9][1][0] = ff_hevc_put_hevc_uni_qpel_v64_8_msa; - - c->put_hevc_qpel_uni[1][1][1] = ff_hevc_put_hevc_uni_qpel_hv4_8_msa; - c->put_hevc_qpel_uni[3][1][1] = ff_hevc_put_hevc_uni_qpel_hv8_8_msa; - c->put_hevc_qpel_uni[4][1][1] = ff_hevc_put_hevc_uni_qpel_hv12_8_msa; - c->put_hevc_qpel_uni[5][1][1] = ff_hevc_put_hevc_uni_qpel_hv16_8_msa; - c->put_hevc_qpel_uni[6][1][1] = ff_hevc_put_hevc_uni_qpel_hv24_8_msa; - c->put_hevc_qpel_uni[7][1][1] = ff_hevc_put_hevc_uni_qpel_hv32_8_msa; - c->put_hevc_qpel_uni[8][1][1] = ff_hevc_put_hevc_uni_qpel_hv48_8_msa; - c->put_hevc_qpel_uni[9][1][1] = ff_hevc_put_hevc_uni_qpel_hv64_8_msa; - - c->put_hevc_epel_uni[3][0][0] = ff_hevc_put_hevc_uni_pel_pixels8_8_msa; - c->put_hevc_epel_uni[4][0][0] = ff_hevc_put_hevc_uni_pel_pixels12_8_msa; - c->put_hevc_epel_uni[5][0][0] = ff_hevc_put_hevc_uni_pel_pixels16_8_msa; - c->put_hevc_epel_uni[6][0][0] = ff_hevc_put_hevc_uni_pel_pixels24_8_msa; - c->put_hevc_epel_uni[7][0][0] = ff_hevc_put_hevc_uni_pel_pixels32_8_msa; - - c->put_hevc_epel_uni[1][0][1] = ff_hevc_put_hevc_uni_epel_h4_8_msa; - c->put_hevc_epel_uni[2][0][1] = ff_hevc_put_hevc_uni_epel_h6_8_msa; - c->put_hevc_epel_uni[3][0][1] = ff_hevc_put_hevc_uni_epel_h8_8_msa; - c->put_hevc_epel_uni[4][0][1] = ff_hevc_put_hevc_uni_epel_h12_8_msa; - c->put_hevc_epel_uni[5][0][1] = ff_hevc_put_hevc_uni_epel_h16_8_msa; - c->put_hevc_epel_uni[6][0][1] = ff_hevc_put_hevc_uni_epel_h24_8_msa; - c->put_hevc_epel_uni[7][0][1] = ff_hevc_put_hevc_uni_epel_h32_8_msa; - - c->put_hevc_epel_uni[1][1][0] = ff_hevc_put_hevc_uni_epel_v4_8_msa; - c->put_hevc_epel_uni[2][1][0] = ff_hevc_put_hevc_uni_epel_v6_8_msa; - c->put_hevc_epel_uni[3][1][0] = ff_hevc_put_hevc_uni_epel_v8_8_msa; - c->put_hevc_epel_uni[4][1][0] = ff_hevc_put_hevc_uni_epel_v12_8_msa; - c->put_hevc_epel_uni[5][1][0] = ff_hevc_put_hevc_uni_epel_v16_8_msa; - c->put_hevc_epel_uni[6][1][0] = ff_hevc_put_hevc_uni_epel_v24_8_msa; - c->put_hevc_epel_uni[7][1][0] = ff_hevc_put_hevc_uni_epel_v32_8_msa; - - c->put_hevc_epel_uni[1][1][1] = ff_hevc_put_hevc_uni_epel_hv4_8_msa; - c->put_hevc_epel_uni[2][1][1] = ff_hevc_put_hevc_uni_epel_hv6_8_msa; - c->put_hevc_epel_uni[3][1][1] = ff_hevc_put_hevc_uni_epel_hv8_8_msa; - c->put_hevc_epel_uni[4][1][1] = ff_hevc_put_hevc_uni_epel_hv12_8_msa; - c->put_hevc_epel_uni[5][1][1] = ff_hevc_put_hevc_uni_epel_hv16_8_msa; - c->put_hevc_epel_uni[6][1][1] = ff_hevc_put_hevc_uni_epel_hv24_8_msa; - c->put_hevc_epel_uni[7][1][1] = ff_hevc_put_hevc_uni_epel_hv32_8_msa; - - c->put_hevc_qpel_uni_w[1][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels4_8_msa; - c->put_hevc_qpel_uni_w[3][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels8_8_msa; - c->put_hevc_qpel_uni_w[4][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels12_8_msa; - c->put_hevc_qpel_uni_w[5][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels16_8_msa; - c->put_hevc_qpel_uni_w[6][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels24_8_msa; - c->put_hevc_qpel_uni_w[7][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels32_8_msa; - c->put_hevc_qpel_uni_w[8][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels48_8_msa; - c->put_hevc_qpel_uni_w[9][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels64_8_msa; - - c->put_hevc_qpel_uni_w[1][0][1] = ff_hevc_put_hevc_uni_w_qpel_h4_8_msa; - c->put_hevc_qpel_uni_w[3][0][1] = ff_hevc_put_hevc_uni_w_qpel_h8_8_msa; - c->put_hevc_qpel_uni_w[4][0][1] = ff_hevc_put_hevc_uni_w_qpel_h12_8_msa; - c->put_hevc_qpel_uni_w[5][0][1] = ff_hevc_put_hevc_uni_w_qpel_h16_8_msa; - c->put_hevc_qpel_uni_w[6][0][1] = ff_hevc_put_hevc_uni_w_qpel_h24_8_msa; - c->put_hevc_qpel_uni_w[7][0][1] = ff_hevc_put_hevc_uni_w_qpel_h32_8_msa; - c->put_hevc_qpel_uni_w[8][0][1] = ff_hevc_put_hevc_uni_w_qpel_h48_8_msa; - c->put_hevc_qpel_uni_w[9][0][1] = ff_hevc_put_hevc_uni_w_qpel_h64_8_msa; - - c->put_hevc_qpel_uni_w[1][1][0] = ff_hevc_put_hevc_uni_w_qpel_v4_8_msa; - c->put_hevc_qpel_uni_w[3][1][0] = ff_hevc_put_hevc_uni_w_qpel_v8_8_msa; - c->put_hevc_qpel_uni_w[4][1][0] = ff_hevc_put_hevc_uni_w_qpel_v12_8_msa; - c->put_hevc_qpel_uni_w[5][1][0] = ff_hevc_put_hevc_uni_w_qpel_v16_8_msa; - c->put_hevc_qpel_uni_w[6][1][0] = ff_hevc_put_hevc_uni_w_qpel_v24_8_msa; - c->put_hevc_qpel_uni_w[7][1][0] = ff_hevc_put_hevc_uni_w_qpel_v32_8_msa; - c->put_hevc_qpel_uni_w[8][1][0] = ff_hevc_put_hevc_uni_w_qpel_v48_8_msa; - c->put_hevc_qpel_uni_w[9][1][0] = ff_hevc_put_hevc_uni_w_qpel_v64_8_msa; - - c->put_hevc_qpel_uni_w[1][1][1] = ff_hevc_put_hevc_uni_w_qpel_hv4_8_msa; - c->put_hevc_qpel_uni_w[3][1][1] = ff_hevc_put_hevc_uni_w_qpel_hv8_8_msa; - c->put_hevc_qpel_uni_w[4][1][1] = - ff_hevc_put_hevc_uni_w_qpel_hv12_8_msa; - c->put_hevc_qpel_uni_w[5][1][1] = - ff_hevc_put_hevc_uni_w_qpel_hv16_8_msa; - c->put_hevc_qpel_uni_w[6][1][1] = - ff_hevc_put_hevc_uni_w_qpel_hv24_8_msa; - c->put_hevc_qpel_uni_w[7][1][1] = - ff_hevc_put_hevc_uni_w_qpel_hv32_8_msa; - c->put_hevc_qpel_uni_w[8][1][1] = - ff_hevc_put_hevc_uni_w_qpel_hv48_8_msa; - c->put_hevc_qpel_uni_w[9][1][1] = - ff_hevc_put_hevc_uni_w_qpel_hv64_8_msa; - - c->put_hevc_epel_uni_w[1][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels4_8_msa; - c->put_hevc_epel_uni_w[2][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels6_8_msa; - c->put_hevc_epel_uni_w[3][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels8_8_msa; - c->put_hevc_epel_uni_w[4][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels12_8_msa; - c->put_hevc_epel_uni_w[5][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels16_8_msa; - c->put_hevc_epel_uni_w[6][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels24_8_msa; - c->put_hevc_epel_uni_w[7][0][0] = - ff_hevc_put_hevc_uni_w_pel_pixels32_8_msa; - - c->put_hevc_epel_uni_w[1][0][1] = ff_hevc_put_hevc_uni_w_epel_h4_8_msa; - c->put_hevc_epel_uni_w[2][0][1] = ff_hevc_put_hevc_uni_w_epel_h6_8_msa; - c->put_hevc_epel_uni_w[3][0][1] = ff_hevc_put_hevc_uni_w_epel_h8_8_msa; - c->put_hevc_epel_uni_w[4][0][1] = ff_hevc_put_hevc_uni_w_epel_h12_8_msa; - c->put_hevc_epel_uni_w[5][0][1] = ff_hevc_put_hevc_uni_w_epel_h16_8_msa; - c->put_hevc_epel_uni_w[6][0][1] = ff_hevc_put_hevc_uni_w_epel_h24_8_msa; - c->put_hevc_epel_uni_w[7][0][1] = ff_hevc_put_hevc_uni_w_epel_h32_8_msa; - - c->put_hevc_epel_uni_w[1][1][0] = ff_hevc_put_hevc_uni_w_epel_v4_8_msa; - c->put_hevc_epel_uni_w[2][1][0] = ff_hevc_put_hevc_uni_w_epel_v6_8_msa; - c->put_hevc_epel_uni_w[3][1][0] = ff_hevc_put_hevc_uni_w_epel_v8_8_msa; - c->put_hevc_epel_uni_w[4][1][0] = ff_hevc_put_hevc_uni_w_epel_v12_8_msa; - c->put_hevc_epel_uni_w[5][1][0] = ff_hevc_put_hevc_uni_w_epel_v16_8_msa; - c->put_hevc_epel_uni_w[6][1][0] = ff_hevc_put_hevc_uni_w_epel_v24_8_msa; - c->put_hevc_epel_uni_w[7][1][0] = ff_hevc_put_hevc_uni_w_epel_v32_8_msa; - - c->put_hevc_epel_uni_w[1][1][1] = ff_hevc_put_hevc_uni_w_epel_hv4_8_msa; - c->put_hevc_epel_uni_w[2][1][1] = ff_hevc_put_hevc_uni_w_epel_hv6_8_msa; - c->put_hevc_epel_uni_w[3][1][1] = ff_hevc_put_hevc_uni_w_epel_hv8_8_msa; - c->put_hevc_epel_uni_w[4][1][1] = - ff_hevc_put_hevc_uni_w_epel_hv12_8_msa; - c->put_hevc_epel_uni_w[5][1][1] = - ff_hevc_put_hevc_uni_w_epel_hv16_8_msa; - c->put_hevc_epel_uni_w[6][1][1] = - ff_hevc_put_hevc_uni_w_epel_hv24_8_msa; - c->put_hevc_epel_uni_w[7][1][1] = - ff_hevc_put_hevc_uni_w_epel_hv32_8_msa; - - c->put_hevc_qpel_bi[1][0][0] = ff_hevc_put_hevc_bi_pel_pixels4_8_msa; - c->put_hevc_qpel_bi[3][0][0] = ff_hevc_put_hevc_bi_pel_pixels8_8_msa; - c->put_hevc_qpel_bi[4][0][0] = ff_hevc_put_hevc_bi_pel_pixels12_8_msa; - c->put_hevc_qpel_bi[5][0][0] = ff_hevc_put_hevc_bi_pel_pixels16_8_msa; - c->put_hevc_qpel_bi[6][0][0] = ff_hevc_put_hevc_bi_pel_pixels24_8_msa; - c->put_hevc_qpel_bi[7][0][0] = ff_hevc_put_hevc_bi_pel_pixels32_8_msa; - c->put_hevc_qpel_bi[8][0][0] = ff_hevc_put_hevc_bi_pel_pixels48_8_msa; - c->put_hevc_qpel_bi[9][0][0] = ff_hevc_put_hevc_bi_pel_pixels64_8_msa; - - c->put_hevc_qpel_bi[1][0][1] = ff_hevc_put_hevc_bi_qpel_h4_8_msa; - c->put_hevc_qpel_bi[3][0][1] = ff_hevc_put_hevc_bi_qpel_h8_8_msa; - c->put_hevc_qpel_bi[4][0][1] = ff_hevc_put_hevc_bi_qpel_h12_8_msa; - c->put_hevc_qpel_bi[5][0][1] = ff_hevc_put_hevc_bi_qpel_h16_8_msa; - c->put_hevc_qpel_bi[6][0][1] = ff_hevc_put_hevc_bi_qpel_h24_8_msa; - c->put_hevc_qpel_bi[7][0][1] = ff_hevc_put_hevc_bi_qpel_h32_8_msa; - c->put_hevc_qpel_bi[8][0][1] = ff_hevc_put_hevc_bi_qpel_h48_8_msa; - c->put_hevc_qpel_bi[9][0][1] = ff_hevc_put_hevc_bi_qpel_h64_8_msa; - - c->put_hevc_qpel_bi[1][1][0] = ff_hevc_put_hevc_bi_qpel_v4_8_msa; - c->put_hevc_qpel_bi[3][1][0] = ff_hevc_put_hevc_bi_qpel_v8_8_msa; - c->put_hevc_qpel_bi[4][1][0] = ff_hevc_put_hevc_bi_qpel_v12_8_msa; - c->put_hevc_qpel_bi[5][1][0] = ff_hevc_put_hevc_bi_qpel_v16_8_msa; - c->put_hevc_qpel_bi[6][1][0] = ff_hevc_put_hevc_bi_qpel_v24_8_msa; - c->put_hevc_qpel_bi[7][1][0] = ff_hevc_put_hevc_bi_qpel_v32_8_msa; - c->put_hevc_qpel_bi[8][1][0] = ff_hevc_put_hevc_bi_qpel_v48_8_msa; - c->put_hevc_qpel_bi[9][1][0] = ff_hevc_put_hevc_bi_qpel_v64_8_msa; - - c->put_hevc_qpel_bi[1][1][1] = ff_hevc_put_hevc_bi_qpel_hv4_8_msa; - c->put_hevc_qpel_bi[3][1][1] = ff_hevc_put_hevc_bi_qpel_hv8_8_msa; - c->put_hevc_qpel_bi[4][1][1] = ff_hevc_put_hevc_bi_qpel_hv12_8_msa; - c->put_hevc_qpel_bi[5][1][1] = ff_hevc_put_hevc_bi_qpel_hv16_8_msa; - c->put_hevc_qpel_bi[6][1][1] = ff_hevc_put_hevc_bi_qpel_hv24_8_msa; - c->put_hevc_qpel_bi[7][1][1] = ff_hevc_put_hevc_bi_qpel_hv32_8_msa; - c->put_hevc_qpel_bi[8][1][1] = ff_hevc_put_hevc_bi_qpel_hv48_8_msa; - c->put_hevc_qpel_bi[9][1][1] = ff_hevc_put_hevc_bi_qpel_hv64_8_msa; - - c->put_hevc_epel_bi[1][0][0] = ff_hevc_put_hevc_bi_pel_pixels4_8_msa; - c->put_hevc_epel_bi[2][0][0] = ff_hevc_put_hevc_bi_pel_pixels6_8_msa; - c->put_hevc_epel_bi[3][0][0] = ff_hevc_put_hevc_bi_pel_pixels8_8_msa; - c->put_hevc_epel_bi[4][0][0] = ff_hevc_put_hevc_bi_pel_pixels12_8_msa; - c->put_hevc_epel_bi[5][0][0] = ff_hevc_put_hevc_bi_pel_pixels16_8_msa; - c->put_hevc_epel_bi[6][0][0] = ff_hevc_put_hevc_bi_pel_pixels24_8_msa; - c->put_hevc_epel_bi[7][0][0] = ff_hevc_put_hevc_bi_pel_pixels32_8_msa; - - c->put_hevc_epel_bi[1][0][1] = ff_hevc_put_hevc_bi_epel_h4_8_msa; - c->put_hevc_epel_bi[2][0][1] = ff_hevc_put_hevc_bi_epel_h6_8_msa; - c->put_hevc_epel_bi[3][0][1] = ff_hevc_put_hevc_bi_epel_h8_8_msa; - c->put_hevc_epel_bi[4][0][1] = ff_hevc_put_hevc_bi_epel_h12_8_msa; - c->put_hevc_epel_bi[5][0][1] = ff_hevc_put_hevc_bi_epel_h16_8_msa; - c->put_hevc_epel_bi[6][0][1] = ff_hevc_put_hevc_bi_epel_h24_8_msa; - c->put_hevc_epel_bi[7][0][1] = ff_hevc_put_hevc_bi_epel_h32_8_msa; - - c->put_hevc_epel_bi[1][1][0] = ff_hevc_put_hevc_bi_epel_v4_8_msa; - c->put_hevc_epel_bi[2][1][0] = ff_hevc_put_hevc_bi_epel_v6_8_msa; - c->put_hevc_epel_bi[3][1][0] = ff_hevc_put_hevc_bi_epel_v8_8_msa; - c->put_hevc_epel_bi[4][1][0] = ff_hevc_put_hevc_bi_epel_v12_8_msa; - c->put_hevc_epel_bi[5][1][0] = ff_hevc_put_hevc_bi_epel_v16_8_msa; - c->put_hevc_epel_bi[6][1][0] = ff_hevc_put_hevc_bi_epel_v24_8_msa; - c->put_hevc_epel_bi[7][1][0] = ff_hevc_put_hevc_bi_epel_v32_8_msa; - - c->put_hevc_epel_bi[1][1][1] = ff_hevc_put_hevc_bi_epel_hv4_8_msa; - c->put_hevc_epel_bi[2][1][1] = ff_hevc_put_hevc_bi_epel_hv6_8_msa; - c->put_hevc_epel_bi[3][1][1] = ff_hevc_put_hevc_bi_epel_hv8_8_msa; - c->put_hevc_epel_bi[4][1][1] = ff_hevc_put_hevc_bi_epel_hv12_8_msa; - c->put_hevc_epel_bi[5][1][1] = ff_hevc_put_hevc_bi_epel_hv16_8_msa; - c->put_hevc_epel_bi[6][1][1] = ff_hevc_put_hevc_bi_epel_hv24_8_msa; - c->put_hevc_epel_bi[7][1][1] = ff_hevc_put_hevc_bi_epel_hv32_8_msa; - - c->put_hevc_qpel_bi_w[1][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels4_8_msa; - c->put_hevc_qpel_bi_w[3][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels8_8_msa; - c->put_hevc_qpel_bi_w[4][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels12_8_msa; - c->put_hevc_qpel_bi_w[5][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels16_8_msa; - c->put_hevc_qpel_bi_w[6][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels24_8_msa; - c->put_hevc_qpel_bi_w[7][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels32_8_msa; - c->put_hevc_qpel_bi_w[8][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels48_8_msa; - c->put_hevc_qpel_bi_w[9][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels64_8_msa; - - c->put_hevc_qpel_bi_w[1][0][1] = ff_hevc_put_hevc_bi_w_qpel_h4_8_msa; - c->put_hevc_qpel_bi_w[3][0][1] = ff_hevc_put_hevc_bi_w_qpel_h8_8_msa; - c->put_hevc_qpel_bi_w[4][0][1] = ff_hevc_put_hevc_bi_w_qpel_h12_8_msa; - c->put_hevc_qpel_bi_w[5][0][1] = ff_hevc_put_hevc_bi_w_qpel_h16_8_msa; - c->put_hevc_qpel_bi_w[6][0][1] = ff_hevc_put_hevc_bi_w_qpel_h24_8_msa; - c->put_hevc_qpel_bi_w[7][0][1] = ff_hevc_put_hevc_bi_w_qpel_h32_8_msa; - c->put_hevc_qpel_bi_w[8][0][1] = ff_hevc_put_hevc_bi_w_qpel_h48_8_msa; - c->put_hevc_qpel_bi_w[9][0][1] = ff_hevc_put_hevc_bi_w_qpel_h64_8_msa; - - c->put_hevc_qpel_bi_w[1][1][0] = ff_hevc_put_hevc_bi_w_qpel_v4_8_msa; - c->put_hevc_qpel_bi_w[3][1][0] = ff_hevc_put_hevc_bi_w_qpel_v8_8_msa; - c->put_hevc_qpel_bi_w[4][1][0] = ff_hevc_put_hevc_bi_w_qpel_v12_8_msa; - c->put_hevc_qpel_bi_w[5][1][0] = ff_hevc_put_hevc_bi_w_qpel_v16_8_msa; - c->put_hevc_qpel_bi_w[6][1][0] = ff_hevc_put_hevc_bi_w_qpel_v24_8_msa; - c->put_hevc_qpel_bi_w[7][1][0] = ff_hevc_put_hevc_bi_w_qpel_v32_8_msa; - c->put_hevc_qpel_bi_w[8][1][0] = ff_hevc_put_hevc_bi_w_qpel_v48_8_msa; - c->put_hevc_qpel_bi_w[9][1][0] = ff_hevc_put_hevc_bi_w_qpel_v64_8_msa; - - c->put_hevc_qpel_bi_w[1][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv4_8_msa; - c->put_hevc_qpel_bi_w[3][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv8_8_msa; - c->put_hevc_qpel_bi_w[4][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv12_8_msa; - c->put_hevc_qpel_bi_w[5][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv16_8_msa; - c->put_hevc_qpel_bi_w[6][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv24_8_msa; - c->put_hevc_qpel_bi_w[7][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv32_8_msa; - c->put_hevc_qpel_bi_w[8][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv48_8_msa; - c->put_hevc_qpel_bi_w[9][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv64_8_msa; - - c->put_hevc_epel_bi_w[1][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels4_8_msa; - c->put_hevc_epel_bi_w[2][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels6_8_msa; - c->put_hevc_epel_bi_w[3][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels8_8_msa; - c->put_hevc_epel_bi_w[4][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels12_8_msa; - c->put_hevc_epel_bi_w[5][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels16_8_msa; - c->put_hevc_epel_bi_w[6][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels24_8_msa; - c->put_hevc_epel_bi_w[7][0][0] = - ff_hevc_put_hevc_bi_w_pel_pixels32_8_msa; - - c->put_hevc_epel_bi_w[1][0][1] = ff_hevc_put_hevc_bi_w_epel_h4_8_msa; - c->put_hevc_epel_bi_w[2][0][1] = ff_hevc_put_hevc_bi_w_epel_h6_8_msa; - c->put_hevc_epel_bi_w[3][0][1] = ff_hevc_put_hevc_bi_w_epel_h8_8_msa; - c->put_hevc_epel_bi_w[4][0][1] = ff_hevc_put_hevc_bi_w_epel_h12_8_msa; - c->put_hevc_epel_bi_w[5][0][1] = ff_hevc_put_hevc_bi_w_epel_h16_8_msa; - c->put_hevc_epel_bi_w[6][0][1] = ff_hevc_put_hevc_bi_w_epel_h24_8_msa; - c->put_hevc_epel_bi_w[7][0][1] = ff_hevc_put_hevc_bi_w_epel_h32_8_msa; - - c->put_hevc_epel_bi_w[1][1][0] = ff_hevc_put_hevc_bi_w_epel_v4_8_msa; - c->put_hevc_epel_bi_w[2][1][0] = ff_hevc_put_hevc_bi_w_epel_v6_8_msa; - c->put_hevc_epel_bi_w[3][1][0] = ff_hevc_put_hevc_bi_w_epel_v8_8_msa; - c->put_hevc_epel_bi_w[4][1][0] = ff_hevc_put_hevc_bi_w_epel_v12_8_msa; - c->put_hevc_epel_bi_w[5][1][0] = ff_hevc_put_hevc_bi_w_epel_v16_8_msa; - c->put_hevc_epel_bi_w[6][1][0] = ff_hevc_put_hevc_bi_w_epel_v24_8_msa; - c->put_hevc_epel_bi_w[7][1][0] = ff_hevc_put_hevc_bi_w_epel_v32_8_msa; - - c->put_hevc_epel_bi_w[1][1][1] = ff_hevc_put_hevc_bi_w_epel_hv4_8_msa; - c->put_hevc_epel_bi_w[2][1][1] = ff_hevc_put_hevc_bi_w_epel_hv6_8_msa; - c->put_hevc_epel_bi_w[3][1][1] = ff_hevc_put_hevc_bi_w_epel_hv8_8_msa; - c->put_hevc_epel_bi_w[4][1][1] = ff_hevc_put_hevc_bi_w_epel_hv12_8_msa; - c->put_hevc_epel_bi_w[5][1][1] = ff_hevc_put_hevc_bi_w_epel_hv16_8_msa; - c->put_hevc_epel_bi_w[6][1][1] = ff_hevc_put_hevc_bi_w_epel_hv24_8_msa; - c->put_hevc_epel_bi_w[7][1][1] = ff_hevc_put_hevc_bi_w_epel_hv32_8_msa; - - c->sao_band_filter[0] = - c->sao_band_filter[1] = - c->sao_band_filter[2] = - c->sao_band_filter[3] = - c->sao_band_filter[4] = ff_hevc_sao_band_filter_0_8_msa; - - c->sao_edge_filter[0] = - c->sao_edge_filter[1] = - c->sao_edge_filter[2] = - c->sao_edge_filter[3] = - c->sao_edge_filter[4] = ff_hevc_sao_edge_filter_8_msa; - - c->hevc_h_loop_filter_luma = ff_hevc_loop_filter_luma_h_8_msa; - c->hevc_v_loop_filter_luma = ff_hevc_loop_filter_luma_v_8_msa; - - c->hevc_h_loop_filter_chroma = ff_hevc_loop_filter_chroma_h_8_msa; - c->hevc_v_loop_filter_chroma = ff_hevc_loop_filter_chroma_v_8_msa; - - c->hevc_h_loop_filter_luma_c = ff_hevc_loop_filter_luma_h_8_msa; - c->hevc_v_loop_filter_luma_c = ff_hevc_loop_filter_luma_v_8_msa; - - c->hevc_h_loop_filter_chroma_c = - ff_hevc_loop_filter_chroma_h_8_msa; - c->hevc_v_loop_filter_chroma_c = - ff_hevc_loop_filter_chroma_v_8_msa; - - c->idct[0] = ff_hevc_idct_4x4_msa; - c->idct[1] = ff_hevc_idct_8x8_msa; - c->idct[2] = ff_hevc_idct_16x16_msa; - c->idct[3] = ff_hevc_idct_32x32_msa; - c->idct_dc[0] = ff_hevc_idct_dc_4x4_msa; - c->idct_dc[1] = ff_hevc_idct_dc_8x8_msa; - c->idct_dc[2] = ff_hevc_idct_dc_16x16_msa; - c->idct_dc[3] = ff_hevc_idct_dc_32x32_msa; - c->add_residual[0] = ff_hevc_addblk_4x4_msa; - c->add_residual[1] = ff_hevc_addblk_8x8_msa; - c->add_residual[2] = ff_hevc_addblk_16x16_msa; - c->add_residual[3] = ff_hevc_addblk_32x32_msa; - c->transform_4x4_luma = ff_hevc_idct_luma_4x4_msa; + if (have_msa(cpu_flags)) { + if (bit_depth == 8) { + c->put_hevc_qpel[1][0][0] = ff_hevc_put_hevc_pel_pixels4_8_msa; + c->put_hevc_qpel[2][0][0] = ff_hevc_put_hevc_pel_pixels6_8_msa; + c->put_hevc_qpel[3][0][0] = ff_hevc_put_hevc_pel_pixels8_8_msa; + c->put_hevc_qpel[4][0][0] = ff_hevc_put_hevc_pel_pixels12_8_msa; + c->put_hevc_qpel[5][0][0] = ff_hevc_put_hevc_pel_pixels16_8_msa; + c->put_hevc_qpel[6][0][0] = ff_hevc_put_hevc_pel_pixels24_8_msa; + c->put_hevc_qpel[7][0][0] = ff_hevc_put_hevc_pel_pixels32_8_msa; + c->put_hevc_qpel[8][0][0] = ff_hevc_put_hevc_pel_pixels48_8_msa; + c->put_hevc_qpel[9][0][0] = ff_hevc_put_hevc_pel_pixels64_8_msa; + + c->put_hevc_qpel[1][0][1] = ff_hevc_put_hevc_qpel_h4_8_msa; + c->put_hevc_qpel[3][0][1] = ff_hevc_put_hevc_qpel_h8_8_msa; + c->put_hevc_qpel[4][0][1] = ff_hevc_put_hevc_qpel_h12_8_msa; + c->put_hevc_qpel[5][0][1] = ff_hevc_put_hevc_qpel_h16_8_msa; + c->put_hevc_qpel[6][0][1] = ff_hevc_put_hevc_qpel_h24_8_msa; + c->put_hevc_qpel[7][0][1] = ff_hevc_put_hevc_qpel_h32_8_msa; + c->put_hevc_qpel[8][0][1] = ff_hevc_put_hevc_qpel_h48_8_msa; + c->put_hevc_qpel[9][0][1] = ff_hevc_put_hevc_qpel_h64_8_msa; + + c->put_hevc_qpel[1][1][0] = ff_hevc_put_hevc_qpel_v4_8_msa; + c->put_hevc_qpel[3][1][0] = ff_hevc_put_hevc_qpel_v8_8_msa; + c->put_hevc_qpel[4][1][0] = ff_hevc_put_hevc_qpel_v12_8_msa; + c->put_hevc_qpel[5][1][0] = ff_hevc_put_hevc_qpel_v16_8_msa; + c->put_hevc_qpel[6][1][0] = ff_hevc_put_hevc_qpel_v24_8_msa; + c->put_hevc_qpel[7][1][0] = ff_hevc_put_hevc_qpel_v32_8_msa; + c->put_hevc_qpel[8][1][0] = ff_hevc_put_hevc_qpel_v48_8_msa; + c->put_hevc_qpel[9][1][0] = ff_hevc_put_hevc_qpel_v64_8_msa; + + c->put_hevc_qpel[1][1][1] = ff_hevc_put_hevc_qpel_hv4_8_msa; + c->put_hevc_qpel[3][1][1] = ff_hevc_put_hevc_qpel_hv8_8_msa; + c->put_hevc_qpel[4][1][1] = ff_hevc_put_hevc_qpel_hv12_8_msa; + c->put_hevc_qpel[5][1][1] = ff_hevc_put_hevc_qpel_hv16_8_msa; + c->put_hevc_qpel[6][1][1] = ff_hevc_put_hevc_qpel_hv24_8_msa; + c->put_hevc_qpel[7][1][1] = ff_hevc_put_hevc_qpel_hv32_8_msa; + c->put_hevc_qpel[8][1][1] = ff_hevc_put_hevc_qpel_hv48_8_msa; + c->put_hevc_qpel[9][1][1] = ff_hevc_put_hevc_qpel_hv64_8_msa; + + c->put_hevc_epel[1][0][0] = ff_hevc_put_hevc_pel_pixels4_8_msa; + c->put_hevc_epel[2][0][0] = ff_hevc_put_hevc_pel_pixels6_8_msa; + c->put_hevc_epel[3][0][0] = ff_hevc_put_hevc_pel_pixels8_8_msa; + c->put_hevc_epel[4][0][0] = ff_hevc_put_hevc_pel_pixels12_8_msa; + c->put_hevc_epel[5][0][0] = ff_hevc_put_hevc_pel_pixels16_8_msa; + c->put_hevc_epel[6][0][0] = ff_hevc_put_hevc_pel_pixels24_8_msa; + c->put_hevc_epel[7][0][0] = ff_hevc_put_hevc_pel_pixels32_8_msa; + + c->put_hevc_epel[1][0][1] = ff_hevc_put_hevc_epel_h4_8_msa; + c->put_hevc_epel[2][0][1] = ff_hevc_put_hevc_epel_h6_8_msa; + c->put_hevc_epel[3][0][1] = ff_hevc_put_hevc_epel_h8_8_msa; + c->put_hevc_epel[4][0][1] = ff_hevc_put_hevc_epel_h12_8_msa; + c->put_hevc_epel[5][0][1] = ff_hevc_put_hevc_epel_h16_8_msa; + c->put_hevc_epel[6][0][1] = ff_hevc_put_hevc_epel_h24_8_msa; + c->put_hevc_epel[7][0][1] = ff_hevc_put_hevc_epel_h32_8_msa; + + c->put_hevc_epel[1][1][0] = ff_hevc_put_hevc_epel_v4_8_msa; + c->put_hevc_epel[2][1][0] = ff_hevc_put_hevc_epel_v6_8_msa; + c->put_hevc_epel[3][1][0] = ff_hevc_put_hevc_epel_v8_8_msa; + c->put_hevc_epel[4][1][0] = ff_hevc_put_hevc_epel_v12_8_msa; + c->put_hevc_epel[5][1][0] = ff_hevc_put_hevc_epel_v16_8_msa; + c->put_hevc_epel[6][1][0] = ff_hevc_put_hevc_epel_v24_8_msa; + c->put_hevc_epel[7][1][0] = ff_hevc_put_hevc_epel_v32_8_msa; + + c->put_hevc_epel[1][1][1] = ff_hevc_put_hevc_epel_hv4_8_msa; + c->put_hevc_epel[2][1][1] = ff_hevc_put_hevc_epel_hv6_8_msa; + c->put_hevc_epel[3][1][1] = ff_hevc_put_hevc_epel_hv8_8_msa; + c->put_hevc_epel[4][1][1] = ff_hevc_put_hevc_epel_hv12_8_msa; + c->put_hevc_epel[5][1][1] = ff_hevc_put_hevc_epel_hv16_8_msa; + c->put_hevc_epel[6][1][1] = ff_hevc_put_hevc_epel_hv24_8_msa; + c->put_hevc_epel[7][1][1] = ff_hevc_put_hevc_epel_hv32_8_msa; + + c->put_hevc_qpel_uni[3][0][0] = ff_hevc_put_hevc_uni_pel_pixels8_8_msa; + c->put_hevc_qpel_uni[4][0][0] = ff_hevc_put_hevc_uni_pel_pixels12_8_msa; + c->put_hevc_qpel_uni[5][0][0] = ff_hevc_put_hevc_uni_pel_pixels16_8_msa; + c->put_hevc_qpel_uni[6][0][0] = ff_hevc_put_hevc_uni_pel_pixels24_8_msa; + c->put_hevc_qpel_uni[7][0][0] = ff_hevc_put_hevc_uni_pel_pixels32_8_msa; + c->put_hevc_qpel_uni[8][0][0] = ff_hevc_put_hevc_uni_pel_pixels48_8_msa; + c->put_hevc_qpel_uni[9][0][0] = ff_hevc_put_hevc_uni_pel_pixels64_8_msa; + + c->put_hevc_qpel_uni[1][0][1] = ff_hevc_put_hevc_uni_qpel_h4_8_msa; + c->put_hevc_qpel_uni[3][0][1] = ff_hevc_put_hevc_uni_qpel_h8_8_msa; + c->put_hevc_qpel_uni[4][0][1] = ff_hevc_put_hevc_uni_qpel_h12_8_msa; + c->put_hevc_qpel_uni[5][0][1] = ff_hevc_put_hevc_uni_qpel_h16_8_msa; + c->put_hevc_qpel_uni[6][0][1] = ff_hevc_put_hevc_uni_qpel_h24_8_msa; + c->put_hevc_qpel_uni[7][0][1] = ff_hevc_put_hevc_uni_qpel_h32_8_msa; + c->put_hevc_qpel_uni[8][0][1] = ff_hevc_put_hevc_uni_qpel_h48_8_msa; + c->put_hevc_qpel_uni[9][0][1] = ff_hevc_put_hevc_uni_qpel_h64_8_msa; + + c->put_hevc_qpel_uni[1][1][0] = ff_hevc_put_hevc_uni_qpel_v4_8_msa; + c->put_hevc_qpel_uni[3][1][0] = ff_hevc_put_hevc_uni_qpel_v8_8_msa; + c->put_hevc_qpel_uni[4][1][0] = ff_hevc_put_hevc_uni_qpel_v12_8_msa; + c->put_hevc_qpel_uni[5][1][0] = ff_hevc_put_hevc_uni_qpel_v16_8_msa; + c->put_hevc_qpel_uni[6][1][0] = ff_hevc_put_hevc_uni_qpel_v24_8_msa; + c->put_hevc_qpel_uni[7][1][0] = ff_hevc_put_hevc_uni_qpel_v32_8_msa; + c->put_hevc_qpel_uni[8][1][0] = ff_hevc_put_hevc_uni_qpel_v48_8_msa; + c->put_hevc_qpel_uni[9][1][0] = ff_hevc_put_hevc_uni_qpel_v64_8_msa; + + c->put_hevc_qpel_uni[1][1][1] = ff_hevc_put_hevc_uni_qpel_hv4_8_msa; + c->put_hevc_qpel_uni[3][1][1] = ff_hevc_put_hevc_uni_qpel_hv8_8_msa; + c->put_hevc_qpel_uni[4][1][1] = ff_hevc_put_hevc_uni_qpel_hv12_8_msa; + c->put_hevc_qpel_uni[5][1][1] = ff_hevc_put_hevc_uni_qpel_hv16_8_msa; + c->put_hevc_qpel_uni[6][1][1] = ff_hevc_put_hevc_uni_qpel_hv24_8_msa; + c->put_hevc_qpel_uni[7][1][1] = ff_hevc_put_hevc_uni_qpel_hv32_8_msa; + c->put_hevc_qpel_uni[8][1][1] = ff_hevc_put_hevc_uni_qpel_hv48_8_msa; + c->put_hevc_qpel_uni[9][1][1] = ff_hevc_put_hevc_uni_qpel_hv64_8_msa; + + c->put_hevc_epel_uni[3][0][0] = ff_hevc_put_hevc_uni_pel_pixels8_8_msa; + c->put_hevc_epel_uni[4][0][0] = ff_hevc_put_hevc_uni_pel_pixels12_8_msa; + c->put_hevc_epel_uni[5][0][0] = ff_hevc_put_hevc_uni_pel_pixels16_8_msa; + c->put_hevc_epel_uni[6][0][0] = ff_hevc_put_hevc_uni_pel_pixels24_8_msa; + c->put_hevc_epel_uni[7][0][0] = ff_hevc_put_hevc_uni_pel_pixels32_8_msa; + + c->put_hevc_epel_uni[1][0][1] = ff_hevc_put_hevc_uni_epel_h4_8_msa; + c->put_hevc_epel_uni[2][0][1] = ff_hevc_put_hevc_uni_epel_h6_8_msa; + c->put_hevc_epel_uni[3][0][1] = ff_hevc_put_hevc_uni_epel_h8_8_msa; + c->put_hevc_epel_uni[4][0][1] = ff_hevc_put_hevc_uni_epel_h12_8_msa; + c->put_hevc_epel_uni[5][0][1] = ff_hevc_put_hevc_uni_epel_h16_8_msa; + c->put_hevc_epel_uni[6][0][1] = ff_hevc_put_hevc_uni_epel_h24_8_msa; + c->put_hevc_epel_uni[7][0][1] = ff_hevc_put_hevc_uni_epel_h32_8_msa; + + c->put_hevc_epel_uni[1][1][0] = ff_hevc_put_hevc_uni_epel_v4_8_msa; + c->put_hevc_epel_uni[2][1][0] = ff_hevc_put_hevc_uni_epel_v6_8_msa; + c->put_hevc_epel_uni[3][1][0] = ff_hevc_put_hevc_uni_epel_v8_8_msa; + c->put_hevc_epel_uni[4][1][0] = ff_hevc_put_hevc_uni_epel_v12_8_msa; + c->put_hevc_epel_uni[5][1][0] = ff_hevc_put_hevc_uni_epel_v16_8_msa; + c->put_hevc_epel_uni[6][1][0] = ff_hevc_put_hevc_uni_epel_v24_8_msa; + c->put_hevc_epel_uni[7][1][0] = ff_hevc_put_hevc_uni_epel_v32_8_msa; + + c->put_hevc_epel_uni[1][1][1] = ff_hevc_put_hevc_uni_epel_hv4_8_msa; + c->put_hevc_epel_uni[2][1][1] = ff_hevc_put_hevc_uni_epel_hv6_8_msa; + c->put_hevc_epel_uni[3][1][1] = ff_hevc_put_hevc_uni_epel_hv8_8_msa; + c->put_hevc_epel_uni[4][1][1] = ff_hevc_put_hevc_uni_epel_hv12_8_msa; + c->put_hevc_epel_uni[5][1][1] = ff_hevc_put_hevc_uni_epel_hv16_8_msa; + c->put_hevc_epel_uni[6][1][1] = ff_hevc_put_hevc_uni_epel_hv24_8_msa; + c->put_hevc_epel_uni[7][1][1] = ff_hevc_put_hevc_uni_epel_hv32_8_msa; + + c->put_hevc_qpel_uni_w[1][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels4_8_msa; + c->put_hevc_qpel_uni_w[3][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels8_8_msa; + c->put_hevc_qpel_uni_w[4][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels12_8_msa; + c->put_hevc_qpel_uni_w[5][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels16_8_msa; + c->put_hevc_qpel_uni_w[6][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels24_8_msa; + c->put_hevc_qpel_uni_w[7][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels32_8_msa; + c->put_hevc_qpel_uni_w[8][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels48_8_msa; + c->put_hevc_qpel_uni_w[9][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels64_8_msa; + + c->put_hevc_qpel_uni_w[1][0][1] = ff_hevc_put_hevc_uni_w_qpel_h4_8_msa; + c->put_hevc_qpel_uni_w[3][0][1] = ff_hevc_put_hevc_uni_w_qpel_h8_8_msa; + c->put_hevc_qpel_uni_w[4][0][1] = ff_hevc_put_hevc_uni_w_qpel_h12_8_msa; + c->put_hevc_qpel_uni_w[5][0][1] = ff_hevc_put_hevc_uni_w_qpel_h16_8_msa; + c->put_hevc_qpel_uni_w[6][0][1] = ff_hevc_put_hevc_uni_w_qpel_h24_8_msa; + c->put_hevc_qpel_uni_w[7][0][1] = ff_hevc_put_hevc_uni_w_qpel_h32_8_msa; + c->put_hevc_qpel_uni_w[8][0][1] = ff_hevc_put_hevc_uni_w_qpel_h48_8_msa; + c->put_hevc_qpel_uni_w[9][0][1] = ff_hevc_put_hevc_uni_w_qpel_h64_8_msa; + + c->put_hevc_qpel_uni_w[1][1][0] = ff_hevc_put_hevc_uni_w_qpel_v4_8_msa; + c->put_hevc_qpel_uni_w[3][1][0] = ff_hevc_put_hevc_uni_w_qpel_v8_8_msa; + c->put_hevc_qpel_uni_w[4][1][0] = ff_hevc_put_hevc_uni_w_qpel_v12_8_msa; + c->put_hevc_qpel_uni_w[5][1][0] = ff_hevc_put_hevc_uni_w_qpel_v16_8_msa; + c->put_hevc_qpel_uni_w[6][1][0] = ff_hevc_put_hevc_uni_w_qpel_v24_8_msa; + c->put_hevc_qpel_uni_w[7][1][0] = ff_hevc_put_hevc_uni_w_qpel_v32_8_msa; + c->put_hevc_qpel_uni_w[8][1][0] = ff_hevc_put_hevc_uni_w_qpel_v48_8_msa; + c->put_hevc_qpel_uni_w[9][1][0] = ff_hevc_put_hevc_uni_w_qpel_v64_8_msa; + + c->put_hevc_qpel_uni_w[1][1][1] = ff_hevc_put_hevc_uni_w_qpel_hv4_8_msa; + c->put_hevc_qpel_uni_w[3][1][1] = ff_hevc_put_hevc_uni_w_qpel_hv8_8_msa; + c->put_hevc_qpel_uni_w[4][1][1] = + ff_hevc_put_hevc_uni_w_qpel_hv12_8_msa; + c->put_hevc_qpel_uni_w[5][1][1] = + ff_hevc_put_hevc_uni_w_qpel_hv16_8_msa; + c->put_hevc_qpel_uni_w[6][1][1] = + ff_hevc_put_hevc_uni_w_qpel_hv24_8_msa; + c->put_hevc_qpel_uni_w[7][1][1] = + ff_hevc_put_hevc_uni_w_qpel_hv32_8_msa; + c->put_hevc_qpel_uni_w[8][1][1] = + ff_hevc_put_hevc_uni_w_qpel_hv48_8_msa; + c->put_hevc_qpel_uni_w[9][1][1] = + ff_hevc_put_hevc_uni_w_qpel_hv64_8_msa; + + c->put_hevc_epel_uni_w[1][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels4_8_msa; + c->put_hevc_epel_uni_w[2][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels6_8_msa; + c->put_hevc_epel_uni_w[3][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels8_8_msa; + c->put_hevc_epel_uni_w[4][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels12_8_msa; + c->put_hevc_epel_uni_w[5][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels16_8_msa; + c->put_hevc_epel_uni_w[6][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels24_8_msa; + c->put_hevc_epel_uni_w[7][0][0] = + ff_hevc_put_hevc_uni_w_pel_pixels32_8_msa; + + c->put_hevc_epel_uni_w[1][0][1] = ff_hevc_put_hevc_uni_w_epel_h4_8_msa; + c->put_hevc_epel_uni_w[2][0][1] = ff_hevc_put_hevc_uni_w_epel_h6_8_msa; + c->put_hevc_epel_uni_w[3][0][1] = ff_hevc_put_hevc_uni_w_epel_h8_8_msa; + c->put_hevc_epel_uni_w[4][0][1] = ff_hevc_put_hevc_uni_w_epel_h12_8_msa; + c->put_hevc_epel_uni_w[5][0][1] = ff_hevc_put_hevc_uni_w_epel_h16_8_msa; + c->put_hevc_epel_uni_w[6][0][1] = ff_hevc_put_hevc_uni_w_epel_h24_8_msa; + c->put_hevc_epel_uni_w[7][0][1] = ff_hevc_put_hevc_uni_w_epel_h32_8_msa; + + c->put_hevc_epel_uni_w[1][1][0] = ff_hevc_put_hevc_uni_w_epel_v4_8_msa; + c->put_hevc_epel_uni_w[2][1][0] = ff_hevc_put_hevc_uni_w_epel_v6_8_msa; + c->put_hevc_epel_uni_w[3][1][0] = ff_hevc_put_hevc_uni_w_epel_v8_8_msa; + c->put_hevc_epel_uni_w[4][1][0] = ff_hevc_put_hevc_uni_w_epel_v12_8_msa; + c->put_hevc_epel_uni_w[5][1][0] = ff_hevc_put_hevc_uni_w_epel_v16_8_msa; + c->put_hevc_epel_uni_w[6][1][0] = ff_hevc_put_hevc_uni_w_epel_v24_8_msa; + c->put_hevc_epel_uni_w[7][1][0] = ff_hevc_put_hevc_uni_w_epel_v32_8_msa; + + c->put_hevc_epel_uni_w[1][1][1] = ff_hevc_put_hevc_uni_w_epel_hv4_8_msa; + c->put_hevc_epel_uni_w[2][1][1] = ff_hevc_put_hevc_uni_w_epel_hv6_8_msa; + c->put_hevc_epel_uni_w[3][1][1] = ff_hevc_put_hevc_uni_w_epel_hv8_8_msa; + c->put_hevc_epel_uni_w[4][1][1] = + ff_hevc_put_hevc_uni_w_epel_hv12_8_msa; + c->put_hevc_epel_uni_w[5][1][1] = + ff_hevc_put_hevc_uni_w_epel_hv16_8_msa; + c->put_hevc_epel_uni_w[6][1][1] = + ff_hevc_put_hevc_uni_w_epel_hv24_8_msa; + c->put_hevc_epel_uni_w[7][1][1] = + ff_hevc_put_hevc_uni_w_epel_hv32_8_msa; + + c->put_hevc_qpel_bi[1][0][0] = ff_hevc_put_hevc_bi_pel_pixels4_8_msa; + c->put_hevc_qpel_bi[3][0][0] = ff_hevc_put_hevc_bi_pel_pixels8_8_msa; + c->put_hevc_qpel_bi[4][0][0] = ff_hevc_put_hevc_bi_pel_pixels12_8_msa; + c->put_hevc_qpel_bi[5][0][0] = ff_hevc_put_hevc_bi_pel_pixels16_8_msa; + c->put_hevc_qpel_bi[6][0][0] = ff_hevc_put_hevc_bi_pel_pixels24_8_msa; + c->put_hevc_qpel_bi[7][0][0] = ff_hevc_put_hevc_bi_pel_pixels32_8_msa; + c->put_hevc_qpel_bi[8][0][0] = ff_hevc_put_hevc_bi_pel_pixels48_8_msa; + c->put_hevc_qpel_bi[9][0][0] = ff_hevc_put_hevc_bi_pel_pixels64_8_msa; + + c->put_hevc_qpel_bi[1][0][1] = ff_hevc_put_hevc_bi_qpel_h4_8_msa; + c->put_hevc_qpel_bi[3][0][1] = ff_hevc_put_hevc_bi_qpel_h8_8_msa; + c->put_hevc_qpel_bi[4][0][1] = ff_hevc_put_hevc_bi_qpel_h12_8_msa; + c->put_hevc_qpel_bi[5][0][1] = ff_hevc_put_hevc_bi_qpel_h16_8_msa; + c->put_hevc_qpel_bi[6][0][1] = ff_hevc_put_hevc_bi_qpel_h24_8_msa; + c->put_hevc_qpel_bi[7][0][1] = ff_hevc_put_hevc_bi_qpel_h32_8_msa; + c->put_hevc_qpel_bi[8][0][1] = ff_hevc_put_hevc_bi_qpel_h48_8_msa; + c->put_hevc_qpel_bi[9][0][1] = ff_hevc_put_hevc_bi_qpel_h64_8_msa; + + c->put_hevc_qpel_bi[1][1][0] = ff_hevc_put_hevc_bi_qpel_v4_8_msa; + c->put_hevc_qpel_bi[3][1][0] = ff_hevc_put_hevc_bi_qpel_v8_8_msa; + c->put_hevc_qpel_bi[4][1][0] = ff_hevc_put_hevc_bi_qpel_v12_8_msa; + c->put_hevc_qpel_bi[5][1][0] = ff_hevc_put_hevc_bi_qpel_v16_8_msa; + c->put_hevc_qpel_bi[6][1][0] = ff_hevc_put_hevc_bi_qpel_v24_8_msa; + c->put_hevc_qpel_bi[7][1][0] = ff_hevc_put_hevc_bi_qpel_v32_8_msa; + c->put_hevc_qpel_bi[8][1][0] = ff_hevc_put_hevc_bi_qpel_v48_8_msa; + c->put_hevc_qpel_bi[9][1][0] = ff_hevc_put_hevc_bi_qpel_v64_8_msa; + + c->put_hevc_qpel_bi[1][1][1] = ff_hevc_put_hevc_bi_qpel_hv4_8_msa; + c->put_hevc_qpel_bi[3][1][1] = ff_hevc_put_hevc_bi_qpel_hv8_8_msa; + c->put_hevc_qpel_bi[4][1][1] = ff_hevc_put_hevc_bi_qpel_hv12_8_msa; + c->put_hevc_qpel_bi[5][1][1] = ff_hevc_put_hevc_bi_qpel_hv16_8_msa; + c->put_hevc_qpel_bi[6][1][1] = ff_hevc_put_hevc_bi_qpel_hv24_8_msa; + c->put_hevc_qpel_bi[7][1][1] = ff_hevc_put_hevc_bi_qpel_hv32_8_msa; + c->put_hevc_qpel_bi[8][1][1] = ff_hevc_put_hevc_bi_qpel_hv48_8_msa; + c->put_hevc_qpel_bi[9][1][1] = ff_hevc_put_hevc_bi_qpel_hv64_8_msa; + + c->put_hevc_epel_bi[1][0][0] = ff_hevc_put_hevc_bi_pel_pixels4_8_msa; + c->put_hevc_epel_bi[2][0][0] = ff_hevc_put_hevc_bi_pel_pixels6_8_msa; + c->put_hevc_epel_bi[3][0][0] = ff_hevc_put_hevc_bi_pel_pixels8_8_msa; + c->put_hevc_epel_bi[4][0][0] = ff_hevc_put_hevc_bi_pel_pixels12_8_msa; + c->put_hevc_epel_bi[5][0][0] = ff_hevc_put_hevc_bi_pel_pixels16_8_msa; + c->put_hevc_epel_bi[6][0][0] = ff_hevc_put_hevc_bi_pel_pixels24_8_msa; + c->put_hevc_epel_bi[7][0][0] = ff_hevc_put_hevc_bi_pel_pixels32_8_msa; + + c->put_hevc_epel_bi[1][0][1] = ff_hevc_put_hevc_bi_epel_h4_8_msa; + c->put_hevc_epel_bi[2][0][1] = ff_hevc_put_hevc_bi_epel_h6_8_msa; + c->put_hevc_epel_bi[3][0][1] = ff_hevc_put_hevc_bi_epel_h8_8_msa; + c->put_hevc_epel_bi[4][0][1] = ff_hevc_put_hevc_bi_epel_h12_8_msa; + c->put_hevc_epel_bi[5][0][1] = ff_hevc_put_hevc_bi_epel_h16_8_msa; + c->put_hevc_epel_bi[6][0][1] = ff_hevc_put_hevc_bi_epel_h24_8_msa; + c->put_hevc_epel_bi[7][0][1] = ff_hevc_put_hevc_bi_epel_h32_8_msa; + + c->put_hevc_epel_bi[1][1][0] = ff_hevc_put_hevc_bi_epel_v4_8_msa; + c->put_hevc_epel_bi[2][1][0] = ff_hevc_put_hevc_bi_epel_v6_8_msa; + c->put_hevc_epel_bi[3][1][0] = ff_hevc_put_hevc_bi_epel_v8_8_msa; + c->put_hevc_epel_bi[4][1][0] = ff_hevc_put_hevc_bi_epel_v12_8_msa; + c->put_hevc_epel_bi[5][1][0] = ff_hevc_put_hevc_bi_epel_v16_8_msa; + c->put_hevc_epel_bi[6][1][0] = ff_hevc_put_hevc_bi_epel_v24_8_msa; + c->put_hevc_epel_bi[7][1][0] = ff_hevc_put_hevc_bi_epel_v32_8_msa; + + c->put_hevc_epel_bi[1][1][1] = ff_hevc_put_hevc_bi_epel_hv4_8_msa; + c->put_hevc_epel_bi[2][1][1] = ff_hevc_put_hevc_bi_epel_hv6_8_msa; + c->put_hevc_epel_bi[3][1][1] = ff_hevc_put_hevc_bi_epel_hv8_8_msa; + c->put_hevc_epel_bi[4][1][1] = ff_hevc_put_hevc_bi_epel_hv12_8_msa; + c->put_hevc_epel_bi[5][1][1] = ff_hevc_put_hevc_bi_epel_hv16_8_msa; + c->put_hevc_epel_bi[6][1][1] = ff_hevc_put_hevc_bi_epel_hv24_8_msa; + c->put_hevc_epel_bi[7][1][1] = ff_hevc_put_hevc_bi_epel_hv32_8_msa; + + c->put_hevc_qpel_bi_w[1][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels4_8_msa; + c->put_hevc_qpel_bi_w[3][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels8_8_msa; + c->put_hevc_qpel_bi_w[4][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels12_8_msa; + c->put_hevc_qpel_bi_w[5][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels16_8_msa; + c->put_hevc_qpel_bi_w[6][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels24_8_msa; + c->put_hevc_qpel_bi_w[7][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels32_8_msa; + c->put_hevc_qpel_bi_w[8][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels48_8_msa; + c->put_hevc_qpel_bi_w[9][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels64_8_msa; + + c->put_hevc_qpel_bi_w[1][0][1] = ff_hevc_put_hevc_bi_w_qpel_h4_8_msa; + c->put_hevc_qpel_bi_w[3][0][1] = ff_hevc_put_hevc_bi_w_qpel_h8_8_msa; + c->put_hevc_qpel_bi_w[4][0][1] = ff_hevc_put_hevc_bi_w_qpel_h12_8_msa; + c->put_hevc_qpel_bi_w[5][0][1] = ff_hevc_put_hevc_bi_w_qpel_h16_8_msa; + c->put_hevc_qpel_bi_w[6][0][1] = ff_hevc_put_hevc_bi_w_qpel_h24_8_msa; + c->put_hevc_qpel_bi_w[7][0][1] = ff_hevc_put_hevc_bi_w_qpel_h32_8_msa; + c->put_hevc_qpel_bi_w[8][0][1] = ff_hevc_put_hevc_bi_w_qpel_h48_8_msa; + c->put_hevc_qpel_bi_w[9][0][1] = ff_hevc_put_hevc_bi_w_qpel_h64_8_msa; + + c->put_hevc_qpel_bi_w[1][1][0] = ff_hevc_put_hevc_bi_w_qpel_v4_8_msa; + c->put_hevc_qpel_bi_w[3][1][0] = ff_hevc_put_hevc_bi_w_qpel_v8_8_msa; + c->put_hevc_qpel_bi_w[4][1][0] = ff_hevc_put_hevc_bi_w_qpel_v12_8_msa; + c->put_hevc_qpel_bi_w[5][1][0] = ff_hevc_put_hevc_bi_w_qpel_v16_8_msa; + c->put_hevc_qpel_bi_w[6][1][0] = ff_hevc_put_hevc_bi_w_qpel_v24_8_msa; + c->put_hevc_qpel_bi_w[7][1][0] = ff_hevc_put_hevc_bi_w_qpel_v32_8_msa; + c->put_hevc_qpel_bi_w[8][1][0] = ff_hevc_put_hevc_bi_w_qpel_v48_8_msa; + c->put_hevc_qpel_bi_w[9][1][0] = ff_hevc_put_hevc_bi_w_qpel_v64_8_msa; + + c->put_hevc_qpel_bi_w[1][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv4_8_msa; + c->put_hevc_qpel_bi_w[3][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv8_8_msa; + c->put_hevc_qpel_bi_w[4][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv12_8_msa; + c->put_hevc_qpel_bi_w[5][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv16_8_msa; + c->put_hevc_qpel_bi_w[6][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv24_8_msa; + c->put_hevc_qpel_bi_w[7][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv32_8_msa; + c->put_hevc_qpel_bi_w[8][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv48_8_msa; + c->put_hevc_qpel_bi_w[9][1][1] = ff_hevc_put_hevc_bi_w_qpel_hv64_8_msa; + + c->put_hevc_epel_bi_w[1][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels4_8_msa; + c->put_hevc_epel_bi_w[2][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels6_8_msa; + c->put_hevc_epel_bi_w[3][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels8_8_msa; + c->put_hevc_epel_bi_w[4][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels12_8_msa; + c->put_hevc_epel_bi_w[5][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels16_8_msa; + c->put_hevc_epel_bi_w[6][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels24_8_msa; + c->put_hevc_epel_bi_w[7][0][0] = + ff_hevc_put_hevc_bi_w_pel_pixels32_8_msa; + + c->put_hevc_epel_bi_w[1][0][1] = ff_hevc_put_hevc_bi_w_epel_h4_8_msa; + c->put_hevc_epel_bi_w[2][0][1] = ff_hevc_put_hevc_bi_w_epel_h6_8_msa; + c->put_hevc_epel_bi_w[3][0][1] = ff_hevc_put_hevc_bi_w_epel_h8_8_msa; + c->put_hevc_epel_bi_w[4][0][1] = ff_hevc_put_hevc_bi_w_epel_h12_8_msa; + c->put_hevc_epel_bi_w[5][0][1] = ff_hevc_put_hevc_bi_w_epel_h16_8_msa; + c->put_hevc_epel_bi_w[6][0][1] = ff_hevc_put_hevc_bi_w_epel_h24_8_msa; + c->put_hevc_epel_bi_w[7][0][1] = ff_hevc_put_hevc_bi_w_epel_h32_8_msa; + + c->put_hevc_epel_bi_w[1][1][0] = ff_hevc_put_hevc_bi_w_epel_v4_8_msa; + c->put_hevc_epel_bi_w[2][1][0] = ff_hevc_put_hevc_bi_w_epel_v6_8_msa; + c->put_hevc_epel_bi_w[3][1][0] = ff_hevc_put_hevc_bi_w_epel_v8_8_msa; + c->put_hevc_epel_bi_w[4][1][0] = ff_hevc_put_hevc_bi_w_epel_v12_8_msa; + c->put_hevc_epel_bi_w[5][1][0] = ff_hevc_put_hevc_bi_w_epel_v16_8_msa; + c->put_hevc_epel_bi_w[6][1][0] = ff_hevc_put_hevc_bi_w_epel_v24_8_msa; + c->put_hevc_epel_bi_w[7][1][0] = ff_hevc_put_hevc_bi_w_epel_v32_8_msa; + + c->put_hevc_epel_bi_w[1][1][1] = ff_hevc_put_hevc_bi_w_epel_hv4_8_msa; + c->put_hevc_epel_bi_w[2][1][1] = ff_hevc_put_hevc_bi_w_epel_hv6_8_msa; + c->put_hevc_epel_bi_w[3][1][1] = ff_hevc_put_hevc_bi_w_epel_hv8_8_msa; + c->put_hevc_epel_bi_w[4][1][1] = ff_hevc_put_hevc_bi_w_epel_hv12_8_msa; + c->put_hevc_epel_bi_w[5][1][1] = ff_hevc_put_hevc_bi_w_epel_hv16_8_msa; + c->put_hevc_epel_bi_w[6][1][1] = ff_hevc_put_hevc_bi_w_epel_hv24_8_msa; + c->put_hevc_epel_bi_w[7][1][1] = ff_hevc_put_hevc_bi_w_epel_hv32_8_msa; + + c->sao_band_filter[0] = + c->sao_band_filter[1] = + c->sao_band_filter[2] = + c->sao_band_filter[3] = + c->sao_band_filter[4] = ff_hevc_sao_band_filter_0_8_msa; + + c->sao_edge_filter[0] = + c->sao_edge_filter[1] = + c->sao_edge_filter[2] = + c->sao_edge_filter[3] = + c->sao_edge_filter[4] = ff_hevc_sao_edge_filter_8_msa; + + c->hevc_h_loop_filter_luma = ff_hevc_loop_filter_luma_h_8_msa; + c->hevc_v_loop_filter_luma = ff_hevc_loop_filter_luma_v_8_msa; + + c->hevc_h_loop_filter_chroma = ff_hevc_loop_filter_chroma_h_8_msa; + c->hevc_v_loop_filter_chroma = ff_hevc_loop_filter_chroma_v_8_msa; + + c->hevc_h_loop_filter_luma_c = ff_hevc_loop_filter_luma_h_8_msa; + c->hevc_v_loop_filter_luma_c = ff_hevc_loop_filter_luma_v_8_msa; + + c->hevc_h_loop_filter_chroma_c = + ff_hevc_loop_filter_chroma_h_8_msa; + c->hevc_v_loop_filter_chroma_c = + ff_hevc_loop_filter_chroma_v_8_msa; + + c->idct[0] = ff_hevc_idct_4x4_msa; + c->idct[1] = ff_hevc_idct_8x8_msa; + c->idct[2] = ff_hevc_idct_16x16_msa; + c->idct[3] = ff_hevc_idct_32x32_msa; + c->idct_dc[0] = ff_hevc_idct_dc_4x4_msa; + c->idct_dc[1] = ff_hevc_idct_dc_8x8_msa; + c->idct_dc[2] = ff_hevc_idct_dc_16x16_msa; + c->idct_dc[3] = ff_hevc_idct_dc_32x32_msa; + c->add_residual[0] = ff_hevc_addblk_4x4_msa; + c->add_residual[1] = ff_hevc_addblk_8x8_msa; + c->add_residual[2] = ff_hevc_addblk_16x16_msa; + c->add_residual[3] = ff_hevc_addblk_32x32_msa; + c->transform_4x4_luma = ff_hevc_idct_luma_4x4_msa; + } } } -#endif // #if HAVE_MSA - -void ff_hevc_dsp_init_mips(HEVCDSPContext *c, const int bit_depth) -{ -#if HAVE_MMI - hevc_dsp_init_mmi(c, bit_depth); -#endif // #if HAVE_MMI -#if HAVE_MSA - hevc_dsp_init_msa(c, bit_depth); -#endif // #if HAVE_MSA -} diff --git a/libavcodec/mips/hevcpred_init_mips.c b/libavcodec/mips/hevcpred_init_mips.c index e987698d66..f7ecb34dcc 100644 --- a/libavcodec/mips/hevcpred_init_mips.c +++ b/libavcodec/mips/hevcpred_init_mips.c @@ -18,32 +18,28 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "config.h" #include "libavutil/attributes.h" #include "libavcodec/mips/hevcpred_mips.h" -#if HAVE_MSA -static av_cold void hevc_pred_init_msa(HEVCPredContext *c, const int bit_depth) -{ - if (8 == bit_depth) { - c->intra_pred[2] = ff_intra_pred_8_16x16_msa; - c->intra_pred[3] = ff_intra_pred_8_32x32_msa; - c->pred_planar[0] = ff_hevc_intra_pred_planar_0_msa; - c->pred_planar[1] = ff_hevc_intra_pred_planar_1_msa; - c->pred_planar[2] = ff_hevc_intra_pred_planar_2_msa; - c->pred_planar[3] = ff_hevc_intra_pred_planar_3_msa; - c->pred_dc = ff_hevc_intra_pred_dc_msa; - c->pred_angular[0] = ff_pred_intra_pred_angular_0_msa; - c->pred_angular[1] = ff_pred_intra_pred_angular_1_msa; - c->pred_angular[2] = ff_pred_intra_pred_angular_2_msa; - c->pred_angular[3] = ff_pred_intra_pred_angular_3_msa; - } -} -#endif // #if HAVE_MSA - void ff_hevc_pred_init_mips(HEVCPredContext *c, const int bit_depth) { -#if HAVE_MSA - hevc_pred_init_msa(c, bit_depth); -#endif // #if HAVE_MSA + int cpu_flags = av_get_cpu_flags(); + + if (have_msa(cpu_flags)) { + if (bit_depth == 8) { + c->intra_pred[2] = ff_intra_pred_8_16x16_msa; + c->intra_pred[3] = ff_intra_pred_8_32x32_msa; + c->pred_planar[0] = ff_hevc_intra_pred_planar_0_msa; + c->pred_planar[1] = ff_hevc_intra_pred_planar_1_msa; + c->pred_planar[2] = ff_hevc_intra_pred_planar_2_msa; + c->pred_planar[3] = ff_hevc_intra_pred_planar_3_msa; + c->pred_dc = ff_hevc_intra_pred_dc_msa; + c->pred_angular[0] = ff_pred_intra_pred_angular_0_msa; + c->pred_angular[1] = ff_pred_intra_pred_angular_1_msa; + c->pred_angular[2] = ff_pred_intra_pred_angular_2_msa; + c->pred_angular[3] = ff_pred_intra_pred_angular_3_msa; + } + } } diff --git a/libavcodec/mips/hpeldsp_init_mips.c b/libavcodec/mips/hpeldsp_init_mips.c index d6f7a9793d..77cbe99fa4 100644 --- a/libavcodec/mips/hpeldsp_init_mips.c +++ b/libavcodec/mips/hpeldsp_init_mips.c @@ -19,104 +19,94 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "../hpeldsp.h" #include "libavcodec/mips/hpeldsp_mips.h" -#if HAVE_MSA -static void ff_hpeldsp_init_msa(HpelDSPContext *c, int flags) -{ - c->put_pixels_tab[0][0] = ff_put_pixels16_msa; - c->put_pixels_tab[0][1] = ff_put_pixels16_x2_msa; - c->put_pixels_tab[0][2] = ff_put_pixels16_y2_msa; - c->put_pixels_tab[0][3] = ff_put_pixels16_xy2_msa; - - c->put_pixels_tab[1][0] = ff_put_pixels8_msa; - c->put_pixels_tab[1][1] = ff_put_pixels8_x2_msa; - c->put_pixels_tab[1][2] = ff_put_pixels8_y2_msa; - c->put_pixels_tab[1][3] = ff_put_pixels8_xy2_msa; - - c->put_pixels_tab[2][1] = ff_put_pixels4_x2_msa; - c->put_pixels_tab[2][2] = ff_put_pixels4_y2_msa; - c->put_pixels_tab[2][3] = ff_put_pixels4_xy2_msa; - - c->put_no_rnd_pixels_tab[0][0] = ff_put_pixels16_msa; - c->put_no_rnd_pixels_tab[0][1] = ff_put_no_rnd_pixels16_x2_msa; - c->put_no_rnd_pixels_tab[0][2] = ff_put_no_rnd_pixels16_y2_msa; - c->put_no_rnd_pixels_tab[0][3] = ff_put_no_rnd_pixels16_xy2_msa; - - c->put_no_rnd_pixels_tab[1][0] = ff_put_pixels8_msa; - c->put_no_rnd_pixels_tab[1][1] = ff_put_no_rnd_pixels8_x2_msa; - c->put_no_rnd_pixels_tab[1][2] = ff_put_no_rnd_pixels8_y2_msa; - c->put_no_rnd_pixels_tab[1][3] = ff_put_no_rnd_pixels8_xy2_msa; - - c->avg_pixels_tab[0][0] = ff_avg_pixels16_msa; - c->avg_pixels_tab[0][1] = ff_avg_pixels16_x2_msa; - c->avg_pixels_tab[0][2] = ff_avg_pixels16_y2_msa; - c->avg_pixels_tab[0][3] = ff_avg_pixels16_xy2_msa; - - c->avg_pixels_tab[1][0] = ff_avg_pixels8_msa; - c->avg_pixels_tab[1][1] = ff_avg_pixels8_x2_msa; - c->avg_pixels_tab[1][2] = ff_avg_pixels8_y2_msa; - c->avg_pixels_tab[1][3] = ff_avg_pixels8_xy2_msa; - - c->avg_pixels_tab[2][0] = ff_avg_pixels4_msa; - c->avg_pixels_tab[2][1] = ff_avg_pixels4_x2_msa; - c->avg_pixels_tab[2][2] = ff_avg_pixels4_y2_msa; - c->avg_pixels_tab[2][3] = ff_avg_pixels4_xy2_msa; -} -#endif // #if HAVE_MSA - -#if HAVE_MMI -static void ff_hpeldsp_init_mmi(HpelDSPContext *c, int flags) -{ - c->put_pixels_tab[0][0] = ff_put_pixels16_8_mmi; - c->put_pixels_tab[0][1] = ff_put_pixels16_x2_8_mmi; - c->put_pixels_tab[0][2] = ff_put_pixels16_y2_8_mmi; - c->put_pixels_tab[0][3] = ff_put_pixels16_xy2_8_mmi; - - c->put_pixels_tab[1][0] = ff_put_pixels8_8_mmi; - c->put_pixels_tab[1][1] = ff_put_pixels8_x2_8_mmi; - c->put_pixels_tab[1][2] = ff_put_pixels8_y2_8_mmi; - c->put_pixels_tab[1][3] = ff_put_pixels8_xy2_8_mmi; - - c->put_pixels_tab[2][0] = ff_put_pixels4_8_mmi; - c->put_pixels_tab[2][1] = ff_put_pixels4_x2_8_mmi; - c->put_pixels_tab[2][2] = ff_put_pixels4_y2_8_mmi; - c->put_pixels_tab[2][3] = ff_put_pixels4_xy2_8_mmi; - - c->put_no_rnd_pixels_tab[0][0] = ff_put_pixels16_8_mmi; - c->put_no_rnd_pixels_tab[0][1] = ff_put_no_rnd_pixels16_x2_8_mmi; - c->put_no_rnd_pixels_tab[0][2] = ff_put_no_rnd_pixels16_y2_8_mmi; - c->put_no_rnd_pixels_tab[0][3] = ff_put_no_rnd_pixels16_xy2_8_mmi; - - c->put_no_rnd_pixels_tab[1][0] = ff_put_pixels8_8_mmi; - c->put_no_rnd_pixels_tab[1][1] = ff_put_no_rnd_pixels8_x2_8_mmi; - c->put_no_rnd_pixels_tab[1][2] = ff_put_no_rnd_pixels8_y2_8_mmi; - c->put_no_rnd_pixels_tab[1][3] = ff_put_no_rnd_pixels8_xy2_8_mmi; - - c->avg_pixels_tab[0][0] = ff_avg_pixels16_8_mmi; - c->avg_pixels_tab[0][1] = ff_avg_pixels16_x2_8_mmi; - c->avg_pixels_tab[0][2] = ff_avg_pixels16_y2_8_mmi; - c->avg_pixels_tab[0][3] = ff_avg_pixels16_xy2_8_mmi; - - c->avg_pixels_tab[1][0] = ff_avg_pixels8_8_mmi; - c->avg_pixels_tab[1][1] = ff_avg_pixels8_x2_8_mmi; - c->avg_pixels_tab[1][2] = ff_avg_pixels8_y2_8_mmi; - c->avg_pixels_tab[1][3] = ff_avg_pixels8_xy2_8_mmi; - - c->avg_pixels_tab[2][0] = ff_avg_pixels4_8_mmi; - c->avg_pixels_tab[2][1] = ff_avg_pixels4_x2_8_mmi; - c->avg_pixels_tab[2][2] = ff_avg_pixels4_y2_8_mmi; - c->avg_pixels_tab[2][3] = ff_avg_pixels4_xy2_8_mmi; -} -#endif // #if HAVE_MMI - void ff_hpeldsp_init_mips(HpelDSPContext *c, int flags) { -#if HAVE_MMI - ff_hpeldsp_init_mmi(c, flags); -#endif // #if HAVE_MMI -#if HAVE_MSA - ff_hpeldsp_init_msa(c, flags); -#endif // #if HAVE_MSA + int cpu_flags = av_get_cpu_flags(); + + if (have_mmi(cpu_flags)) { + c->put_pixels_tab[0][0] = ff_put_pixels16_8_mmi; + c->put_pixels_tab[0][1] = ff_put_pixels16_x2_8_mmi; + c->put_pixels_tab[0][2] = ff_put_pixels16_y2_8_mmi; + c->put_pixels_tab[0][3] = ff_put_pixels16_xy2_8_mmi; + + c->put_pixels_tab[1][0] = ff_put_pixels8_8_mmi; + c->put_pixels_tab[1][1] = ff_put_pixels8_x2_8_mmi; + c->put_pixels_tab[1][2] = ff_put_pixels8_y2_8_mmi; + c->put_pixels_tab[1][3] = ff_put_pixels8_xy2_8_mmi; + + c->put_pixels_tab[2][0] = ff_put_pixels4_8_mmi; + c->put_pixels_tab[2][1] = ff_put_pixels4_x2_8_mmi; + c->put_pixels_tab[2][2] = ff_put_pixels4_y2_8_mmi; + c->put_pixels_tab[2][3] = ff_put_pixels4_xy2_8_mmi; + + c->put_no_rnd_pixels_tab[0][0] = ff_put_pixels16_8_mmi; + c->put_no_rnd_pixels_tab[0][1] = ff_put_no_rnd_pixels16_x2_8_mmi; + c->put_no_rnd_pixels_tab[0][2] = ff_put_no_rnd_pixels16_y2_8_mmi; + c->put_no_rnd_pixels_tab[0][3] = ff_put_no_rnd_pixels16_xy2_8_mmi; + + c->put_no_rnd_pixels_tab[1][0] = ff_put_pixels8_8_mmi; + c->put_no_rnd_pixels_tab[1][1] = ff_put_no_rnd_pixels8_x2_8_mmi; + c->put_no_rnd_pixels_tab[1][2] = ff_put_no_rnd_pixels8_y2_8_mmi; + c->put_no_rnd_pixels_tab[1][3] = ff_put_no_rnd_pixels8_xy2_8_mmi; + + c->avg_pixels_tab[0][0] = ff_avg_pixels16_8_mmi; + c->avg_pixels_tab[0][1] = ff_avg_pixels16_x2_8_mmi; + c->avg_pixels_tab[0][2] = ff_avg_pixels16_y2_8_mmi; + c->avg_pixels_tab[0][3] = ff_avg_pixels16_xy2_8_mmi; + + c->avg_pixels_tab[1][0] = ff_avg_pixels8_8_mmi; + c->avg_pixels_tab[1][1] = ff_avg_pixels8_x2_8_mmi; + c->avg_pixels_tab[1][2] = ff_avg_pixels8_y2_8_mmi; + c->avg_pixels_tab[1][3] = ff_avg_pixels8_xy2_8_mmi; + + c->avg_pixels_tab[2][0] = ff_avg_pixels4_8_mmi; + c->avg_pixels_tab[2][1] = ff_avg_pixels4_x2_8_mmi; + c->avg_pixels_tab[2][2] = ff_avg_pixels4_y2_8_mmi; + c->avg_pixels_tab[2][3] = ff_avg_pixels4_xy2_8_mmi; + } + + if (have_msa(cpu_flags)) { + c->put_pixels_tab[0][0] = ff_put_pixels16_msa; + c->put_pixels_tab[0][1] = ff_put_pixels16_x2_msa; + c->put_pixels_tab[0][2] = ff_put_pixels16_y2_msa; + c->put_pixels_tab[0][3] = ff_put_pixels16_xy2_msa; + + c->put_pixels_tab[1][0] = ff_put_pixels8_msa; + c->put_pixels_tab[1][1] = ff_put_pixels8_x2_msa; + c->put_pixels_tab[1][2] = ff_put_pixels8_y2_msa; + c->put_pixels_tab[1][3] = ff_put_pixels8_xy2_msa; + + c->put_pixels_tab[2][1] = ff_put_pixels4_x2_msa; + c->put_pixels_tab[2][2] = ff_put_pixels4_y2_msa; + c->put_pixels_tab[2][3] = ff_put_pixels4_xy2_msa; + + c->put_no_rnd_pixels_tab[0][0] = ff_put_pixels16_msa; + c->put_no_rnd_pixels_tab[0][1] = ff_put_no_rnd_pixels16_x2_msa; + c->put_no_rnd_pixels_tab[0][2] = ff_put_no_rnd_pixels16_y2_msa; + c->put_no_rnd_pixels_tab[0][3] = ff_put_no_rnd_pixels16_xy2_msa; + + c->put_no_rnd_pixels_tab[1][0] = ff_put_pixels8_msa; + c->put_no_rnd_pixels_tab[1][1] = ff_put_no_rnd_pixels8_x2_msa; + c->put_no_rnd_pixels_tab[1][2] = ff_put_no_rnd_pixels8_y2_msa; + c->put_no_rnd_pixels_tab[1][3] = ff_put_no_rnd_pixels8_xy2_msa; + + c->avg_pixels_tab[0][0] = ff_avg_pixels16_msa; + c->avg_pixels_tab[0][1] = ff_avg_pixels16_x2_msa; + c->avg_pixels_tab[0][2] = ff_avg_pixels16_y2_msa; + c->avg_pixels_tab[0][3] = ff_avg_pixels16_xy2_msa; + + c->avg_pixels_tab[1][0] = ff_avg_pixels8_msa; + c->avg_pixels_tab[1][1] = ff_avg_pixels8_x2_msa; + c->avg_pixels_tab[1][2] = ff_avg_pixels8_y2_msa; + c->avg_pixels_tab[1][3] = ff_avg_pixels8_xy2_msa; + + c->avg_pixels_tab[2][0] = ff_avg_pixels4_msa; + c->avg_pixels_tab[2][1] = ff_avg_pixels4_x2_msa; + c->avg_pixels_tab[2][2] = ff_avg_pixels4_y2_msa; + c->avg_pixels_tab[2][3] = ff_avg_pixels4_xy2_msa; + } } diff --git a/libavcodec/mips/idctdsp_init_mips.c b/libavcodec/mips/idctdsp_init_mips.c index 85b76ca478..23efd9ed58 100644 --- a/libavcodec/mips/idctdsp_init_mips.c +++ b/libavcodec/mips/idctdsp_init_mips.c @@ -19,56 +19,44 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "idctdsp_mips.h" #include "xvididct_mips.h" -#if HAVE_MSA -static av_cold void idctdsp_init_msa(IDCTDSPContext *c, AVCodecContext *avctx, - unsigned high_bit_depth) +av_cold void ff_idctdsp_init_mips(IDCTDSPContext *c, AVCodecContext *avctx, + unsigned high_bit_depth) { - if ((avctx->lowres != 1) && (avctx->lowres != 2) && (avctx->lowres != 3) && - (avctx->bits_per_raw_sample != 10) && - (avctx->bits_per_raw_sample != 12) && - (avctx->idct_algo == FF_IDCT_AUTO)) { - c->idct_put = ff_simple_idct_put_msa; - c->idct_add = ff_simple_idct_add_msa; - c->idct = ff_simple_idct_msa; - c->perm_type = FF_IDCT_PERM_NONE; - } + int cpu_flags = av_get_cpu_flags(); - c->put_pixels_clamped = ff_put_pixels_clamped_msa; - c->put_signed_pixels_clamped = ff_put_signed_pixels_clamped_msa; - c->add_pixels_clamped = ff_add_pixels_clamped_msa; -} -#endif // #if HAVE_MSA + if (have_mmi(cpu_flags)) { + if ((avctx->lowres != 1) && (avctx->lowres != 2) && (avctx->lowres != 3) && + (avctx->bits_per_raw_sample != 10) && + (avctx->bits_per_raw_sample != 12) && + ((avctx->idct_algo == FF_IDCT_AUTO) || (avctx->idct_algo == FF_IDCT_SIMPLE))) { + c->idct_put = ff_simple_idct_put_8_mmi; + c->idct_add = ff_simple_idct_add_8_mmi; + c->idct = ff_simple_idct_8_mmi; + c->perm_type = FF_IDCT_PERM_NONE; + } -#if HAVE_MMI -static av_cold void idctdsp_init_mmi(IDCTDSPContext *c, AVCodecContext *avctx, - unsigned high_bit_depth) -{ - if ((avctx->lowres != 1) && (avctx->lowres != 2) && (avctx->lowres != 3) && - (avctx->bits_per_raw_sample != 10) && - (avctx->bits_per_raw_sample != 12) && - ((avctx->idct_algo == FF_IDCT_AUTO) || (avctx->idct_algo == FF_IDCT_SIMPLE))) { - c->idct_put = ff_simple_idct_put_8_mmi; - c->idct_add = ff_simple_idct_add_8_mmi; - c->idct = ff_simple_idct_8_mmi; - c->perm_type = FF_IDCT_PERM_NONE; + c->put_pixels_clamped = ff_put_pixels_clamped_mmi; + c->add_pixels_clamped = ff_add_pixels_clamped_mmi; + c->put_signed_pixels_clamped = ff_put_signed_pixels_clamped_mmi; } - c->put_pixels_clamped = ff_put_pixels_clamped_mmi; - c->add_pixels_clamped = ff_add_pixels_clamped_mmi; - c->put_signed_pixels_clamped = ff_put_signed_pixels_clamped_mmi; -} -#endif /* HAVE_MMI */ + if (have_msa(cpu_flags)) { + if ((avctx->lowres != 1) && (avctx->lowres != 2) && (avctx->lowres != 3) && + (avctx->bits_per_raw_sample != 10) && + (avctx->bits_per_raw_sample != 12) && + (avctx->idct_algo == FF_IDCT_AUTO)) { + c->idct_put = ff_simple_idct_put_msa; + c->idct_add = ff_simple_idct_add_msa; + c->idct = ff_simple_idct_msa; + c->perm_type = FF_IDCT_PERM_NONE; + } -av_cold void ff_idctdsp_init_mips(IDCTDSPContext *c, AVCodecContext *avctx, - unsigned high_bit_depth) -{ -#if HAVE_MMI - idctdsp_init_mmi(c, avctx, high_bit_depth); -#endif /* HAVE_MMI */ -#if HAVE_MSA - idctdsp_init_msa(c, avctx, high_bit_depth); -#endif // #if HAVE_MSA + c->put_pixels_clamped = ff_put_pixels_clamped_msa; + c->put_signed_pixels_clamped = ff_put_signed_pixels_clamped_msa; + c->add_pixels_clamped = ff_add_pixels_clamped_msa; + } } diff --git a/libavcodec/mips/me_cmp_init_mips.c b/libavcodec/mips/me_cmp_init_mips.c index 219a0dc00c..e3e33b8e5e 100644 --- a/libavcodec/mips/me_cmp_init_mips.c +++ b/libavcodec/mips/me_cmp_init_mips.c @@ -18,39 +18,35 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "me_cmp_mips.h" -#if HAVE_MSA -static av_cold void me_cmp_msa(MECmpContext *c, AVCodecContext *avctx) +av_cold void ff_me_cmp_init_mips(MECmpContext *c, AVCodecContext *avctx) { + int cpu_flags = av_get_cpu_flags(); + + if (have_msa(cpu_flags)) { #if BIT_DEPTH == 8 - c->pix_abs[0][0] = ff_pix_abs16_msa; - c->pix_abs[0][1] = ff_pix_abs16_x2_msa; - c->pix_abs[0][2] = ff_pix_abs16_y2_msa; - c->pix_abs[0][3] = ff_pix_abs16_xy2_msa; - c->pix_abs[1][0] = ff_pix_abs8_msa; - c->pix_abs[1][1] = ff_pix_abs8_x2_msa; - c->pix_abs[1][2] = ff_pix_abs8_y2_msa; - c->pix_abs[1][3] = ff_pix_abs8_xy2_msa; + c->pix_abs[0][0] = ff_pix_abs16_msa; + c->pix_abs[0][1] = ff_pix_abs16_x2_msa; + c->pix_abs[0][2] = ff_pix_abs16_y2_msa; + c->pix_abs[0][3] = ff_pix_abs16_xy2_msa; + c->pix_abs[1][0] = ff_pix_abs8_msa; + c->pix_abs[1][1] = ff_pix_abs8_x2_msa; + c->pix_abs[1][2] = ff_pix_abs8_y2_msa; + c->pix_abs[1][3] = ff_pix_abs8_xy2_msa; - c->hadamard8_diff[0] = ff_hadamard8_diff16_msa; - c->hadamard8_diff[1] = ff_hadamard8_diff8x8_msa; + c->hadamard8_diff[0] = ff_hadamard8_diff16_msa; + c->hadamard8_diff[1] = ff_hadamard8_diff8x8_msa; - c->hadamard8_diff[4] = ff_hadamard8_intra16_msa; - c->hadamard8_diff[5] = ff_hadamard8_intra8x8_msa; + c->hadamard8_diff[4] = ff_hadamard8_intra16_msa; + c->hadamard8_diff[5] = ff_hadamard8_intra8x8_msa; - c->sad[0] = ff_pix_abs16_msa; - c->sad[1] = ff_pix_abs8_msa; - c->sse[0] = ff_sse16_msa; - c->sse[1] = ff_sse8_msa; - c->sse[2] = ff_sse4_msa; + c->sad[0] = ff_pix_abs16_msa; + c->sad[1] = ff_pix_abs8_msa; + c->sse[0] = ff_sse16_msa; + c->sse[1] = ff_sse8_msa; + c->sse[2] = ff_sse4_msa; #endif -} -#endif // #if HAVE_MSA - -av_cold void ff_me_cmp_init_mips(MECmpContext *c, AVCodecContext *avctx) -{ -#if HAVE_MSA - me_cmp_msa(c, avctx); -#endif // #if HAVE_MSA + } } diff --git a/libavcodec/mips/mpegvideo_init_mips.c b/libavcodec/mips/mpegvideo_init_mips.c index be77308140..bfda90bbcc 100644 --- a/libavcodec/mips/mpegvideo_init_mips.c +++ b/libavcodec/mips/mpegvideo_init_mips.c @@ -18,41 +18,31 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "h263dsp_mips.h" #include "mpegvideo_mips.h" -#if HAVE_MSA -static av_cold void dct_unquantize_init_msa(MpegEncContext *s) +av_cold void ff_mpv_common_init_mips(MpegEncContext *s) { - s->dct_unquantize_h263_intra = ff_dct_unquantize_h263_intra_msa; - s->dct_unquantize_h263_inter = ff_dct_unquantize_h263_inter_msa; - if (!s->q_scale_type) - s->dct_unquantize_mpeg2_inter = ff_dct_unquantize_mpeg2_inter_msa; -} -#endif // #if HAVE_MSA + int cpu_flags = av_get_cpu_flags(); -#if HAVE_MMI -static av_cold void dct_unquantize_init_mmi(MpegEncContext *s) -{ - s->dct_unquantize_h263_intra = ff_dct_unquantize_h263_intra_mmi; - s->dct_unquantize_h263_inter = ff_dct_unquantize_h263_inter_mmi; - s->dct_unquantize_mpeg1_intra = ff_dct_unquantize_mpeg1_intra_mmi; - s->dct_unquantize_mpeg1_inter = ff_dct_unquantize_mpeg1_inter_mmi; + if (have_mmi(cpu_flags)) { + s->dct_unquantize_h263_intra = ff_dct_unquantize_h263_intra_mmi; + s->dct_unquantize_h263_inter = ff_dct_unquantize_h263_inter_mmi; + s->dct_unquantize_mpeg1_intra = ff_dct_unquantize_mpeg1_intra_mmi; + s->dct_unquantize_mpeg1_inter = ff_dct_unquantize_mpeg1_inter_mmi; - if (!(s->avctx->flags & AV_CODEC_FLAG_BITEXACT)) - if (!s->q_scale_type) - s->dct_unquantize_mpeg2_intra = ff_dct_unquantize_mpeg2_intra_mmi; + if (!(s->avctx->flags & AV_CODEC_FLAG_BITEXACT)) + if (!s->q_scale_type) + s->dct_unquantize_mpeg2_intra = ff_dct_unquantize_mpeg2_intra_mmi; - s->denoise_dct= ff_denoise_dct_mmi; -} -#endif /* HAVE_MMI */ + s->denoise_dct= ff_denoise_dct_mmi; + } -av_cold void ff_mpv_common_init_mips(MpegEncContext *s) -{ -#if HAVE_MMI - dct_unquantize_init_mmi(s); -#endif /* HAVE_MMI */ -#if HAVE_MSA - dct_unquantize_init_msa(s); -#endif // #if HAVE_MSA + if (have_msa(cpu_flags)) { + s->dct_unquantize_h263_intra = ff_dct_unquantize_h263_intra_msa; + s->dct_unquantize_h263_inter = ff_dct_unquantize_h263_inter_msa; + if (!s->q_scale_type) + s->dct_unquantize_mpeg2_inter = ff_dct_unquantize_mpeg2_inter_msa; + } } diff --git a/libavcodec/mips/mpegvideoencdsp_init_mips.c b/libavcodec/mips/mpegvideoencdsp_init_mips.c index 9bfe94e4cd..71831a61ac 100644 --- a/libavcodec/mips/mpegvideoencdsp_init_mips.c +++ b/libavcodec/mips/mpegvideoencdsp_init_mips.c @@ -18,23 +18,18 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "libavcodec/bit_depth_template.c" #include "h263dsp_mips.h" -#if HAVE_MSA -static av_cold void mpegvideoencdsp_init_msa(MpegvideoEncDSPContext *c, - AVCodecContext *avctx) -{ -#if BIT_DEPTH == 8 - c->pix_sum = ff_pix_sum_msa; -#endif -} -#endif // #if HAVE_MSA - av_cold void ff_mpegvideoencdsp_init_mips(MpegvideoEncDSPContext *c, AVCodecContext *avctx) { -#if HAVE_MSA - mpegvideoencdsp_init_msa(c, avctx); -#endif // #if HAVE_MSA + int cpu_flags = av_get_cpu_flags(); + + if (have_msa(cpu_flags)) { +#if BIT_DEPTH == 8 + c->pix_sum = ff_pix_sum_msa; +#endif + } } diff --git a/libavcodec/mips/pixblockdsp_init_mips.c b/libavcodec/mips/pixblockdsp_init_mips.c index fd0238d79b..2e2d70953b 100644 --- a/libavcodec/mips/pixblockdsp_init_mips.c +++ b/libavcodec/mips/pixblockdsp_init_mips.c @@ -19,51 +19,38 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "pixblockdsp_mips.h" -#if HAVE_MSA -static av_cold void pixblockdsp_init_msa(PixblockDSPContext *c, - AVCodecContext *avctx, - unsigned high_bit_depth) +void ff_pixblockdsp_init_mips(PixblockDSPContext *c, AVCodecContext *avctx, + unsigned high_bit_depth) { - c->diff_pixels = ff_diff_pixels_msa; + int cpu_flags = av_get_cpu_flags(); + + if (have_mmi(cpu_flags)) { + c->diff_pixels = ff_diff_pixels_mmi; - switch (avctx->bits_per_raw_sample) { - case 9: - case 10: - case 12: - case 14: - c->get_pixels = ff_get_pixels_16_msa; - break; - default: - if (avctx->bits_per_raw_sample <= 8 || avctx->codec_type != - AVMEDIA_TYPE_VIDEO) { - c->get_pixels = ff_get_pixels_8_msa; + if (!high_bit_depth || avctx->codec_type != AVMEDIA_TYPE_VIDEO) { + c->get_pixels = ff_get_pixels_8_mmi; } - break; } -} -#endif // #if HAVE_MSA -#if HAVE_MMI -static av_cold void pixblockdsp_init_mmi(PixblockDSPContext *c, - AVCodecContext *avctx, unsigned high_bit_depth) -{ - c->diff_pixels = ff_diff_pixels_mmi; + if (have_msa(cpu_flags)) { + c->diff_pixels = ff_diff_pixels_msa; - if (!high_bit_depth || avctx->codec_type != AVMEDIA_TYPE_VIDEO) { - c->get_pixels = ff_get_pixels_8_mmi; + switch (avctx->bits_per_raw_sample) { + case 9: + case 10: + case 12: + case 14: + c->get_pixels = ff_get_pixels_16_msa; + break; + default: + if (avctx->bits_per_raw_sample <= 8 || avctx->codec_type != + AVMEDIA_TYPE_VIDEO) { + c->get_pixels = ff_get_pixels_8_msa; + } + break; + } } } -#endif /* HAVE_MMI */ - -void ff_pixblockdsp_init_mips(PixblockDSPContext *c, AVCodecContext *avctx, - unsigned high_bit_depth) -{ -#if HAVE_MMI - pixblockdsp_init_mmi(c, avctx, high_bit_depth); -#endif /* HAVE_MMI */ -#if HAVE_MSA - pixblockdsp_init_msa(c, avctx, high_bit_depth); -#endif // #if HAVE_MSA -} diff --git a/libavcodec/mips/qpeldsp_init_mips.c b/libavcodec/mips/qpeldsp_init_mips.c index 140e8f89c9..cccf9d4429 100644 --- a/libavcodec/mips/qpeldsp_init_mips.c +++ b/libavcodec/mips/qpeldsp_init_mips.c @@ -18,150 +18,146 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "qpeldsp_mips.h" -#if HAVE_MSA -static av_cold void qpeldsp_init_msa(QpelDSPContext *c) +void ff_qpeldsp_init_mips(QpelDSPContext *c) { - c->put_qpel_pixels_tab[0][0] = ff_copy_16x16_msa; - c->put_qpel_pixels_tab[0][1] = ff_horiz_mc_qpel_aver_src0_16width_msa; - c->put_qpel_pixels_tab[0][2] = ff_horiz_mc_qpel_16width_msa; - c->put_qpel_pixels_tab[0][3] = ff_horiz_mc_qpel_aver_src1_16width_msa; - c->put_qpel_pixels_tab[0][4] = ff_vert_mc_qpel_aver_src0_16x16_msa; - c->put_qpel_pixels_tab[0][5] = ff_hv_mc_qpel_aver_hv_src00_16x16_msa; - c->put_qpel_pixels_tab[0][6] = ff_hv_mc_qpel_aver_v_src0_16x16_msa; - c->put_qpel_pixels_tab[0][7] = ff_hv_mc_qpel_aver_hv_src10_16x16_msa; - c->put_qpel_pixels_tab[0][8] = ff_vert_mc_qpel_16x16_msa; - c->put_qpel_pixels_tab[0][9] = ff_hv_mc_qpel_aver_h_src0_16x16_msa; - c->put_qpel_pixels_tab[0][10] = ff_hv_mc_qpel_16x16_msa; - c->put_qpel_pixels_tab[0][11] = ff_hv_mc_qpel_aver_h_src1_16x16_msa; - c->put_qpel_pixels_tab[0][12] = ff_vert_mc_qpel_aver_src1_16x16_msa; - c->put_qpel_pixels_tab[0][13] = ff_hv_mc_qpel_aver_hv_src01_16x16_msa; - c->put_qpel_pixels_tab[0][14] = ff_hv_mc_qpel_aver_v_src1_16x16_msa; - c->put_qpel_pixels_tab[0][15] = ff_hv_mc_qpel_aver_hv_src11_16x16_msa; + int cpu_flags = av_get_cpu_flags(); - c->put_qpel_pixels_tab[1][0] = ff_copy_8x8_msa; - c->put_qpel_pixels_tab[1][1] = ff_horiz_mc_qpel_aver_src0_8width_msa; - c->put_qpel_pixels_tab[1][2] = ff_horiz_mc_qpel_8width_msa; - c->put_qpel_pixels_tab[1][3] = ff_horiz_mc_qpel_aver_src1_8width_msa; - c->put_qpel_pixels_tab[1][4] = ff_vert_mc_qpel_aver_src0_8x8_msa; - c->put_qpel_pixels_tab[1][5] = ff_hv_mc_qpel_aver_hv_src00_8x8_msa; - c->put_qpel_pixels_tab[1][6] = ff_hv_mc_qpel_aver_v_src0_8x8_msa; - c->put_qpel_pixels_tab[1][7] = ff_hv_mc_qpel_aver_hv_src10_8x8_msa; - c->put_qpel_pixels_tab[1][8] = ff_vert_mc_qpel_8x8_msa; - c->put_qpel_pixels_tab[1][9] = ff_hv_mc_qpel_aver_h_src0_8x8_msa; - c->put_qpel_pixels_tab[1][10] = ff_hv_mc_qpel_8x8_msa; - c->put_qpel_pixels_tab[1][11] = ff_hv_mc_qpel_aver_h_src1_8x8_msa; - c->put_qpel_pixels_tab[1][12] = ff_vert_mc_qpel_aver_src1_8x8_msa; - c->put_qpel_pixels_tab[1][13] = ff_hv_mc_qpel_aver_hv_src01_8x8_msa; - c->put_qpel_pixels_tab[1][14] = ff_hv_mc_qpel_aver_v_src1_8x8_msa; - c->put_qpel_pixels_tab[1][15] = ff_hv_mc_qpel_aver_hv_src11_8x8_msa; + if (have_msa(cpu_flags)) { + c->put_qpel_pixels_tab[0][0] = ff_copy_16x16_msa; + c->put_qpel_pixels_tab[0][1] = ff_horiz_mc_qpel_aver_src0_16width_msa; + c->put_qpel_pixels_tab[0][2] = ff_horiz_mc_qpel_16width_msa; + c->put_qpel_pixels_tab[0][3] = ff_horiz_mc_qpel_aver_src1_16width_msa; + c->put_qpel_pixels_tab[0][4] = ff_vert_mc_qpel_aver_src0_16x16_msa; + c->put_qpel_pixels_tab[0][5] = ff_hv_mc_qpel_aver_hv_src00_16x16_msa; + c->put_qpel_pixels_tab[0][6] = ff_hv_mc_qpel_aver_v_src0_16x16_msa; + c->put_qpel_pixels_tab[0][7] = ff_hv_mc_qpel_aver_hv_src10_16x16_msa; + c->put_qpel_pixels_tab[0][8] = ff_vert_mc_qpel_16x16_msa; + c->put_qpel_pixels_tab[0][9] = ff_hv_mc_qpel_aver_h_src0_16x16_msa; + c->put_qpel_pixels_tab[0][10] = ff_hv_mc_qpel_16x16_msa; + c->put_qpel_pixels_tab[0][11] = ff_hv_mc_qpel_aver_h_src1_16x16_msa; + c->put_qpel_pixels_tab[0][12] = ff_vert_mc_qpel_aver_src1_16x16_msa; + c->put_qpel_pixels_tab[0][13] = ff_hv_mc_qpel_aver_hv_src01_16x16_msa; + c->put_qpel_pixels_tab[0][14] = ff_hv_mc_qpel_aver_v_src1_16x16_msa; + c->put_qpel_pixels_tab[0][15] = ff_hv_mc_qpel_aver_hv_src11_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][0] = ff_copy_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][1] = - ff_horiz_mc_qpel_no_rnd_aver_src0_16width_msa; - c->put_no_rnd_qpel_pixels_tab[0][2] = ff_horiz_mc_qpel_no_rnd_16width_msa; - c->put_no_rnd_qpel_pixels_tab[0][3] = - ff_horiz_mc_qpel_no_rnd_aver_src1_16width_msa; - c->put_no_rnd_qpel_pixels_tab[0][4] = - ff_vert_mc_qpel_no_rnd_aver_src0_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][5] = - ff_hv_mc_qpel_no_rnd_aver_hv_src00_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][6] = - ff_hv_mc_qpel_no_rnd_aver_v_src0_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][7] = - ff_hv_mc_qpel_no_rnd_aver_hv_src10_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][8] = ff_vert_mc_qpel_no_rnd_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][9] = - ff_hv_mc_qpel_no_rnd_aver_h_src0_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][10] = ff_hv_mc_qpel_no_rnd_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][11] = - ff_hv_mc_qpel_no_rnd_aver_h_src1_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][12] = - ff_vert_mc_qpel_no_rnd_aver_src1_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][13] = - ff_hv_mc_qpel_no_rnd_aver_hv_src01_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][14] = - ff_hv_mc_qpel_no_rnd_aver_v_src1_16x16_msa; - c->put_no_rnd_qpel_pixels_tab[0][15] = - ff_hv_mc_qpel_no_rnd_aver_hv_src11_16x16_msa; + c->put_qpel_pixels_tab[1][0] = ff_copy_8x8_msa; + c->put_qpel_pixels_tab[1][1] = ff_horiz_mc_qpel_aver_src0_8width_msa; + c->put_qpel_pixels_tab[1][2] = ff_horiz_mc_qpel_8width_msa; + c->put_qpel_pixels_tab[1][3] = ff_horiz_mc_qpel_aver_src1_8width_msa; + c->put_qpel_pixels_tab[1][4] = ff_vert_mc_qpel_aver_src0_8x8_msa; + c->put_qpel_pixels_tab[1][5] = ff_hv_mc_qpel_aver_hv_src00_8x8_msa; + c->put_qpel_pixels_tab[1][6] = ff_hv_mc_qpel_aver_v_src0_8x8_msa; + c->put_qpel_pixels_tab[1][7] = ff_hv_mc_qpel_aver_hv_src10_8x8_msa; + c->put_qpel_pixels_tab[1][8] = ff_vert_mc_qpel_8x8_msa; + c->put_qpel_pixels_tab[1][9] = ff_hv_mc_qpel_aver_h_src0_8x8_msa; + c->put_qpel_pixels_tab[1][10] = ff_hv_mc_qpel_8x8_msa; + c->put_qpel_pixels_tab[1][11] = ff_hv_mc_qpel_aver_h_src1_8x8_msa; + c->put_qpel_pixels_tab[1][12] = ff_vert_mc_qpel_aver_src1_8x8_msa; + c->put_qpel_pixels_tab[1][13] = ff_hv_mc_qpel_aver_hv_src01_8x8_msa; + c->put_qpel_pixels_tab[1][14] = ff_hv_mc_qpel_aver_v_src1_8x8_msa; + c->put_qpel_pixels_tab[1][15] = ff_hv_mc_qpel_aver_hv_src11_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][0] = ff_copy_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][1] = - ff_horiz_mc_qpel_no_rnd_aver_src0_8width_msa; - c->put_no_rnd_qpel_pixels_tab[1][2] = ff_horiz_mc_qpel_no_rnd_8width_msa; - c->put_no_rnd_qpel_pixels_tab[1][3] = - ff_horiz_mc_qpel_no_rnd_aver_src1_8width_msa; - c->put_no_rnd_qpel_pixels_tab[1][4] = - ff_vert_mc_qpel_no_rnd_aver_src0_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][5] = - ff_hv_mc_qpel_no_rnd_aver_hv_src00_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][6] = - ff_hv_mc_qpel_no_rnd_aver_v_src0_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][7] = - ff_hv_mc_qpel_no_rnd_aver_hv_src10_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][8] = ff_vert_mc_qpel_no_rnd_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][9] = - ff_hv_mc_qpel_no_rnd_aver_h_src0_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][10] = ff_hv_mc_qpel_no_rnd_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][11] = - ff_hv_mc_qpel_no_rnd_aver_h_src1_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][12] = - ff_vert_mc_qpel_no_rnd_aver_src1_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][13] = - ff_hv_mc_qpel_no_rnd_aver_hv_src01_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][14] = - ff_hv_mc_qpel_no_rnd_aver_v_src1_8x8_msa; - c->put_no_rnd_qpel_pixels_tab[1][15] = - ff_hv_mc_qpel_no_rnd_aver_hv_src11_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[0][0] = ff_copy_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][1] = + ff_horiz_mc_qpel_no_rnd_aver_src0_16width_msa; + c->put_no_rnd_qpel_pixels_tab[0][2] = ff_horiz_mc_qpel_no_rnd_16width_msa; + c->put_no_rnd_qpel_pixels_tab[0][3] = + ff_horiz_mc_qpel_no_rnd_aver_src1_16width_msa; + c->put_no_rnd_qpel_pixels_tab[0][4] = + ff_vert_mc_qpel_no_rnd_aver_src0_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][5] = + ff_hv_mc_qpel_no_rnd_aver_hv_src00_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][6] = + ff_hv_mc_qpel_no_rnd_aver_v_src0_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][7] = + ff_hv_mc_qpel_no_rnd_aver_hv_src10_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][8] = ff_vert_mc_qpel_no_rnd_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][9] = + ff_hv_mc_qpel_no_rnd_aver_h_src0_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][10] = ff_hv_mc_qpel_no_rnd_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][11] = + ff_hv_mc_qpel_no_rnd_aver_h_src1_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][12] = + ff_vert_mc_qpel_no_rnd_aver_src1_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][13] = + ff_hv_mc_qpel_no_rnd_aver_hv_src01_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][14] = + ff_hv_mc_qpel_no_rnd_aver_v_src1_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[0][15] = + ff_hv_mc_qpel_no_rnd_aver_hv_src11_16x16_msa; - c->avg_qpel_pixels_tab[0][0] = ff_avg_width16_msa; - c->avg_qpel_pixels_tab[0][1] = - ff_horiz_mc_qpel_avg_dst_aver_src0_16width_msa; - c->avg_qpel_pixels_tab[0][2] = ff_horiz_mc_qpel_avg_dst_16width_msa; - c->avg_qpel_pixels_tab[0][3] = - ff_horiz_mc_qpel_avg_dst_aver_src1_16width_msa; - c->avg_qpel_pixels_tab[0][4] = ff_vert_mc_qpel_avg_dst_aver_src0_16x16_msa; - c->avg_qpel_pixels_tab[0][5] = - ff_hv_mc_qpel_avg_dst_aver_hv_src00_16x16_msa; - c->avg_qpel_pixels_tab[0][6] = ff_hv_mc_qpel_avg_dst_aver_v_src0_16x16_msa; - c->avg_qpel_pixels_tab[0][7] = - ff_hv_mc_qpel_avg_dst_aver_hv_src10_16x16_msa; - c->avg_qpel_pixels_tab[0][8] = ff_vert_mc_qpel_avg_dst_16x16_msa; - c->avg_qpel_pixels_tab[0][9] = ff_hv_mc_qpel_avg_dst_aver_h_src0_16x16_msa; - c->avg_qpel_pixels_tab[0][10] = ff_hv_mc_qpel_avg_dst_16x16_msa; - c->avg_qpel_pixels_tab[0][11] = ff_hv_mc_qpel_avg_dst_aver_h_src1_16x16_msa; - c->avg_qpel_pixels_tab[0][12] = ff_vert_mc_qpel_avg_dst_aver_src1_16x16_msa; - c->avg_qpel_pixels_tab[0][13] = - ff_hv_mc_qpel_avg_dst_aver_hv_src01_16x16_msa; - c->avg_qpel_pixels_tab[0][14] = ff_hv_mc_qpel_avg_dst_aver_v_src1_16x16_msa; - c->avg_qpel_pixels_tab[0][15] = - ff_hv_mc_qpel_avg_dst_aver_hv_src11_16x16_msa; + c->put_no_rnd_qpel_pixels_tab[1][0] = ff_copy_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][1] = + ff_horiz_mc_qpel_no_rnd_aver_src0_8width_msa; + c->put_no_rnd_qpel_pixels_tab[1][2] = ff_horiz_mc_qpel_no_rnd_8width_msa; + c->put_no_rnd_qpel_pixels_tab[1][3] = + ff_horiz_mc_qpel_no_rnd_aver_src1_8width_msa; + c->put_no_rnd_qpel_pixels_tab[1][4] = + ff_vert_mc_qpel_no_rnd_aver_src0_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][5] = + ff_hv_mc_qpel_no_rnd_aver_hv_src00_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][6] = + ff_hv_mc_qpel_no_rnd_aver_v_src0_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][7] = + ff_hv_mc_qpel_no_rnd_aver_hv_src10_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][8] = ff_vert_mc_qpel_no_rnd_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][9] = + ff_hv_mc_qpel_no_rnd_aver_h_src0_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][10] = ff_hv_mc_qpel_no_rnd_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][11] = + ff_hv_mc_qpel_no_rnd_aver_h_src1_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][12] = + ff_vert_mc_qpel_no_rnd_aver_src1_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][13] = + ff_hv_mc_qpel_no_rnd_aver_hv_src01_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][14] = + ff_hv_mc_qpel_no_rnd_aver_v_src1_8x8_msa; + c->put_no_rnd_qpel_pixels_tab[1][15] = + ff_hv_mc_qpel_no_rnd_aver_hv_src11_8x8_msa; - c->avg_qpel_pixels_tab[1][0] = ff_avg_width8_msa; - c->avg_qpel_pixels_tab[1][1] = - ff_horiz_mc_qpel_avg_dst_aver_src0_8width_msa; - c->avg_qpel_pixels_tab[1][2] = ff_horiz_mc_qpel_avg_dst_8width_msa; - c->avg_qpel_pixels_tab[1][3] = - ff_horiz_mc_qpel_avg_dst_aver_src1_8width_msa; - c->avg_qpel_pixels_tab[1][4] = ff_vert_mc_qpel_avg_dst_aver_src0_8x8_msa; - c->avg_qpel_pixels_tab[1][5] = ff_hv_mc_qpel_avg_dst_aver_hv_src00_8x8_msa; - c->avg_qpel_pixels_tab[1][6] = ff_hv_mc_qpel_avg_dst_aver_v_src0_8x8_msa; - c->avg_qpel_pixels_tab[1][7] = ff_hv_mc_qpel_avg_dst_aver_hv_src10_8x8_msa; - c->avg_qpel_pixels_tab[1][8] = ff_vert_mc_qpel_avg_dst_8x8_msa; - c->avg_qpel_pixels_tab[1][9] = ff_hv_mc_qpel_avg_dst_aver_h_src0_8x8_msa; - c->avg_qpel_pixels_tab[1][10] = ff_hv_mc_qpel_avg_dst_8x8_msa; - c->avg_qpel_pixels_tab[1][11] = ff_hv_mc_qpel_avg_dst_aver_h_src1_8x8_msa; - c->avg_qpel_pixels_tab[1][12] = ff_vert_mc_qpel_avg_dst_aver_src1_8x8_msa; - c->avg_qpel_pixels_tab[1][13] = ff_hv_mc_qpel_avg_dst_aver_hv_src01_8x8_msa; - c->avg_qpel_pixels_tab[1][14] = ff_hv_mc_qpel_avg_dst_aver_v_src1_8x8_msa; - c->avg_qpel_pixels_tab[1][15] = ff_hv_mc_qpel_avg_dst_aver_hv_src11_8x8_msa; -} -#endif // #if HAVE_MSA + c->avg_qpel_pixels_tab[0][0] = ff_avg_width16_msa; + c->avg_qpel_pixels_tab[0][1] = + ff_horiz_mc_qpel_avg_dst_aver_src0_16width_msa; + c->avg_qpel_pixels_tab[0][2] = ff_horiz_mc_qpel_avg_dst_16width_msa; + c->avg_qpel_pixels_tab[0][3] = + ff_horiz_mc_qpel_avg_dst_aver_src1_16width_msa; + c->avg_qpel_pixels_tab[0][4] = ff_vert_mc_qpel_avg_dst_aver_src0_16x16_msa; + c->avg_qpel_pixels_tab[0][5] = + ff_hv_mc_qpel_avg_dst_aver_hv_src00_16x16_msa; + c->avg_qpel_pixels_tab[0][6] = ff_hv_mc_qpel_avg_dst_aver_v_src0_16x16_msa; + c->avg_qpel_pixels_tab[0][7] = + ff_hv_mc_qpel_avg_dst_aver_hv_src10_16x16_msa; + c->avg_qpel_pixels_tab[0][8] = ff_vert_mc_qpel_avg_dst_16x16_msa; + c->avg_qpel_pixels_tab[0][9] = ff_hv_mc_qpel_avg_dst_aver_h_src0_16x16_msa; + c->avg_qpel_pixels_tab[0][10] = ff_hv_mc_qpel_avg_dst_16x16_msa; + c->avg_qpel_pixels_tab[0][11] = ff_hv_mc_qpel_avg_dst_aver_h_src1_16x16_msa; + c->avg_qpel_pixels_tab[0][12] = ff_vert_mc_qpel_avg_dst_aver_src1_16x16_msa; + c->avg_qpel_pixels_tab[0][13] = + ff_hv_mc_qpel_avg_dst_aver_hv_src01_16x16_msa; + c->avg_qpel_pixels_tab[0][14] = ff_hv_mc_qpel_avg_dst_aver_v_src1_16x16_msa; + c->avg_qpel_pixels_tab[0][15] = + ff_hv_mc_qpel_avg_dst_aver_hv_src11_16x16_msa; -void ff_qpeldsp_init_mips(QpelDSPContext *c) -{ -#if HAVE_MSA - qpeldsp_init_msa(c); -#endif // #if HAVE_MSA + c->avg_qpel_pixels_tab[1][0] = ff_avg_width8_msa; + c->avg_qpel_pixels_tab[1][1] = + ff_horiz_mc_qpel_avg_dst_aver_src0_8width_msa; + c->avg_qpel_pixels_tab[1][2] = ff_horiz_mc_qpel_avg_dst_8width_msa; + c->avg_qpel_pixels_tab[1][3] = + ff_horiz_mc_qpel_avg_dst_aver_src1_8width_msa; + c->avg_qpel_pixels_tab[1][4] = ff_vert_mc_qpel_avg_dst_aver_src0_8x8_msa; + c->avg_qpel_pixels_tab[1][5] = ff_hv_mc_qpel_avg_dst_aver_hv_src00_8x8_msa; + c->avg_qpel_pixels_tab[1][6] = ff_hv_mc_qpel_avg_dst_aver_v_src0_8x8_msa; + c->avg_qpel_pixels_tab[1][7] = ff_hv_mc_qpel_avg_dst_aver_hv_src10_8x8_msa; + c->avg_qpel_pixels_tab[1][8] = ff_vert_mc_qpel_avg_dst_8x8_msa; + c->avg_qpel_pixels_tab[1][9] = ff_hv_mc_qpel_avg_dst_aver_h_src0_8x8_msa; + c->avg_qpel_pixels_tab[1][10] = ff_hv_mc_qpel_avg_dst_8x8_msa; + c->avg_qpel_pixels_tab[1][11] = ff_hv_mc_qpel_avg_dst_aver_h_src1_8x8_msa; + c->avg_qpel_pixels_tab[1][12] = ff_vert_mc_qpel_avg_dst_aver_src1_8x8_msa; + c->avg_qpel_pixels_tab[1][13] = ff_hv_mc_qpel_avg_dst_aver_hv_src01_8x8_msa; + c->avg_qpel_pixels_tab[1][14] = ff_hv_mc_qpel_avg_dst_aver_v_src1_8x8_msa; + c->avg_qpel_pixels_tab[1][15] = ff_hv_mc_qpel_avg_dst_aver_hv_src11_8x8_msa; + } } diff --git a/libavcodec/mips/vc1dsp_init_mips.c b/libavcodec/mips/vc1dsp_init_mips.c index c0007ff650..94126f3a9d 100644 --- a/libavcodec/mips/vc1dsp_init_mips.c +++ b/libavcodec/mips/vc1dsp_init_mips.c @@ -18,6 +18,7 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "libavutil/attributes.h" #include "libavcodec/vc1dsp.h" #include "vc1dsp_mips.h" @@ -27,104 +28,93 @@ dsp->OP##vc1_mspel_pixels_tab[1][X+4*Y] = ff_##OP##vc1_mspel_mc##X##Y##INSN; \ dsp->OP##vc1_mspel_pixels_tab[0][X+4*Y] = ff_##OP##vc1_mspel_mc##X##Y##_16##INSN -#if HAVE_MMI -static av_cold void vc1dsp_init_mmi(VC1DSPContext *dsp) -{ -#if _MIPS_SIM != _ABIO32 - dsp->vc1_inv_trans_8x8 = ff_vc1_inv_trans_8x8_mmi; - dsp->vc1_inv_trans_4x8 = ff_vc1_inv_trans_4x8_mmi; - dsp->vc1_inv_trans_8x4 = ff_vc1_inv_trans_8x4_mmi; -#endif - dsp->vc1_inv_trans_4x4 = ff_vc1_inv_trans_4x4_mmi; - dsp->vc1_inv_trans_8x8_dc = ff_vc1_inv_trans_8x8_dc_mmi; - dsp->vc1_inv_trans_4x8_dc = ff_vc1_inv_trans_4x8_dc_mmi; - dsp->vc1_inv_trans_8x4_dc = ff_vc1_inv_trans_8x4_dc_mmi; - dsp->vc1_inv_trans_4x4_dc = ff_vc1_inv_trans_4x4_dc_mmi; - - dsp->vc1_h_overlap = ff_vc1_h_overlap_mmi; - dsp->vc1_v_overlap = ff_vc1_v_overlap_mmi; - dsp->vc1_h_s_overlap = ff_vc1_h_s_overlap_mmi; - dsp->vc1_v_s_overlap = ff_vc1_v_s_overlap_mmi; - - dsp->vc1_v_loop_filter4 = ff_vc1_v_loop_filter4_mmi; - dsp->vc1_h_loop_filter4 = ff_vc1_h_loop_filter4_mmi; - dsp->vc1_v_loop_filter8 = ff_vc1_v_loop_filter8_mmi; - dsp->vc1_h_loop_filter8 = ff_vc1_h_loop_filter8_mmi; - dsp->vc1_v_loop_filter16 = ff_vc1_v_loop_filter16_mmi; - dsp->vc1_h_loop_filter16 = ff_vc1_h_loop_filter16_mmi; - - FN_ASSIGN(put_, 0, 0, _mmi); - FN_ASSIGN(put_, 0, 1, _mmi); - FN_ASSIGN(put_, 0, 2, _mmi); - FN_ASSIGN(put_, 0, 3, _mmi); - - FN_ASSIGN(put_, 1, 0, _mmi); - //FN_ASSIGN(put_, 1, 1, _mmi);//FIXME - //FN_ASSIGN(put_, 1, 2, _mmi);//FIXME - //FN_ASSIGN(put_, 1, 3, _mmi);//FIXME - - FN_ASSIGN(put_, 2, 0, _mmi); - //FN_ASSIGN(put_, 2, 1, _mmi);//FIXME - //FN_ASSIGN(put_, 2, 2, _mmi);//FIXME - //FN_ASSIGN(put_, 2, 3, _mmi);//FIXME - - FN_ASSIGN(put_, 3, 0, _mmi); - //FN_ASSIGN(put_, 3, 1, _mmi);//FIXME - //FN_ASSIGN(put_, 3, 2, _mmi);//FIXME - //FN_ASSIGN(put_, 3, 3, _mmi);//FIXME - - FN_ASSIGN(avg_, 0, 0, _mmi); - FN_ASSIGN(avg_, 0, 1, _mmi); - FN_ASSIGN(avg_, 0, 2, _mmi); - FN_ASSIGN(avg_, 0, 3, _mmi); - - FN_ASSIGN(avg_, 1, 0, _mmi); - //FN_ASSIGN(avg_, 1, 1, _mmi);//FIXME - //FN_ASSIGN(avg_, 1, 2, _mmi);//FIXME - //FN_ASSIGN(avg_, 1, 3, _mmi);//FIXME - - FN_ASSIGN(avg_, 2, 0, _mmi); - //FN_ASSIGN(avg_, 2, 1, _mmi);//FIXME - //FN_ASSIGN(avg_, 2, 2, _mmi);//FIXME - //FN_ASSIGN(avg_, 2, 3, _mmi);//FIXME - - FN_ASSIGN(avg_, 3, 0, _mmi); - //FN_ASSIGN(avg_, 3, 1, _mmi);//FIXME - //FN_ASSIGN(avg_, 3, 2, _mmi);//FIXME - //FN_ASSIGN(avg_, 3, 3, _mmi);//FIXME - - dsp->put_no_rnd_vc1_chroma_pixels_tab[0] = ff_put_no_rnd_vc1_chroma_mc8_mmi; - dsp->avg_no_rnd_vc1_chroma_pixels_tab[0] = ff_avg_no_rnd_vc1_chroma_mc8_mmi; - dsp->put_no_rnd_vc1_chroma_pixels_tab[1] = ff_put_no_rnd_vc1_chroma_mc4_mmi; - dsp->avg_no_rnd_vc1_chroma_pixels_tab[1] = ff_avg_no_rnd_vc1_chroma_mc4_mmi; -} -#endif /* HAVE_MMI */ - -#if HAVE_MSA -static av_cold void vc1dsp_init_msa(VC1DSPContext *dsp) -{ - dsp->vc1_inv_trans_8x8 = ff_vc1_inv_trans_8x8_msa; - dsp->vc1_inv_trans_4x8 = ff_vc1_inv_trans_4x8_msa; - dsp->vc1_inv_trans_8x4 = ff_vc1_inv_trans_8x4_msa; - - FN_ASSIGN(put_, 1, 1, _msa); - FN_ASSIGN(put_, 1, 2, _msa); - FN_ASSIGN(put_, 1, 3, _msa); - FN_ASSIGN(put_, 2, 1, _msa); - FN_ASSIGN(put_, 2, 2, _msa); - FN_ASSIGN(put_, 2, 3, _msa); - FN_ASSIGN(put_, 3, 1, _msa); - FN_ASSIGN(put_, 3, 2, _msa); - FN_ASSIGN(put_, 3, 3, _msa); -} -#endif /* HAVE_MSA */ - av_cold void ff_vc1dsp_init_mips(VC1DSPContext *dsp) { -#if HAVE_MMI - vc1dsp_init_mmi(dsp); -#endif /* HAVE_MMI */ -#if HAVE_MSA - vc1dsp_init_msa(dsp); -#endif /* HAVE_MSA */ + int cpu_flags = av_get_cpu_flags(); + + if (have_mmi(cpu_flags)) { + #if _MIPS_SIM != _ABIO32 + dsp->vc1_inv_trans_8x8 = ff_vc1_inv_trans_8x8_mmi; + dsp->vc1_inv_trans_4x8 = ff_vc1_inv_trans_4x8_mmi; + dsp->vc1_inv_trans_8x4 = ff_vc1_inv_trans_8x4_mmi; +#endif + dsp->vc1_inv_trans_4x4 = ff_vc1_inv_trans_4x4_mmi; + dsp->vc1_inv_trans_8x8_dc = ff_vc1_inv_trans_8x8_dc_mmi; + dsp->vc1_inv_trans_4x8_dc = ff_vc1_inv_trans_4x8_dc_mmi; + dsp->vc1_inv_trans_8x4_dc = ff_vc1_inv_trans_8x4_dc_mmi; + dsp->vc1_inv_trans_4x4_dc = ff_vc1_inv_trans_4x4_dc_mmi; + + dsp->vc1_h_overlap = ff_vc1_h_overlap_mmi; + dsp->vc1_v_overlap = ff_vc1_v_overlap_mmi; + dsp->vc1_h_s_overlap = ff_vc1_h_s_overlap_mmi; + dsp->vc1_v_s_overlap = ff_vc1_v_s_overlap_mmi; + + dsp->vc1_v_loop_filter4 = ff_vc1_v_loop_filter4_mmi; + dsp->vc1_h_loop_filter4 = ff_vc1_h_loop_filter4_mmi; + dsp->vc1_v_loop_filter8 = ff_vc1_v_loop_filter8_mmi; + dsp->vc1_h_loop_filter8 = ff_vc1_h_loop_filter8_mmi; + dsp->vc1_v_loop_filter16 = ff_vc1_v_loop_filter16_mmi; + dsp->vc1_h_loop_filter16 = ff_vc1_h_loop_filter16_mmi; + + FN_ASSIGN(put_, 0, 0, _mmi); + FN_ASSIGN(put_, 0, 1, _mmi); + FN_ASSIGN(put_, 0, 2, _mmi); + FN_ASSIGN(put_, 0, 3, _mmi); + + FN_ASSIGN(put_, 1, 0, _mmi); + //FN_ASSIGN(put_, 1, 1, _mmi);//FIXME + //FN_ASSIGN(put_, 1, 2, _mmi);//FIXME + //FN_ASSIGN(put_, 1, 3, _mmi);//FIXME + + FN_ASSIGN(put_, 2, 0, _mmi); + //FN_ASSIGN(put_, 2, 1, _mmi);//FIXME + //FN_ASSIGN(put_, 2, 2, _mmi);//FIXME + //FN_ASSIGN(put_, 2, 3, _mmi);//FIXME + + FN_ASSIGN(put_, 3, 0, _mmi); + //FN_ASSIGN(put_, 3, 1, _mmi);//FIXME + //FN_ASSIGN(put_, 3, 2, _mmi);//FIXME + //FN_ASSIGN(put_, 3, 3, _mmi);//FIXME + + FN_ASSIGN(avg_, 0, 0, _mmi); + FN_ASSIGN(avg_, 0, 1, _mmi); + FN_ASSIGN(avg_, 0, 2, _mmi); + FN_ASSIGN(avg_, 0, 3, _mmi); + + FN_ASSIGN(avg_, 1, 0, _mmi); + //FN_ASSIGN(avg_, 1, 1, _mmi);//FIXME + //FN_ASSIGN(avg_, 1, 2, _mmi);//FIXME + //FN_ASSIGN(avg_, 1, 3, _mmi);//FIXME + + FN_ASSIGN(avg_, 2, 0, _mmi); + //FN_ASSIGN(avg_, 2, 1, _mmi);//FIXME + //FN_ASSIGN(avg_, 2, 2, _mmi);//FIXME + //FN_ASSIGN(avg_, 2, 3, _mmi);//FIXME + + FN_ASSIGN(avg_, 3, 0, _mmi); + //FN_ASSIGN(avg_, 3, 1, _mmi);//FIXME + //FN_ASSIGN(avg_, 3, 2, _mmi);//FIXME + //FN_ASSIGN(avg_, 3, 3, _mmi);//FIXME + + dsp->put_no_rnd_vc1_chroma_pixels_tab[0] = ff_put_no_rnd_vc1_chroma_mc8_mmi; + dsp->avg_no_rnd_vc1_chroma_pixels_tab[0] = ff_avg_no_rnd_vc1_chroma_mc8_mmi; + dsp->put_no_rnd_vc1_chroma_pixels_tab[1] = ff_put_no_rnd_vc1_chroma_mc4_mmi; + dsp->avg_no_rnd_vc1_chroma_pixels_tab[1] = ff_avg_no_rnd_vc1_chroma_mc4_mmi; + } + + if (have_msa(cpu_flags)) { + dsp->vc1_inv_trans_8x8 = ff_vc1_inv_trans_8x8_msa; + dsp->vc1_inv_trans_4x8 = ff_vc1_inv_trans_4x8_msa; + dsp->vc1_inv_trans_8x4 = ff_vc1_inv_trans_8x4_msa; + + FN_ASSIGN(put_, 1, 1, _msa); + FN_ASSIGN(put_, 1, 2, _msa); + FN_ASSIGN(put_, 1, 3, _msa); + FN_ASSIGN(put_, 2, 1, _msa); + FN_ASSIGN(put_, 2, 2, _msa); + FN_ASSIGN(put_, 2, 3, _msa); + FN_ASSIGN(put_, 3, 1, _msa); + FN_ASSIGN(put_, 3, 2, _msa); + FN_ASSIGN(put_, 3, 3, _msa); + } } diff --git a/libavcodec/mips/videodsp_init.c b/libavcodec/mips/videodsp_init.c index 817040420b..07c23bcf7e 100644 --- a/libavcodec/mips/videodsp_init.c +++ b/libavcodec/mips/videodsp_init.c @@ -18,12 +18,12 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "config.h" #include "libavutil/attributes.h" #include "libavutil/mips/asmdefs.h" #include "libavcodec/videodsp.h" -#if HAVE_MSA static void prefetch_mips(uint8_t *mem, ptrdiff_t stride, int h) { register const uint8_t *p = mem; @@ -41,11 +41,11 @@ static void prefetch_mips(uint8_t *mem, ptrdiff_t stride, int h) : [stride] "r" (stride) ); } -#endif // #if HAVE_MSA av_cold void ff_videodsp_init_mips(VideoDSPContext *ctx, int bpc) { -#if HAVE_MSA - ctx->prefetch = prefetch_mips; -#endif // #if HAVE_MSA + int cpu_flags = av_get_cpu_flags(); + + if (have_msa(cpu_flags)) + ctx->prefetch = prefetch_mips; } diff --git a/libavcodec/mips/vp3dsp_init_mips.c b/libavcodec/mips/vp3dsp_init_mips.c index e183db35b6..4252ff790e 100644 --- a/libavcodec/mips/vp3dsp_init_mips.c +++ b/libavcodec/mips/vp3dsp_init_mips.c @@ -19,42 +19,32 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "config.h" #include "libavutil/attributes.h" #include "libavcodec/avcodec.h" #include "libavcodec/vp3dsp.h" #include "vp3dsp_mips.h" -#if HAVE_MSA -static av_cold void vp3dsp_init_msa(VP3DSPContext *c, int flags) +av_cold void ff_vp3dsp_init_mips(VP3DSPContext *c, int flags) { - c->put_no_rnd_pixels_l2 = ff_put_no_rnd_pixels_l2_msa; + int cpu_flags = av_get_cpu_flags(); - c->idct_add = ff_vp3_idct_add_msa; - c->idct_put = ff_vp3_idct_put_msa; - c->idct_dc_add = ff_vp3_idct_dc_add_msa; - c->v_loop_filter = ff_vp3_v_loop_filter_msa; - c->h_loop_filter = ff_vp3_h_loop_filter_msa; -} -#endif /* HAVE_MSA */ + if (have_mmi(cpu_flags)) { + c->put_no_rnd_pixels_l2 = ff_put_no_rnd_pixels_l2_mmi; -#if HAVE_MMI -static av_cold void vp3dsp_init_mmi(VP3DSPContext *c, int flags) -{ - c->put_no_rnd_pixels_l2 = ff_put_no_rnd_pixels_l2_mmi; + c->idct_add = ff_vp3_idct_add_mmi; + c->idct_put = ff_vp3_idct_put_mmi; + c->idct_dc_add = ff_vp3_idct_dc_add_mmi; + } - c->idct_add = ff_vp3_idct_add_mmi; - c->idct_put = ff_vp3_idct_put_mmi; - c->idct_dc_add = ff_vp3_idct_dc_add_mmi; -} -#endif /* HAVE_MMI */ + if (have_msa(cpu_flags)) { + c->put_no_rnd_pixels_l2 = ff_put_no_rnd_pixels_l2_msa; -av_cold void ff_vp3dsp_init_mips(VP3DSPContext *c, int flags) -{ -#if HAVE_MMI - vp3dsp_init_mmi(c, flags); -#endif /* HAVE_MMI */ -#if HAVE_MSA - vp3dsp_init_msa(c, flags); -#endif /* HAVE_MSA */ + c->idct_add = ff_vp3_idct_add_msa; + c->idct_put = ff_vp3_idct_put_msa; + c->idct_dc_add = ff_vp3_idct_dc_add_msa; + c->v_loop_filter = ff_vp3_v_loop_filter_msa; + c->h_loop_filter = ff_vp3_h_loop_filter_msa; + } } diff --git a/libavcodec/mips/vp8dsp_init_mips.c b/libavcodec/mips/vp8dsp_init_mips.c index 7fd8fb0d32..92d8c792cb 100644 --- a/libavcodec/mips/vp8dsp_init_mips.c +++ b/libavcodec/mips/vp8dsp_init_mips.c @@ -24,6 +24,7 @@ * VP8 compatible video decoder */ +#include "libavutil/mips/cpu.h" #include "config.h" #include "libavutil/attributes.h" #include "libavcodec/vp8dsp.h" @@ -71,132 +72,123 @@ dsp->put_vp8_bilinear_pixels_tab[IDX][0][0] = \ ff_put_vp8_pixels##SIZE##_msa; -#if HAVE_MSA -static av_cold void vp8dsp_init_msa(VP8DSPContext *dsp) -{ - dsp->vp8_luma_dc_wht = ff_vp8_luma_dc_wht_msa; - dsp->vp8_idct_add = ff_vp8_idct_add_msa; - dsp->vp8_idct_dc_add = ff_vp8_idct_dc_add_msa; - dsp->vp8_idct_dc_add4y = ff_vp8_idct_dc_add4y_msa; - dsp->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_msa; - - VP8_MC_MIPS_FUNC(0, 16); - VP8_MC_MIPS_FUNC(1, 8); - VP8_MC_MIPS_FUNC(2, 4); - - VP8_BILINEAR_MC_MIPS_FUNC(0, 16); - VP8_BILINEAR_MC_MIPS_FUNC(1, 8); - VP8_BILINEAR_MC_MIPS_FUNC(2, 4); - - VP8_MC_MIPS_COPY(0, 16); - VP8_MC_MIPS_COPY(1, 8); - - dsp->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_msa; - dsp->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_msa; - dsp->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_msa; - dsp->vp8_h_loop_filter8uv = ff_vp8_h_loop_filter8uv_msa; - - dsp->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_msa; - dsp->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_msa; - dsp->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_msa; - dsp->vp8_h_loop_filter8uv_inner = ff_vp8_h_loop_filter8uv_inner_msa; - - dsp->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter_simple_msa; - dsp->vp8_h_loop_filter_simple = ff_vp8_h_loop_filter_simple_msa; -} -#endif // #if HAVE_MSA -#if HAVE_MMI -static av_cold void vp8dsp_init_mmi(VP8DSPContext *dsp) -{ - dsp->vp8_luma_dc_wht = ff_vp8_luma_dc_wht_mmi; - dsp->vp8_luma_dc_wht_dc = ff_vp8_luma_dc_wht_dc_mmi; - dsp->vp8_idct_add = ff_vp8_idct_add_mmi; - dsp->vp8_idct_dc_add = ff_vp8_idct_dc_add_mmi; - dsp->vp8_idct_dc_add4y = ff_vp8_idct_dc_add4y_mmi; - dsp->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_mmi; - - dsp->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_mmi; - dsp->put_vp8_epel_pixels_tab[0][0][2] = ff_put_vp8_epel16_h6_mmi; - dsp->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_mmi; - dsp->put_vp8_epel_pixels_tab[0][1][1] = ff_put_vp8_epel16_h4v4_mmi; - dsp->put_vp8_epel_pixels_tab[0][1][2] = ff_put_vp8_epel16_h6v4_mmi; - dsp->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_mmi; - dsp->put_vp8_epel_pixels_tab[0][2][1] = ff_put_vp8_epel16_h4v6_mmi; - dsp->put_vp8_epel_pixels_tab[0][2][2] = ff_put_vp8_epel16_h6v6_mmi; - - dsp->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_mmi; - dsp->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_mmi; - dsp->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_mmi; - dsp->put_vp8_epel_pixels_tab[1][1][1] = ff_put_vp8_epel8_h4v4_mmi; - dsp->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_mmi; - dsp->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_mmi; - dsp->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_mmi; - dsp->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_mmi; - - dsp->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_mmi; - dsp->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_mmi; - dsp->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_mmi; - dsp->put_vp8_epel_pixels_tab[2][1][1] = ff_put_vp8_epel4_h4v4_mmi; - dsp->put_vp8_epel_pixels_tab[2][1][2] = ff_put_vp8_epel4_h6v4_mmi; - dsp->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_mmi; - dsp->put_vp8_epel_pixels_tab[2][2][1] = ff_put_vp8_epel4_h4v6_mmi; - dsp->put_vp8_epel_pixels_tab[2][2][2] = ff_put_vp8_epel4_h6v6_mmi; - - dsp->put_vp8_bilinear_pixels_tab[0][0][1] = ff_put_vp8_bilinear16_h_mmi; - dsp->put_vp8_bilinear_pixels_tab[0][0][2] = ff_put_vp8_bilinear16_h_mmi; - dsp->put_vp8_bilinear_pixels_tab[0][1][0] = ff_put_vp8_bilinear16_v_mmi; - dsp->put_vp8_bilinear_pixels_tab[0][1][1] = ff_put_vp8_bilinear16_hv_mmi; - dsp->put_vp8_bilinear_pixels_tab[0][1][2] = ff_put_vp8_bilinear16_hv_mmi; - dsp->put_vp8_bilinear_pixels_tab[0][2][0] = ff_put_vp8_bilinear16_v_mmi; - dsp->put_vp8_bilinear_pixels_tab[0][2][1] = ff_put_vp8_bilinear16_hv_mmi; - dsp->put_vp8_bilinear_pixels_tab[0][2][2] = ff_put_vp8_bilinear16_hv_mmi; - - dsp->put_vp8_bilinear_pixels_tab[1][0][1] = ff_put_vp8_bilinear8_h_mmi; - dsp->put_vp8_bilinear_pixels_tab[1][0][2] = ff_put_vp8_bilinear8_h_mmi; - dsp->put_vp8_bilinear_pixels_tab[1][1][0] = ff_put_vp8_bilinear8_v_mmi; - dsp->put_vp8_bilinear_pixels_tab[1][1][1] = ff_put_vp8_bilinear8_hv_mmi; - dsp->put_vp8_bilinear_pixels_tab[1][1][2] = ff_put_vp8_bilinear8_hv_mmi; - dsp->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilinear8_v_mmi; - dsp->put_vp8_bilinear_pixels_tab[1][2][1] = ff_put_vp8_bilinear8_hv_mmi; - dsp->put_vp8_bilinear_pixels_tab[1][2][2] = ff_put_vp8_bilinear8_hv_mmi; - - dsp->put_vp8_bilinear_pixels_tab[2][0][1] = ff_put_vp8_bilinear4_h_mmi; - dsp->put_vp8_bilinear_pixels_tab[2][0][2] = ff_put_vp8_bilinear4_h_mmi; - dsp->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilinear4_v_mmi; - dsp->put_vp8_bilinear_pixels_tab[2][1][1] = ff_put_vp8_bilinear4_hv_mmi; - dsp->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilinear4_hv_mmi; - dsp->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilinear4_v_mmi; - dsp->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilinear4_hv_mmi; - dsp->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilinear4_hv_mmi; - - dsp->put_vp8_epel_pixels_tab[0][0][0] = ff_put_vp8_pixels16_mmi; - dsp->put_vp8_bilinear_pixels_tab[0][0][0] = ff_put_vp8_pixels16_mmi; - - dsp->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_mmi; - dsp->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_mmi; - - dsp->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_mmi; - dsp->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_mmi; - dsp->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_mmi; - dsp->vp8_h_loop_filter8uv = ff_vp8_h_loop_filter8uv_mmi; - - dsp->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_mmi; - dsp->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_mmi; - dsp->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_mmi; - dsp->vp8_h_loop_filter8uv_inner = ff_vp8_h_loop_filter8uv_inner_mmi; - - dsp->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter_simple_mmi; - dsp->vp8_h_loop_filter_simple = ff_vp8_h_loop_filter_simple_mmi; -} -#endif /* HAVE_MMI */ av_cold void ff_vp8dsp_init_mips(VP8DSPContext *dsp) { -#if HAVE_MMI - vp8dsp_init_mmi(dsp); -#endif /* HAVE_MMI */ -#if HAVE_MSA - vp8dsp_init_msa(dsp); -#endif // #if HAVE_MSA + int cpu_flags = av_get_cpu_flags(); + + if (have_mmi(cpu_flags)) { + dsp->vp8_luma_dc_wht = ff_vp8_luma_dc_wht_mmi; + dsp->vp8_luma_dc_wht_dc = ff_vp8_luma_dc_wht_dc_mmi; + dsp->vp8_idct_add = ff_vp8_idct_add_mmi; + dsp->vp8_idct_dc_add = ff_vp8_idct_dc_add_mmi; + dsp->vp8_idct_dc_add4y = ff_vp8_idct_dc_add4y_mmi; + dsp->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_mmi; + + dsp->put_vp8_epel_pixels_tab[0][0][1] = ff_put_vp8_epel16_h4_mmi; + dsp->put_vp8_epel_pixels_tab[0][0][2] = ff_put_vp8_epel16_h6_mmi; + dsp->put_vp8_epel_pixels_tab[0][1][0] = ff_put_vp8_epel16_v4_mmi; + dsp->put_vp8_epel_pixels_tab[0][1][1] = ff_put_vp8_epel16_h4v4_mmi; + dsp->put_vp8_epel_pixels_tab[0][1][2] = ff_put_vp8_epel16_h6v4_mmi; + dsp->put_vp8_epel_pixels_tab[0][2][0] = ff_put_vp8_epel16_v6_mmi; + dsp->put_vp8_epel_pixels_tab[0][2][1] = ff_put_vp8_epel16_h4v6_mmi; + dsp->put_vp8_epel_pixels_tab[0][2][2] = ff_put_vp8_epel16_h6v6_mmi; + + dsp->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_mmi; + dsp->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_mmi; + dsp->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_mmi; + dsp->put_vp8_epel_pixels_tab[1][1][1] = ff_put_vp8_epel8_h4v4_mmi; + dsp->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_mmi; + dsp->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_mmi; + dsp->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_mmi; + dsp->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_mmi; + + dsp->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_mmi; + dsp->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_mmi; + dsp->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_mmi; + dsp->put_vp8_epel_pixels_tab[2][1][1] = ff_put_vp8_epel4_h4v4_mmi; + dsp->put_vp8_epel_pixels_tab[2][1][2] = ff_put_vp8_epel4_h6v4_mmi; + dsp->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_mmi; + dsp->put_vp8_epel_pixels_tab[2][2][1] = ff_put_vp8_epel4_h4v6_mmi; + dsp->put_vp8_epel_pixels_tab[2][2][2] = ff_put_vp8_epel4_h6v6_mmi; + + dsp->put_vp8_bilinear_pixels_tab[0][0][1] = ff_put_vp8_bilinear16_h_mmi; + dsp->put_vp8_bilinear_pixels_tab[0][0][2] = ff_put_vp8_bilinear16_h_mmi; + dsp->put_vp8_bilinear_pixels_tab[0][1][0] = ff_put_vp8_bilinear16_v_mmi; + dsp->put_vp8_bilinear_pixels_tab[0][1][1] = ff_put_vp8_bilinear16_hv_mmi; + dsp->put_vp8_bilinear_pixels_tab[0][1][2] = ff_put_vp8_bilinear16_hv_mmi; + dsp->put_vp8_bilinear_pixels_tab[0][2][0] = ff_put_vp8_bilinear16_v_mmi; + dsp->put_vp8_bilinear_pixels_tab[0][2][1] = ff_put_vp8_bilinear16_hv_mmi; + dsp->put_vp8_bilinear_pixels_tab[0][2][2] = ff_put_vp8_bilinear16_hv_mmi; + + dsp->put_vp8_bilinear_pixels_tab[1][0][1] = ff_put_vp8_bilinear8_h_mmi; + dsp->put_vp8_bilinear_pixels_tab[1][0][2] = ff_put_vp8_bilinear8_h_mmi; + dsp->put_vp8_bilinear_pixels_tab[1][1][0] = ff_put_vp8_bilinear8_v_mmi; + dsp->put_vp8_bilinear_pixels_tab[1][1][1] = ff_put_vp8_bilinear8_hv_mmi; + dsp->put_vp8_bilinear_pixels_tab[1][1][2] = ff_put_vp8_bilinear8_hv_mmi; + dsp->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilinear8_v_mmi; + dsp->put_vp8_bilinear_pixels_tab[1][2][1] = ff_put_vp8_bilinear8_hv_mmi; + dsp->put_vp8_bilinear_pixels_tab[1][2][2] = ff_put_vp8_bilinear8_hv_mmi; + + dsp->put_vp8_bilinear_pixels_tab[2][0][1] = ff_put_vp8_bilinear4_h_mmi; + dsp->put_vp8_bilinear_pixels_tab[2][0][2] = ff_put_vp8_bilinear4_h_mmi; + dsp->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilinear4_v_mmi; + dsp->put_vp8_bilinear_pixels_tab[2][1][1] = ff_put_vp8_bilinear4_hv_mmi; + dsp->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilinear4_hv_mmi; + dsp->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilinear4_v_mmi; + dsp->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilinear4_hv_mmi; + dsp->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilinear4_hv_mmi; + + dsp->put_vp8_epel_pixels_tab[0][0][0] = ff_put_vp8_pixels16_mmi; + dsp->put_vp8_bilinear_pixels_tab[0][0][0] = ff_put_vp8_pixels16_mmi; + + dsp->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_mmi; + dsp->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_mmi; + + dsp->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_mmi; + dsp->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_mmi; + dsp->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_mmi; + dsp->vp8_h_loop_filter8uv = ff_vp8_h_loop_filter8uv_mmi; + + dsp->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_mmi; + dsp->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_mmi; + dsp->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_mmi; + dsp->vp8_h_loop_filter8uv_inner = ff_vp8_h_loop_filter8uv_inner_mmi; + + dsp->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter_simple_mmi; + dsp->vp8_h_loop_filter_simple = ff_vp8_h_loop_filter_simple_mmi; + } + + if (have_msa(cpu_flags)) { + dsp->vp8_luma_dc_wht = ff_vp8_luma_dc_wht_msa; + dsp->vp8_idct_add = ff_vp8_idct_add_msa; + dsp->vp8_idct_dc_add = ff_vp8_idct_dc_add_msa; + dsp->vp8_idct_dc_add4y = ff_vp8_idct_dc_add4y_msa; + dsp->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_msa; + + VP8_MC_MIPS_FUNC(0, 16); + VP8_MC_MIPS_FUNC(1, 8); + VP8_MC_MIPS_FUNC(2, 4); + + VP8_BILINEAR_MC_MIPS_FUNC(0, 16); + VP8_BILINEAR_MC_MIPS_FUNC(1, 8); + VP8_BILINEAR_MC_MIPS_FUNC(2, 4); + + VP8_MC_MIPS_COPY(0, 16); + VP8_MC_MIPS_COPY(1, 8); + + dsp->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_msa; + dsp->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_msa; + dsp->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_msa; + dsp->vp8_h_loop_filter8uv = ff_vp8_h_loop_filter8uv_msa; + + dsp->vp8_v_loop_filter16y_inner = ff_vp8_v_loop_filter16_inner_msa; + dsp->vp8_h_loop_filter16y_inner = ff_vp8_h_loop_filter16_inner_msa; + dsp->vp8_v_loop_filter8uv_inner = ff_vp8_v_loop_filter8uv_inner_msa; + dsp->vp8_h_loop_filter8uv_inner = ff_vp8_h_loop_filter8uv_inner_msa; + + dsp->vp8_v_loop_filter_simple = ff_vp8_v_loop_filter_simple_msa; + dsp->vp8_h_loop_filter_simple = ff_vp8_h_loop_filter_simple_msa; + } } diff --git a/libavcodec/mips/vp9dsp_init_mips.c b/libavcodec/mips/vp9dsp_init_mips.c index 5990fa6952..5a8c599e7e 100644 --- a/libavcodec/mips/vp9dsp_init_mips.c +++ b/libavcodec/mips/vp9dsp_init_mips.c @@ -18,6 +18,7 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "config.h" #include "libavutil/common.h" #include "libavcodec/vp9dsp.h" @@ -209,10 +210,17 @@ static av_cold void vp9dsp_init_mmi(VP9DSPContext *dsp, int bpp) av_cold void ff_vp9dsp_init_mips(VP9DSPContext *dsp, int bpp) { +#if HAVE_MSA || HAVE_MMI + int cpu_flags = av_get_cpu_flags(); +#endif + #if HAVE_MMI - vp9dsp_init_mmi(dsp, bpp); -#endif // #if HAVE_MMI + if (have_mmi(cpu_flags)) + vp9dsp_init_mmi(dsp, bpp); +#endif + #if HAVE_MSA - vp9dsp_init_msa(dsp, bpp); -#endif // #if HAVE_MSA + if (have_msa(cpu_flags)) + vp9dsp_init_msa(dsp, bpp); +#endif } diff --git a/libavcodec/mips/wmv2dsp_init_mips.c b/libavcodec/mips/wmv2dsp_init_mips.c index 51dd2078d9..af1400731a 100644 --- a/libavcodec/mips/wmv2dsp_init_mips.c +++ b/libavcodec/mips/wmv2dsp_init_mips.c @@ -18,21 +18,17 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "config.h" #include "libavutil/attributes.h" #include "wmv2dsp_mips.h" -#if HAVE_MMI -static av_cold void wmv2dsp_init_mmi(WMV2DSPContext *c) -{ - c->idct_add = ff_wmv2_idct_add_mmi; - c->idct_put = ff_wmv2_idct_put_mmi; -} -#endif /* HAVE_MMI */ - av_cold void ff_wmv2dsp_init_mips(WMV2DSPContext *c) { -#if HAVE_MMI - wmv2dsp_init_mmi(c); -#endif /* HAVE_MMI */ + int cpu_flags = av_get_cpu_flags(); + + if (have_mmi(cpu_flags)) { + c->idct_add = ff_wmv2_idct_add_mmi; + c->idct_put = ff_wmv2_idct_put_mmi; + } } diff --git a/libavcodec/mips/wmv2dsp_mips.h b/libavcodec/mips/wmv2dsp_mips.h index 22894c505d..f7313460fb 100644 --- a/libavcodec/mips/wmv2dsp_mips.h +++ b/libavcodec/mips/wmv2dsp_mips.h @@ -23,7 +23,7 @@ #include "libavcodec/wmv2dsp.h" -void ff_wmv2_idct_add_mmi(uint8_t *dest, int line_size, int16_t *block); -void ff_wmv2_idct_put_mmi(uint8_t *dest, int line_size, int16_t *block); +void ff_wmv2_idct_add_mmi(uint8_t *dest, long int line_size, int16_t *block); +void ff_wmv2_idct_put_mmi(uint8_t *dest, long int line_size, int16_t *block); #endif /* AVCODEC_MIPS_WMV2DSP_MIPS_H */ diff --git a/libavcodec/mips/wmv2dsp_mmi.c b/libavcodec/mips/wmv2dsp_mmi.c index 1f6ccb299b..8796ebe195 100644 --- a/libavcodec/mips/wmv2dsp_mmi.c +++ b/libavcodec/mips/wmv2dsp_mmi.c @@ -95,7 +95,7 @@ static void wmv2_idct_col_mmi(short * b) b[56] = (a0 + a2 - a1 - a5 + 8192) >> 14; } -void ff_wmv2_idct_add_mmi(uint8_t *dest, int line_size, int16_t *block) +void ff_wmv2_idct_add_mmi(uint8_t *dest, long int line_size, int16_t *block) { int i; double ftmp[11]; @@ -212,7 +212,7 @@ void ff_wmv2_idct_add_mmi(uint8_t *dest, int line_size, int16_t *block) ); } -void ff_wmv2_idct_put_mmi(uint8_t *dest, int line_size, int16_t *block) +void ff_wmv2_idct_put_mmi(uint8_t *dest, long int line_size, int16_t *block) { int i; double ftmp[8]; diff --git a/libavcodec/mips/xvididct_init_mips.c b/libavcodec/mips/xvididct_init_mips.c index c1d82cc30c..ed545cfe17 100644 --- a/libavcodec/mips/xvididct_init_mips.c +++ b/libavcodec/mips/xvididct_init_mips.c @@ -18,28 +18,23 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "libavutil/mips/cpu.h" #include "xvididct_mips.h" -#if HAVE_MMI -static av_cold void xvid_idct_init_mmi(IDCTDSPContext *c, AVCodecContext *avctx, +av_cold void ff_xvid_idct_init_mips(IDCTDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth) { - if (!high_bit_depth) { - if (avctx->idct_algo == FF_IDCT_AUTO || - avctx->idct_algo == FF_IDCT_XVID) { - c->idct_put = ff_xvid_idct_put_mmi; - c->idct_add = ff_xvid_idct_add_mmi; - c->idct = ff_xvid_idct_mmi; - c->perm_type = FF_IDCT_PERM_NONE; + int cpu_flags = av_get_cpu_flags(); + + if (have_mmi(cpu_flags)) { + if (!high_bit_depth) { + if (avctx->idct_algo == FF_IDCT_AUTO || + avctx->idct_algo == FF_IDCT_XVID) { + c->idct_put = ff_xvid_idct_put_mmi; + c->idct_add = ff_xvid_idct_add_mmi; + c->idct = ff_xvid_idct_mmi; + c->perm_type = FF_IDCT_PERM_NONE; + } } } } -#endif /* HAVE_MMI */ - -av_cold void ff_xvid_idct_init_mips(IDCTDSPContext *c, AVCodecContext *avctx, - unsigned high_bit_depth) -{ -#if HAVE_MMI - xvid_idct_init_mmi(c, avctx, high_bit_depth); -#endif /* HAVE_MMI */ -} From patchwork Thu Jul 2 15:46:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 20784 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 346D544B923 for ; Thu, 2 Jul 2020 18:48:00 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2815168B1B1; Thu, 2 Jul 2020 18:48:00 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from relay3.mymailcheap.com (relay3.mymailcheap.com [217.182.119.157]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2AA5C68AC48 for ; Thu, 2 Jul 2020 18:47:54 +0300 (EEST) Received: from filter2.mymailcheap.com (filter2.mymailcheap.com [91.134.140.82]) by relay3.mymailcheap.com (Postfix) with ESMTPS id A36EA3ECDF for ; Thu, 2 Jul 2020 17:47:53 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by filter2.mymailcheap.com (Postfix) with ESMTP id 8233D2A4F0 for ; Thu, 2 Jul 2020 17:47:53 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=mymailcheap.com; s=default; t=1593704873; bh=n1QtYZ/lwhawEOwLCs283lcUCVmoLk35iYgo//63jD4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Q2rSOvoBsU1DLW0yRni4R/cLBtIYJTm15nW354YAulDPZi7D/c10w5+g9H6DgsSFt YM9EKrUWVCNYFcN+sFcG5ULgEiIXpouxTXdPZUfad0YphzpZtWJFTve8tjtuMwkqtG En6+xXSKQXlHU+Z/mZPDCgLSB/5xm7uWXs2lUVAM= X-Virus-Scanned: Debian amavisd-new at filter2.mymailcheap.com Received: from filter2.mymailcheap.com ([127.0.0.1]) by localhost (filter2.mymailcheap.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id s1BwLj8Kk8I1 for ; Thu, 2 Jul 2020 17:47:52 +0200 (CEST) Received: from mail20.mymailcheap.com (mail20.mymailcheap.com [51.83.111.147]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by filter2.mymailcheap.com (Postfix) with ESMTPS for ; Thu, 2 Jul 2020 17:47:52 +0200 (CEST) Received: from [148.251.23.173] (ml.mymailcheap.com [148.251.23.173]) by mail20.mymailcheap.com (Postfix) with ESMTP id D0CBF4085D; Thu, 2 Jul 2020 15:47:51 +0000 (UTC) Authentication-Results: mail20.mymailcheap.com; dkim=pass (1024-bit key; unprotected) header.d=flygoat.com header.i=@flygoat.com header.b="IYvjrUrv"; dkim-atps=neutral AI-Spam-Status: Not processed Received: from halation.202.net.flygoat.com (unknown [115.227.164.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail20.mymailcheap.com (Postfix) with ESMTPSA id A4003403B3; Thu, 2 Jul 2020 15:46:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=flygoat.com; s=default; t=1593704806; bh=n1QtYZ/lwhawEOwLCs283lcUCVmoLk35iYgo//63jD4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IYvjrUrvJxtFK5jAht4DK3PejBPdOwOjw5bAIHk6f9uUzdQLCKHL67v0TVjrtXqYD LndU6f+3toye9mpOmdXjEVpZI37OekAsN9F3+6IF7kLz/UyLKKgUhhnynEZ/0/JB1/ nuyJMmA5I4rOKvXBEcnxanxmNnneJj/dnztMIUS0= From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Thu, 2 Jul 2020 23:46:21 +0800 Message-Id: <20200702154622.21280-6-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200702154622.21280-1-jiaxun.yang@flygoat.com> References: <20200702154622.21280-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: D0CBF4085D X-Spamd-Result: default: False [4.90 / 10.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(0.00)[flygoat.com:s=default]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; R_SPF_SOFTFAIL(0.00)[~all:c]; ML_SERVERS(-3.10)[148.251.23.173]; DKIM_TRACE(0.00)[flygoat.com:+]; RCPT_COUNT_TWO(0.00)[2]; MID_CONTAINS_FROM(1.00)[]; DMARC_POLICY_ALLOW(0.00)[flygoat.com,none]; DMARC_POLICY_ALLOW_WITH_FAILURES(0.00)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:148.251.0.0/16, country:DE]; RCVD_COUNT_TWO(0.00)[2]; HFILTER_HELO_BAREIP(3.00)[148.251.23.173,1] X-Rspamd-Server: mail20.mymailcheap.com X-Spam: Yes Subject: [FFmpeg-devel] [PATCH v5 5/6] libavcodec: MIPS: MMI: Fix type mismatches X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" GCC complains about them. Signed-off-by: Jiaxun Yang --- libavcodec/mips/h264dsp_mips.h | 18 +++++++++--------- libavcodec/mips/h264dsp_mmi.c | 18 +++++++++--------- libavcodec/mips/xvid_idct_mmi.c | 4 ++-- libavcodec/mips/xvididct_mips.h | 4 ++-- 4 files changed, 22 insertions(+), 22 deletions(-) diff --git a/libavcodec/mips/h264dsp_mips.h b/libavcodec/mips/h264dsp_mips.h index 21b7de06f0..7b2a9fabe5 100644 --- a/libavcodec/mips/h264dsp_mips.h +++ b/libavcodec/mips/h264dsp_mips.h @@ -357,23 +357,23 @@ void ff_h264_biweight_pixels4_8_mmi(uint8_t *dst, uint8_t *src, void ff_deblock_v_chroma_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta, int8_t *tc0); -void ff_deblock_v_chroma_intra_8_mmi(uint8_t *pix, int stride, int alpha, +void ff_deblock_v_chroma_intra_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta); -void ff_deblock_h_chroma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, +void ff_deblock_h_chroma_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta, int8_t *tc0); -void ff_deblock_h_chroma_intra_8_mmi(uint8_t *pix, int stride, int alpha, +void ff_deblock_h_chroma_intra_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta); -void ff_deblock_v_luma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, +void ff_deblock_v_luma_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta, int8_t *tc0); -void ff_deblock_v_luma_intra_8_mmi(uint8_t *pix, int stride, int alpha, +void ff_deblock_v_luma_intra_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta); -void ff_deblock_h_luma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, +void ff_deblock_h_luma_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta, int8_t *tc0); -void ff_deblock_h_luma_intra_8_mmi(uint8_t *pix, int stride, int alpha, +void ff_deblock_h_luma_intra_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta); -void ff_deblock_v8_luma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, +void ff_deblock_v8_luma_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta, int8_t *tc0); -void ff_deblock_v8_luma_intra_8_mmi(uint8_t *pix, int stride, int alpha, +void ff_deblock_v8_luma_intra_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta); void ff_put_h264_qpel16_mc00_mmi(uint8_t *dst, const uint8_t *src, diff --git a/libavcodec/mips/h264dsp_mmi.c b/libavcodec/mips/h264dsp_mmi.c index 0459711b82..7a60ee7c2b 100644 --- a/libavcodec/mips/h264dsp_mmi.c +++ b/libavcodec/mips/h264dsp_mmi.c @@ -1433,7 +1433,7 @@ void ff_h264_biweight_pixels4_8_mmi(uint8_t *dst, uint8_t *src, } } -void ff_deblock_v8_luma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, +void ff_deblock_v8_luma_8_mmi(uint8_t *pix, pixdiff_t stride, int alpha, int beta, int8_t *tc0) { double ftmp[12]; @@ -1561,7 +1561,7 @@ void ff_deblock_v8_luma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, ); } -static void deblock_v8_luma_intra_8_mmi(uint8_t *pix, int stride, int alpha, +static void deblock_v8_luma_intra_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta) { DECLARE_ALIGNED(8, const uint64_t, stack[0x0a]); @@ -1871,7 +1871,7 @@ void ff_deblock_v_chroma_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, ); } -void ff_deblock_v_chroma_intra_8_mmi(uint8_t *pix, int stride, int alpha, +void ff_deblock_v_chroma_intra_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta) { double ftmp[9]; @@ -1949,7 +1949,7 @@ void ff_deblock_v_chroma_intra_8_mmi(uint8_t *pix, int stride, int alpha, ); } -void ff_deblock_h_chroma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, +void ff_deblock_h_chroma_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta, int8_t *tc0) { double ftmp[11]; @@ -2089,7 +2089,7 @@ void ff_deblock_h_chroma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, ); } -void ff_deblock_h_chroma_intra_8_mmi(uint8_t *pix, int stride, int alpha, +void ff_deblock_h_chroma_intra_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta) { double ftmp[11]; @@ -2222,7 +2222,7 @@ void ff_deblock_h_chroma_intra_8_mmi(uint8_t *pix, int stride, int alpha, ); } -void ff_deblock_v_luma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, +void ff_deblock_v_luma_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta, int8_t *tc0) { if ((tc0[0] & tc0[1]) >= 0) @@ -2231,14 +2231,14 @@ void ff_deblock_v_luma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, ff_deblock_v8_luma_8_mmi(pix + 8, stride, alpha, beta, tc0 + 2); } -void ff_deblock_v_luma_intra_8_mmi(uint8_t *pix, int stride, int alpha, +void ff_deblock_v_luma_intra_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta) { deblock_v8_luma_intra_8_mmi(pix + 0, stride, alpha, beta); deblock_v8_luma_intra_8_mmi(pix + 8, stride, alpha, beta); } -void ff_deblock_h_luma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, +void ff_deblock_h_luma_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta, int8_t *tc0) { DECLARE_ALIGNED(8, const uint64_t, stack[0x0d]); @@ -2457,7 +2457,7 @@ void ff_deblock_h_luma_8_mmi(uint8_t *pix, int stride, int alpha, int beta, ); } -void ff_deblock_h_luma_intra_8_mmi(uint8_t *pix, int stride, int alpha, +void ff_deblock_h_luma_intra_8_mmi(uint8_t *pix, ptrdiff_t stride, int alpha, int beta) { DECLARE_ALIGNED(8, const uint64_t, ptmp[0x11]); diff --git a/libavcodec/mips/xvid_idct_mmi.c b/libavcodec/mips/xvid_idct_mmi.c index d3f9acb0e2..b822b8add8 100644 --- a/libavcodec/mips/xvid_idct_mmi.c +++ b/libavcodec/mips/xvid_idct_mmi.c @@ -240,13 +240,13 @@ void ff_xvid_idct_mmi(int16_t *block) ); } -void ff_xvid_idct_put_mmi(uint8_t *dest, int32_t line_size, int16_t *block) +void ff_xvid_idct_put_mmi(uint8_t *dest, ptrdiff_t line_size, int16_t *block) { ff_xvid_idct_mmi(block); ff_put_pixels_clamped_mmi(block, dest, line_size); } -void ff_xvid_idct_add_mmi(uint8_t *dest, int32_t line_size, int16_t *block) +void ff_xvid_idct_add_mmi(uint8_t *dest, ptrdiff_t line_size, int16_t *block) { ff_xvid_idct_mmi(block); ff_add_pixels_clamped_mmi(block, dest, line_size); diff --git a/libavcodec/mips/xvididct_mips.h b/libavcodec/mips/xvididct_mips.h index 0768aaa26f..bee03c1391 100644 --- a/libavcodec/mips/xvididct_mips.h +++ b/libavcodec/mips/xvididct_mips.h @@ -24,7 +24,7 @@ #include "libavcodec/xvididct.h" void ff_xvid_idct_mmi(int16_t *block); -void ff_xvid_idct_put_mmi(uint8_t *dest, int32_t line_size, int16_t *block); -void ff_xvid_idct_add_mmi(uint8_t *dest, int32_t line_size, int16_t *block); +void ff_xvid_idct_put_mmi(uint8_t *dest, ptrdiff_t line_size, int16_t *block); +void ff_xvid_idct_add_mmi(uint8_t *dest, ptrdiff_t line_size, int16_t *block); #endif /* AVCODEC_MIPS_XVIDIDCT_MIPS_H */ From patchwork Thu Jul 2 15:46:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 20782 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id B3EDC44B923 for ; Thu, 2 Jul 2020 18:47:15 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 9BD3E68AF9A; Thu, 2 Jul 2020 18:47:15 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from relay3.mymailcheap.com (relay3.mymailcheap.com [217.182.119.157]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C2D3E68AEE4 for ; Thu, 2 Jul 2020 18:47:09 +0300 (EEST) Received: from filter1.mymailcheap.com (filter1.mymailcheap.com [149.56.130.247]) by relay3.mymailcheap.com (Postfix) with ESMTPS id 1543D3F15F for ; Thu, 2 Jul 2020 17:47:09 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by filter1.mymailcheap.com (Postfix) with ESMTP id 47E7E2A3A7 for ; Thu, 2 Jul 2020 11:47:08 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=mymailcheap.com; s=default; t=1593704828; bh=XrlQuz1en2p7gn00iAyTOv2UcWCEl+7gRUCBYY8P7Gc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FyXdW+LMSPoapRJObcJ63FPLWmgDR2+4xVJ5Ru3WfcS8PQ7bwYKb6p8j7xqvZaBoa 8ZSIgEc/ST8A7YzGdFWZvhxRmjvIOcHBfkzOO9Q4g4Vbyft1ECmHhsOAX36ffCGumW 4ZZTW4ejD2jWW4Gdpr/EcPpFzXSlejDKdgb9y2sI= X-Virus-Scanned: Debian amavisd-new at filter1.mymailcheap.com Received: from filter1.mymailcheap.com ([127.0.0.1]) by localhost (filter1.mymailcheap.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iOg4Ofv3ijK6 for ; Thu, 2 Jul 2020 11:47:07 -0400 (EDT) Received: from mail20.mymailcheap.com (mail20.mymailcheap.com [51.83.111.147]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by filter1.mymailcheap.com (Postfix) with ESMTPS for ; Thu, 2 Jul 2020 11:47:07 -0400 (EDT) Received: from [213.133.102.83] (ml.mymailcheap.com [213.133.102.83]) by mail20.mymailcheap.com (Postfix) with ESMTP id 395CC403B3; Thu, 2 Jul 2020 15:47:06 +0000 (UTC) Authentication-Results: mail20.mymailcheap.com; dkim=pass (1024-bit key; unprotected) header.d=flygoat.com header.i=@flygoat.com header.b="KsdJu06d"; dkim-atps=neutral AI-Spam-Status: Not processed Received: from halation.202.net.flygoat.com (unknown [115.227.164.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail20.mymailcheap.com (Postfix) with ESMTPSA id E7E51403B3; Thu, 2 Jul 2020 15:46:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=flygoat.com; s=default; t=1593704810; bh=XrlQuz1en2p7gn00iAyTOv2UcWCEl+7gRUCBYY8P7Gc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KsdJu06dzl3hvd53T8HgbmCau5T1qR4eUb575Tuth009qVdxv1uTIxDWPaRlSWMUz /oc/AaBDop3Zje9zUfykw6SKTfupWJTh9Vrskkh9zRTF5ZGY0YuH/LZydZEY1u4HsH SOl393at3vSieaLRwJ6xm87FTbjSAuPtEQH4D7sg= From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Thu, 2 Jul 2020 23:46:22 +0800 Message-Id: <20200702154622.21280-7-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200702154622.21280-1-jiaxun.yang@flygoat.com> References: <20200702154622.21280-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 395CC403B3 X-Spamd-Result: default: False [0.90 / 10.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(0.00)[flygoat.com:s=default]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; R_SPF_SOFTFAIL(0.00)[~all:c]; ML_SERVERS(-3.10)[213.133.102.83]; DKIM_TRACE(0.00)[flygoat.com:+]; RCPT_COUNT_TWO(0.00)[2]; MID_CONTAINS_FROM(1.00)[]; RCVD_IN_DNSWL_NONE(0.00)[213.133.102.83:from]; DMARC_POLICY_ALLOW(0.00)[flygoat.com,none]; DMARC_POLICY_ALLOW_WITH_FAILURES(0.00)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:213.133.96.0/19, country:DE]; RCVD_COUNT_TWO(0.00)[2]; HFILTER_HELO_BAREIP(3.00)[213.133.102.83,1] X-Rspamd-Server: mail20.mymailcheap.com Subject: [FFmpeg-devel] [PATCH v5 6/6] libavcodec: MIPS: MMI: Move stack pointer out of the cobber list X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" GCC compplains: warning: listing the stack pointer register ‘$29’ in a clobber list is deprecated [-Wdeprecated] Actually stack pointer was restored at the end of the inline assembly so there is no reason to add it to the cobber list. Also use $sp insted of $29 to make our intention much more clear. Signed-off-by: Jiaxun Yang --- libavcodec/mips/h264dsp_mmi.c | 38 +++++++++++++++++------------------ 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/libavcodec/mips/h264dsp_mmi.c b/libavcodec/mips/h264dsp_mmi.c index 7a60ee7c2b..20df39806f 100644 --- a/libavcodec/mips/h264dsp_mmi.c +++ b/libavcodec/mips/h264dsp_mmi.c @@ -177,7 +177,7 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) __asm__ volatile ( "lhu %[tmp0], 0x00(%[block]) \n\t" - PTR_ADDI "$29, $29, -0x20 \n\t" + PTR_ADDI "$sp, $sp, -0x20 \n\t" PTR_ADDIU "%[tmp0], %[tmp0], 0x20 \n\t" MMI_LDC1(%[ftmp1], %[block], 0x10) "sh %[tmp0], 0x00(%[block]) \n\t" @@ -254,8 +254,8 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) "punpckhwd %[ftmp3], %[ftmp6], %[ftmp0] \n\t" "punpcklwd %[ftmp6], %[ftmp6], %[ftmp0] \n\t" MMI_LDC1(%[ftmp0], %[block], 0x00) - MMI_SDC1(%[ftmp7], $29, 0x00) - MMI_SDC1(%[ftmp1], $29, 0x10) + MMI_SDC1(%[ftmp7], $sp, 0x00) + MMI_SDC1(%[ftmp1], $sp, 0x10) "dmfc1 %[tmp1], %[ftmp6] \n\t" "dmfc1 %[tmp3], %[ftmp3] \n\t" "punpckhhw %[ftmp3], %[ftmp5], %[ftmp2] \n\t" @@ -266,8 +266,8 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) "punpcklwd %[ftmp5], %[ftmp5], %[ftmp4] \n\t" "punpckhwd %[ftmp4], %[ftmp3], %[ftmp2] \n\t" "punpcklwd %[ftmp3], %[ftmp3], %[ftmp2] \n\t" - MMI_SDC1(%[ftmp5], $29, 0x08) - MMI_SDC1(%[ftmp0], $29, 0x18) + MMI_SDC1(%[ftmp5], $sp, 0x08) + MMI_SDC1(%[ftmp0], $sp, 0x18) "dmfc1 %[tmp2], %[ftmp3] \n\t" "dmfc1 %[tmp4], %[ftmp4] \n\t" MMI_LDC1(%[ftmp1], %[block], 0x18) @@ -359,7 +359,7 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) PTR_ADDIU "%[addr0], %[dst], 0x04 \n\t" "mov.d %[ftmp7], %[ftmp10] \n\t" "dmtc1 %[tmp3], %[ftmp6] \n\t" - MMI_LDC1(%[ftmp1], $29, 0x10) + MMI_LDC1(%[ftmp1], $sp, 0x10) "dmtc1 %[tmp1], %[ftmp3] \n\t" "mov.d %[ftmp4], %[ftmp1] \n\t" "psrah %[ftmp1], %[ftmp1], %[ftmp8] \n\t" @@ -392,7 +392,7 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) "psrah %[ftmp0], %[ftmp3], %[ftmp8] \n\t" "paddh %[ftmp2], %[ftmp2], %[ftmp3] \n\t" "psubh %[ftmp0], %[ftmp0], %[ftmp7] \n\t" - MMI_LDC1(%[ftmp3], $29, 0x00) + MMI_LDC1(%[ftmp3], $sp, 0x00) "dmtc1 %[tmp5], %[ftmp7] \n\t" "paddh %[ftmp7], %[ftmp7], %[ftmp3] \n\t" "paddh %[ftmp3], %[ftmp3], %[ftmp3] \n\t" @@ -414,9 +414,9 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) "paddh %[ftmp1], %[ftmp1], %[ftmp7] \n\t" "psubh %[ftmp3], %[ftmp3], %[ftmp6] \n\t" "paddh %[ftmp7], %[ftmp7], %[ftmp7] \n\t" - MMI_SDC1(%[ftmp3], $29, 0x00) + MMI_SDC1(%[ftmp3], $sp, 0x00) "psubh %[ftmp7], %[ftmp7], %[ftmp1] \n\t" - MMI_SDC1(%[ftmp0], $29, 0x10) + MMI_SDC1(%[ftmp0], $sp, 0x10) "dmfc1 %[tmp1], %[ftmp2] \n\t" "xor %[ftmp2], %[ftmp2], %[ftmp2] \n\t" MMI_SDC1(%[ftmp2], %[block], 0x00) @@ -463,8 +463,8 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) "packushb %[ftmp0], %[ftmp0], %[ftmp2] \n\t" MMI_SWC1(%[ftmp3], %[dst], 0x00) MMI_SWXC1(%[ftmp0], %[dst], %[stride], 0x00) - MMI_LDC1(%[ftmp5], $29, 0x00) - MMI_LDC1(%[ftmp4], $29, 0x10) + MMI_LDC1(%[ftmp5], $sp, 0x00) + MMI_LDC1(%[ftmp4], $sp, 0x10) "dmtc1 %[tmp1], %[ftmp6] \n\t" PTR_ADDU "%[dst], %[dst], %[stride] \n\t" PTR_ADDU "%[dst], %[dst], %[stride] \n\t" @@ -496,7 +496,7 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) MMI_SWXC1(%[ftmp0], %[dst], %[stride], 0x00) "dmtc1 %[tmp4], %[ftmp1] \n\t" "dmtc1 %[tmp2], %[ftmp6] \n\t" - MMI_LDC1(%[ftmp4], $29, 0x18) + MMI_LDC1(%[ftmp4], $sp, 0x18) "mov.d %[ftmp5], %[ftmp4] \n\t" "psrah %[ftmp4], %[ftmp4], %[ftmp8] \n\t" "psrah %[ftmp7], %[ftmp11], %[ftmp8] \n\t" @@ -528,7 +528,7 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) "psrah %[ftmp7], %[ftmp6], %[ftmp8] \n\t" "paddh %[ftmp0], %[ftmp0], %[ftmp6] \n\t" "psubh %[ftmp7], %[ftmp7], %[ftmp3] \n\t" - MMI_LDC1(%[ftmp6], $29, 0x08) + MMI_LDC1(%[ftmp6], $sp, 0x08) "dmtc1 %[tmp6], %[ftmp3] \n\t" "paddh %[ftmp3], %[ftmp3], %[ftmp6] \n\t" "paddh %[ftmp6], %[ftmp6], %[ftmp6] \n\t" @@ -550,9 +550,9 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) "paddh %[ftmp4], %[ftmp4], %[ftmp3] \n\t" "psubh %[ftmp6], %[ftmp6], %[ftmp1] \n\t" "paddh %[ftmp3], %[ftmp3], %[ftmp3] \n\t" - MMI_SDC1(%[ftmp6], $29, 0x08) + MMI_SDC1(%[ftmp6], $sp, 0x08) "psubh %[ftmp3], %[ftmp3], %[ftmp4] \n\t" - MMI_SDC1(%[ftmp7], $29, 0x18) + MMI_SDC1(%[ftmp7], $sp, 0x18) "dmfc1 %[tmp2], %[ftmp0] \n\t" "xor %[ftmp0], %[ftmp0], %[ftmp0] \n\t" MMI_ULWC1(%[ftmp6], %[addr0], 0x00) @@ -581,8 +581,8 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) "packushb %[ftmp7], %[ftmp7], %[ftmp0] \n\t" MMI_SWC1(%[ftmp6], %[addr0], 0x00) MMI_SWXC1(%[ftmp7], %[addr0], %[stride], 0x00) - MMI_LDC1(%[ftmp2], $29, 0x08) - MMI_LDC1(%[ftmp5], $29, 0x18) + MMI_LDC1(%[ftmp2], $sp, 0x08) + MMI_LDC1(%[ftmp5], $sp, 0x18) PTR_ADDU "%[addr0], %[addr0], %[stride] \n\t" "dmtc1 %[tmp2], %[ftmp1] \n\t" PTR_ADDU "%[addr0], %[addr0], %[stride] \n\t" @@ -612,7 +612,7 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) "packushb %[ftmp7], %[ftmp7], %[ftmp0] \n\t" MMI_SWC1(%[ftmp6], %[addr0], 0x00) MMI_SWXC1(%[ftmp7], %[addr0], %[stride], 0x00) - PTR_ADDIU "$29, $29, 0x20 \n\t" + PTR_ADDIU "$sp, $sp, 0x20 \n\t" : [ftmp0]"=&f"(ftmp[0]), [ftmp1]"=&f"(ftmp[1]), [ftmp2]"=&f"(ftmp[2]), [ftmp3]"=&f"(ftmp[3]), [ftmp4]"=&f"(ftmp[4]), [ftmp5]"=&f"(ftmp[5]), @@ -630,7 +630,7 @@ void ff_h264_idct8_add_8_mmi(uint8_t *dst, int16_t *block, int stride) [addr0]"=&r"(addr[0]) : [dst]"r"(dst), [block]"r"(block), [stride]"r"((mips_reg)stride) - : "$29","memory" + : "memory" ); }