From patchwork Thu Jul 2 15:46:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaxun Yang X-Patchwork-Id: 20780 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 591ED44B923 for ; Thu, 2 Jul 2020 18:47:07 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 43CC368AD9A; Thu, 2 Jul 2020 18:47:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from relay4.mymailcheap.com (relay4.mymailcheap.com [137.74.80.156]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 73B9F68A0BE for ; Thu, 2 Jul 2020 18:46:59 +0300 (EEST) Received: from filter2.mymailcheap.com (filter2.mymailcheap.com [91.134.140.82]) by relay4.mymailcheap.com (Postfix) with ESMTPS id 959313F1CF for ; Thu, 2 Jul 2020 17:46:57 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by filter2.mymailcheap.com (Postfix) with ESMTP id 63B542A4F0 for ; Thu, 2 Jul 2020 17:46:57 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=mymailcheap.com; s=default; t=1593704817; bh=WENfaguslmkx/sed9tpjYBmjFwSvSsMpcPLT6uOLDEE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HGIol5+xFFbAaq4G+hItxbpwK1L+7SemHZGJcp6rjcdXccX8f2peuQ82tiFTBKnB0 xUt2VRNgonIrm+kkXZ/cGPNrdnNUadOhhF3fPfJ56QqwJ6iTYjfFG5VaHtH2qXCEaY fZSTzyKEMHFovZF6U6L8zwMEqk2tNL3Va1zvU7uY= X-Virus-Scanned: Debian amavisd-new at filter2.mymailcheap.com Received: from filter2.mymailcheap.com ([127.0.0.1]) by localhost (filter2.mymailcheap.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EEo4bPzEHbU7 for ; Thu, 2 Jul 2020 17:46:55 +0200 (CEST) Received: from mail20.mymailcheap.com (mail20.mymailcheap.com [51.83.111.147]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by filter2.mymailcheap.com (Postfix) with ESMTPS for ; Thu, 2 Jul 2020 17:46:55 +0200 (CEST) Received: from [213.133.102.83] (ml.mymailcheap.com [213.133.102.83]) by mail20.mymailcheap.com (Postfix) with ESMTP id 6A349403B3; Thu, 2 Jul 2020 15:46:53 +0000 (UTC) Authentication-Results: mail20.mymailcheap.com; dkim=pass (1024-bit key; unprotected) header.d=flygoat.com header.i=@flygoat.com header.b="XNK5d6vx"; dkim-atps=neutral AI-Spam-Status: Not processed Received: from halation.202.net.flygoat.com (unknown [115.227.164.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail20.mymailcheap.com (Postfix) with ESMTPSA id 949084084C; Thu, 2 Jul 2020 15:46:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=flygoat.com; s=default; t=1593704795; bh=WENfaguslmkx/sed9tpjYBmjFwSvSsMpcPLT6uOLDEE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XNK5d6vxCQad6v+qdPoI7Am7AT3OUVcxr8savJ54AVty27mqb1SiJ306qkjlfgv7G X9xrgu/tOTtp+DytZn2oS8YaWbG4QBZWSPpg+iulzwgNp2cItau9w1T0UyzV81hCHl JGhIccyexsJ832FB6o8WLhBPGw9XaCqb/I2ESEtI= From: Jiaxun Yang To: ffmpeg-devel@ffmpeg.org Date: Thu, 2 Jul 2020 23:46:17 +0800 Message-Id: <20200702154622.21280-2-jiaxun.yang@flygoat.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200702154622.21280-1-jiaxun.yang@flygoat.com> References: <20200702154622.21280-1-jiaxun.yang@flygoat.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 6A349403B3 X-Spamd-Result: default: False [4.90 / 10.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(0.00)[flygoat.com:s=default]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; R_SPF_SOFTFAIL(0.00)[~all]; ML_SERVERS(-3.10)[213.133.102.83]; DKIM_TRACE(0.00)[flygoat.com:+]; RCPT_COUNT_TWO(0.00)[2]; MID_CONTAINS_FROM(1.00)[]; RCVD_IN_DNSWL_NONE(0.00)[213.133.102.83:from]; DMARC_POLICY_ALLOW(0.00)[flygoat.com,none]; DMARC_POLICY_ALLOW_WITH_FAILURES(0.00)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:213.133.96.0/19, country:DE]; RCVD_COUNT_TWO(0.00)[2]; HFILTER_HELO_BAREIP(3.00)[213.133.102.83,1] X-Rspamd-Server: mail20.mymailcheap.com X-Spam: Yes Subject: [FFmpeg-devel] [PATCH v5 1/6] ffbuild: Refine MIPS handling X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Jiaxun Yang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" To enable runtime detection for MIPS, we need to refine ffbuild part to support buildding these feature together. Firstly, we fixed configure, let it probe native ability of toolchain to decide wether a feature can to be enabled, also clearly marked the conflictions between loongson2 & loongson3 and Release 6 & rest. Secondly, we compile MMI and MSA C sources with their own flags to ensure their flags won't pollute the whole program and generate illegal code. Signed-off-by: Jiaxun Yang --- v5: Minor fixes --- configure | 180 +++++++++++++++++++++++---------------- ffbuild/common.mak | 10 ++- libavcodec/mips/Makefile | 3 +- 3 files changed, 118 insertions(+), 75 deletions(-) diff --git a/configure b/configure index 7495f35faa..385a9e5f5f 100755 --- a/configure +++ b/configure @@ -2551,7 +2551,7 @@ mips64r6_deps="mips" mipsfpu_deps="mips" mipsdsp_deps="mips" mipsdspr2_deps="mips" -mmi_deps="mips" +mmi_deps_any="loongson2 loongson3" msa_deps="mipsfpu" msa2_deps="msa" @@ -5002,8 +5002,6 @@ elif enabled bfin; then elif enabled mips; then - cpuflags="-march=$cpu" - if [ "$cpu" != "generic" ]; then disable mips32r2 disable mips32r5 @@ -5012,19 +5010,61 @@ elif enabled mips; then disable mips64r6 disable loongson2 disable loongson3 + disable mipsdsp + disable mipsdspr2 + disable msa + disable mmi + + cpuflags="-march=$cpu" case $cpu in - 24kc|24kf*|24kec|34kc|1004kc|24kef*|34kf*|1004kf*|74kc|74kf) + # General ISA levels + mips1|mips3) + ;; + mips32r2) + enable msa enable mips32r2 - disable msa ;; - p5600|i6400|p6600) - disable mipsdsp - disable mipsdspr2 + mips32r5) + enable msa + enable mips32r2 + enable mips32r5 ;; - loongson*) - enable loongson2 + mips64r2|mips64r5) + enable msa + enable mmi + enable mips64r2 enable loongson3 + ;; + # Cores from MIPS(MTI) + 24kc) + disable mipsfpu + enable mips32r2 + ;; + 24kf*|24kec|34kc|74Kc|1004kc) + enable mips32r2 + ;; + 24kef*|34kf*|1004kf*) + enable mipsdsp + enable mips32r2 + ;; + p5600) + enable msa + enable mips32r2 + enable mips32r5 + check_cflags "-mtune=p5600" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" + ;; + i6400) + enable mips64r6 + check_cflags "-mtune=i6400 -mabi=64" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" && check_ldflags "-mabi=64" + ;; + p6600) + enable mips64r6 + check_cflags "-mtune=p6600 -mabi=64" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" && check_ldflags "-mabi=64" + ;; + # Cores from Loongson + loongson2e|loongson2f|loongson3*) + enable mmi enable local_aligned enable simd_align_16 enable fast_64bit @@ -5032,75 +5072,44 @@ elif enabled mips; then enable fast_cmov enable fast_unaligned disable aligned_stack - disable mipsdsp - disable mipsdspr2 # When gcc version less than 5.3.0, add -fno-expensive-optimizations flag. - if [ $cc == gcc ]; then - gcc_version=$(gcc -dumpversion) - if [ "$(echo "$gcc_version 5.3.0" | tr " " "\n" | sort -rV | head -n 1)" == "$gcc_version" ]; then - expensive_optimization_flag="" - else + if test "$cc_type" = "gcc"; then + case $gcc_basever in + 2|2.*|3.*|4.*|5.0|5.1|5.2) expensive_optimization_flag="-fno-expensive-optimizations" - fi + ;; + *) + expensive_optimization_flag="" + ;; + esac fi + case $cpu in loongson3*) + enable loongson3 + enable mips64r2 + enable msa cpuflags="-march=loongson3a -mhard-float $expensive_optimization_flag" ;; loongson2e) + enable loongson2 cpuflags="-march=loongson2e -mhard-float $expensive_optimization_flag" ;; loongson2f) + enable loongson2 cpuflags="-march=loongson2f -mhard-float $expensive_optimization_flag" ;; esac ;; *) - # Unknown CPU. Disable everything. - warn "unknown CPU. Disabling all MIPS optimizations." - disable mipsfpu - disable mipsdsp - disable mipsdspr2 - disable msa - disable mmi + warn "unknown MIPS CPU" ;; esac - case $cpu in - 24kc) - disable mipsfpu - disable mipsdsp - disable mipsdspr2 - ;; - 24kf*) - disable mipsdsp - disable mipsdspr2 - ;; - 24kec|34kc|1004kc) - disable mipsfpu - disable mipsdspr2 - ;; - 24kef*|34kf*|1004kf*) - disable mipsdspr2 - ;; - 74kc) - disable mipsfpu - ;; - p5600) - enable mips32r5 - check_cflags "-mtune=p5600" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" - ;; - i6400) - enable mips64r6 - check_cflags "-mtune=i6400 -mabi=64" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" && check_ldflags "-mabi=64" - ;; - p6600) - enable mips64r6 - check_cflags "-mtune=p6600 -mabi=64" && check_cflags "-msched-weight -mload-store-pairs -funroll-loops" && check_ldflags "-mabi=64" - ;; - esac else - # We do not disable anything. Is up to the user to disable the unwanted features. + disable mipsdsp + disable mipsdspr2 + # Disable DSP stuff for generic CPU, it can't be detected at runtime. warn 'generic cpu selected' fi @@ -5847,28 +5856,51 @@ EOF elif enabled mips; then - enabled loongson2 && check_inline_asm loongson2 '"dmult.g $8, $9, $10"' - enabled loongson3 && check_inline_asm loongson3 '"gsldxc1 $f0, 0($2, $3)"' - enabled mmi && check_inline_asm mmi '"punpcklhw $f0, $f0, $f0"' - # Enable minimum ISA based on selected options + # Check toolchain ISA level if enabled mips64; then - enabled mips64r6 && check_inline_asm_flags mips64r6 '"dlsa $0, $0, $0, 1"' '-mips64r6' - enabled mips64r2 && check_inline_asm_flags mips64r2 '"dext $0, $0, 0, 1"' '-mips64r2' - disabled mips64r6 && disabled mips64r2 && check_inline_asm_flags mips64r1 '"daddi $0, $0, 0"' '-mips64' + enabled mips64r6 && check_inline_asm mips64r6 '"dlsa $0, $0, $0, 1"' && + disable mips64r2 + + enabled mips64r2 && check_inline_asm mips64r2 '"dext $0, $0, 0, 1"' + + disable mips32r6 && disable mips32r5 && disable mips32r2 else - enabled mips32r6 && check_inline_asm_flags mips32r6 '"aui $0, $0, 0"' '-mips32r6' - enabled mips32r5 && check_inline_asm_flags mips32r5 '"eretnc"' '-mips32r5' - enabled mips32r2 && check_inline_asm_flags mips32r2 '"ext $0, $0, 0, 1"' '-mips32r2' - disabled mips32r6 && disabled mips32r5 && disabled mips32r2 && check_inline_asm_flags mips32r1 '"addi $0, $0, 0"' '-mips32' + enabled mips32r6 && check_inline_asm mips32r6 '"aui $0, $0, 0"' && + disable mips32r5 && disable mips32r2 + + enabled mips32r5 && check_inline_asm mips32r5 '"eretnc"' + enabled mips32r2 && check_inline_asm mips32r2 '"ext $0, $0, 0, 1"' + + disable mips64r6 && disable mips64r5 && disable mips64r2 fi - enabled mipsfpu && check_inline_asm_flags mipsfpu '"cvt.d.l $f0, $f2"' '-mhard-float' + enabled mipsfpu && check_inline_asm mipsfpu '"cvt.d.l $f0, $f2"' enabled mipsfpu && (enabled mips32r5 || enabled mips32r6 || enabled mips64r6) && check_inline_asm_flags mipsfpu '"cvt.d.l $f0, $f1"' '-mfp64' - enabled mipsfpu && enabled msa && check_inline_asm_flags msa '"addvi.b $w0, $w1, 1"' '-mmsa' && check_headers msa.h || disable msa + enabled mipsdsp && check_inline_asm_flags mipsdsp '"addu.qb $t0, $t1, $t2"' '-mdsp' enabled mipsdspr2 && check_inline_asm_flags mipsdspr2 '"absq_s.qb $t0, $t1"' '-mdspr2' - enabled msa && enabled msa2 && check_inline_asm_flags msa2 '"nxbits.any.b $w0, $w0"' '-mmsa2' && check_headers msa2.h || disable msa2 + + # MSA and MSA2 can be detected at runtime so we supply extra flags here + enabled mipsfpu && enabled msa && check_inline_asm msa '"addvi.b $w0, $w1, 1"' '-mmsa' && append MSAFLAGS '-mmsa' + enabled msa && enabled msa2 && check_inline_asm msa2 '"nxbits.any.b $w0, $w0"' '-mmsa2' && append MSAFLAGS '-mmsa2' + + # loongson2 have no switch cflag so we can only probe toolchain ability + enabled loongson2 && check_inline_asm loongson2 '"dmult.g $8, $9, $10"' + if enabled loongson2 ; then + disable loongson3 + fi + + # loongson3 is paired with MMI + enabled mips64r2 && enabled loongson3 && check_inline_asm loongson3 '"gsldxc1 $f0, 0($2, $3)"' '-mloongson-ext' && append MMIFLAGS '-mloongson-ext' + + # MMI must come together with loongson2 or loongson3 + if disabled loongson2 && disabled loongson3; then + disable mmi + fi + + # MMI can be detected at runtime too + enabled mmi && check_inline_asm mmi '"punpcklhw $f0, $f0, $f0"' '-mloongson-mmi' && append MMIFLAGS '-mloongson-mmi' if enabled bigendian && enabled msa; then disable msa @@ -7443,6 +7475,8 @@ LDSOFLAGS=$LDSOFLAGS SHFLAGS=$(echo $($ldflags_filter $SHFLAGS)) ASMSTRIPFLAGS=$ASMSTRIPFLAGS X86ASMFLAGS=$X86ASMFLAGS +MSAFLAGS=$MSAFLAGS +MMIFLAGS=$MMIFLAGS BUILDSUF=$build_suffix PROGSSUF=$progs_suffix FULLNAME=$FULLNAME diff --git a/ffbuild/common.mak b/ffbuild/common.mak index a60d27c9bd..6b95a17fbb 100644 --- a/ffbuild/common.mak +++ b/ffbuild/common.mak @@ -44,7 +44,7 @@ LDFLAGS := $(ALLFFLIBS:%=$(LD_PATH)lib%) $(LDFLAGS) define COMPILE $(call $(1)DEP,$(1)) - $($(1)) $($(1)FLAGS) $($(1)_DEPFLAGS) $($(1)_C) $($(1)_O) $(patsubst $(SRC_PATH)/%,$(SRC_LINK)/%,$<) + $($(1)) $($(1)FLAGS) $($(1)_DEPFLAGS) $($(1)_C) $($(1)_O) $($(2)) $(patsubst $(SRC_PATH)/%,$(SRC_LINK)/%,$<) endef COMPILE_C = $(call COMPILE,CC) @@ -54,6 +54,14 @@ COMPILE_M = $(call COMPILE,OBJCC) COMPILE_X86ASM = $(call COMPILE,X86ASM) COMPILE_HOSTC = $(call COMPILE,HOSTCC) COMPILE_NVCC = $(call COMPILE,NVCC) +COMPILE_MMI = $(call COMPILE,CC,MMIFLAGS) +COMPILE_MSA = $(call COMPILE,CC,MSAFLAGS) + +%_mmi.o: %_mmi.c + $(COMPILE_MMI) + +%_msa.o: %_msa.c + $(COMPILE_MSA) %.o: %.c $(COMPILE_C) diff --git a/libavcodec/mips/Makefile b/libavcodec/mips/Makefile index b4993f6e76..2be4d9b8a2 100644 --- a/libavcodec/mips/Makefile +++ b/libavcodec/mips/Makefile @@ -71,6 +71,8 @@ MSA-OBJS-$(CONFIG_IDCTDSP) += mips/idctdsp_msa.o \ MSA-OBJS-$(CONFIG_MPEGVIDEO) += mips/mpegvideo_msa.o MSA-OBJS-$(CONFIG_MPEGVIDEOENC) += mips/mpegvideoencdsp_msa.o MSA-OBJS-$(CONFIG_ME_CMP) += mips/me_cmp_msa.o +MSA-OBJS-$(CONFIG_VC1_DECODER) += mips/vc1dsp_msa.o + MMI-OBJS += mips/constants.o MMI-OBJS-$(CONFIG_H264DSP) += mips/h264dsp_mmi.o MMI-OBJS-$(CONFIG_H264CHROMA) += mips/h264chroma_mmi.o @@ -89,4 +91,3 @@ MMI-OBJS-$(CONFIG_WMV2DSP) += mips/wmv2dsp_mmi.o MMI-OBJS-$(CONFIG_HEVC_DECODER) += mips/hevcdsp_mmi.o MMI-OBJS-$(CONFIG_VP3DSP) += mips/vp3dsp_idct_mmi.o MMI-OBJS-$(CONFIG_VP9_DECODER) += mips/vp9_mc_mmi.o -MSA-OBJS-$(CONFIG_VC1_DECODER) += mips/vc1dsp_msa.o