From patchwork Sun Jun 11 09:34:46 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ivan Kalvachev X-Patchwork-Id: 3905 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.10.195 with SMTP id 186csp437983vsk; Sun, 11 Jun 2017 02:34:59 -0700 (PDT) X-Received: by 10.28.136.85 with SMTP id k82mr4622712wmd.55.1497173699479; Sun, 11 Jun 2017 02:34:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1497173699; cv=none; d=google.com; s=arc-20160816; b=HuykUk8Aps2EsAXJKZF8ubVuIKYedlCrR2im2uRGL9Vffpw4PqkwzYf6VKQWKQRjGg 4BVusBBli/CqJP0ToQz88NRjOHa8TTqtnkNhR23X0zLPs9w/JrXfuDFhqomYyFeCifLQ owaoh6mPjOIyIL/was34pftgQupErCjkjOiUDXCxTaM9k3R3bO5Xue8S36b/RCRqu3YK AgUqAtEL4OA6GHjAOxm+zfAvauc71bK4HRJecrTl6c63eK/wvh1Oh61YTpK8xEgIerMq KuweeLwIKggvaPok9FTaoGaV4cGJIxnQY1JFJAKsFgc9uLp9bEp6raOOfjbv4rM+dURz 9wmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:to:message-id:date:from:references:in-reply-to :mime-version:dkim-signature:delivered-to:arc-authentication-results; bh=VWNakX8FMv6oBNBfIqAtRLhrK5FAg6mPezWgp6RmXkM=; b=fIg0bUBZanzgrpIhyGyb8RRzx6ThZFxNovD5qz0m6FyN1J3JRCVA01sy37vDe3KPB4 bkRPAysBGdjHpWj9O1AKUmQDurAY7ZPxoy8dqkzSksp6G6eWiqwwDWo/sl2lH6gIjocg PAGGfmbWTkFl64sg5eOlYKV8CkBadnlRIjRc1JoOCYKTE5E7HDqOFr9C2E0VGwvPrcYF oi8AFu3tpkGOfaTPwvbLvu/H/2/ClOi75XWKGqFabyaK/Hh4bbsUMqTPtyZV2ik48W9Q zAZq+175npsMG++xyGxAeQsuvpXSa7c9F5xJ22NXmYmwAYRPwPsEN2jx0XsPG3K7qaKY PJIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c8si5821949wrd.273.2017.06.11.02.34.58; Sun, 11 Jun 2017 02:34:59 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5C3AC689D34; Sun, 11 Jun 2017 12:34:54 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg0-f45.google.com (mail-pg0-f45.google.com [74.125.83.45]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DD61368922E for ; Sun, 11 Jun 2017 12:34:47 +0300 (EEST) Received: by mail-pg0-f45.google.com with SMTP id a70so37138272pge.3 for ; Sun, 11 Jun 2017 02:34:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=W5AbUJwnR4B/goqL1XIgCnDAdnA5W4JlQMJq8EyyELA=; b=BTyPSMYJwIkISELf1f8y9JfNWkdfD+MEGeJgBRufho4XAbU03RTY0E53v/k7aw33ER vnR6q9f6N3VAnMM1KRGvODuAOI81dDwwuBIy1lWH3vMfkt8YB0b89F8HSb3T4SDFCXuZ 02QZh1JMVJClV/VwRQyA3A6Sq6Jjes7mhQFtuEXqwuURg16a5O6sMj2vTY6nuMZPqd0D aFx7zkEmc2YYO/EXeBTN+e09Ecbl+V77JVMu5V5IU81Gy8HcqAMffbGML2Raqz5Habmn jaMbtJVg5MqVkSNX5YriKwLw+cm7sw6EFPLkA+eI6YC47c33w6FHDXOA8T+sHJvrY1SG A2bA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=W5AbUJwnR4B/goqL1XIgCnDAdnA5W4JlQMJq8EyyELA=; b=tlZzcDjIDAVT145YHkXbG8O9bpbZGj8BYNtCT+Lry550xiF6ZJuRTlhu7/+/7tC8z1 z0nQHSkX8SXgL4421dQKJKXTb+OuUTKD0NnHv2R1RZCB+ch9rny+Fmm/I/Aj/QoQq96p mbS4xt8PSmhD48NDErgikSK3A0SwLavUn/+s3aQUT580CYMXROMObdKg8bF7KKNBDW8h wXPl8NUWzoSPXzoPNGVXz4pN+vDbXHuUIT13yeumNFS1jQNI0tVRhmJTv1YHeJmdZcl9 kEiQh1K0w/lfRAP4EwR650e1JZjSkDzl9awgdw7G2tmGYsuacTQp+Kh99nZbhSfyrfxK x4gw== X-Gm-Message-State: AKS2vOzrC6NNabdqQyGyBst7Rt8KAFGrxBYbT7LXmG2pkY7BlDYXuymA fMT5LbUTc8J35yOhP2DnuXF/YkAu2A== X-Received: by 10.99.123.12 with SMTP id w12mr1315587pgc.125.1497173687269; Sun, 11 Jun 2017 02:34:47 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.183.197 with HTTP; Sun, 11 Jun 2017 02:34:46 -0700 (PDT) In-Reply-To: <25d77459-7eb9-292e-d500-a649b57e9f02@gmail.com> References: <20170609100848.GA4759@nb4> <25d77459-7eb9-292e-d500-a649b57e9f02@gmail.com> From: Ivan Kalvachev Date: Sun, 11 Jun 2017 12:34:46 +0300 Message-ID: To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [WIP][PATCH] Opus Piramid Vector Quantization Search in x86 SIMD asm X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" On 6/10/17, James Darnley wrote: > On 2017-06-09 13:41, Ivan Kalvachev wrote: >> On 6/9/17, Michael Niedermayer wrote: >>> seems this breaks build with mingw64, didnt investigate but it >>> fails with these errors: >>> >>> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x2d): >>> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge' >>> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x3fd): >>> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge' >>> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x7a1): >>> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge' >>> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0xb48): >>> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge' >>> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x2d): >>> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge' >>> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x3fd): >>> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge' >>> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x7a1): >>> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge' >>> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0xb48): >>> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge' >>> collect2: error: ld returned 1 exit status >>> collect2: error: ld returned 1 exit status >>> make: *** [ffmpeg_g.exe] Error 1 >>> make: *** Waiting for unfinished jobs.... >>> make: *** [ffprobe_g.exe] Error 1 >> >> >> const_*_edge is used on only one place is the code. >> Would you check if this patch fixes the issue. >> >> I expected that the addresses would be pre-calculated >> by n/yasm as one value and indexed >> relative to the section start. >> Instead it seems that each entry is represented with >> its own address and offset from it. >> Since the offset is negative it uses all 64 bits and >> it makes difference if it is truncated to 32 bits. >> >> Same issue could happen with clang tools. > > The problem is with the relative addressing. You need to load the real > address first before you can offset with another register at runtime. So > something like: > >> mov reg1, [read_only_const] lea ? >> mova mmreg, [reg1 + reg2] . OK, Getting mingw for my distro is problem and compiling one myself would take a bit more effort/time. So I'm posting a patch that "should" work. lea r4q, [Nq-mmsize] ; Nq is rounded up (aligned up) to mmsize, so r4q can't become negative here, unless N=0. movups m2, [inXq + r4q] ====== What I find surprising is that PIC is enabled only on Windows and does not seem to depend on CONFIG_PIC, so textrels are used all over assembly code. Do I miss something? Are there option(s) to signal/error when texrel is been used in code that should be pic ? ====== --- a/libavcodec/x86/opus_pvq_search.asm +++ b/libavcodec/x86/opus_pvq_search.asm @@ -406,7 +406,7 @@ align 16 ; uint32 N - Number of vector elements. Must be 0 < N < 8192 ; %macro PVQ_FAST_SEARCH 0 -cglobal pvq_search,4,5,8, mmsize, inX, outY, K, N +cglobal pvq_search,4,6,8, mmsize, inX, outY, K, N %define tmpX rsp ; movsxdifnidn Nq, Nd @@ -419,7 +419,12 @@ cglobal pvq_search,4,5,8, mmsize, inX, outY, K, N add Nq, r4q ; Nq = align(Nq, mmsize) sub rsp, Nq ; allocate tmpX[Nq] +%ifdef PIC + lea r5q, [const_align_abs_edge] ; rip+const + movups m3, [r5q+r4q-mmsize] ; this is the bit mask for the padded read at the end of the input +%else movups m3, [const_align_abs_edge-mmsize+r4q] ; this is the bit mask for the padded read at the end of the input +%endif