From patchwork Fri Jun 30 01:53:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 4159 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.1.76 with SMTP id 73csp5179878vsb; Thu, 29 Jun 2017 18:53:20 -0700 (PDT) X-Received: by 10.28.69.9 with SMTP id s9mr14079931wma.71.1498787600658; Thu, 29 Jun 2017 18:53:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1498787600; cv=none; d=google.com; s=arc-20160816; b=rIYXQlWeDqYTYpXywGxlgCB5lKXndEk/RldGLn9nkW+N/VVoLiCvGn3d5ZNW7Y4wQw V+6Zzja8vHqJgszfsfLGA2Ev4vBLJbkGkdKqGBlrjwLPcWbqU6Jji3JE3k50yq+ZpLnY Cqo1Ua+fnHzUZK1u4ub7N7CSBrLJ4hDbuLO7dC3QN06KyXc3w1egWAJq1WkqS0UOuHKY QbXcE+GIJupaPIG40QxrbEwzaXa0juH+d74kSrkcVMSJKYnDwUvCZ9A/Untx2yuts0tE jpbASuT4VTWdb/zvmNpvHxnGLI5wtxQ3HrXczckJ8QNO3lI+WuVtKVSopiMX0isCvBam q0GQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:dkim-signature:delivered-to :arc-authentication-results; bh=eEEGPhcCpCjechpSGM0Iy3WyjlJmpWHxDF4tQbat2Y0=; b=NlETD+oLuT2Wx5BlrcXlLm/G6Kh5Qzx6vuTi5MTrrUUTgZUeTz0h4rgroY9TSxMpWM qhMbAkgQyo5y2H0oVwCpOvycs8qWlAHB2vtGGVTDBmoz3+jnuZpQ7RIL1QWxKIhOsrlR dZr95Jcm2rrYDAJ7GcuSvY1hLmbkbcNsCmS8tRS4802HfvV7eGDwjdCZfWs7yRzeQhtD bV10CvWOFq/DIkoAsPHu2WTSsuOVv+sFJRmsDN6RD0xlsFXBcCU+xE1BrYcyEWzZU6YT TNbUJVUjjVMSH9vpi7iRlwbCJUnE1YpJ0nFjx9tn+cEYCyVM5k1uze/XhKGFq0AWLv/9 iPrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.b=cnv+sQRp; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i66si9533452wmd.125.2017.06.29.18.53.19; Thu, 29 Jun 2017 18:53:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.b=cnv+sQRp; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D656F689DE9; Fri, 30 Jun 2017 04:53:16 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qk0-f194.google.com (mail-qk0-f194.google.com [209.85.220.194]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 767FB689BBF for ; Fri, 30 Jun 2017 04:53:10 +0300 (EEST) Received: by mail-qk0-f194.google.com with SMTP id 16so13868527qkg.2 for ; Thu, 29 Jun 2017 18:53:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language; bh=xLMp9U9AdcfbA6EZ2wZOjoS7a6l0dGlfmzkzjpDsY/c=; b=cnv+sQRptMTN/26HCqQtVD/9PrJ4M85ado81bnmSo/T3uhUBW2Xr0pkl1Rc+06s1nx +cJ2gqBnMi7yyXDPuzl7uZTmex4TsT1gDEqrPSC6KspDQg/ANN8NNMEow6YPEr1q28VY zgKeELkVLMdiF6repTDikSEkXq2R3GjVzYrJuu273H1XHITKs2JOFptBqdhijEWwR9E1 xOqgIy59evKWi/QNsbeJmEb7ALUm8DWsBbRc7S4p9K9wAMlHFWanTT2Srd1zN5nWUu+4 /n5t5BdkFZ8Rb2fH3TH65klOe2aeVVgnqGTdzJaZx2bS/7NJok2poMmXFMxiE8i5YgMu zuEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=xLMp9U9AdcfbA6EZ2wZOjoS7a6l0dGlfmzkzjpDsY/c=; b=o1AGxtRqAaBlpYVWoOb/nMQMO4k1Gpt+YT8ZTSI+/g7UpfwztJnWqNAipRrYFbMkBY gHKDdHZTEQQY3c1lF6Cr46OyqMj6d9q2It4Qq8CKocQlUVjkvJHSVKwTPiNwIZ6qTugB GYgtMcc0ZKk5aAzHZzAS/WKqjLOEkC3DZR7xcCgcicCmWe8W42aU1gvhCXJkX/dBHYxy 95ZmIbKMJwFRYQfrc/7iNDzxY65wkRQCKj/Z00y5y90QPi4pxvIr2uGoX8Wslu9mIGPR E/mSXWCvDVS+QQxWg2tbkny3ssEC9W4VRgvNIlyDb+AVbneT2Or0IA1QlS6CvYPXZx0B 1woQ== X-Gm-Message-State: AKS2vOxuSjxVVaAHEnoFWlx5BpnsVqnjyPtzaD1C6Mllg94A8KlEmRIC 2Dc2+zSzCwO/bip3HGQ= X-Received: by 10.55.5.135 with SMTP id 129mr23721517qkf.184.1498787589195; Thu, 29 Jun 2017 18:53:09 -0700 (PDT) Received: from [192.168.0.4] ([181.231.116.134]) by smtp.gmail.com with ESMTPSA id g13sm5686561qkh.3.2017.06.29.18.53.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 29 Jun 2017 18:53:08 -0700 (PDT) To: ffmpeg-devel@ffmpeg.org References: <20170622125628.24516-1-matthieu.bouron@gmail.com> <8eb1a3b6-e135-13c4-fd5a-c6c8fcfc5762@gmail.com> <20170623150135.GD26847@tsuri.lan> <20170628124855.GF26847@tsuri.lan> <20170629235805.GB4727@nb4> From: James Almer Message-ID: Date: Thu, 29 Jun 2017 22:53:06 -0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: Re: [FFmpeg-devel] [PATCH 1/2] checkasm: add sbrdsp tests X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" On 6/29/2017 10:14 PM, Henrik Gramner wrote: > On Fri, Jun 30, 2017 at 1:58 AM, Michael Niedermayer > wrote: >> Program received signal SIGSEGV, Segmentation fault. >> 0x0000000000684919 in ff_sbr_hf_gen_sse () > >> 0x0000000000684909 : sub %r9,%r8 > >> => 0x0000000000684919 : movaps (%rsi,%r8,1),%xmm0 > >> r9 0xdeadbeef00000080 -2401053092612145024 > > Another case of a 32-bit int being used as part of a 64-bit operation. I can't reproduce it on my ArchLinux x86_64 environment for some reason, but based on what you said i assume the attached patch should fix it. From f4646091b450b7c4c5479fbb4163ef89615a4a8d Mon Sep 17 00:00:00 2001 From: James Almer Date: Thu, 29 Jun 2017 22:51:04 -0300 Subject: [PATCH] x86/sbrdsp: zero extend start and end gprs in ff_sbr_hf_gen_sse Signed-off-by: James Almer --- libavcodec/x86/sbrdsp.asm | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/libavcodec/x86/sbrdsp.asm b/libavcodec/x86/sbrdsp.asm index d0f774b277..c716184b14 100644 --- a/libavcodec/x86/sbrdsp.asm +++ b/libavcodec/x86/sbrdsp.asm @@ -149,19 +149,19 @@ cglobal sbr_hf_gen, 4,4,8, X_high, X_low, alpha0, alpha1, BW, S, E ; start and end 6th and 7th args on stack mov r2d, Sm mov r3d, Em -%define start r2q -%define end r3q + DEFINE_ARGS X_high, X_low, start, end %else ; BW does not actually occupy a register, so shift by 1 -%define start BWq -%define end Sq + DEFINE_ARGS X_high, X_low, alpha0, alpha1, start, end + movsxd startq, startd + movsxd endq, endd %endif - sub start, end ; neg num of loops - lea X_highq, [X_highq + end*2*4] - lea X_lowq, [X_lowq + end*2*4 - 2*2*4] - shl start, 3 ; offset from num loops + sub startq, endq ; neg num of loops + lea X_highq, [X_highq + endq*2*4] + lea X_lowq, [X_lowq + endq*2*4 - 2*2*4] + shl startq, 3 ; offset from num loops - mova m0, [X_lowq + start] + mova m0, [X_lowq + startq] shufps m3, m3, q1111 shufps m4, m4, q1111 xorps m3, [ps_mask] @@ -169,7 +169,7 @@ cglobal sbr_hf_gen, 4,4,8, X_high, X_low, alpha0, alpha1, BW, S, E shufps m2, m2, q0000 xorps m4, [ps_mask] .loop2: - movu m7, [X_lowq + start + 8] ; BbCc + movu m7, [X_lowq + startq + 8] ; BbCc mova m6, m0 mova m5, m7 shufps m0, m0, q2301 ; aAbB @@ -179,12 +179,12 @@ cglobal sbr_hf_gen, 4,4,8, X_high, X_low, alpha0, alpha1, BW, S, E mulps m6, m2 mulps m5, m1 addps m7, m0 - mova m0, [X_lowq + start +16] ; CcDd + mova m0, [X_lowq + startq + 16] ; CcDd addps m7, m0 addps m6, m5 addps m7, m6 - mova [X_highq + start], m7 - add start, 16 + mova [X_highq + startq], m7 + add startq, 16 jnz .loop2 RET