From patchwork Tue Jul 4 18:15:56 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 4212 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.1.76 with SMTP id 73csp1294139vsb; Tue, 4 Jul 2017 11:16:11 -0700 (PDT) X-Received: by 10.28.223.86 with SMTP id w83mr27338127wmg.9.1499192171139; Tue, 04 Jul 2017 11:16:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1499192171; cv=none; d=google.com; s=arc-20160816; b=zfc6TNdWDh46iFcomTb3lOfTWHLxlP/KMklZBliNfDhlr7a3IW1fmzVYpABEX2BvbL Uc/VIIICmCV2axFgUwHRId4E0Xhpkd4TniEi+lQpeT9FXmnCJ4ctDWEkTcb4lXjAFFMt w7+xpgYJP3iLjfXspPEtnfo+AU1Ldpd6Y9WaYXw0F2zMVwYEhNMEjoD4A8UQHPUXIjt4 H1Gi7UhyiKAlK34MAKbT9/PGbbaplTeV0Wm6YKHpA642qDYllgNhNWq7BTN35Sa6DQw2 j5gGYOliwom9cE9lMFqZ3YVjieqqCSaB2yijJgTA9WMgzWhlxmDVFWeyz0n/b87tKLqx Djfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:dkim-signature:delivered-to :arc-authentication-results; bh=3x+PArIyESi6zLfvs5JVNH9NPheJm0H0RYLedRQJbVE=; b=RcrAz3gacF26bbmKTP4ktpriaH/cpJVeBOWAOAC7b5YFmWfe3SSosXxWGTAlR/8N0X iKkfGbSYrJ1ZVGUX4t3rnqbqY7vKyy2Y//pEIjb3C288zGi6rgQ2QhTxR1+sLA+dSpIn rtkXuWLbI17OFKZ6OOuqiafzBsBhW6PehG6MUqRYWRTvCh4SNoWTc0y4+74ZHSN8aNkx YUynGhjG4X3Ihg4IyWroYk85AEvcvbtZ6tJAyOLlVth0F51N9DI+z/cbTgIf6NoH6laW VtNQOuXML4g3IXdG8YvzNIfzqafmzXVitQowtmXCQ0fM4aPiDKXqOnbtaWNZTJHX/6cK jheQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.b=eB+Ys4RJ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v23si14617154wra.229.2017.07.04.11.16.10; Tue, 04 Jul 2017 11:16:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.b=eB+Ys4RJ; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2301B68075B; Tue, 4 Jul 2017 21:16:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qk0-f172.google.com (mail-qk0-f172.google.com [209.85.220.172]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 726C9680481 for ; Tue, 4 Jul 2017 21:15:59 +0300 (EEST) Received: by mail-qk0-f172.google.com with SMTP id d78so173183515qkb.1 for ; Tue, 04 Jul 2017 11:16:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language; bh=+VnAFBJod0ro1OJO5FaRlk+tUFRgfX6i1xl6xYCGae0=; b=eB+Ys4RJGi4/zw7kJTS8XPnhGuijRrV2wn+KVlXaeEDJ9MpHklvElcuHB0WZVD6WD5 DA+SJepS6CijYHzuWJXQsM3UpCwTiJmxJXpAMyDHPFtgOiV+6EP2/sEhjBIZgDOXWCb0 p1tqnFEhAYycSTIYWQHn7ShbXhlGHFlFqrDahlBwaEe7D24d/GEeNNy30QEgiIF7rpUV 9uMsGIbLCw3WK25B9XuM/SCUkkBdXt56EgtRqM6/WKNwwsau7AFThjAL8+sHIs8vlRUh pUXjnQwlzhJs/YqmltadZYWS2UsOIk2TvSmnOrKi9OQny0jnl8Tp5ZBw4Fr+neMbkDGt 4Pmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=+VnAFBJod0ro1OJO5FaRlk+tUFRgfX6i1xl6xYCGae0=; b=QD/eVoBXUvRbeSVmC2NlPrl5nwvT5BgOP9psksh5dhvUfYd2dL8+Ub5+qisQkXrMjE BvUR3YvkAP8o8XAL/onKk6cgevzlhcLL6BOARXTOzuvmnZevQjkWQdknsX0QEdHrYqmm 2SlHExBk+sbacx77U40Kjrl/rYvK75uMhXByRjrjafOpbI0gLoX9xpEd5u+S1dch+a/2 trcfjdcFAOm7m+rFG7SnmyXfD1SseYrfsxZUwXohU+a8pWM2YrOj7yx/RHyv2nOsGKiu P0rbBkAa59lzq66+HPyfIVnVmJsI55QHgyGoA5pgICdltaPBs/Mum2OhTZvgpzYshBVl DOew== X-Gm-Message-State: AKS2vOx6SlOTUkCB654DL1MGdw4hWxhCqXzY4rm0NOjYVRJZpN3LpG5N /vyii69iN3ovVc8cfcc= X-Received: by 10.55.197.136 with SMTP id k8mr50646076qkl.211.1499192159902; Tue, 04 Jul 2017 11:15:59 -0700 (PDT) Received: from [192.168.0.4] ([181.231.116.134]) by smtp.gmail.com with ESMTPSA id v63sm14605238qkv.46.2017.07.04.11.15.58 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Jul 2017 11:15:59 -0700 (PDT) To: ffmpeg-devel@ffmpeg.org References: <20170622125628.24516-1-matthieu.bouron@gmail.com> <8eb1a3b6-e135-13c4-fd5a-c6c8fcfc5762@gmail.com> <20170623150135.GD26847@tsuri.lan> <20170628124855.GF26847@tsuri.lan> <20170629235805.GB4727@nb4> <20170630135552.GE4727@nb4> <20170630151637.GG26847@tsuri.lan> <20170703123228.GH26847@tsuri.lan> <20170704173159.GY4727@nb4> From: James Almer Message-ID: <6b3830b9-8bbd-f0df-c256-67e2eaf44919@gmail.com> Date: Tue, 4 Jul 2017 15:15:56 -0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170704173159.GY4727@nb4> Content-Language: en-US Subject: Re: [FFmpeg-devel] [PATCH 1/2] checkasm: add sbrdsp tests X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" On 7/4/2017 2:31 PM, Michael Niedermayer wrote: > On Mon, Jul 03, 2017 at 02:32:28PM +0200, Matthieu Bouron wrote: >> On Fri, Jun 30, 2017 at 05:16:37PM +0200, Matthieu Bouron wrote: >>> On Fri, Jun 30, 2017 at 03:55:52PM +0200, Michael Niedermayer wrote: >>>> On Thu, Jun 29, 2017 at 10:53:06PM -0300, James Almer wrote: >>>>> On 6/29/2017 10:14 PM, Henrik Gramner wrote: >>>>>> On Fri, Jun 30, 2017 at 1:58 AM, Michael Niedermayer >>>>>> wrote: >>>>>>> Program received signal SIGSEGV, Segmentation fault. >>>>>>> 0x0000000000684919 in ff_sbr_hf_gen_sse () >>>>>> >>>>>>> 0x0000000000684909 : sub %r9,%r8 >>>>>> >>>>>>> => 0x0000000000684919 : movaps (%rsi,%r8,1),%xmm0 >>>>>> >>>>>>> r9 0xdeadbeef00000080 -2401053092612145024 >>>>>> >>>>>> Another case of a 32-bit int being used as part of a 64-bit operation. >>>>> >>>>> I can't reproduce it on my ArchLinux x86_64 environment for some reason, >>>>> but based on what you said i assume the attached patch should fix it. >>>> >>>> no crash occurs here with this, so it seems fixed >>> >>> Should i push the patchset or wait a little bit longer ? >> >> Patchset applied. > > it seems theres some issue still in this: > > checkasm: using random seed 3655967467 > MMX: > - audiodsp.audiodsp [OK] > - blockdsp.blockdsp [OK] > - h264dsp.idct [OK] > - h264pred.pred4x4 [OK] > - h264pred.pred8x8 [OK] > - h264pred.pred16x16 [OK] > - pixblockdsp.get_pixels [OK] > - pixblockdsp.diff_pixels [OK] > - vp8dsp.idct [OK] > - vp8dsp.mc [OK] > - vp9dsp.ipred [OK] > - vp9dsp.itxfm [OK] > - vp9dsp.mc [OK] > MMXEXT: > - audiodsp.audiodsp [OK] > - h264dsp.idct [OK] > - h264pred.pred4x4 [OK] > - h264pred.pred8x8 [OK] > - h264pred.pred16x16 [OK] > - h264pred.pred8x8l [OK] > - h264qpel.put [OK] > - h264qpel.avg [OK] > - hevc_add_res.add_residual [OK] > - hevc_idct.idct_dc [OK] > - vp8dsp.mc [OK] > - vp9dsp.ipred [OK] > - vp9dsp.itxfm [OK] > - vp9dsp.loopfilter [OK] > - vp9dsp.mc [OK] > SSE: > - aacpsdsp.add_squares [OK] > - aacpsdsp.mul_pair_single [OK] > - aacpsdsp.hybrid_analysis [OK] > - sbrdsp.sum64x5 [OK] > - sbrdsp.sum_square [OK] > - sbrdsp.neg_odd_64 [OK] > - sbrdsp.qmf_post_shuffle [OK] > - sbrdsp.qmf_deint_neg [OK] > - sbrdsp.qmf_deint_bfly [OK] > - sbrdsp.autocorrelate [OK] > - sbrdsp.hf_gen [OK] > - sbrdsp.hf_g_filt [OK] > - audiodsp.audiodsp [OK] > - blockdsp.blockdsp [OK] > - fmtconvert.fmtconvert [OK] > - h264pred.pred16x16 [OK] > - vp8dsp.idct [OK] > - vp8dsp.mc [OK] > - vp9dsp.ipred [OK] > - vp9dsp.mc [OK] > - float_dsp.vector_fmul [OK] > - float_dsp.vector_fmac [OK] > - float_dsp.butterflies_float [OK] > - float_dsp.scalarproduct_float [OK] > SSE2: > - sbrdsp.qmf_pre_shuffle [OK] > - sbrdsp.qmf_deint_bfly [OK] > > Program received signal SIGSEGV, Segmentation fault. > apply_noise_main.loop () at libavcodec/x86/sbrdsp.asm:418 > 418 movu m7, [Yq + 2*count + mmsize] > (gdb) bt > Python Exception No module named gdb.frames: > #0 apply_noise_main.loop () at libavcodec/x86/sbrdsp.asm:418 > #1 0x000000000043659b in checkasm_checked_call () at tests/checkasm/x86/checkasm.asm:77 > #2 0xdeadbeefdeadbeef in ?? () > #3 0xdeadbeefdeadbeef in ?? () > #4 0xdeadbeefdeadbeef in ?? () > #5 0xdeadbeefdeadbeef in ?? () > #6 0xdeadbeefdeadbeef in ?? () > #7 0xdeadbeefdeadbeef in ?? () > #8 0xdeadbeefdeadbeef in ?? () > #9 0xdeadbeefdeadbeef in ?? () > #10 0xdeadbeefdeadbeef in ?? () > #11 0xdeadbeefdeadbeef in ?? () > #12 0xdeadbeefdeadbeef in ?? () > #13 0xdeadbeefdeadbeef in ?? () > #14 0xdeadbeefdeadbeef in ?? () > #15 0xdeadbeefdeadbeef in ?? () > #16 0xdeadbeefdeadbeef in ?? () > #17 0xdeadbeefdeadbeef in ?? () > #18 0xdeadbeefdeadbeef in ?? () > #19 0x00007fffffffd870 in ?? () > #20 0x00007fffffffcc70 in ?? () > #21 0x00007fffffffce70 in ?? () > #22 0x0000000000000000 in ?? () > (gdb) info all-registers > rax 0x0 0 > rbx 0xed56bb2dcb3c7736 -1344681633365854410 > rcx 0x8e8 2280 > rdx 0x7ab77bbbffffd070 8842672440749314160 > rsi 0x7ab77bbbffffce70 8842672440749313648 > rdi 0xf56e7777ffffdc70 -761539929699263376 > rbp 0x8bda43d3fd1a7e06 0x8bda43d3fd1a7e06 > rsp 0x7fffffffcae8 0x7fffffffcae8 > r8 0xdeadbeef00000000 -2401053092612145152 > r9 0x85490444000009c0 -8842531703260968512 > r10 0x684bf0 6835184 > r11 0x1 1 > r12 0x4a75479abd64e097 5365273261009854615 > r13 0x249214109d5d1c88 2635190793557318792 > r14 0xb64a9c9e5d318408 -5311260606547786744 > r15 0xdf9a54b303f1d3a3 -2334460328996121693 > rip 0x684cc9 0x684cc9 > eflags 0x10206 [ PF IF RF ] > cs 0x33 51 > ss 0x2b 43 > ds 0x0 0 > es 0x0 0 > fs 0x0 0 > gs 0x0 0 > st0 -nan(0x0fffb0005) (raw 0xffff00000000fffb0005) > st1 -nan(0x334fe50ff28fc84) (raw 0xffff0334fe50ff28fc84) > st2 -nan(0x0ff640150) (raw 0xffff00000000ff640150) > st3 -nan(0x0005e005a) (raw 0xffff00000000005e005a) > st4 -nan(0x0ff5bffe7) (raw 0xffff00000000ff5bffe7) > st5 -nan(0xff63fc2cfe94fee5) (raw 0xffffff63fc2cfe94fee5) > st6 -nan(0x01c4df38a) (raw 0xffff000000001c4df38a) > st7 -nan(0x06215436f) (raw 0xffff000000006215436f) > Does the attached patch fix it? From 14c4b77569af06ae181e521330aef6290f29fca1 Mon Sep 17 00:00:00 2001 From: James Almer Date: Tue, 4 Jul 2017 15:05:47 -0300 Subject: [PATCH] x86/sbrdsp: zero extend m_max in apply_noise_main Signed-off-by: James Almer --- libavcodec/x86/sbrdsp.asm | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/libavcodec/x86/sbrdsp.asm b/libavcodec/x86/sbrdsp.asm index c716184b14..62bbe512ec 100644 --- a/libavcodec/x86/sbrdsp.asm +++ b/libavcodec/x86/sbrdsp.asm @@ -378,24 +378,24 @@ cglobal sbr_hf_apply_noise_3, 5,5+NREGS+UNIX64,8, Y,s_m,q_filt,noise,kx,m_max apply_noise_main: %if ARCH_X86_64 == 0 || WIN64 mov kxd, m_maxm -%define count kxq + DEFINE_ARGS Y, s_m, q_filt, noise, count %else -%define count m_maxq + DEFINE_ARGS Y, s_m, q_filt, noise, kx, count %endif movsxdifnidn noiseq, noised dec noiseq - shl count, 2 + shl countd, 2 %ifdef PIC lea NOISE_TABLE, [sbr_noise_table] %endif - lea Yq, [Yq + 2*count] - add s_mq, count - add q_filtq, count + lea Yq, [Yq + 2*countq] + add s_mq, countq + add q_filtq, countq shl noiseq, 3 pxor m5, m5 - neg count + neg countq .loop: - mova m1, [q_filtq + count] + mova m1, [q_filtq + countq] movu m3, [noiseq + NOISE_TABLE + 1*mmsize] movu m4, [noiseq + NOISE_TABLE + 2*mmsize] add noiseq, 2*mmsize @@ -404,7 +404,7 @@ apply_noise_main: punpckldq m1, m1 mulps m1, m3 ; m2 = q_filt[m] * ff_sbr_noise_table[noise] mulps m2, m4 ; m2 = q_filt[m] * ff_sbr_noise_table[noise] - mova m3, [s_mq + count] + mova m3, [s_mq + countq] ; TODO: replace by a vpermd in AVX2 punpckhdq m4, m3, m3 punpckldq m3, m3 @@ -414,15 +414,15 @@ apply_noise_main: mulps m4, m0 ; s_m[m] * phi_sign pand m1, m6 pand m2, m7 - movu m6, [Yq + 2*count] - movu m7, [Yq + 2*count + mmsize] + movu m6, [Yq + 2*countq] + movu m7, [Yq + 2*countq + mmsize] addps m3, m1 addps m4, m2 addps m6, m3 addps m7, m4 - movu [Yq + 2*count], m6 - movu [Yq + 2*count + mmsize], m7 - add count, mmsize + movu [Yq + 2*countq], m6 + movu [Yq + 2*countq + mmsize], m7 + add countq, mmsize jl .loop RET