From patchwork Fri Aug 25 15:38:29 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ivan Kalvachev X-Patchwork-Id: 4839 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.2.15.201 with SMTP id 70csp466093jao; Fri, 25 Aug 2017 08:38:41 -0700 (PDT) X-Received: by 10.28.178.130 with SMTP id b124mr1557572wmf.164.1503675520966; Fri, 25 Aug 2017 08:38:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1503675520; cv=none; d=google.com; s=arc-20160816; b=vAy4eTMpyny3lSjrz8SDzsdc1Diy2csUJ8QAC6iGz6P4OOIcbY+guaz94+u34bcwJu UW1FGsQszQN99zRUeucsYVP3bFhbqExRcbpFhhFwv+PWiltrUl7P/K/EHZFQ9M4Ot7p7 z0z8BzUr1EYRyU9CiP+/NG8M3YgKwwokkBuYRstkK30DgUKCiVHeoYIZ/jHnxwvAed3u EqGtD7JZeXEuqmpBLWTt01jj2ns+HvEQv1DabIEAEAun0Z1NQtRTHiWJFUWg58VBrxB2 QEYBIBuFZDsLSqWnv8eX4hPQ472xAmGllRK1xLmhUih7brj+IU0JTY5X8bsGcuslbDTi ORaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to :arc-authentication-results; bh=GNwxf1uH8c1Rkk7/BzqubkbXu908djKYvM21Z1rS9GU=; b=fPkybORUpXE4imRbbYXCv/Yht5Q2hSVqIGrIYFRE5E66PtsmSLAJpXESjWJWMHNImT 42CMG/q+efIbUcHYHbe+G2z6RgqJsSzPF9Dju+qFP6qUzUK7WzspoljzQo3C0UCZlehi GQ8lzISzkQDppjNja6jtukNiQivKZNL6ufoifskZfRzTykfxS8cQqhzYpjR03/tJtn6U gnCm6ML+RuWNM8xMDtUO/CixOMApHvXGPB5ro3F79u/IoqBsvFS8k6itPLxKyoVOx18q H+0LYfmaueUDg6WnjJ9APkI+tGk3nKycUUBqh9GvBEmeXOC4lBwoJZqMczFjlpe+dKY1 juqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=HFvyZE0J; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id d125si1458712wmc.14.2017.08.25.08.38.40; Fri, 25 Aug 2017 08:38:40 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=HFvyZE0J; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 47344689C38; Fri, 25 Aug 2017 18:38:29 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg0-f50.google.com (mail-pg0-f50.google.com [74.125.83.50]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id F0ABD689AF0 for ; Fri, 25 Aug 2017 18:38:22 +0300 (EEST) Received: by mail-pg0-f50.google.com with SMTP id b8so784950pgn.5 for ; Fri, 25 Aug 2017 08:38:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=uvvVl1cIYjuUceI1O0yh1i6Jw42JurQb/8uk+dQvz3s=; b=HFvyZE0JjUAbvxtaZeoBN3GU6C3GoZZuf72W2vJ8iiAO3ROZU/CYOWEp8SuLVCfHdM rm11gFMpgsA7sTgLKt5Bgy61FsJ6YubrjXp/qDvwCsKL6XYjcPw1ePsfbcDyMLHQMKmW CqbEhY1e10ULMM7uWljB6JYxtQNYE1Gu8oMkxjJFfEOIQIK/PJAyo0IUqMwzBKr0id1h WnKbgbx0T4LO5Pyzeah+BvLPtq5r5h2S+e8zfarplkIwjMXrO8n2SxG1nriPND8qBWcP eo6xJtvbaOtwcFs6rPmP1j+TUXf3KrtMcYZ5iU4KRJtWB96o9OvepYYsVH2jcLpLx+vJ Bo7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=uvvVl1cIYjuUceI1O0yh1i6Jw42JurQb/8uk+dQvz3s=; b=nqVofQ4MVm2gL6PL4tmPuXmEKG8V8/3toUjzcXFs3Qp/Aje9Df0g9ajBxr2+WRuS/t Ep6Hd54nDvwWt7FImBQSef+TMER4qbHGLW6F4+Mwt3uvWBK1iDnQRHAP2vmQ6dl3CMAk DMrDH0KKW7uR1uc1HFBNwcErkzvdf/4ej01JQ3sYpSxSEwkVQYZTApkpwoVKIEooNky3 9RkIQQPzS0RL1bXBJ9KVG722YLCbgOPbU3xMHWUC6+u7cxDqcV+yO/nj/HVk4f6ZpLJ0 VVGg3s+jpm1eqgNQ6/7oa4tNpOachTZHCsXHA4P3bZVO+4l+pYwbv5KO5MTEE0ANudav +Y2w== X-Gm-Message-State: AHYfb5j2e+IvSIFofekZaxzjk9km8t+8z0wBLTqVdCAl19WG83dI8Pip 1q2BkFw4Cf+BtxWh567jHzxdqQWbcw== X-Received: by 10.84.232.73 with SMTP id f9mr10977594pln.421.1503675510274; Fri, 25 Aug 2017 08:38:30 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.168.77 with HTTP; Fri, 25 Aug 2017 08:38:29 -0700 (PDT) From: Ivan Kalvachev Date: Fri, 25 Aug 2017 18:38:29 +0300 Message-ID: To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] [PATCH] opus_pvq_search.asm: Handle zero vector input differently. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Instead of returning all zeroes as result and Syy=1.0, place all the K pulses in the first element y[0] and return Syy=K*K. This is how the original opus function handles the case. This is how the existing pvq_search_c handles the case. Also, according to Rostislav, the encoded all zeros vector would be decoded as such y[0]=K vector, before dequantization. So it is better to do that explicitly and calculate the proper gain in the encoder. --- I must point out that ppp_pvq_search_c() does generate y[0]=-K vector, not +K. This is because FFSIGN(0.0) returns -1. I do consider this bug, however I'm not quite sure what is the best way to handle it. 1. Fix localy #undef FFSIGN #define FFSIGN(a) ((a) >= 0 ? 1 : -1) 2. Use different name for that macro #define OPUS_SIGN(a) ... 3. Fix by special case in ppp_pvq_search_c(): if( !(res > 0.0) ) { y[0]=K; for(i=1;i Date: Fri, 25 Aug 2017 17:14:28 +0300 Subject: [PATCH] opus_pvq_search.asm: Handle zero vector input differently. Instead of returning all zeroes as result and Syy=1.0, place all the K pulses in the first element y[0] and return Syy=K*K. This is how the original opus function handles the case. This is how the existing pvq_search_c handles the case. Also, according to Rostislav, the encoded all zeros vector would be decoded as such y[0]=K vector, before dequantization. So it is better to do that explicitly and calculate the proper gain in the encoder. Signed-off-by: Ivan Kalvachev --- libavcodec/x86/opus_pvq_search.asm | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/libavcodec/x86/opus_pvq_search.asm b/libavcodec/x86/opus_pvq_search.asm index 5c1e6d6174..adf3e6f87c 100644 --- a/libavcodec/x86/opus_pvq_search.asm +++ b/libavcodec/x86/opus_pvq_search.asm @@ -252,9 +252,9 @@ align 16 xorps m0, m0 comiss xm0, xm1 ; + cvtsi2ss xm0, dword Kd ; m0 = K jz %%zero_input ; if (Sx==0) goto zero_input - cvtsi2ss xm0, dword Kd ; m0 = K %if USE_APPROXIMATION == 1 rcpss xm1, xm1 ; m1 = approx(1/Sx) mulss xm0, xm1 ; m0 = K*(1/Sx) @@ -355,7 +355,8 @@ align 16 RET align 16 -%%zero_input: +%%zero_input: ; expected m0 = K + mulss xm6, xm0, xm0 ; Syy_norm = K*K lea r4d, [Nd - mmsize] xorps m0, m0 %%zero_loop: @@ -363,7 +364,7 @@ align 16 sub r4d, mmsize jnc %%zero_loop - movaps m6, [const_float_1] + mov [outYq], Kd jmp %%return %endmacro -- 2.14.1