From patchwork Thu Aug 15 10:47:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lynne X-Patchwork-Id: 14523 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 6D3AF4473AE for ; Thu, 15 Aug 2019 13:47:10 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4D15B68A996; Thu, 15 Aug 2019 13:47:10 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from w4.tutanota.de (w4.tutanota.de [81.3.6.165]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 023FC6808AE for ; Thu, 15 Aug 2019 13:47:03 +0300 (EEST) Received: from w2.tutanota.de (unknown [192.168.1.163]) by w4.tutanota.de (Postfix) with ESMTP id E8D4A10600E3 for ; Thu, 15 Aug 2019 10:47:02 +0000 (UTC) Authentication-Results: w4.tutanota.de; dkim=pass (2048-bit key; secure) header.d=lynne.ee header.i=@lynne.ee header.b="w1a9VQ/s"; dkim-atps=neutral DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1565866022; s=s1; d=lynne.ee; h=Date:From:To:Message-ID:Subject:MIME-Version:Content-Type; bh=Oo96RK5kP1QVWFmMeSpTdtXZ+O34fDhpGsE0mVrQL6M=; b=w1a9VQ/sF2WpBZRhutJetciMQ4oofpfZgTdsKaNEY9EOTMR/GPMKaVXLqSzuS6aw hLdm1U54g0w0K3HhEmVJDhEHD8DDAHE0Cp/JBuSHj2GtOOjNths1nvI/b/W2K2S8J0E qgm090TSFH0jwUIueL3FxsGAAsXfly1lbReZb4PecQ9AMZRqL99eTrWOH8nxM6tQcbx BwnmqnmadqMhlGkaoGmajvqXrCXknhasyBuL4gldyIz82XLZxbOJPCwe7J7wlaGnu/Y iQHNvm4E4U+uVbDK6mSnN2gQd5mdDqvT+xQwv6CopPQY64LWLLzWibB3YC8iBOhGNzC q5vMtwKlag== Date: Thu, 15 Aug 2019 12:47:02 +0200 (CEST) From: Lynne To: Ffmpeg Devel Message-ID: MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] opusdsp: adjust and optimize C function to match assembly X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" The C and asm versions behaved differently _outside_ of the codec. The C version returned pre-multiplied 'state' for the next execution to use right away, while the assembly version outputted non-multiplied 'state' for the next execution to multiply to save instructions. Since the initial state when initialized or seeking is always 0, and since C and asm versions were never mixed, there was no issue. However, comparing outputs directly in checkasm doesn't work without dividing the initial state by CELT_EMPH_COEFF and multiplying the returned state by CELT_EMPH_COEFF for the assembly function. Since its actually faster to do this in C as well, copy the behavior the asm versions use. As a reminder, add a note explaining the differences between libopus on coefficient init. From b7f2fc24387310cf12d57dbe1ce06f0284a2a390 Mon Sep 17 00:00:00 2001 From: Lynne Date: Thu, 15 Aug 2019 11:13:35 +0100 Subject: [PATCH 1/2] opusdsp: adjust and optimize C function to match assembly The C and asm versions behaved differently _outside_ of the codec. The C version returned pre-multiplied 'state' for the next execution to use right away, while the assembly version outputted non-multiplied 'state' for the next execution to multiply to save instructions. Since the initial state when initialized or seeking is always 0, and since C and asm versions were never mixed, there was no issue. However, comparing outputs directly in checkasm doesn't work without dividing the initial state by CELT_EMPH_COEFF and multiplying the returned state by CELT_EMPH_COEFF for the assembly function. Since its actually faster to do this in C as well, copy the behavior the asm versions use. As a reminder, the initial state 0 is divided by CELT_EMPH_COEFF on seek and init (just in case in the future this is changed, its technically more correct to init with CELT_EMPH_COEFF than 0, however when seeking this will result in more audiable pops, unlike with 0 where the output gets in sync over a few samples). --- libavcodec/opus_celt.c | 6 +++++- libavcodec/opusdsp.c | 11 +++-------- 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/libavcodec/opus_celt.c b/libavcodec/opus_celt.c index 4655172b09..9dbeff1927 100644 --- a/libavcodec/opus_celt.c +++ b/libavcodec/opus_celt.c @@ -507,7 +507,11 @@ void ff_celt_flush(CeltFrame *f) memset(block->pf_gains_old, 0, sizeof(block->pf_gains_old)); memset(block->pf_gains_new, 0, sizeof(block->pf_gains_new)); - block->emph_coeff = 0.0; + /* libopus uses CELT_EMPH_COEFF on init, but 0 is better since there's + * a lesser discontinuity when seeking. + * The deemphasis functions differ from libopus in that they require + * an initial state divided by the coefficient. */ + block->emph_coeff = 0.0f / CELT_EMPH_COEFF; } f->seed = 0; diff --git a/libavcodec/opusdsp.c b/libavcodec/opusdsp.c index 0e179c98c9..08df87ffbe 100644 --- a/libavcodec/opusdsp.c +++ b/libavcodec/opusdsp.c @@ -43,15 +43,10 @@ static void postfilter_c(float *data, int period, float *gains, int len) static float deemphasis_c(float *y, float *x, float coeff, int len) { - float state = coeff; + for (int i = 0; i < len; i++) + coeff = y[i] = x[i] + coeff*CELT_EMPH_COEFF; - for (int i = 0; i < len; i++) { - const float tmp = x[i] + state; - state = tmp * CELT_EMPH_COEFF; - y[i] = tmp; - } - - return state; + return coeff; } av_cold void ff_opus_dsp_init(OpusDSP *ctx) -- 2.23.0.rc1