From patchwork Tue Dec 10 20:10:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Storsj=C3=B6?= X-Patchwork-Id: 16699 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 0835844708F for ; Tue, 10 Dec 2019 22:10:49 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E48B868B092; Tue, 10 Dec 2019 22:10:48 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f196.google.com (mail-lj1-f196.google.com [209.85.208.196]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8402768AFB3 for ; Tue, 10 Dec 2019 22:10:42 +0200 (EET) Received: by mail-lj1-f196.google.com with SMTP id 21so21383311ljr.0 for ; Tue, 10 Dec 2019 12:10:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=martin-st.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id; bh=e8lvckJWjDsMXMMVEUYTlIJG35VUuKxNBedfZ80v7S4=; b=O24mK0shLl5NHpgt3Gw3qSyj8vnxq5a4sSZVqGauxCs3yo8UM0OZsPxcphWNWLFfOi FlNz+LFs9j+38wWNFTFV4VaEs/3YJT/JfUnfprjPuvIRmEDH3DStU2d5bszVvn/aECva IHVWplG6WkLpvw7Tzmr4VE+Q0lVTWqNYDc89yzvpIs+QbbbECwwD59utKF6nhDPr8Epm xuFTQ4PAeIE7XYHyMVG8FBgOqhx5ljB3WKg1oODzex9lm4tsfrvHecbxzuRHPq0AjAaS ITFxnkNiWpu0kKIbdKu4+XFLTOQKR6t0QW7oPI6HCjOFbl1jXYEDSuDhxUxis+UTAx80 j6Rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=e8lvckJWjDsMXMMVEUYTlIJG35VUuKxNBedfZ80v7S4=; b=EKDVeEzbdvBEII5GBcOCoPV4Cqy+LimLfdARlL7XSED+nnWCLKEQ1ELN9Ryhi4pwD0 5mYakHVHqF+6ZjHDDlHiXy/ueyiEid+aYYzCEch9t86ex8bTHUOcl5YJ0zp0hINqul6v SmxMV7VZgC6950ssp/rmkPrjarzttWS8TbVgHQ+rAng6Ut945Etf019VRhfsN6OELsM1 onWFBU6vMN/Y9EERSgFGkT2eIGw33eNz5jwLKbDEb20RecBEz5tap5lNoeO1xOzDLdd1 M7zSzsXe6NyEnkR9iQWzJTuhP+jVVe/Yo5UuFgJAt+hI0eS579VFxKhbVvkVeEgIn/Wr BoJQ== X-Gm-Message-State: APjAAAVUcGGyfFMkz6xLoKgaFgHg89r0BAOC7N5Jue3D/sppP+EqFJsf AQ0HE80d5sT1bztCRRsivO4KF+ssN0A= X-Google-Smtp-Source: APXvYqxO/4x8pKeQqYIA8T3oieeHlyJSKWivvFsNYc5gWBUP63//96QWVlozdSFdiK9Wc8IU7RFLDw== X-Received: by 2002:a2e:9899:: with SMTP id b25mr21162793ljj.70.1576008641688; Tue, 10 Dec 2019 12:10:41 -0800 (PST) Received: from localhost.localdomain (dsl-tkubng21-58c01c-243.dhcp.inet.fi. [88.192.28.243]) by smtp.gmail.com with ESMTPSA id r15sm2361652ljk.3.2019.12.10.12.10.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Dec 2019 12:10:41 -0800 (PST) From: =?UTF-8?q?Martin=20Storsj=C3=B6?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 10 Dec 2019 22:10:39 +0200 Message-Id: <20191210201040.22050-1-martin@martin.st> X-Mailer: git-send-email 2.17.1 Subject: [FFmpeg-devel] [PATCH 1/2] checkasm: aacpsdsp: Tolerate extra intermediate precision in stereo_interpolate X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" The stereo_interpolate functions add h_step to the values h BUF_SIZE times. Within the stereo_interpolate C functions, the values h (h0-h3, h00-h13) are declared as local float variables, but the compiler is free to keep them in a register with extra precision. If the accumulation is rounded to 32 bit float precision after each step, the less significant bits of h_step end up ignored and the sum can deviate, affecting the end result more than the currently set EPS. By clearing the log2(BUF_SIZE) lower bits of h_step, we make sure that the accumulation shouldn't differ significantly, regardless of any extra precision in the accmulating register/variable. This fixes the aacpsdsp checkasm test when built with clang for mingw/x86_32. --- tests/checkasm/aacpsdsp.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/tests/checkasm/aacpsdsp.c b/tests/checkasm/aacpsdsp.c index ea68b39fa9..2ceef4341f 100644 --- a/tests/checkasm/aacpsdsp.c +++ b/tests/checkasm/aacpsdsp.c @@ -17,6 +17,7 @@ */ #include "libavcodec/aacpsdsp.h" +#include "libavutil/intfloat.h" #include "checkasm.h" @@ -34,6 +35,16 @@ #define EPS 0.005 +static void clear_less_significant_bits(INTFLOAT *buf, int len, int bits) +{ + int i; + for (i = 0; i < len; i++) { + union av_intfloat32 u = { .f = buf[i] }; + u.i &= (0xffffffff << bits); + buf[i] = u.f; + } +} + static void test_add_squares(void) { LOCAL_ALIGNED_16(INTFLOAT, dst0, [BUF_SIZE]); @@ -198,6 +209,13 @@ static void test_stereo_interpolate(PSDSPContext *psdsp) randomize((INTFLOAT *)h, 2 * 4); randomize((INTFLOAT *)h_step, 2 * 4); + // Clear the least significant 14 bits of h_step, to avoid + // divergence when accumulating h_step BUF_SIZE times into + // a float variable which may or may not have extra intermediate + // precision. Therefore clear roughly log2(BUF_SIZE) less + // significant bits, to get the same result regardless of any + // extra precision in the accumulator. + clear_less_significant_bits((INTFLOAT *)h_step, 2 * 4, 14); call_ref(l0, r0, h, h_step, BUF_SIZE); call_new(l1, r1, h, h_step, BUF_SIZE);