From patchwork Wed Feb 28 20:14:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ivan Kalvachev X-Patchwork-Id: 7768 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.2.181.170 with SMTP id m39csp4186535jaj; Wed, 28 Feb 2018 12:14:26 -0800 (PST) X-Google-Smtp-Source: AH8x224jmqTjqU/zh04KFz+TBIR/81IElX+Z+V6Mt8qYh68O81rkKlEb7nwSsFWZ81yuRpofzspS X-Received: by 10.223.188.18 with SMTP id s18mr18205084wrg.211.1519848866139; Wed, 28 Feb 2018 12:14:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519848866; cv=none; d=google.com; s=arc-20160816; b=aCZCVXTB6+H8hGi9FkwreUSYOrVvjzZ+wVypA2AwsHmsCKz4cTxVbzYFht6JviD5X3 wCiV644SJDeZlHci4+yvHLscz6BljA14bpUb0aU5VL3A3LCtnRa7G+pEsHbQbmJJq3BE 3+/hl9ThvHZztiuARJ8k7+vmem9pDNtaZd/CMbQ8kNC0KytCzil+rSak5DElP07z+XvZ +236LQ65k4uCwz1UGQ8AXtVpagVPAJXM/UMHF1HXhRbMxnbtFIa9TeOyJ0pObmO+L85z M9zpMzJRHrIydlcynYppaklOPZhCEkdT6NCCSjg5oGe22YovXL9HzeF4CQCzgZmLB0OH PnwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to :arc-authentication-results; bh=h+7HRsT1ljWwpgMK03VfbU8uAIF8QG5QmysVyd5Y3HY=; b=qW/KX21ZWSYzVcngtrFgKrnKe6XY5xEcLvWlfN0UHxA83DMFUNHfuhT+mIJvWZ6zNO K+oQkxxus2fY2M7deaC9MzI7YG5BNJ5reQd8ltbWyu6zj5fBdVVTAN7axBOrD8mspwNB 1p5ayzD8bSY8tAlhgIc9kFMknG0YENKjXkh8M5iDoNxISBgPTFXkd/WQL655ISHgEs87 UcXytIjlvxo4YsRn/nK+RkdOwqlnVGwFHjhzZ7XBhykI+Qyfp0wgIlpruozIpLKolzjz wfZBNH8qoJ4TlC/ku9DF/buY66DALOuTvGkInJF05K+OtSgGsKY2ob2nSHBikRmjp9O9 tS8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=Slymd6mS; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id k1si1857043wrg.2.2018.02.28.12.14.25; Wed, 28 Feb 2018 12:14:26 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=Slymd6mS; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 3DFA168A2FA; Wed, 28 Feb 2018 22:14:19 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg0-f46.google.com (mail-pg0-f46.google.com [74.125.83.46]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E7EA0689B35 for ; Wed, 28 Feb 2018 22:14:12 +0200 (EET) Received: by mail-pg0-f46.google.com with SMTP id y26so1374771pgv.4 for ; Wed, 28 Feb 2018 12:14:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=Q3fdS0aTHcPC7MnMdIXdP+kAHmuz6B/F/hhBaJJEHUc=; b=Slymd6mSB+83Ly7PpNqQe+010xQtDckcpfcqc4kXVihb33z3TgRuMgm/Ps839YzwTF PAjGFdLFuepitWA8dlkp0RBNpRRDYy19cdd/CFdv8XjJCudnveuu4ZWJ99OhDwDP2Fd6 TMZ6c/QqdhdkaiBtS0EUIXsNrIEppzH0Ac+VM+IYwn9DoAx81j1a97V2gk2jhdCauJ3D YJPTrA10qlk3oTFCv+yOoEBF5QCiEG8hBFZ6wn4HyFMuayXzNpqUwGWMEVB6I5TPuhCa oT84qRW7urUfoovfhmiFNgH5JeCX/gGuG5nKhhxmTPvbdGkn4IgYGbwGuhPvJtDsorXf 5Lug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=Q3fdS0aTHcPC7MnMdIXdP+kAHmuz6B/F/hhBaJJEHUc=; b=erXN0BgNcFuUdFAnXsckp9KJRbUPlMhCIlg5N1IT+Bg/Z/XLlms26biNIYSc7c8kxW nHppTrcBclrbfKu8dHTRmvSrx0s6CGDigGOxvdrLbAoxTN8Qv4bAb+SiRZAPmONBAlJj 2ZGCfa9uCPciRTJY5cHtLuxC/1pB9jOzVT9UMVnHZ/PL5cjgSqnDJE8a8MbByUHMRKVi XWxsQERGvAKOD9N6xWBBYQtGtdtlNudB9aJEpWlsx0hZHvi9XVuyv+zOYmw8HjMOnfUl S7tT+vb3hEJfx/Rj6Dm4vzoG/FRz2BnPVVSHzISxsvEVQjLocWC+PqpW4PYFyWsmREkb bYMQ== X-Gm-Message-State: APf1xPAsWI1Zmwe28hS9OcZKfmngPMBJBnbf2C2NgUIo98kjMp/9zI5u skUoxuwHAICjMCbpkyNJCGENSC1jQW/9bzZHFhs= X-Received: by 10.98.61.73 with SMTP id k70mr19044298pfa.10.1519848856009; Wed, 28 Feb 2018 12:14:16 -0800 (PST) MIME-Version: 1.0 Received: by 10.236.177.7 with HTTP; Wed, 28 Feb 2018 12:14:15 -0800 (PST) From: Ivan Kalvachev Date: Wed, 28 Feb 2018 22:14:15 +0200 Message-ID: To: FFmpeg development discussions and patches Subject: [FFmpeg-devel] [PATCH][RFC] Improve and fix put_vc2_ue_uint() function. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Replace two bit handling loops and internal conditional branch with simple formula using few logical operations. The old function would generate wrong output if the input does not fit into 15 bits. Fix this by using 64 bit math and put_bits64(). This case should be quite rare, since the bug has not asserted itself. --- It's attempt for speed optimization, but in the process it turned out it needs also bugfixing. I only tested the old case of the code, to confirm i've implemented the correct function. Haven't done any benchmarks or run fate. It should be faster, especially because currently coefficients bellow 2048 are written using lookup table and bypass this function. If you like it, use it. Best Regards Ivan Kalvachev. From 1f7fd38fcb6c64281bc458c09c711fc567b3ef0f Mon Sep 17 00:00:00 2001 From: Ivan Kalvachev Date: Wed, 28 Feb 2018 17:48:40 +0200 Subject: [PATCH] Improve and fix put_vc2_ue_uint() function. Replace two bit handling loops and internal conditional branch with simple formula using few logical operations. The old function would generate wrong output if the input does not fit into 15 bits. Fix this by using 64 bit math and put_bits64(). This case should be quite rare, since the bug has not asserted itself. Signed-off-by: Ivan Kalvachev --- libavcodec/vc2enc.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/libavcodec/vc2enc.c b/libavcodec/vc2enc.c index b7adcd3d36..b2f1611ea3 100644 --- a/libavcodec/vc2enc.c +++ b/libavcodec/vc2enc.c @@ -187,28 +187,33 @@ typedef struct VC2EncContext { static av_always_inline void put_vc2_ue_uint(PutBitContext *pb, uint32_t val) { - int i; - int pbits = 0, bits = 0, topbit = 1, maxval = 1; + int bits = 0; + uint64_t pbits = 0; if (!val++) { put_bits(pb, 1, 1); return; } - while (val > maxval) { - topbit <<= 1; - maxval <<= 1; - maxval |= 1; - } + bits = ff_log2(val); - bits = ff_log2(topbit); + if (bits > 15) { + pbits = val; - for (i = 0; i < bits; i++) { - topbit >>= 1; - pbits <<= 2; - if (val & topbit) - pbits |= 0x1; + pbits = ((pbits<<16)|pbits)&0x0000FFFF0000FFFFULL; + pbits = ((pbits<< 8)|pbits)&0x00FF00FF00FF00FFULL; + pbits = ((pbits<< 4)|pbits)&0x0F0F0F0F0F0F0F0FULL; + pbits = ((pbits<< 2)|pbits)&0x3333333333333333ULL; + pbits = ((pbits<< 1)|pbits)&0x5555555555555555ULL; + + put_bits64(pb, bits*2 + 1, (pbits << 1) | 1); + return; } + // ____'____ ____'____ ponm'lkji hgfe'dcba + val = ( (val << 8) | val ) & 0x00FF00FF; // ____'____ ponm'lkji ____'____ hgfe'dcba + val = ( (val << 4) | val ) & 0x0F0F0F0F; // ____'ponm ____'lkji ____'hgfe ____'dcba + val = ( (val << 2) | val ) & 0x33333333; // __po'__nm __lk'__ji __hg'__fe __dc'__ba + val = ( (val << 1) | val ) & 0x55555555; // _p_o'_n_m _l_k'_j_i _h_g'_f_e _d_c'_b_a put_bits(pb, bits*2 + 1, (pbits << 1) | 1); } -- 2.16.2