From patchwork Sat Aug 6 23:51:29 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rostislav Pehlivanov X-Patchwork-Id: 112 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.140.67 with SMTP id o64csp2375754vsd; Sat, 6 Aug 2016 17:05:06 -0700 (PDT) X-Received: by 10.28.153.70 with SMTP id b67mr9666410wme.84.1470528306456; Sat, 06 Aug 2016 17:05:06 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id k10si14561717wmh.11.2016.08.06.17.05.05; Sat, 06 Aug 2016 17:05:06 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C90E9689EAA; Sun, 7 Aug 2016 03:04:50 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm0-f68.google.com (mail-wm0-f68.google.com [74.125.82.68]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B1E4E689BA0 for ; Sun, 7 Aug 2016 03:04:38 +0300 (EEST) Received: by mail-wm0-f68.google.com with SMTP id o80so9251512wme.0 for ; Sat, 06 Aug 2016 17:04:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=w5XwPIOTJCXBblOpCLfKkwNzFOpOJQjxk07obEtzyYc=; b=sGK0Y0CZPGNdau/OGZ4cGAjAOqyzQ1T+ELasSMSxTkdtbEns9uUkUHJsXrlOAoohpJ LtSgLf08XzrUNpe/LMeqX1NU/Qtboi3E/XLaWWV9AuS612NOEi2Ja5OL173R3AYeoiNe m2oE3+ir96yG91lSGjuDNhLwnLnW0OGt+YUh6ayNbIes27zyGryH3G7OXILcxf86acMN NTxmUHBNcHgI2gEytzCgX+2HDH+9wsi84mAsmcoZLvxNYktdOcfbC/v6z9j5sPk1yWZ9 BC9xDyNc4nYrLs+GNmuoWg2B54Tokto4C8H4guL2myUnJbvlBfXN2B+x/7VKJ+TqlZco qyyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=w5XwPIOTJCXBblOpCLfKkwNzFOpOJQjxk07obEtzyYc=; b=cGuzpN4TjW7mCryAKR39eWcfAEQXG4ZuDJzpRdua4uMPfS5hcbnZB3lK9sF3dpd/1U v0RyNbYuBMKQay6frBGviSNqzbTDK0H2dhttRLL4464NrXIvxvg8d7fTdLirv9ZmoG0M NK2RawZFA4Li2VC6zYkcYK4cEN/xfp0kVJiHgCY5aib8mUga4G5TDPssTNkmJPtd53mp FKylWUrxjMDm0RbR7GVTEv8zGxeshjmFsXqTGcYLI6i5rsbIkxjhq8Kzj/e6yp0xrgJN f9eIhi4eU+Sv+st72thlcfRM1D/cExVpceCR0MEuxlGJJwCbeBX3nrbs8sZKRc70Zf+W oROw== X-Gm-Message-State: AEkoouuwhdEjBXheIPk4un2JK2Vl+fiy6t4P6jL67SVRBj0cM3q+eRBSRc039HCf2kmCdw== X-Received: by 10.28.36.10 with SMTP id k10mr9029073wmk.33.1470527495042; Sat, 06 Aug 2016 16:51:35 -0700 (PDT) Received: from moonbase.lan (host86-146-197-104.range86-146.btcentralplus.com. [86.146.197.104]) by smtp.gmail.com with ESMTPSA id w129sm15853536wmd.9.2016.08.06.16.51.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 06 Aug 2016 16:51:33 -0700 (PDT) From: Rostislav Pehlivanov To: ffmpeg-devel@ffmpeg.org Date: Sun, 7 Aug 2016 00:51:29 +0100 Message-Id: <20160806235130.13284-1-atomnuker@gmail.com> X-Mailer: git-send-email 2.8.1.369.geae769a Subject: [FFmpeg-devel] [PATCH 1/2] aacenc: add a faster version of twoloop as the "fast" coder X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Rostislav Pehlivanov MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Does nothing fancy but still sounds very decent at 128kbps. Still room to improve by bringing in the low pass and PNS management from the main big twoloop which should improve its quality but not sacrifice that much speed. Signed-off-by: Rostislav Pehlivanov Acked-by: Michael --- libavcodec/aaccoder.c | 154 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 134 insertions(+), 20 deletions(-) diff --git a/libavcodec/aaccoder.c b/libavcodec/aaccoder.c index bca1f59..edf29f4 100644 --- a/libavcodec/aaccoder.c +++ b/libavcodec/aaccoder.c @@ -396,34 +396,148 @@ static void search_for_quantizers_fast(AVCodecContext *avctx, AACEncContext *s, SingleChannelElement *sce, const float lambda) { - int i, w, w2, g; - int minq = 255; - - memset(sce->sf_idx, 0, sizeof(sce->sf_idx)); + int start = 0, i, w, w2, g; + int destbits = avctx->bit_rate * 1024.0 / avctx->sample_rate / avctx->channels * (lambda / 120.f); + float dists[128] = { 0 }, uplims[128] = { 0 }; + float maxvals[128]; + int fflag, minscaler; + int its = 0; + int allz = 0; + float minthr = INFINITY; + + // for values above this the decoder might end up in an endless loop + // due to always having more bits than what can be encoded. + destbits = FFMIN(destbits, 5800); + //XXX: some heuristic to determine initial quantizers will reduce search time + //determine zero bands and upper limits for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) { - for (g = 0; g < sce->ics.num_swb; g++) { + start = 0; + for (g = 0; g < sce->ics.num_swb; g++) { + int nz = 0; + float uplim = 0.0f, energy = 0.0f; for (w2 = 0; w2 < sce->ics.group_len[w]; w2++) { FFPsyBand *band = &s->psy.ch[s->cur_channel].psy_bands[(w+w2)*16+g]; - if (band->energy <= band->threshold) { - sce->sf_idx[(w+w2)*16+g] = 218; + uplim += band->threshold; + energy += band->energy; + if (band->energy <= band->threshold || band->threshold == 0.0f) { sce->zeroes[(w+w2)*16+g] = 1; - } else { - sce->sf_idx[(w+w2)*16+g] = av_clip(SCALE_ONE_POS - SCALE_DIV_512 + log2f(band->threshold), 80, 218); - sce->zeroes[(w+w2)*16+g] = 0; + continue; } - minq = FFMIN(minq, sce->sf_idx[(w+w2)*16+g]); + nz = 1; } + uplims[w*16+g] = uplim *512; + sce->band_type[w*16+g] = 0; + sce->zeroes[w*16+g] = !nz; + if (nz) + minthr = FFMIN(minthr, uplim); + allz |= nz; + start += sce->ics.swb_sizes[g]; } } - for (i = 0; i < 128; i++) { - sce->sf_idx[i] = 140; - //av_clip(sce->sf_idx[i], minq, minq + SCALE_MAX_DIFF - 1); + for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) { + for (g = 0; g < sce->ics.num_swb; g++) { + if (sce->zeroes[w*16+g]) { + sce->sf_idx[w*16+g] = SCALE_ONE_POS; + continue; + } + sce->sf_idx[w*16+g] = SCALE_ONE_POS + FFMIN(log2f(uplims[w*16+g]/minthr)*4,59); + } } - //set the same quantizers inside window groups - for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) - for (g = 0; g < sce->ics.num_swb; g++) - for (w2 = 1; w2 < sce->ics.group_len[w]; w2++) - sce->sf_idx[(w+w2)*16+g] = sce->sf_idx[w*16+g]; + + if (!allz) + return; + abs_pow34_v(s->scoefs, sce->coeffs, 1024); + ff_quantize_band_cost_cache_init(s); + + for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) { + start = w*128; + for (g = 0; g < sce->ics.num_swb; g++) { + const float *scaled = s->scoefs + start; + maxvals[w*16+g] = find_max_val(sce->ics.group_len[w], sce->ics.swb_sizes[g], scaled); + start += sce->ics.swb_sizes[g]; + } + } + + //perform two-loop search + //outer loop - improve quality + do { + int tbits, qstep; + minscaler = sce->sf_idx[0]; + //inner loop - quantize spectrum to fit into given number of bits + qstep = its ? 1 : 32; + do { + int prev = -1; + tbits = 0; + for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) { + start = w*128; + for (g = 0; g < sce->ics.num_swb; g++) { + const float *coefs = sce->coeffs + start; + const float *scaled = s->scoefs + start; + int bits = 0; + int cb; + float dist = 0.0f; + + if (sce->zeroes[w*16+g] || sce->sf_idx[w*16+g] >= 218) { + start += sce->ics.swb_sizes[g]; + continue; + } + minscaler = FFMIN(minscaler, sce->sf_idx[w*16+g]); + cb = find_min_book(maxvals[w*16+g], sce->sf_idx[w*16+g]); + for (w2 = 0; w2 < sce->ics.group_len[w]; w2++) { + int b; + dist += quantize_band_cost_cached(s, w + w2, g, + coefs + w2*128, + scaled + w2*128, + sce->ics.swb_sizes[g], + sce->sf_idx[w*16+g], + cb, 1.0f, INFINITY, + &b, NULL, 0); + bits += b; + } + dists[w*16+g] = dist - bits; + if (prev != -1) { + bits += ff_aac_scalefactor_bits[sce->sf_idx[w*16+g] - prev + SCALE_DIFF_ZERO]; + } + tbits += bits; + start += sce->ics.swb_sizes[g]; + prev = sce->sf_idx[w*16+g]; + } + } + if (tbits > destbits) { + for (i = 0; i < 128; i++) + if (sce->sf_idx[i] < 218 - qstep) + sce->sf_idx[i] += qstep; + } else { + for (i = 0; i < 128; i++) + if (sce->sf_idx[i] > 60 - qstep) + sce->sf_idx[i] -= qstep; + } + qstep >>= 1; + if (!qstep && tbits > destbits*1.02 && sce->sf_idx[0] < 217) + qstep = 1; + } while (qstep); + + fflag = 0; + minscaler = av_clip(minscaler, 60, 255 - SCALE_MAX_DIFF); + + for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) { + for (g = 0; g < sce->ics.num_swb; g++) { + int prevsc = sce->sf_idx[w*16+g]; + if (dists[w*16+g] > uplims[w*16+g] && sce->sf_idx[w*16+g] > 60) { + if (find_min_book(maxvals[w*16+g], sce->sf_idx[w*16+g]-1)) + sce->sf_idx[w*16+g]--; + else //Try to make sure there is some energy in every band + sce->sf_idx[w*16+g]-=2; + } + sce->sf_idx[w*16+g] = av_clip(sce->sf_idx[w*16+g], minscaler, minscaler + SCALE_MAX_DIFF); + sce->sf_idx[w*16+g] = FFMIN(sce->sf_idx[w*16+g], 219); + if (sce->sf_idx[w*16+g] != prevsc) + fflag = 1; + sce->band_type[w*16+g] = find_min_book(maxvals[w*16+g], sce->sf_idx[w*16+g]); + } + } + its++; + } while (fflag && its < 10); } static void search_for_pns(AACEncContext *s, AVCodecContext *avctx, SingleChannelElement *sce) @@ -828,7 +942,7 @@ AACCoefficientsEncoder ff_aac_coders[AAC_CODER_NB] = { }, [AAC_CODER_FAST] = { search_for_quantizers_fast, - encode_window_bands_info, + codebook_trellis_rate, quantize_and_encode_band, ff_aac_encode_tns_info, ff_aac_encode_ltp_info,