From patchwork Fri Sep 2 14:20:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martijn van Beurden X-Patchwork-Id: 37613 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:139a:b0:8f:1db5:eae2 with SMTP id w26csp980432pzh; Fri, 2 Sep 2022 07:21:19 -0700 (PDT) X-Google-Smtp-Source: AA6agR4SmVOXCBcxGNn1BZ4FjvKtlqZGSx6dY/4eH6IoZ10bvuXY4f66CH+5eEkldl7t1/3R21xq X-Received: by 2002:a17:907:94d0:b0:730:7a68:3093 with SMTP id dn16-20020a17090794d000b007307a683093mr28048963ejc.760.1662128478740; Fri, 02 Sep 2022 07:21:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662128478; cv=none; d=google.com; s=arc-20160816; b=LlBi9zY8DzXDbGbuAxj0Mly9qbPSyBSVKpp4acZV8N/jFXsV7Bl7UkvF0tdxeXdU3J Mg+MH2UgmWCjwWM06rkgZ08aSAIwA45gPVpOqfmQ1sKGjIAxjzeKhOWTrnAGZEKlsgFf gqtRJuWx0GCLhF/Q8MUodlXwT/I7emvbSStrTRz0fmqyLu+7wtxgPFly1lAjWKLqczPz vK9iGyRUP9IHW7YQHbdquMuh+Hm/wt8jxi73zgO6szL6622qWZoPhYgvvCxvuY8aY/ZK 9SKjuYMDbR0qJfw+XZLpmYknn6ZTHOXngYBS/Jt0JUzpAmxbChfkbpYai+NRDNIHN6Ta pulg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=FoWVcyip/NBIh76uWog6gaveQvqRpDoodQutHLwWRbs=; b=lzibVyxAczv8g0eXQ1v/tjsLs+pXYAZ0iSPIv5mWlDhO3gGDwcEoiim6dZTk49EitI jHGRYsCULVeBqzo7qPnW9J3BdMcxdQ7Wdjc01CLfO7PkHPxxSRVfDvQoER3S0E9b/6Pa KxpGtWbh8PyL2QlCvbB1TNqMYJQFjDeLtGvhAOjRTqRiGVS56Up8fygdOjBE9i+717Ty wT/poG5m32ueENKLz63OxzOg/nOpJD62USyxykULFrKTg9haX1C2Qy37ODfWeJ0vDItz /FYS9tUUgxOnOWDDDpdagNTg0q5IyGTtyENRKigHbiFyACX2LNGnaFoUwcxsWMZUvez7 EV+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=d9xQzKs3; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j7-20020a05640211c700b0044785508ac0si4843edw.51.2022.09.02.07.21.18; Fri, 02 Sep 2022 07:21:18 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=d9xQzKs3; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 4EB2468BA1A; Fri, 2 Sep 2022 17:21:02 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com [209.85.208.48]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4DC1A68B98E for ; Fri, 2 Sep 2022 17:20:55 +0300 (EEST) Received: by mail-ed1-f48.google.com with SMTP id b44so2880379edf.9 for ; Fri, 02 Sep 2022 07:20:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=T8k3kgSVN6LIXksmiGIKAj2SG23Tlox8sakpHmhtQWM=; b=d9xQzKs3OY2KmdmEFeSRfRWxB3DH73Pj2LNq+ekB9M17AU76buLQYr6cwDx2xErgtK inVpFIV3BW69Dbfb6157z79ldBRC+CYcqnigTHgfRv6bplSOugTq9wqz68OmO0mHRbrS hm7Rvto1loEnZo1LXdjXJEbc2GxyJrLbA9SUkIPMzghLDi2Y2Hk0K3HhcdSwxDkpG88i ajY+O3dK2VmP/Q6fTjDuKPSBVrqdhjLyRqRkgnCNDEXshjAvYQIHlAFk/NsH6lVV/nup l0lzf7yfSrrNoPsEDhLapiAuCI3PG33BE1BddsWUvX+VmZNwfb6GzyGIg8KEOR7VsWsz zkJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=T8k3kgSVN6LIXksmiGIKAj2SG23Tlox8sakpHmhtQWM=; b=kaU1KqGTMms+xtYgunmvrdLhqXoobCKppJrhwhXvRuAD7mVnxJNHAJhEflpWPU/9xr 1/0d5bht6HT61swF5cK/IVvhNvi0nhZK0Mg+9SmuLWqJY/H25EsO4K56Ba1FXI3yoRd5 QJeEdIKlaOx8mkWXKSQVqjv4vYTdK/ClJJrxQ/h/OQ3Eh87lxzxEp/BAsSM9CV9QAUji 32s1d4ivOkX3+LNc8jdErrmH0eBzNwFHGYNUaozQyEukmJBapyLx6cf/4UTi0mwOtSDC ZUP/eW1G3dyr7sVGOrwujYg/0Iggu7ExfIMIimh2Pt4lLoLbpipzllPlI3ipfX9T2lSe u10A== X-Gm-Message-State: ACgBeo2ut8Afu7A0erDoiC37h5pAFoqthYih0KrLNbZS9uwBYoIcxGIY H1AUz3sw7bzDrT7qYAouJ8DcgGctP0F25Q== X-Received: by 2002:a05:6402:1848:b0:448:5f2e:e322 with SMTP id v8-20020a056402184800b004485f2ee322mr21602327edy.140.1662128454224; Fri, 02 Sep 2022 07:20:54 -0700 (PDT) Received: from localhost.localdomain (92-64-99-37.biz.kpn.net. [92.64.99.37]) by smtp.googlemail.com with ESMTPSA id d20-20020a17090694d400b007417c5dbfeesm1328814ejy.70.2022.09.02.07.20.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Sep 2022 07:20:53 -0700 (PDT) From: Martijn van Beurden To: ffmpeg-devel@ffmpeg.org Date: Fri, 2 Sep 2022 16:20:28 +0200 Message-Id: <20220902142028.261480-3-mvanb1@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220902142028.261480-1-mvanb1@gmail.com> References: <20220902142028.261480-1-mvanb1@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 2/2] libavcodec/flacenc: Implement encoding of 32 bit-per-sample PCM X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Martijn van Beurden Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: MIhNc/+cv4om Add encoding of 32 bit-per-sample PCM to FLAC files to libavcodec. Coding to this format is at this point considered experimental and -strict -2 is needed to get ffmpeg to encode such files. --- libavcodec/flacenc.c | 499 ++++++++++++++++++++++++++++++++-------- libavcodec/put_bits.h | 7 + libavcodec/put_golomb.h | 19 +- 3 files changed, 418 insertions(+), 107 deletions(-) diff --git a/libavcodec/flacenc.c b/libavcodec/flacenc.c index 0170e02ae8..84973bb2a5 100644 --- a/libavcodec/flacenc.c +++ b/libavcodec/flacenc.c @@ -95,6 +95,7 @@ typedef struct FlacSubframe { typedef struct FlacFrame { FlacSubframe subframes[FLAC_MAX_CHANNELS]; + int64_t samples_33bps[FLAC_MAX_BLOCKSIZE]; int blocksize; int bs_code[2]; uint8_t crc8; @@ -282,10 +283,23 @@ static av_cold int flac_encode_init(AVCodecContext *avctx) s->bps_code = 4; break; case AV_SAMPLE_FMT_S32: - if (avctx->bits_per_raw_sample != 24) - av_log(avctx, AV_LOG_WARNING, "encoding as 24 bits-per-sample\n"); - avctx->bits_per_raw_sample = 24; - s->bps_code = 6; + if (avctx->bits_per_raw_sample <= 24) { + if (avctx->bits_per_raw_sample < 24) + av_log(avctx, AV_LOG_WARNING, "encoding as 24 bits-per-sample\n"); + avctx->bits_per_raw_sample = 24; + s->bps_code = 6; + } else if (avctx->strict_std_compliance > FF_COMPLIANCE_EXPERIMENTAL) { + av_log(avctx, AV_LOG_WARNING, + "encoding as 24 bits-per-sample, more is considered " + "experimental. Add -strict %d if you want to encode " + "more than 24 bits-per-sample\n", + FF_COMPLIANCE_EXPERIMENTAL); + avctx->bits_per_raw_sample = 24; + s->bps_code = 6; + } else { + avctx->bits_per_raw_sample = 32; + s->bps_code = 7; + } break; } @@ -533,8 +547,7 @@ static uint64_t rice_count_exact(const int32_t *res, int n, int k) uint64_t count = 0; for (i = 0; i < n; i++) { - int32_t v = -2 * res[i] - 1; - v ^= v >> 31; + unsigned v = ((unsigned)(res[i]) << 1) ^ (res[i] >> 31); count += (v >> k) + 1 + k; } return count; @@ -713,8 +726,8 @@ static uint64_t calc_rice_params(RiceContext *rc, tmp_rc.coding_mode = rc->coding_mode; - for (i = 0; i < n; i++) - udata[i] = (2 * data[i]) ^ (data[i] >> 31); + for (i = pred_order; i < n; i++) + udata[i] = ((unsigned)(data[i]) << 1) ^ (data[i] >> 31); calc_sum_top(pmax, exact ? kmax : 0, udata, n, pred_order, sums); @@ -812,6 +825,180 @@ static void encode_residual_fixed(int32_t *res, const int32_t *smp, int n, } +static int encode_residual_fixed_with_residual_limit(int32_t *res, const int32_t *smp, + int n, int order) +{ + /* This function checks for every residual whether it can be + * contained in INT32_MAX) + return 1; + res[i] = res64; + } + } else if (order == 2) { + for (i = order; i < n; i++) { + res64 = (int64_t)smp[i] - 2*(int64_t)smp[i-1] + smp[i-2]; + if (res64 <= INT32_MIN || res64 > INT32_MAX) + return 1; + res[i] = res64; + } + } else if (order == 3) { + for (i = order; i < n; i++) { + res64 = (int64_t)smp[i] - 3*(int64_t)smp[i-1] + 3*(int64_t)smp[i-2] - smp[i-3]; + if (res64 <= INT32_MIN || res64 > INT32_MAX) + return 1; + res[i] = res64; + } + } else { + for (i = order; i < n; i++) { + res64 = (int64_t)smp[i] - 4*(int64_t)smp[i-1] + 6*(int64_t)smp[i-2] - 4*(int64_t)smp[i-3] + smp[i-4]; + if (res64 <= INT32_MIN || res64 > INT32_MAX) + return 1; + res[i] = res64; + } + } + return 0; +} + + +static int encode_residual_fixed_with_residual_limit_33bps(int32_t *res, const int64_t *smp, + int n, int order) +{ + /* This function checks for every residual whether it can be + * contained in INT32_MAX) + return 1; + res[i] = smp[i]; + } + } else if (order == 1) { + for (i = order; i < n; i++) { + res64 = smp[i] - smp[i-1]; + if (res64 <= INT32_MIN || res64 > INT32_MAX) + return 1; + res[i] = res64; + } + } else if (order == 2) { + for (i = order; i < n; i++) { + res64 = smp[i] - 2*smp[i-1] + smp[i-2]; + if (res64 <= INT32_MIN || res64 > INT32_MAX) + return 1; + res[i] = res64; + } + } else if (order == 3) { + for (i = order; i < n; i++) { + res64 = smp[i] - 3*smp[i-1] + 3*smp[i-2] - smp[i-3]; + if (res64 <= INT32_MIN || res64 > INT32_MAX) + return 1; + res[i] = res64; + } + } else { + for (i = order; i < n; i++) { + res64 = smp[i] - 4*smp[i-1] + 6*smp[i-2] - 4*smp[i-3] + smp[i-4]; + if (res64 <= INT32_MIN || res64 > INT32_MAX) + return 1; + res[i] = res64; + } + } + return 0; +} + + +static int lpc_encode_with_residual_limit(int32_t *res, const int32_t *smp, int len, + int order, int32_t *coefs, int shift) +{ + /* This function checks for every residual whether it can be + * contained in >= shift; + tmp = smp[i] - p; + if (tmp <= INT32_MIN || tmp > INT32_MAX) + return 1; + res[i] = tmp; + } + return 0; +} + +static int lpc_encode_with_residual_limit_33bps(int32_t *res, const int64_t *smp, int len, + int order, int32_t *coefs, int shift) +{ + /* This function checks for every residual whether it can be + * contained in >= shift; + tmp = smp[i] - p; + if (tmp <= INT32_MIN || tmp > INT32_MAX) + return 1; + res[i] = tmp; + } + return 0; +} + +static int lpc_encode_choose_datapath(FlacEncodeContext *s, int32_t bps, + int32_t *res, const int32_t *smp, + const int64_t *smp_33bps, int len, + int order, int32_t *coefs, int shift) +{ + uint64_t max_residual_value = 0; + int64_t max_sample_value = ((int64_t)(1) << (bps-1)); + /* This calculates the max size of any residual with the current + * predictor, so we know whether we need to check the residual */ + for (int i = 0; i < order; i++) + max_residual_value += FFABS(max_sample_value * coefs[i]); + max_residual_value >>= shift; + max_residual_value += max_sample_value; + if (bps > 32) { + if (lpc_encode_with_residual_limit_33bps(res, smp_33bps, len, order, coefs, shift)) + return 1; + } else if (max_residual_value > INT32_MAX) { + if (lpc_encode_with_residual_limit(res, smp, len, order, coefs, shift)) + return 1; + } else if (bps + s->options.lpc_coeff_precision + av_log2(order) <= 32) { + s->flac_dsp.lpc16_encode(res, smp, len, order, coefs, shift); + } else { + s->flac_dsp.lpc32_encode(res, smp, len, order, coefs, shift); + } + return 0; +} + +#define DEFAULT_TO_VERBATIM() \ +{ \ + sub->type = sub->type_code = FLAC_SUBFRAME_VERBATIM; \ + if (sub->obits <= 32) \ + memcpy(res, smp, n * sizeof(int32_t)); \ + return subframe_count_exact(s, sub, 0); \ +} + static int encode_residual_ch(FlacEncodeContext *s, int ch) { int i, n; @@ -821,28 +1008,38 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) int32_t coefs[MAX_LPC_ORDER][MAX_LPC_ORDER]; int shift[MAX_LPC_ORDER]; int32_t *res, *smp; + int64_t *smp_33bps; - frame = &s->frame; - sub = &frame->subframes[ch]; - res = sub->residual; - smp = sub->samples; - n = frame->blocksize; + frame = &s->frame; + sub = &frame->subframes[ch]; + res = sub->residual; + smp = sub->samples; + smp_33bps = frame->samples_33bps; + n = frame->blocksize; /* CONSTANT */ - for (i = 1; i < n; i++) - if(smp[i] != smp[0]) - break; - if (i == n) { - sub->type = sub->type_code = FLAC_SUBFRAME_CONSTANT; - res[0] = smp[0]; - return subframe_count_exact(s, sub, 0); + if (sub->obits > 32) { + for (i = 1; i < n; i++) + if(smp_33bps[i] != smp_33bps[0]) + break; + if (i == n) { + sub->type = sub->type_code = FLAC_SUBFRAME_CONSTANT; + return subframe_count_exact(s, sub, 0); + } + } else { + for (i = 1; i < n; i++) + if(smp[i] != smp[0]) + break; + if (i == n) { + sub->type = sub->type_code = FLAC_SUBFRAME_CONSTANT; + res[0] = smp[0]; + return subframe_count_exact(s, sub, 0); + } } /* VERBATIM */ if (frame->verbatim_only || n < 5) { - sub->type = sub->type_code = FLAC_SUBFRAME_VERBATIM; - memcpy(res, smp, n * sizeof(int32_t)); - return subframe_count_exact(s, sub, 0); + DEFAULT_TO_VERBATIM(); } min_order = s->options.min_prediction_order; @@ -859,15 +1056,32 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) opt_order = 0; bits[0] = UINT32_MAX; for (i = min_order; i <= max_order; i++) { - encode_residual_fixed(res, smp, n, i); + if (sub->obits == 33) { + if (encode_residual_fixed_with_residual_limit_33bps(res, smp_33bps, n, i)) + continue; + } else if (sub->obits + i >= 32) { + if (encode_residual_fixed_with_residual_limit(res, smp, n, i)) + continue; + } else + encode_residual_fixed(res, smp, n, i); bits[i] = find_subframe_rice_params(s, sub, i); if (bits[i] < bits[opt_order]) opt_order = i; } + if (opt_order == 0 && bits[0] == UINT32_MAX) { + /* No predictor found with residuals within order = opt_order; sub->type_code = sub->type | sub->order; if (sub->order != max_order) { - encode_residual_fixed(res, smp, n, sub->order); + if (sub->obits == 33) + encode_residual_fixed_with_residual_limit_33bps(res, smp_33bps, n, sub->order); + else if (sub->obits + i >= 32) + encode_residual_fixed_with_residual_limit(res, smp, n, sub->order); + else + encode_residual_fixed(res, smp, n, sub->order); find_subframe_rice_params(s, sub, sub->order); } return subframe_count_exact(s, sub, sub->order); @@ -875,6 +1089,14 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) /* LPC */ sub->type = FLAC_SUBFRAME_LPC; + if (sub->obits == 33) + /* As ff_lpc_calc_coefs is shared with other codecs and the LSB + * probably isn't predictable anyway, throw away LSB for analysis + * so it fits 32 bit int and existing function can be used + * unmodified */ + for (i = 0; i < n; i++) + smp[i] = smp_33bps[i] >> 1; + opt_order = ff_lpc_calc_coefs(&s->lpc_ctx, smp, n, min_order, max_order, s->options.lpc_coeff_precision, coefs, shift, s->options.lpc_type, s->options.lpc_passes, omethod, @@ -895,13 +1117,8 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) order = av_clip(order, min_order - 1, max_order - 1); if (order == last_order) continue; - if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(order) <= 32) { - s->flac_dsp.lpc16_encode(res, smp, n, order+1, coefs[order], - shift[order]); - } else { - s->flac_dsp.lpc32_encode(res, smp, n, order+1, coefs[order], - shift[order]); - } + if(lpc_encode_choose_datapath(s, sub->obits, res, smp, smp_33bps, n, order+1, coefs[order], shift[order])) + continue; bits[i] = find_subframe_rice_params(s, sub, order+1); if (bits[i] < bits[opt_index]) { opt_index = i; @@ -915,11 +1132,8 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) opt_order = 0; bits[0] = UINT32_MAX; for (i = min_order-1; i < max_order; i++) { - if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(i) <= 32) { - s->flac_dsp.lpc16_encode(res, smp, n, i+1, coefs[i], shift[i]); - } else { - s->flac_dsp.lpc32_encode(res, smp, n, i+1, coefs[i], shift[i]); - } + if(lpc_encode_choose_datapath(s, sub->obits, res, smp, smp_33bps, n, i+1, coefs[i], shift[i])) + continue; bits[i] = find_subframe_rice_params(s, sub, i+1); if (bits[i] < bits[opt_order]) opt_order = i; @@ -937,11 +1151,8 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) for (i = last-step; i <= last+step; i += step) { if (i < min_order-1 || i >= max_order || bits[i] < UINT32_MAX) continue; - if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(i) <= 32) { - s->flac_dsp.lpc32_encode(res, smp, n, i+1, coefs[i], shift[i]); - } else { - s->flac_dsp.lpc16_encode(res, smp, n, i+1, coefs[i], shift[i]); - } + if(lpc_encode_choose_datapath(s, sub->obits, res, smp, smp_33bps, n, i+1, coefs[i], shift[i])) + continue; bits[i] = find_subframe_rice_params(s, sub, i+1); if (bits[i] < bits[opt_order]) opt_order = i; @@ -978,11 +1189,8 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) if (diffsum >8) continue; - if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(opt_order - 1) <= 32) { - s->flac_dsp.lpc16_encode(res, smp, n, opt_order, lpc_try, shift[opt_order-1]); - } else { - s->flac_dsp.lpc32_encode(res, smp, n, opt_order, lpc_try, shift[opt_order-1]); - } + if(lpc_encode_choose_datapath(s, sub->obits, res, smp, smp_33bps, n, opt_order, lpc_try, shift[opt_order-1])) + continue; score = find_subframe_rice_params(s, sub, opt_order); if (score < best_score) { best_score = score; @@ -999,10 +1207,10 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) for (i = 0; i < sub->order; i++) sub->coefs[i] = coefs[sub->order-1][i]; - if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(opt_order) <= 32) { - s->flac_dsp.lpc16_encode(res, smp, n, sub->order, sub->coefs, sub->shift); - } else { - s->flac_dsp.lpc32_encode(res, smp, n, sub->order, sub->coefs, sub->shift); + if(lpc_encode_choose_datapath(s, sub->obits, res, smp, smp_33bps, n, sub->order, sub->coefs, sub->shift)) { + /* No predictor found with residuals within order); @@ -1069,54 +1277,88 @@ static int encode_frame(FlacEncodeContext *s) static void remove_wasted_bits(FlacEncodeContext *s) { - int ch, i; + int ch, i, wasted_bits; for (ch = 0; ch < s->channels; ch++) { FlacSubframe *sub = &s->frame.subframes[ch]; - int32_t v = 0; - for (i = 0; i < s->frame.blocksize; i++) { - v |= sub->samples[i]; - if (v & 1) - break; - } + if (sub->obits > 32) { + int64_t v = 0; + for (i = 0; i < s->frame.blocksize; i++) { + v |= s->frame.samples_33bps[i]; + if (v & 1) + break; + } + + if (!v || (v & 1)) + return; - if (v && !(v & 1)) { v = ff_ctz(v); + /* If any wasted bits are found, samples are moved + * from frame.samples_33bps to frame.subframes[ch] */ for (i = 0; i < s->frame.blocksize; i++) - sub->samples[i] >>= v; + sub->samples[i] = s->frame.samples_33bps[i] >> v; + wasted_bits = v; + } else { + int32_t v = 0; + for (i = 0; i < s->frame.blocksize; i++) { + v |= sub->samples[i]; + if (v & 1) + break; + } + + if (!v || (v & 1)) + return; - sub->wasted = v; - sub->obits -= v; + v = ff_ctz(v); - /* for 24-bit, check if removing wasted bits makes the range better - suited for using RICE instead of RICE2 for entropy coding */ - if (sub->obits <= 17) - sub->rc.coding_mode = CODING_MODE_RICE; + for (i = 0; i < s->frame.blocksize; i++) + sub->samples[i] >>= v; + wasted_bits = v; } + + sub->wasted = wasted_bits; + sub->obits -= wasted_bits; + + /* for 24-bit, check if removing wasted bits makes the range better + * suited for using RICE instead of RICE2 for entropy coding */ + if (sub->obits <= 17) + sub->rc.coding_mode = CODING_MODE_RICE; } } static int estimate_stereo_mode(const int32_t *left_ch, const int32_t *right_ch, int n, - int max_rice_param) + int max_rice_param, int bps) { int i, best; - int32_t lt, rt; uint64_t sum[4]; uint64_t score[4]; int k; /* calculate sum of 2nd order residual for each channel */ sum[0] = sum[1] = sum[2] = sum[3] = 0; - for (i = 2; i < n; i++) { - lt = left_ch[i] - 2*left_ch[i-1] + left_ch[i-2]; - rt = right_ch[i] - 2*right_ch[i-1] + right_ch[i-2]; - sum[2] += FFABS((lt + rt) >> 1); - sum[3] += FFABS(lt - rt); - sum[0] += FFABS(lt); - sum[1] += FFABS(rt); + if(bps < 30) { + int32_t lt, rt; + for (i = 2; i < n; i++) { + lt = left_ch[i] - 2*left_ch[i-1] + left_ch[i-2]; + rt = right_ch[i] - 2*right_ch[i-1] + right_ch[i-2]; + sum[2] += FFABS((lt + rt) >> 1); + sum[3] += FFABS(lt - rt); + sum[0] += FFABS(lt); + sum[1] += FFABS(rt); + } + } else { + int64_t lt, rt; + for (i = 2; i < n; i++) { + lt = (int64_t)left_ch[i] - 2*(int64_t)left_ch[i-1] + left_ch[i-2]; + rt = (int64_t)right_ch[i] - 2*(int64_t)right_ch[i-1] + right_ch[i-2]; + sum[2] += FFABS((lt + rt) >> 1); + sum[3] += FFABS(lt - rt); + sum[0] += FFABS(lt); + sum[1] += FFABS(rt); + } } /* estimate bit counts */ for (i = 0; i < 4; i++) { @@ -1147,12 +1389,14 @@ static void channel_decorrelation(FlacEncodeContext *s) { FlacFrame *frame; int32_t *left, *right; + int64_t *side_33bps; int i, n; - frame = &s->frame; - n = frame->blocksize; - left = frame->subframes[0].samples; - right = frame->subframes[1].samples; + frame = &s->frame; + n = frame->blocksize; + left = frame->subframes[0].samples; + right = frame->subframes[1].samples; + side_33bps = frame->samples_33bps; if (s->channels != 2) { frame->ch_mode = FLAC_CHMODE_INDEPENDENT; @@ -1161,29 +1405,49 @@ static void channel_decorrelation(FlacEncodeContext *s) if (s->options.ch_mode < 0) { int max_rice_param = (1 << frame->subframes[0].rc.coding_mode) - 2; - frame->ch_mode = estimate_stereo_mode(left, right, n, max_rice_param); + frame->ch_mode = estimate_stereo_mode(left, right, n, max_rice_param, s->avctx->bits_per_raw_sample); } else frame->ch_mode = s->options.ch_mode; /* perform decorrelation and adjust bits-per-sample */ if (frame->ch_mode == FLAC_CHMODE_INDEPENDENT) return; - if (frame->ch_mode == FLAC_CHMODE_MID_SIDE) { - int32_t tmp; - for (i = 0; i < n; i++) { - tmp = left[i]; - left[i] = (tmp + right[i]) >> 1; - right[i] = tmp - right[i]; + if(s->avctx->bits_per_raw_sample == 32) { + if (frame->ch_mode == FLAC_CHMODE_MID_SIDE) { + int64_t tmp; + for (i = 0; i < n; i++) { + tmp = left[i]; + left[i] = (tmp + right[i]) >> 1; + side_33bps[i] = tmp - right[i]; + } + frame->subframes[1].obits++; + } else if (frame->ch_mode == FLAC_CHMODE_LEFT_SIDE) { + for (i = 0; i < n; i++) + side_33bps[i] = (int64_t)left[i] - right[i]; + frame->subframes[1].obits++; + } else { + for (i = 0; i < n; i++) + side_33bps[i] = (int64_t)left[i] - right[i]; + frame->subframes[0].obits++; } - frame->subframes[1].obits++; - } else if (frame->ch_mode == FLAC_CHMODE_LEFT_SIDE) { - for (i = 0; i < n; i++) - right[i] = left[i] - right[i]; - frame->subframes[1].obits++; } else { - for (i = 0; i < n; i++) - left[i] -= right[i]; - frame->subframes[0].obits++; + if (frame->ch_mode == FLAC_CHMODE_MID_SIDE) { + int32_t tmp; + for (i = 0; i < n; i++) { + tmp = left[i]; + left[i] = (tmp + right[i]) >> 1; + right[i] = tmp - right[i]; + } + frame->subframes[1].obits++; + } else if (frame->ch_mode == FLAC_CHMODE_LEFT_SIDE) { + for (i = 0; i < n; i++) + right[i] = left[i] - right[i]; + frame->subframes[1].obits++; + } else { + for (i = 0; i < n; i++) + left[i] -= right[i]; + frame->subframes[0].obits++; + } } } @@ -1252,14 +1516,38 @@ static void write_subframes(FlacEncodeContext *s) /* subframe */ if (sub->type == FLAC_SUBFRAME_CONSTANT) { - put_sbits(&s->pb, sub->obits, res[0]); + if(sub->obits == 33) + put_sbits64(&s->pb, 33, s->frame.samples_33bps[0]); + else if(sub->obits == 32) + put_bits32(&s->pb, res[0]); + else + put_sbits(&s->pb, sub->obits, res[0]); } else if (sub->type == FLAC_SUBFRAME_VERBATIM) { - while (res < frame_end) - put_sbits(&s->pb, sub->obits, *res++); + if (sub->obits == 33) { + int64_t *res64 = s->frame.samples_33bps; + int64_t *frame_end64 = &s->frame.samples_33bps[s->frame.blocksize]; + while (res64 < frame_end64) + put_sbits64(&s->pb, 33, (*res64++)); + } else if (sub->obits == 32) { + while (res < frame_end) + put_bits32(&s->pb, *res++); + } else { + while (res < frame_end) + put_sbits(&s->pb, sub->obits, *res++); + } } else { /* warm-up samples */ - for (i = 0; i < sub->order; i++) - put_sbits(&s->pb, sub->obits, *res++); + if (sub->obits == 33) { + for (i = 0; i < sub->order; i++) + put_sbits64(&s->pb, 33, s->frame.samples_33bps[i]); + res += sub->order; + } else if (sub->obits == 32) { + for (i = 0; i < sub->order; i++) + put_bits32(&s->pb, *res++); + } else { + for (i = 0; i < sub->order; i++) + put_sbits(&s->pb, sub->obits, *res++); + } /* LPC coefficients */ if (sub->type == FLAC_SUBFRAME_LPC) { @@ -1284,7 +1572,7 @@ static void write_subframes(FlacEncodeContext *s) int k = sub->rc.params[p]; put_bits(&s->pb, sub->rc.coding_mode, k); while (res < part_end) - set_sr_golomb_flac(&s->pb, *res++, k, INT32_MAX, 0); + set_sr_golomb_flac(&s->pb, *res++, k); part_end = FFMIN(frame_end, part_end + psize); } } @@ -1332,7 +1620,7 @@ static int update_md5_sum(FlacEncodeContext *s, const void *samples) (const uint16_t *) samples, buf_size / 2); buf = s->md5_buffer; #endif - } else { + } else if (s->avctx->bits_per_raw_sample <= 24) { int i; const int32_t *samples0 = samples; uint8_t *tmp = s->md5_buffer; @@ -1342,6 +1630,15 @@ static int update_md5_sum(FlacEncodeContext *s, const void *samples) AV_WL24(tmp + 3*i, v); } buf = s->md5_buffer; + } else { + /* s->avctx->bits_per_raw_sample <= 32 */ + int i; + const int32_t *samples0 = samples; + uint8_t *tmp = s->md5_buffer; + + for (i = 0; i < s->frame.blocksize * s->channels; i++) + AV_WL32(tmp + 4*i, samples0[i]); + buf = s->md5_buffer; } av_md5_update(s->md5ctx, buf, buf_size); diff --git a/libavcodec/put_bits.h b/libavcodec/put_bits.h index 4b4f977ad5..11360fe7a4 100644 --- a/libavcodec/put_bits.h +++ b/libavcodec/put_bits.h @@ -363,6 +363,13 @@ static inline void put_bits64(PutBitContext *s, int n, uint64_t value) } } +static inline void put_sbits64(PutBitContext *pb, int n, int64_t value) +{ + av_assert2(n >= 0 && n <= 64); + + put_bits64(pb, n, (uint64_t)(value) & (~(UINT64_MAX << n))); +} + /** * Return the pointer to the byte where the bitstream writer will put * the next bit. diff --git a/libavcodec/put_golomb.h b/libavcodec/put_golomb.h index 9ca911fc3c..ca4b182ae8 100644 --- a/libavcodec/put_golomb.h +++ b/libavcodec/put_golomb.h @@ -154,15 +154,22 @@ static inline void set_sr_golomb(PutBitContext *pb, int i, int k, int limit, /** * write signed golomb rice code (flac). */ -static inline void set_sr_golomb_flac(PutBitContext *pb, int i, int k, - int limit, int esc_len) +static inline void set_sr_golomb_flac(PutBitContext *pb, int i, int k) { - int v; + unsigned v, e; - v = -2 * i - 1; - v ^= (v >> 31); + v = ((unsigned)(i) << 1) ^ (i >> 31); - set_ur_golomb_jpegls(pb, v, k, limit, esc_len); + e = (v >> k) + 1; + while (e > 31) { + put_bits(pb, 31, 0); + e -= 31; + } + put_bits(pb, e, 1); + if (k) { + unsigned mask = UINT32_MAX >> (32-k); + put_bits(pb, k, v & mask); + } } #endif /* AVCODEC_PUT_GOLOMB_H */