From patchwork Thu Dec 16 19:43:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martijn van Beurden X-Patchwork-Id: 32663 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:cd86:0:0:0:0:0 with SMTP id d128csp793976iog; Thu, 16 Dec 2021 11:44:05 -0800 (PST) X-Google-Smtp-Source: ABdhPJxo31m/a8RIJ5pGhY+IYa0RpDlFrNxD6/muHKK7E1zSKOWEe93vzEXH+cz1BGi5hw8RaJRG X-Received: by 2002:a05:6402:2686:: with SMTP id w6mr22022727edd.141.1639683845292; Thu, 16 Dec 2021 11:44:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1639683845; cv=none; d=google.com; s=arc-20160816; b=Jm+QMXegq6aZPlb3Jx4vlYQOu7q77IJsRrdS85AUuciPCinMMjy4XZ0NcM6Y8UPp5H oARNTg9UucSD+jvFb+nW/jKidcJkioiL0kU4HssKuz6+GiLMmF981Ers2jypMk76vcZJ MaOw1tU2Sb9AEkRfxs0fOmaC6Ze704zglLqx3at5kaSWDa0su6if5Ip38kUi9qx7rjn7 8oag/jwFPIbaWtp+wiQCdBPj8G8ytWLMU+ku5BTcssHIpV6YwvOib7yPhrhLVurMma39 r45auFd2G+WOF8qHjvuPemZ8tbbBgeNQ8Hx7zPOy3SAEph7AHbggnSCHCFtxgCRdLt/g SHPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=b2I6t8QqK2flZzG//sW7/pzubN+XaYBIvftbpULaj7g=; b=h7U/xDJr/LzY5NYQKl6FYLWTneV74YgUsnjMikFbnTAEP4diyfzOGJk21w6mCirLgo gxPP7rWvYX6ZXx4NtEIi484fksQw7h6nphld8BJ3uassVk+spmmQHUVxywt5LI9rKQ1q b1SRx1WxcIuFyAr0hIRdyzYOIOeii6h30kLzjGY8Uk3FleBNwehhDpx2qAkwmYqHEHHn fOU5vyBPCfjpn/Jti/YH4o3oNQo+MvDgO6fhlDQUNqvnY4dKZKZtEHco5yh5Su3gokrs KYgQ81fTZ1sTn3v02q8bn4NqC5j+6LKHbWQtBvpyltLC8uEfnU5c4WIv9iTOkUpCoe3Z rClQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=BlmVEvWs; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id jl22si3875490ejc.423.2021.12.16.11.44.04; Thu, 16 Dec 2021 11:44:05 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20210112 header.b=BlmVEvWs; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 08D07687F96; Thu, 16 Dec 2021 21:43:52 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1621B687F96 for ; Thu, 16 Dec 2021 21:43:44 +0200 (EET) Received: by mail-ed1-f45.google.com with SMTP id y22so20053137edq.2 for ; Thu, 16 Dec 2021 11:43:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=C5JXlJ+yj+cf/3WXCcsDXL1C3K7S528kq9E5zkse/nU=; b=BlmVEvWs2H9U3VH3NJjeG1a0xWd7kuqFdlco+QGranUos2PXPublDRFKQ91JcKvuDg wNtR3Hk4EfecTazIdke7z+IYvPYEOxclmQ/c3sbJ01BkB8/nuzLBXS+ZkO1kRmNx5ZFR +qmdZ3NivAzLX3yJfqZ4L2DksHCAjhvLiCAPaJeKghHznkyHtHjkoqzD23Wn7ewf7if0 VfPKpL9dh6A6+nWNNL5PZ+E3Cln+zNlsPAjz0B1x1zT10uFX4oG+d6AZAPTvYSBmSNuD H2BKkaZ1QuLZOW+6xV6ZYt/ixbVYb2nmwk/+Vcl4ikV4X0nDkxl/+ZY9aorVeLFNE+ST D30w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=C5JXlJ+yj+cf/3WXCcsDXL1C3K7S528kq9E5zkse/nU=; b=4RPI3LSnEywmwPwoNaeKGn3cmsNfZ94BBPH8yvNfX0A6Ov4Km6RAbQIhRfeT0kkCzb /Iz5TMWEkuSJ6KDbFveGmvKiImWMd9j9hFgDatodgrE352orhWvMLXHfTb/sPjFD9spQ I8u6G/F92w/1VpFnBscSetk1BzCPuWo9qOH8+YwMZUFPchf25K6x5IpLnzEVCZ1Ud4G9 JUeqxmlab5D9zgi991p4wWi7+L9Qz7zDQurbfsirvwVHm2pQc3IZJ4aKu4thZrhd7YVh TaiCruM88qxrUQJQbTSNctSzwk7GvwMSGKkC9E99LzWdQSlnO4BrHZvpd4DK+p3ITiu5 ca/Q== X-Gm-Message-State: AOAM5339b+hoq2VjD5feilitCqztUr2/nD/BCO6cKDOi3Lx1AGhv1Wkc ginwUJlHZolGL99uP/u3fUC+xeG0ESrmDA== X-Received: by 2002:a17:906:15d0:: with SMTP id l16mr17546100ejd.462.1639683823136; Thu, 16 Dec 2021 11:43:43 -0800 (PST) Received: from localhost.localdomain (92-64-99-37.biz.kpn.net. [92.64.99.37]) by smtp.googlemail.com with ESMTPSA id qb21sm2093183ejc.39.2021.12.16.11.43.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Dec 2021 11:43:42 -0800 (PST) From: Martijn van Beurden To: ffmpeg-devel@ffmpeg.org Date: Thu, 16 Dec 2021 20:43:21 +0100 Message-Id: <20211216194321.18669-1-mvanb1@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH] Add 32 bit-per-sample capability to FLAC encoder X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Martijn van Beurden Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: M0uBTnMYlSe5 This commit makes ffmpeg create FLAC files with up to 32 bits-per-sample, up from the previous limit of 24 bit. This is because of a feature request to RAWcooked, apparently the archiving community has a need for storing files with 32-bit integer audio samples. See https://github.com/MediaArea/RAWcooked/issues/356 Care has been taken to create files that are compatible with existing decoders, which were never tested with such files. Stereo decorrelation is disabled on 32 bit-per-sample, because a side channel would need 33 bit-per-sample, causing problems in existing decoders using 32 bit integers. Also, only LPC encoding is enabled, because decoders capable of processing 24-bit files already use 64-bit processing for LPC, but not for fixed subframes. Additionally, on encoding checks are added making certain that no prediction overflows 32 bit signed integer and no residual is larger than 2^30 or smaller than -1*2^30, because most decoders (including ffmpeg) convert the unsigned folded representation to a signed one before dividing by two (part of the unfolding). Testing has been done with CDDA upsampled to 32 bit-per-sample files with SoX, using a hilbert filter to fill the added 16 bits. Also a 32 bit-per-sample file was generated with SoX synth filter. ffmpeg has been unintentionally forward-compatible with this change since commit c720b9ce98, committed in May 2015. ffmpeg n2.7 was the first release containing this commit. libFLAC is forward-compatible since release 1.2.1 (September 2007), the flac command line tool however blocks 32-bit files out of caution, it having been untested until now. --- libavcodec/flacdsp.c | 25 +++++++++++ libavcodec/flacdsp.h | 3 ++ libavcodec/flacenc.c | 99 +++++++++++++++++++++++++++++++++++++------- 3 files changed, 113 insertions(+), 14 deletions(-) diff --git a/libavcodec/flacdsp.c b/libavcodec/flacdsp.c index bc9a5dbed9..84d8b9571a 100644 --- a/libavcodec/flacdsp.c +++ b/libavcodec/flacdsp.c @@ -43,6 +43,31 @@ #define PLANAR 1 #include "flacdsp_template.c" +#define ZIGZAG_32BIT_MAX 0x3FFFFFFF +#define ZIGZAG_32BIT_MIN -0x3FFFFFFF + +int ff_flacdsp_lpc_encode_c_32_overflow_detect(int32_t *res, const int32_t *smp, int len, + int order, const int32_t *coefs, int shift) +{ + int i; + for (i = 0; i < order; i++) + res[i] = smp[i]; + for (int i = order; i < len; i++) { + int64_t p = 0, tmp = 0; + for (int j = 0; j < order; j++) { + p += (int64_t)coefs[j]*smp[(i-1)-j]; + } + p >>= shift; + tmp = smp[i] - p; + if(p > INT32_MAX || p < INT32_MIN || + tmp > ZIGZAG_32BIT_MAX || tmp < ZIGZAG_32BIT_MIN) + return 0; + res[i] = tmp; + } + return 1; +} + + static void flac_lpc_16_c(int32_t *decoded, const int coeffs[32], int pred_order, int qlevel, int len) { diff --git a/libavcodec/flacdsp.h b/libavcodec/flacdsp.h index 7bb0dd0e9a..7441e4ca62 100644 --- a/libavcodec/flacdsp.h +++ b/libavcodec/flacdsp.h @@ -40,4 +40,7 @@ void ff_flacdsp_init(FLACDSPContext *c, enum AVSampleFormat fmt, int channels, i void ff_flacdsp_init_arm(FLACDSPContext *c, enum AVSampleFormat fmt, int channels, int bps); void ff_flacdsp_init_x86(FLACDSPContext *c, enum AVSampleFormat fmt, int channels, int bps); +int ff_flacdsp_lpc_encode_c_32_overflow_detect(int32_t *res, const int32_t *smp, int len, + int order, const int32_t *coefs, int shift); + #endif /* AVCODEC_FLACDSP_H */ diff --git a/libavcodec/flacenc.c b/libavcodec/flacenc.c index 595928927d..f9c1451771 100644 --- a/libavcodec/flacenc.c +++ b/libavcodec/flacenc.c @@ -254,10 +254,29 @@ static av_cold int flac_encode_init(AVCodecContext *avctx) s->bps_code = 4; break; case AV_SAMPLE_FMT_S32: - if (avctx->bits_per_raw_sample != 24) - av_log(avctx, AV_LOG_WARNING, "encoding as 24 bits-per-sample\n"); - avctx->bits_per_raw_sample = 24; - s->bps_code = 6; + if (avctx->bits_per_raw_sample > 0 && avctx->bits_per_raw_sample <= 24){ + if(avctx->bits_per_raw_sample < 24) + av_log(avctx, AV_LOG_WARNING, "encoding as 24 bits-per-sample\n"); + avctx->bits_per_raw_sample = 24; + s->bps_code = 6; + } else { + av_log(avctx, AV_LOG_WARNING, "non-streamable bits-per-sample\n"); + s->bps_code = 0; + if (avctx->bits_per_raw_sample == 0) + avctx->bits_per_raw_sample = 32; + if(s->options.lpc_type != FF_LPC_TYPE_LEVINSON){ + av_log(avctx, AV_LOG_WARNING, "forcing lpc_type levinson, others not supported with >24 bits-per-sample FLAC\n"); + s->options.lpc_type = FF_LPC_TYPE_LEVINSON; + } + if (avctx->bits_per_raw_sample == 32){ + /* Because stereo decorrelation can raise the bitdepth of + * a subframe to 33 bits, we disable it */ + if(s->options.ch_mode != FLAC_CHMODE_INDEPENDENT){ + av_log(avctx, AV_LOG_WARNING, "disabling stereo decorrelation, not supported with 32 bits-per-sample FLAC\n"); + s->options.ch_mode = FLAC_CHMODE_INDEPENDENT; + } + } + } break; } @@ -686,7 +705,7 @@ static uint64_t calc_rice_params(RiceContext *rc, tmp_rc.coding_mode = rc->coding_mode; - for (i = 0; i < n; i++) + for (i = pred_order; i < n; i++) udata[i] = (2 * data[i]) ^ (data[i] >> 31); calc_sum_top(pmax, exact ? kmax : 0, udata, n, pred_order, sums); @@ -868,7 +887,11 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) order = av_clip(order, min_order - 1, max_order - 1); if (order == last_order) continue; - if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(order) <= 32) { + if (s->avctx->bits_per_raw_sample > 24) { + if(!ff_flacdsp_lpc_encode_c_32_overflow_detect(res, smp, n, order+1, + coefs[order], shift[order])) + continue; + } else if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(order) <= 32) { s->flac_dsp.lpc16_encode(res, smp, n, order+1, coefs[order], shift[order]); } else { @@ -888,7 +911,11 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) opt_order = 0; bits[0] = UINT32_MAX; for (i = min_order-1; i < max_order; i++) { - if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(i) <= 32) { + if (s->avctx->bits_per_raw_sample > 24) { + if(!ff_flacdsp_lpc_encode_c_32_overflow_detect(res, smp, n, i+1, + coefs[i], shift[i])) + continue; + } else if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(i) <= 32) { s->flac_dsp.lpc16_encode(res, smp, n, i+1, coefs[i], shift[i]); } else { s->flac_dsp.lpc32_encode(res, smp, n, i+1, coefs[i], shift[i]); @@ -910,7 +937,11 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) for (i = last-step; i <= last+step; i += step) { if (i < min_order-1 || i >= max_order || bits[i] < UINT32_MAX) continue; - if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(i) <= 32) { + if (s->avctx->bits_per_raw_sample > 24) { + if(!ff_flacdsp_lpc_encode_c_32_overflow_detect(res, smp, n, i+1, + coefs[i], shift[i])) + continue; + } else if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(i) <= 32) { s->flac_dsp.lpc32_encode(res, smp, n, i+1, coefs[i], shift[i]); } else { s->flac_dsp.lpc16_encode(res, smp, n, i+1, coefs[i], shift[i]); @@ -951,7 +982,11 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) if (diffsum >8) continue; - if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(opt_order - 1) <= 32) { + if (s->avctx->bits_per_raw_sample > 24) { + if(!ff_flacdsp_lpc_encode_c_32_overflow_detect(res, smp, n, opt_order, + lpc_try, shift[opt_order-1])) + continue; + } else if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(opt_order-1) <= 32) { s->flac_dsp.lpc16_encode(res, smp, n, opt_order, lpc_try, shift[opt_order-1]); } else { s->flac_dsp.lpc32_encode(res, smp, n, opt_order, lpc_try, shift[opt_order-1]); @@ -972,7 +1007,25 @@ static int encode_residual_ch(FlacEncodeContext *s, int ch) for (i = 0; i < sub->order; i++) sub->coefs[i] = coefs[sub->order-1][i]; - if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(opt_order) <= 32) { + if (s->avctx->bits_per_raw_sample > 24) { + if (!ff_flacdsp_lpc_encode_c_32_overflow_detect(res, smp, n, sub->order, + sub->coefs, sub->shift)) { + /* The found LPC coefficients produce predictions that overflow + * 32-bit signed integer or produce residuals that do not fall + * between -2^30 and 2^30. First try again with slightly smaller + * coefficients so that the prediction undershoots, if that + * doesn't help return a verbatim subframe instead */ + for (i = 0; i < sub->order; i++) { + sub->coefs[i] = sub->coefs[i]*0.98; + if (!ff_flacdsp_lpc_encode_c_32_overflow_detect(res, smp, n, sub->order, + sub->coefs, sub->shift)) { + sub->type = sub->type_code = FLAC_SUBFRAME_VERBATIM; + memcpy(res, smp, n * sizeof(int32_t)); + return subframe_count_exact(s, sub, 0); + } + } + } + } else if (s->bps_code * 4 + s->options.lpc_coeff_precision + av_log2(sub->order) <= 32) { s->flac_dsp.lpc16_encode(res, smp, n, sub->order, sub->coefs, sub->shift); } else { s->flac_dsp.lpc32_encode(res, smp, n, sub->order, sub->coefs, sub->shift); @@ -1226,13 +1279,21 @@ static void write_subframes(FlacEncodeContext *s) /* subframe */ if (sub->type == FLAC_SUBFRAME_CONSTANT) { put_sbits(&s->pb, sub->obits, res[0]); - } else if (sub->type == FLAC_SUBFRAME_VERBATIM) { + } else if (sub->type == FLAC_SUBFRAME_VERBATIM && sub->obits < 32) { while (res < frame_end) put_sbits(&s->pb, sub->obits, *res++); + } else if (sub->type == FLAC_SUBFRAME_VERBATIM) { + while (res < frame_end) + put_bits32(&s->pb, *res++); } else { /* warm-up samples */ - for (i = 0; i < sub->order; i++) - put_sbits(&s->pb, sub->obits, *res++); + if(sub->obits < 32){ + for (i = 0; i < sub->order; i++) + put_sbits(&s->pb, sub->obits, *res++); + }else{ + for (i = 0; i < sub->order; i++) + put_bits32(&s->pb, *res++); + } /* LPC coefficients */ if (sub->type == FLAC_SUBFRAME_LPC) { @@ -1305,7 +1366,7 @@ static int update_md5_sum(FlacEncodeContext *s, const void *samples) (const uint16_t *) samples, buf_size / 2); buf = s->md5_buffer; #endif - } else { + } else if (s->avctx->bits_per_raw_sample <= 24) { int i; const int32_t *samples0 = samples; uint8_t *tmp = s->md5_buffer; @@ -1315,6 +1376,16 @@ static int update_md5_sum(FlacEncodeContext *s, const void *samples) AV_WL24(tmp + 3*i, v); } buf = s->md5_buffer; + } else { + /* s->avctx->bits_per_raw_sample <= 32 */ + int i; + const int32_t *samples0 = samples; + uint8_t *tmp = s->md5_buffer; + + for (i = 0; i < s->frame.blocksize * s->channels; i++) { + AV_WL32(tmp + 4*i, samples0[i]); + } + buf = s->md5_buffer; } av_md5_update(s->md5ctx, buf, buf_size);