From patchwork Mon Mar 27 02:58:33 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tyler Jones <tdjones879@gmail.com>
X-Patchwork-Id: 3111
Delivered-To: ffmpegpatchwork@gmail.com
Received: by 10.103.50.79 with SMTP id y76csp1028013vsy;
	Sun, 26 Mar 2017 19:58:48 -0700 (PDT)
X-Received: by 10.223.147.100 with SMTP id 91mr585866wro.89.1490583528158;
	Sun, 26 Mar 2017 19:58:48 -0700 (PDT)
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
	by mx.google.com with ESMTP id
	s11si12781951wra.190.2017.03.26.19.58.47;
	Sun, 26 Mar 2017 19:58:48 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
	dkim=neutral (body hash did not verify) header.i=@gmail.com;
	spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
	designates 79.124.17.100 as permitted sender)
	smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
	dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gmail.com
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8E8896882A8;
	Mon, 27 Mar 2017 05:58:22 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail-it0-f68.google.com (mail-it0-f68.google.com
	[209.85.214.68])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 951A46809D7
	for <ffmpeg-devel@ffmpeg.org>; Mon, 27 Mar 2017 05:58:16 +0300 (EEST)
Received: by mail-it0-f68.google.com with SMTP id 190so8895131itm.3
	for <ffmpeg-devel@ffmpeg.org>; Sun, 26 Mar 2017 19:58:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=date:from:to:subject:message-id:mime-version:content-disposition
	:user-agent; bh=j9WVzmf2i8jBbbVhZID5KpjEAJW5DU27sqmNo8t/m8I=;
	b=oacGKl71P9eBIMtIAWvzEF4UC9BncuUUMpA3/zPdE+MqHAWl798fNRE8JhpH/g8l0m
	/ugm9mTZb3p/TA1xNTlMSWlSeu7SZnT51S0GrfQ7AgYUrnVDZ3gtxgZ5fb0out/wDYox
	bMJAmqOcpeQPLvQWQU+P1wF6TVYsB4fJSO6eiFOXq5OSmIk5J6Lk4cM0F6CpFCDb9JNh
	hp2VMHJcfHPTuPGYG/M0eTvXzpRxCNdpYkC513iq4mTb8/3zVzl1kAJ6W2TqetU39o2E
	bgsP1y94rh5MstV6XuQCwV0pqpObj1u05M9RZdv3H9DhI/5K0vkrJ/Csgl9m/ySB2ZS5
	CF3A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:date:from:to:subject:message-id:mime-version
	:content-disposition:user-agent;
	bh=j9WVzmf2i8jBbbVhZID5KpjEAJW5DU27sqmNo8t/m8I=;
	b=IlQ5koxJWlvRhZL/3SY6DcWz9KtTSKdpjAXvWFt7WSwUT7gbliYhD7v+n48Bn3PBtS
	S9QA353wKKa57UmK8ax6C5vElzMAeE9mGabGFR8fJdeAUE32PM4OdHpyJiLn5LiUj8b0
	4JlbhFDQOtwLwWAPDVBqkz0dtfTAPkiPqmAcYbsMQUq9lbg2tkfjjYP2mDQWfy0Wc4bB
	93UEMeVYtRSaUzjXf/XA+8HPRVXAf/RSaHTu7QZ8dC+Uqv1Ey0n++U2JrEfzBe87+nAi
	RoqGUR0YcxNEfyrLFu6osMK/yYX7bT7hYZuVToo0inDy8DNt1r1J6jzrfUvTMPa5qAyg
	cO3g==
X-Gm-Message-State: 
 AFeK/H35qZSjZmcuU+gC5/XZ4RSUZBfw0iRWDziowWw3h5ZgBoEgDzRBPddnAD5lu5USng==
X-Received: by 10.36.70.210 with SMTP id j201mr101614itb.109.1490583516893;
	Sun, 26 Mar 2017 19:58:36 -0700 (PDT)
Received: from tdjones879 (host-184-167-177-46.csp-wy.client.bresnan.net.
	[184.167.177.46]) by smtp.gmail.com with ESMTPSA id
	c91sm5042866iod.18.2017.03.26.19.58.35
	for <ffmpeg-devel@ffmpeg.org>
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Sun, 26 Mar 2017 19:58:36 -0700 (PDT)
Date: Sun, 26 Mar 2017 20:58:33 -0600
From: Tyler Jones <tdjones879@gmail.com>
To: ffmpeg-devel@ffmpeg.org
Message-ID: <20170327025833.GA1669@tdjones879>
MIME-Version: 1.0
Content-Disposition: inline
User-Agent: Mutt/1.5.24 (2015-08-30)
Subject: [FFmpeg-devel] [PATCH 2/2] avcodec/vorbisenc: Implement transient
	detection in Vorbis encoder
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <http://ffmpeg.org/mailman/options/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <http://ffmpeg.org/pipermail/ffmpeg-devel/>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
	<mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

The existing AAC psychoacoustic system is used to detect transients within the
vorbis encoder. This is useful, in general, as an initial step in later utilizing
a complex psychoacoustic model for the vorbis encoder, but more specifically
allows the cacellation of pre-echo effects that frequently occur with this
codec.

Signed-off-by: Tyler Jones <tdjones879@gmail.com>
---
 libavcodec/psymodel.c  |  1 +
 libavcodec/vorbisenc.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/libavcodec/psymodel.c b/libavcodec/psymodel.c
index 2b5f111..38831ce 100644
--- a/libavcodec/psymodel.c
+++ b/libavcodec/psymodel.c
@@ -62,6 +62,7 @@ av_cold int ff_psy_init(FFPsyContext *ctx, AVCodecContext *avctx, int num_lens,
 
     switch (ctx->avctx->codec_id) {
     case AV_CODEC_ID_AAC:
+    case AV_CODEC_ID_VORBIS:
         ctx->model = &ff_aac_psy_model;
         break;
     }
diff --git a/libavcodec/vorbisenc.c b/libavcodec/vorbisenc.c
index 2974ca2..e4ec822 100644
--- a/libavcodec/vorbisenc.c
+++ b/libavcodec/vorbisenc.c
@@ -33,6 +33,8 @@
 #include "vorbis.h"
 #include "vorbis_enc_data.h"
 
+#include "psymodel.h"
+
 #define BITSTREAM_WRITER_LE
 #include "put_bits.h"
 
@@ -126,6 +128,9 @@ typedef struct vorbis_enc_context {
     vorbis_enc_mode *modes;
 
     int64_t next_pts;
+
+    FFPsyContext psy;
+    struct FFPsyPreprocessContext* psypp;
 } vorbis_enc_context;
 
 #define MAX_CHANNELS     2
@@ -1024,10 +1029,38 @@ static int vorbis_encode_frame(AVCodecContext *avctx, AVPacket *avpkt,
     vorbis_enc_context *venc = avctx->priv_data;
     float **audio = frame ? (float **)frame->extended_data : NULL;
     int samples = frame ? frame->nb_samples : 0;
+    float *samples2, *la, *overlap;
     vorbis_enc_mode *mode;
     vorbis_enc_mapping *mapping;
     PutBitContext pb;
     int i, ret;
+    int start_ch, ch, chans, cur_channel;
+    FFPsyWindowInfo windows[MAX_CHANNELS];
+    enum WindowSequence window_sequence[MAX_CHANNELS];
+
+    if (!avctx->frame_number)
+        return 0;
+
+    if (venc->psypp)
+        ff_psy_preprocess(venc->psypp, audio, venc->channels);
+
+    if (frame) {
+        start_ch = 0;
+        cur_channel = 0;
+        for (i = 0; i < venc->channels - 1; i++) {
+            FFPsyWindowInfo* wi = windows + start_ch;
+            chans = 2;
+            for (ch = 0; ch < 2; ch++) {
+                cur_channel = start_ch + ch;
+                overlap = &audio[cur_channel][0];
+                samples2 = overlap + 1024;
+                la = samples2 + (448+64);
+                wi[ch] = venc->psy.model->window(&venc->psy, samples2, la,
+                                                 cur_channel, window_sequence[0]);
+            }
+            start_ch += chans;
+        }
+    }
 
     if (!apply_window_and_mdct(venc, audio, samples))
         return 0;
@@ -1158,7 +1191,10 @@ static av_cold int vorbis_encode_close(AVCodecContext *avctx)
 
     ff_mdct_end(&venc->mdct[0]);
     ff_mdct_end(&venc->mdct[1]);
+    ff_psy_end(&venc->psy);
 
+    if (venc->psypp)
+        ff_psy_preprocess_end(venc->psypp);
     av_freep(&avctx->extradata);
 
     return 0 ;
@@ -1168,6 +1204,10 @@ static av_cold int vorbis_encode_init(AVCodecContext *avctx)
 {
     vorbis_enc_context *venc = avctx->priv_data;
     int ret;
+    const uint8_t *sizes[MAX_CHANNELS];
+    uint8_t grouping[MAX_CHANNELS];
+    int lengths[MAX_CHANNELS];
+    int samplerate_index;
 
     if (avctx->channels != 2) {
         av_log(avctx, AV_LOG_ERROR, "Current FFmpeg Vorbis encoder only supports 2 channels.\n");
@@ -1190,6 +1230,26 @@ static av_cold int vorbis_encode_init(AVCodecContext *avctx)
 
     avctx->frame_size = 1 << (venc->log2_blocksize[0] - 1);
 
+    for (samplerate_index = 0; samplerate_index < 16; samplerate_index++)
+        if (avctx->sample_rate == mpeg4audio_sample_rates[samplerate_index])
+            break;
+    if (samplerate_index == 16 ||
+        samplerate_index >= ff_vorbis_swb_size_1024_len ||
+        samplerate_index >= ff_vorbis_swb_size_128_len)
+        av_log(avctx, AV_LOG_ERROR, "Unsupported sample rate %d\n", avctx->sample_rate);
+
+    sizes[0]   = ff_vorbis_swb_size_1024[samplerate_index];
+    sizes[1]   = ff_vorbis_swb_size_128[samplerate_index];
+    lengths[0] = ff_vorbis_num_swb_1024[samplerate_index];
+    lengths[1] = ff_vorbis_num_swb_128[samplerate_index];
+    grouping[0] = 1;
+
+    if ((ret = ff_psy_init(&venc->psy, avctx, 2,
+                           sizes, lengths,
+                           1, grouping)) < 0)
+        goto error;
+    venc->psypp = ff_psy_preprocess_init(avctx);
+
     return 0;
 error:
     vorbis_encode_close(avctx);