From patchwork Sun Dec 3 22:23:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 44893 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:a301:b0:181:818d:5e7f with SMTP id x1csp2432241pzk; Sun, 3 Dec 2023 14:24:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IGsEvZHq0mf21uhmCxiRRlqPXM9KpL31KvGvJfcCbDwv4ZU5k4QE8Xt55A9ZUdvA4TFOmPE X-Received: by 2002:a17:906:3f0a:b0:9bd:a75a:5644 with SMTP id c10-20020a1709063f0a00b009bda75a5644mr2921975ejj.16.1701642253070; Sun, 03 Dec 2023 14:24:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701642253; cv=none; d=google.com; s=arc-20160816; b=jv9yJL2tJBBYkZyu3BGvumgvnhwQ0CUzwNyygmh7KlO5ExbKivETvc7kz79hDUvt1s SMQcAPYfWZ/KZKhIglgHszAFMg9uF+6JmbIEsbo7sQBedcHBni1r3pR5SyXyH5Rw/Exr A2XugWltZHOF2Lu6GaG3elo/Z66NSqvC21loOXibSOAuSWhXVdoObmqLozop32Lv3pz3 9YOflfyW8YY23GhWTl1LpMiKl7cVEEsxOHxoNXcVLGcuvSH1XTHNT4fYX9jfZYFPmerJ O5mN4jiLzxMHRkuufDFlHUBE72Soo3UqI/ExE1N6scf+8fvtfxPoayBA8lGw/lh5xhlG 85nA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:to :message-id:date:from:mime-version:dkim-signature:delivered-to; bh=1JuxuTALoOMchBFdq7A6t/3qMbAOvdSj/lrECMWpDj8=; fh=e5zN9xSzcxLA6bGo3lF+CqTbY/oLwzApV03EO/RBfgQ=; b=voVzQCK/0T2hia+Ww40VfwtgvBT0r0IAoOfa1k9mqQJ0KmnpZ6LURLGj0fEIbsLVjp PmqEu1axuGKB7TVFu7iH6wallAKQwqcvD93G2ioKHj0zSCmmDX39S+65Gh1iB+brtx2/ v8axz92yGpyYEPwFuegdoJbisQODIByljB+hW/FKvkStkhylwr2gBK3U1xiClr/svK+w 85eAVjGCohY5n8XDFh9V5guaeD/Q4H+F37rQvF49TtLoWo9FAZefB9OWOYXJZ8oeMdwa JLu1olV7dTXn5j0fPucb7M6zaP/zM4g9fOBV/FAPXw7zsQkk99BPUUsSp541KORuZdxz TBaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=kll1yULR; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id qw16-20020a170906fcb000b00a199f9b4619si2998410ejb.252.2023.12.03.14.24.12; Sun, 03 Dec 2023 14:24:13 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=kll1yULR; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DC3CB68BC08; Mon, 4 Dec 2023 00:24:08 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-vk1-f174.google.com (mail-vk1-f174.google.com [209.85.221.174]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D1BE868CAC2 for ; Mon, 4 Dec 2023 00:24:01 +0200 (EET) Received: by mail-vk1-f174.google.com with SMTP id 71dfb90a1353d-4b2e0492fffso82560e0c.0 for ; Sun, 03 Dec 2023 14:24:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701642240; x=1702247040; darn=ffmpeg.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=8v1iMlbc8xIly752RQclYp8UB6ZfQAbp+7N8/4s25/w=; b=kll1yULRuoqmQpPn9DC+KTIEK2+A0fVKhPctPmWAI4Kuuwv5NUA2y48f1iRQNetvHW fqd93RPlqhNfN5l3yK8yY7dpE09rlsWbOYbFBioSM6Fc/vpG9Zpie5UItqyAyVQlCuB8 ZlnYhdVvhI1j5YXK+Rx/n6sxYjQ1rLcazNRUHTtdKf6kUm3Uf7r4J8Ij6ZCMnaoM/xeg Y55yvM/hl08pQkA5IEKC+SZ4fwlOM8v+4qHHrs0kkNuBggFhgNzGuTGCA5ycwet2hc9b r8y/eS1Fcrl2XjURN5JoFCtEMq1ljIyHzDlJxHX1mM7mqa0OW0NW3W93G4b7gkRqOuec yz8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701642240; x=1702247040; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=8v1iMlbc8xIly752RQclYp8UB6ZfQAbp+7N8/4s25/w=; b=e3wOUgi9cIzmrCVRJS8aRG9K++0wamQHNKHAcFJpaKXoE7meIwAP7rC4qI8WczlQma UCnjBgk6kIDv01hXYlow0y5qEeJOxFEqqC/0gGAWoOTDd883KjE61yOk+c8gDRliKu0+ +FK11sKU+CjOuHr+B61SJgmxbacnWwdRpGMqZDXgrq2x0+VfMJyfNjLktgdjMnG0vnVE hEnhOxF3GuC4ZJd2C6QeDOTFLMHfM/0P9z8g7N5kBprpAnfVL3Z4rGBDAkcE3sP/qF3F nWbzJGSyhqjcx7sMVNSc1JPthHQVmIcxWLykmnXuAl8iakJzyncy/n0IHvxcehrKev8W 6pxA== X-Gm-Message-State: AOJu0YyFwjF/YG94aJcAMcn2tZc1XezLCFtP96TrD0Tsx5L3X0QYMFNk iSkhwS3oT2bZf9ZpLPUfcZ2AAPP6EqHZNWssD0StJUwz X-Received: by 2002:a05:6122:2518:b0:4b2:c555:384 with SMTP id cl24-20020a056122251800b004b2c5550384mr2074945vkb.26.1701642240151; Sun, 03 Dec 2023 14:24:00 -0800 (PST) MIME-Version: 1.0 From: Paul B Mahol Date: Sun, 3 Dec 2023 23:23:48 +0100 Message-ID: To: FFmpeg development discussions and patches X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: [FFmpeg-devel] [PATCH] libavfilter/asrc_flite: fixes and improvements X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: BvOc1MjZTujU Attached. From e8aad4411ee0f8bc4bd50d5e3a10b7f712687f60 Mon Sep 17 00:00:00 2001 From: Paul B Mahol Date: Sun, 3 Dec 2023 22:50:11 +0100 Subject: [PATCH 2/2] avfilter/asrc_flite: use streaming function Fix continuous accumulation of audio samples for big txt inputs. Signed-off-by: Paul B Mahol --- libavfilter/asrc_flite.c | 84 ++++++++++++++++++++++++++++++---------- 1 file changed, 64 insertions(+), 20 deletions(-) diff --git a/libavfilter/asrc_flite.c b/libavfilter/asrc_flite.c index 74c8414b5c..70a2fd3e40 100644 --- a/libavfilter/asrc_flite.c +++ b/libavfilter/asrc_flite.c @@ -24,6 +24,8 @@ */ #include +#include "libavutil/audio_fifo.h" +#include "libavutil/avstring.h" #include "libavutil/channel_layout.h" #include "libavutil/file.h" #include "libavutil/opt.h" @@ -39,11 +41,14 @@ typedef struct FliteContext { char *voice_str; char *textfile; char *text; - cst_wave *wave; - int16_t *wave_samples; - int wave_nb_samples; + char *text_p; + char *text_saveptr; + int nb_channels; + int sample_rate; + AVAudioFifo *fifo; int list_voices; cst_voice *voice; + cst_audio_streaming_info *asi; struct voice_entry *voice_entry; int64_t pts; int frame_nb_samples; ///< number of samples per frame @@ -140,10 +145,30 @@ static int select_voice(struct voice_entry **entry_ret, const char *voice_name, return AVERROR(EINVAL); } +static int audio_stream_chunk_by_word(const cst_wave *w, int start, int size, + int last, cst_audio_streaming_info *asi) +{ + FliteContext *flite = asi->userdata; + void *const ptr[8] = { &w->samples[start] }; + + flite->nb_channels = w->num_channels; + flite->sample_rate = w->sample_rate; + if (!flite->fifo) { + flite->fifo = av_audio_fifo_alloc(AV_SAMPLE_FMT_S16, flite->nb_channels, size); + if (!flite->fifo) + return CST_AUDIO_STREAM_STOP; + } + + av_audio_fifo_write(flite->fifo, ptr, size); + + return CST_AUDIO_STREAM_CONT; +} + static av_cold int init(AVFilterContext *ctx) { FliteContext *flite = ctx->priv; int ret = 0; + char *text; if (flite->list_voices) { list_voices(ctx, "\n"); @@ -197,10 +222,21 @@ static av_cold int init(AVFilterContext *ctx) return AVERROR(EINVAL); } - /* synth all the file data in block */ - flite->wave = flite_text_to_wave(flite->text, flite->voice); - flite->wave_samples = flite->wave->samples; - flite->wave_nb_samples = flite->wave->num_samples; + flite->asi = new_audio_streaming_info(); + if (!flite->asi) + return AVERROR_BUG; + + flite->asi->asc = audio_stream_chunk_by_word; + flite->asi->userdata = flite; + feat_set(flite->voice->features, "streaming_info", audio_streaming_info_val(flite->asi)); + + flite->text_p = flite->text; + if (!(text = av_strtok(flite->text_p, "\n", &flite->text_saveptr))) + return AVERROR(EINVAL); + flite->text_p = NULL; + + flite_text_to_speech(text, flite->voice, "none"); + return 0; } @@ -216,8 +252,7 @@ static av_cold void uninit(AVFilterContext *ctx) } pthread_mutex_unlock(&flite_mutex); } - delete_wave(flite->wave); - flite->wave = NULL; + av_audio_fifo_free(flite->fifo); } static int query_formats(AVFilterContext *ctx) @@ -230,13 +265,13 @@ static int query_formats(AVFilterContext *ctx) AVFilterFormats *sample_rates = NULL; AVChannelLayout chlayout = { 0 }; - av_channel_layout_default(&chlayout, flite->wave->num_channels); + av_channel_layout_default(&chlayout, flite->nb_channels); if ((ret = ff_add_channel_layout (&chlayouts , &chlayout )) < 0 || (ret = ff_set_common_channel_layouts (ctx , chlayouts )) < 0 || (ret = ff_add_format (&sample_formats, AV_SAMPLE_FMT_S16 )) < 0 || (ret = ff_set_common_formats (ctx , sample_formats )) < 0 || - (ret = ff_add_format (&sample_rates , flite->wave->sample_rate)) < 0 || + (ret = ff_add_format (&sample_rates , flite->sample_rate )) < 0 || (ret = ff_set_common_samplerates (ctx , sample_rates )) < 0) return ret; @@ -248,12 +283,13 @@ static int config_props(AVFilterLink *outlink) AVFilterContext *ctx = outlink->src; FliteContext *flite = ctx->priv; - outlink->sample_rate = flite->wave->sample_rate; - outlink->time_base = (AVRational){1, flite->wave->sample_rate}; + outlink->sample_rate = flite->sample_rate; + outlink->time_base = (AVRational){1, flite->sample_rate}; av_log(ctx, AV_LOG_VERBOSE, "voice:%s fmt:%s sample_rate:%d\n", flite->voice_str, av_get_sample_fmt_name(outlink->format), outlink->sample_rate); + return 0; } @@ -261,14 +297,23 @@ static int activate(AVFilterContext *ctx) { AVFilterLink *outlink = ctx->outputs[0]; FliteContext *flite = ctx->priv; - int nb_samples = FFMIN(flite->wave_nb_samples, flite->frame_nb_samples); AVFrame *samplesref; + int nb_samples; if (!ff_outlink_frame_wanted(outlink)) return FFERROR_NOT_READY; + nb_samples = FFMIN(av_audio_fifo_size(flite->fifo), flite->frame_nb_samples); if (!nb_samples) { - ff_outlink_set_status(outlink, AVERROR_EOF, flite->pts); + char *text; + + if (!(text = av_strtok(flite->text_p, "\n", &flite->text_saveptr))) { + ff_outlink_set_status(outlink, AVERROR_EOF, flite->pts); + return 0; + } + + flite_text_to_speech(text, flite->voice, "none"); + ff_filter_set_ready(ctx, 100); return 0; } @@ -276,13 +321,12 @@ static int activate(AVFilterContext *ctx) if (!samplesref) return AVERROR(ENOMEM); - memcpy(samplesref->data[0], flite->wave_samples, - nb_samples * flite->wave->num_channels * 2); + av_audio_fifo_read(flite->fifo, (void **)samplesref->extended_data, + nb_samples); + samplesref->pts = flite->pts; - samplesref->sample_rate = flite->wave->sample_rate; + samplesref->sample_rate = flite->sample_rate; flite->pts += nb_samples; - flite->wave_samples += nb_samples * flite->wave->num_channels; - flite->wave_nb_samples -= nb_samples; return ff_filter_frame(outlink, samplesref); } -- 2.42.1