From patchwork Thu Feb 1 13:01:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Harshit Karwal X-Patchwork-Id: 45957 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1a28:b0:199:de12:6fa6 with SMTP id cj40csp249329pzb; Thu, 1 Feb 2024 05:02:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IGKWh0HSlCk6aK5cyqUcaGIoftrwkhYAaH9RaH5WOfdsXL6JeBBc+1kxqaL+sugheukJgwC X-Received: by 2002:a05:6402:1779:b0:55f:43ae:505c with SMTP id da25-20020a056402177900b0055f43ae505cmr4169172edb.5.1706792547141; Thu, 01 Feb 2024 05:02:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1706792547; cv=none; d=google.com; s=arc-20160816; b=Sdu1/ENqsrRurBLTXokZgrB8O/5uwCdSSHufBnz2C0ndgn4nR8Sn35MUnG2lWxpjHJ XSY6b8Mrf2vGx31ashUjjJt1scskRHkJu4PezBbBmv8cNOBJxzM9CPrVQtLuTfGshTes v7yL83ooTEDuGuJHVj94gcKGpouX3XTK+Yu458ppvw+GfCocYU2hrQ768wIix9Q7X2Tx 9z03d1XjPrcxi73omC5oBbzfNR2gMAn5fhB70D8kA99aVhZjeuOgdlVYPHaSaolsEUtX gN1DTojHYnv4Ki+PNK4DLq1YtFjsOQMi6fcYgc/hMqKDuR00bAd6nQk6sDInsPTLFRXI P5Tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=HrKqOGaZQ+q0x2uJAen2Nxb4NRbmh/QoTrz3+3rdyEg=; fh=2azhiyo7FOwlTgbXPpid6+LPQMjiGmY6j3aXee2EJkg=; b=bCJDJ0UImCcZeeYNO+Iho2GvPeebswP+d6vfFmfjXMcrcHsW9gH77C5stXdIe0ViIh pGv14KqtvFXsrWSxih/eIAJGoB4/VtwztNt+/YMWxSjJg1mnqcL10JTS8dowwhvrTy2G 9khY5Hko4zXV1OewDSqORTIKPfrykE6VzhS09EUe9e+qsstTmqGRYZN8pKpy4V1klBny 8J/KSIPI8nHZweLVm3NwOcYdrgo/o8PaVZ/T+1flF40IzaCaCSNA5hHiEJc0MsV8+kMC jqUJPXMLlYSaqhuqFqkt1uqtRX8DtmcooBA3xd4H+m9Yyu+ZHxfKPrhK7arxk1iyH4bj HY/g==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=hkrHOUGF; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com X-Forwarded-Encrypted: i=0; AJvYcCWKcqtqVwDGTy+zV079G18ipWlhkoEx6dNEwpcyWO8+79VI0MHCrF/JiXuVRCacCb3fTGm5WUqzE8wFaVd/NBtnBTqtsBt1kIQpt+3GzDOFHzrZZdYZC5fgzXteVfoD+v/3G+wM98SOPJTwJlLSmDy6+uUW09yVefpzv4KRbVFm/F/WOAMAdQ/T8VEcmxbs8iFGLY4bVCpT61uSutMRX0O24Gcx3vGF8Vl1aJko2IIKg7w8pUrm3ah5YxIMqJRFUAsnVqiMLPFVRRR9Jo+Fgow9vud4ijpgl6iyiUmQF65vdS4tqKYZlMHSYwNrSb5153Dp2U8bx2J8JIYPH3wX2wi3yVkOK+eYEo4+Z4ZW11O6c2l3rBhzfavDH/rWdifjK4GcDKhzFqbnxkzoN34vZ7GYkpDf/cwjspYjl6RnT1xOJD7lgTQKMYZ+kRqD70Qlsid/uQSlwdcXYIHCpR3FvPVKFILFHwECrseGhBLbvtRcS8oaI2+hXdoskKFO/pmt4h3QxSYhIgqCmfrz0dFY37UUCc6dHLWF9krzh6G1H11Dkk6l/pgCH/Ucplk3IKkZpTVbsO+Z4Z3gpJFIm2yZ752N6Ujfu1u3GIQECglxaBTLCVhkzTlYXP9ahgQAZ6IFf1qmyRBgYiND6np/jhyfL4Q7uCrPzS8g/cx/8kfk7AATqXBH7eQOpuTuSYJzVn0D7+aPu2G2lEmvbO6K6j2qu+jdmA2Sj/qt/l/YhIE8mmTnF+RtrcrmQ5PSSXNwyHPWkMiArfb7CM+pSd2J/anLdES7+ZKaafaaqS6y1nSX7HP76Q9MEZSTNSqg90qtQRsy1HXLNXv22xu4VFu9Jmx8MWuwNWcNMZoga66PzCYgWHTnsZ2sFsgNwYwpOrb0UVsGggpj+8spYOS2VnGk/MNNWcuZutcB+SbbEmYBNUOA0XEQsb8Rj86S4aCEhAmL56Jn3fGazK Yogu1k+0fzcn929Xq1PQQ1bY4WPHmH4mslh+P2vRD+wjWwGqPpdhIzOmUUeMNLRof4Or8kNhU8W7KTCbgvjApExN/QNxJDPVSgCILc4DXrnmk//Vp+zoy0Y/Z1KfbkGh+UjfLPv2coqnLzXls6LyFiJKlZKAorVE2WT1tMCZhwCaHEwnzOlTyRRRw7zldvq+mw6MGR/ZG0nB0P6tfW7fQ9gyx/NQPsqdteCKONU2pYuz5ZWhpWLbthpSPfAkypru+bJCKNVhD/V+o2r822y9EqMdf6moc//9KmLIkkdZWMNoY1JZ6PYkvuEC+MV1q9oMbGY/G6pr1q8B3mLANbxzJyceSDlY229yHbzoGa3B7Z60fYJc3AzLgBuLgjRsqqJtLaf0oyxGIlv0PRhXFaNgjL9J2pzNDJo9qBq6M8U7tVIE+UMeVu9cDWsrADo7zzHPyr4IIygwGqa08mhI9oPFogk6ouHNd/fHs2qkIppwQ3/bIZZ332vzcDeIeQCFYNh5mTdzocjnLTehGlxX/aLz7T7zzBDQ9/nKdB5ZLPJaA5wIRh9V4xPamibXgH8kjgtNlaZkRfJKn8In+842Y3Son2v4xqQD0CXWdhK65PivAZZihq5oyyoFxbDfDt1ezPMGzx0/JenFNeqU4= Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i35-20020a0564020f2300b0055f009a3b1asi3909897eda.627.2024.02.01.05.02.26; Thu, 01 Feb 2024 05:02:27 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=hkrHOUGF; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D227068D0A0; Thu, 1 Feb 2024 15:02:14 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-oo1-f46.google.com (mail-oo1-f46.google.com [209.85.161.46]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6DC6B68D085 for ; Thu, 1 Feb 2024 15:02:08 +0200 (EET) Received: by mail-oo1-f46.google.com with SMTP id 006d021491bc7-59a8b9b327aso405704eaf.2 for ; Thu, 01 Feb 2024 05:02:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706792526; x=1707397326; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=byo1SowTgOrBhytIh4xf0qIjJFfSOlnGH0Nm1mC2ro8=; b=hkrHOUGFy5rdOGAS9/xNkLxpXk6yNwjhjU8v9k6AS5VMCWUdVnmHiCMU9SkRA6jRmW w8n5VTCDybtgBSewviJAHPrdF8zrteSqWrPze9nxEo9UmKJuL92Ix7QREcZIJUcO3ceh oeJt54/HsaPtJGtIdnXA1xzUDANiJXRrJjVDlF1vejM7rgNWrjlhuuZp7xUcQl4gp1IU Cn2hMYjRkMmVGBFJP4eeDchRn/FD7Viu6JNan4A89giw2GKl75fk4c4U1Hre1HcagwSf 70aKRtB0jOqiTDT8gbwtnLSmUFab8gihyyA+l4I+o64T3BqCjn+863D7O44Hlr83xRR4 PhSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706792526; x=1707397326; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=byo1SowTgOrBhytIh4xf0qIjJFfSOlnGH0Nm1mC2ro8=; b=xFz/fsb048CWjjTaVNxM5fCJRA9gL0iaHfNKXas2CmKqjmzPdGjzHivtR5yorxoYnM Hq1sDm9pDEVtZQaMHu8N35fNgN2FyExlGefw7aSHc/7ELn437ePhnMseA8ncnbANYvJn UElJCtaEIhctq9T7bQQj3BiwMxJ2pZgq5yUSth5XseSJajCkEXl4E3m0qYHrmB6jIINN mxzzBV7SaRVTHN6LjLI3JZevoOrlpY/dHyrW5vPmoLE2eQRlEdTz/xJiQoGZws+oZZgT m3YOPOhp3y8lGjIsEA1SolF9yan1Gctj+/oLXHKDTbP6/jHJXuhD75Ad5BoJZVyVIqlB BBLg== X-Gm-Message-State: AOJu0YxmNKQiOv7bQPFEEc62fLB6ohbYKLN3Z9KrYqZ48xX5gIDs07EX asPXIqt263UERMo5gjVmvwsLlCnca3H1k6kZGqKSNLoxIOv7wRbOJILbOrqcsdNq9Q== X-Received: by 2002:a05:6358:9491:b0:176:51d2:802 with SMTP id i17-20020a056358949100b0017651d20802mr4455758rwb.1.1706792525230; Thu, 01 Feb 2024 05:02:05 -0800 (PST) Received: from Harshits-MBP.lan ([218.185.248.66]) by smtp.gmail.com with ESMTPSA id 36-20020a630c64000000b005cf5cbac29asm12478802pgm.53.2024.02.01.05.02.03 for (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 01 Feb 2024 05:02:04 -0800 (PST) From: Harshit Karwal To: ffmpeg-devel@ffmpeg.org Date: Thu, 1 Feb 2024 18:31:56 +0530 Message-Id: <20240201130157.38822-2-karwalharshit@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240201130157.38822-1-karwalharshit@gmail.com> References: <20240201130157.38822-1-karwalharshit@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v5 1/2] avfilter: add audio overlay filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: poBTbEiiiIXX Co-authored-by: Paul B Mahol Signed-off-by: Harshit Karwal --- doc/filters.texi | 40 +++ libavfilter/Makefile | 1 + libavfilter/af_aoverlay.c | 548 ++++++++++++++++++++++++++++++++++++++ libavfilter/allfilters.c | 1 + 4 files changed, 590 insertions(+) create mode 100644 libavfilter/af_aoverlay.c diff --git a/doc/filters.texi b/doc/filters.texi index 20c91bab3a..f36ad9a2fd 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -2779,6 +2779,46 @@ This filter supports the same commands as options, excluding option @code{order} Pass the audio source unchanged to the output. +@section aoverlay + +Replace a specified section of an audio stream with another input audio stream. + +In case no @option{enable} for timeline editing is specified, the second audio stream will +be output at sections of the first stream which have a gap in PTS (Presentation TimeStamp) values +such that the output stream's PTS values are monotonous. + +This filter also supports linear cross fading when transitioning from one +input stream to another. + +The filter accepts the following options: + +@table @option +@item cf_duration +Set duration (in seconds) for cross fade between the inputs. Default value is @code{100} milliseconds. +@end table + +@subsection Examples + +@itemize +@item +Replace the first stream with the second stream from @code{t=10} seconds to @code{t=20} seconds: +@example +ffmpeg -i first.wav -i second.wav -filter_complex "aoverlay=enable='between(t,10,20)'" output.wav +@end example + +@item +Do the same as above, but with crossfading for @code{2} seconds between the streams: +@example +ffmpeg -i first.wav -i second.wav -filter_complex "aoverlay=cf_duration=2:enable='between(t,10,20)'" output.wav +@end example + +@item +Introduce a PTS gap from @code{t=4} seconds to @code{t=8} seconds in the first stream and output the second stream during this gap: +@example +ffmpeg -i first.wav -i second.wav -filter_complex "[0]aselect='not(between(t,4,8))'[temp];[temp][1]aoverlay[out]" -map "[out]" output.wav +@end example +@end itemize + @section apad Pad the end of an audio stream with silence. diff --git a/libavfilter/Makefile b/libavfilter/Makefile index bba0219876..0f2b403441 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -81,6 +81,7 @@ OBJS-$(CONFIG_ANLMDN_FILTER) += af_anlmdn.o OBJS-$(CONFIG_ANLMF_FILTER) += af_anlms.o OBJS-$(CONFIG_ANLMS_FILTER) += af_anlms.o OBJS-$(CONFIG_ANULL_FILTER) += af_anull.o +OBJS-$(CONFIG_AOVERLAY_FILTER) += af_aoverlay.o OBJS-$(CONFIG_APAD_FILTER) += af_apad.o OBJS-$(CONFIG_APERMS_FILTER) += f_perms.o OBJS-$(CONFIG_APHASER_FILTER) += af_aphaser.o generate_wave_table.o diff --git a/libavfilter/af_aoverlay.c b/libavfilter/af_aoverlay.c new file mode 100644 index 0000000000..8dd2d02951 --- /dev/null +++ b/libavfilter/af_aoverlay.c @@ -0,0 +1,548 @@ +/* + * Copyright (c) 2023 Harshit Karwal + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/opt.h" +#include "libavutil/audio_fifo.h" + +#include "audio.h" +#include "avfilter.h" +#include "filters.h" +#include "internal.h" + +typedef struct AOverlayContext { + const AVClass *class; + AVFrame *main_input; + AVFrame *overlay_input; + int64_t pts; + int main_eof; + int overlay_eof; + + int default_mode; + int previous_samples; + int64_t pts_gap; + int64_t previous_pts; + int64_t pts_gap_start; + int64_t pts_gap_end; + + int is_disabled; + int nb_channels; + int crossfade_ready; + AVAudioFifo *main_sample_buffers; + AVAudioFifo *overlay_sample_buffers; + int64_t cf_duration; + int64_t cf_samples; + void (*crossfade_samples)(uint8_t **dst, uint8_t * const *cf0, + uint8_t * const *cf1, + int nb_samples, int channels); + + int64_t transition_pts; + int64_t transition_pts2; + + uint8_t **cf0; + uint8_t **cf1; +} AOverlayContext; + +static const enum AVSampleFormat sample_fmts[] = { + AV_SAMPLE_FMT_DBLP, AV_SAMPLE_FMT_FLTP, + AV_SAMPLE_FMT_S16P, AV_SAMPLE_FMT_S32P, + AV_SAMPLE_FMT_NONE +}; + +enum CrossfadeModes { + MODE_TIMELINE, + MODE_DEFAULT, + MODE_OVERLAY_EOF +}; + +#define SEGMENT_SIZE 1024 +#define OFFSET(x) offsetof(AOverlayContext, x) +#define FLAGS AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_FILTERING_PARAM + +static const AVOption aoverlay_options[] = { + { "cf_duration", "set duration for cross fade between the inputs", OFFSET(cf_duration), AV_OPT_TYPE_DURATION, {.i64 = 100000}, 0, 60000000, FLAGS }, + { NULL } +}; + +AVFILTER_DEFINE_CLASS(aoverlay); + +#define CROSSFADE_PLANAR(name, type) \ +static void crossfade_samples_## name ##p(uint8_t **dst, uint8_t * const *cf0, \ + uint8_t * const *cf1, \ + int nb_samples, int channels) \ +{ \ + for (int i = 0; i < nb_samples; i++) { \ + double main_gain = av_clipd(1.0 * (nb_samples - 1 - i) / nb_samples, 0, 1.); \ + double overlay_gain = av_clipd(1.0 * i / nb_samples, 0, 1.); \ + for (int c = 0; c < channels; c++) { \ + type *d = (type *)dst[c]; \ + const type *s0 = (type *)cf0[c]; \ + const type *s1 = (type *)cf1[c]; \ + \ + d[i] = s0[i] * main_gain + s1[i] * overlay_gain; \ + } \ + } \ +} + +CROSSFADE_PLANAR(dbl, double) +CROSSFADE_PLANAR(flt, float) +CROSSFADE_PLANAR(s16, int16_t) +CROSSFADE_PLANAR(s32, int32_t) + +static av_cold int init(AVFilterContext *ctx) +{ + AOverlayContext *s = ctx->priv; + + s->is_disabled = 1; + s->transition_pts = AV_NOPTS_VALUE; + s->transition_pts2 = AV_NOPTS_VALUE; + + return 0; +} + +static av_cold void uninit(AVFilterContext *ctx) +{ + AOverlayContext *s = ctx->priv; + + av_audio_fifo_free(s->main_sample_buffers); + av_audio_fifo_free(s->overlay_sample_buffers); + + for (int i = 0; i < s->nb_channels; i++) { + if (s->cf0) + av_freep(&s->cf0[i]); + if (s->cf1) + av_freep(&s->cf1[i]); + } + av_freep(&s->cf0); + av_freep(&s->cf1); + + av_frame_free(&s->main_input); + av_frame_free(&s->overlay_input); +} + +static int crossfade_prepare(AOverlayContext *s, AVFilterLink *main_inlink, AVFilterLink *overlay_inlink, AVFilterLink *outlink, + int nb_samples, AVFrame **main_buffer, AVFrame **overlay_buffer, enum CrossfadeModes mode) +{ + int ret; + + *main_buffer = ff_get_audio_buffer(outlink, nb_samples); + if (!(*main_buffer)) + return AVERROR(ENOMEM); + + (*main_buffer)->pts = s->pts; + s->pts += av_rescale_q(nb_samples, (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + + if ((ret = av_audio_fifo_read(s->main_sample_buffers, (void **)(*main_buffer)->extended_data, nb_samples)) < 0) + return ret; + + if (mode == MODE_DEFAULT) { + s->previous_samples = (*main_buffer)->nb_samples; + } else if (mode == MODE_OVERLAY_EOF || (mode == MODE_TIMELINE && s->is_disabled)) { + *overlay_buffer = ff_get_audio_buffer(outlink, nb_samples); + if (!(*overlay_buffer)) + return AVERROR(ENOMEM); + + if ((ret = av_audio_fifo_read(s->overlay_sample_buffers, (void **)(*overlay_buffer)->extended_data, nb_samples)) < 0) + return ret; + + (*overlay_buffer)->pts = (*main_buffer)->pts; + } + + s->crossfade_ready = 1; + + return 0; +} + +static int crossfade_samples(AOverlayContext *s, AVFilterLink *main_inlink, AVFilterLink *overlay_inlink, AVFilterLink *outlink, + int nb_samples, AVFrame **out, enum CrossfadeModes mode) +{ + int ret; + + *out = ff_get_audio_buffer(outlink, nb_samples); + if (!(*out)) + return AVERROR(ENOMEM); + + if ((ret = av_audio_fifo_read(s->main_sample_buffers, (void **) s->cf0, nb_samples)) < 0) + return ret; + + if ((ret = av_audio_fifo_read(s->overlay_sample_buffers, (void **) s->cf1, nb_samples)) < 0) + return ret; + + if (mode == MODE_TIMELINE) { + s->is_disabled ? s->crossfade_samples((*out)->extended_data, s->cf1, s->cf0, nb_samples, (*out)->ch_layout.nb_channels) + : s->crossfade_samples((*out)->extended_data, s->cf0, s->cf1, nb_samples, (*out)->ch_layout.nb_channels); + } else if (mode == MODE_OVERLAY_EOF) { + s->crossfade_samples((*out)->extended_data, s->cf1, s->cf0, s->cf_samples, (*out)->ch_layout.nb_channels); + } else if (mode == MODE_DEFAULT) { + s->transition_pts2 != AV_NOPTS_VALUE ? s->crossfade_samples((*out)->extended_data, s->cf1, s->cf0, nb_samples, (*out)->ch_layout.nb_channels) + : s->crossfade_samples((*out)->extended_data, s->cf0, s->cf1, nb_samples, (*out)->ch_layout.nb_channels); + } + + (*out)->pts = s->pts; + s->pts += av_rescale_q(nb_samples, (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + s->transition_pts = AV_NOPTS_VALUE; + s->transition_pts2 = AV_NOPTS_VALUE; + s->crossfade_ready = 0; + + return 0; +} + +static int consume_samples(AOverlayContext *s, AVFilterLink *overlay_inlink, AVFilterLink *outlink) +{ + int ret, status, nb_samples; + int64_t pts; + + nb_samples = FFMIN(SEGMENT_SIZE, av_audio_fifo_space(s->overlay_sample_buffers)); + + ret = ff_inlink_consume_samples(overlay_inlink, nb_samples, nb_samples, &s->overlay_input); + if (ret < 0) { + return ret; + } else if (ff_inlink_acknowledge_status(overlay_inlink, &status, &pts)) { + s->overlay_eof = 1; + return 0; + } else if (!ret) { + if (ff_outlink_frame_wanted(outlink)) + ff_inlink_request_frame(overlay_inlink); + return 0; + } + + ret = av_audio_fifo_write(s->overlay_sample_buffers, (void **)s->overlay_input->extended_data, nb_samples); + av_frame_free(&s->overlay_input); + if (ret < 0) + return ret; + + return 1; +} + +static int activate(AVFilterContext *ctx) +{ + AOverlayContext *s = ctx->priv; + int status, ret, nb_samples; + int64_t pts; + AVFrame *out = NULL, *main_buffer = NULL, *overlay_buffer = NULL; + + AVFilterLink *main_inlink = ctx->inputs[0]; + AVFilterLink *overlay_inlink = ctx->inputs[1]; + AVFilterLink *outlink = ctx->outputs[0]; + + FF_FILTER_FORWARD_STATUS_BACK_ALL(outlink, ctx); + + if (s->default_mode && (s->pts_gap_end - s->pts_gap_start <= 0 || s->overlay_eof)) { + s->default_mode = 0; + s->transition_pts2 = s->pts_gap_end; + } + + if (av_audio_fifo_space(s->main_sample_buffers) != 0 && !s->main_eof && !s->default_mode) { + nb_samples = FFMIN(SEGMENT_SIZE, av_audio_fifo_space(s->main_sample_buffers)); + + ret = ff_inlink_consume_samples(main_inlink, nb_samples, nb_samples, &s->main_input); + if (ret > 0) { + if (ctx->enable_str && s->is_disabled != ctx->is_disabled && !s->overlay_eof) { + s->is_disabled = ctx->is_disabled; + s->transition_pts = s->main_input->pts; + + if (s->main_input->nb_samples < av_audio_fifo_space(s->main_sample_buffers)) + s->crossfade_ready = 1; + if (av_audio_fifo_size(s->main_sample_buffers) == 0) { + s->transition_pts = AV_NOPTS_VALUE; + s->crossfade_ready = 0; + } + } + else if (!ctx->enable_str && !s->default_mode) { + if (s->previous_pts + av_rescale_q(s->previous_samples, (AVRational){ 1, outlink->sample_rate }, outlink->time_base) >= s->main_input->pts) { + s->default_mode = 0; + s->previous_pts = s->main_input->pts; + s->previous_samples = s->main_input->nb_samples; + } else if (!s->overlay_eof) { + s->pts_gap_start = s->previous_pts; + if (s->pts > 0 || av_audio_fifo_size(s->main_sample_buffers) > 0) + s->transition_pts = s->pts_gap_start; + s->pts_gap_end = s->main_input->pts; + s->default_mode = 1; + } + } + + ret = av_audio_fifo_write(s->main_sample_buffers, (void **)s->main_input->extended_data, nb_samples); + av_frame_free(&s->main_input); + if (ret < 0) + return ret; + } else if (ret < 0) { + return ret; + } else if (ff_inlink_acknowledge_status(main_inlink, &status, &pts)) { + s->main_eof = 1; + s->crossfade_ready = 1; + } else if (!ret) { + if (ff_outlink_frame_wanted(outlink)) + ff_inlink_request_frame(main_inlink); + return 0; + } + } + + if (s->main_eof && av_audio_fifo_size(s->main_sample_buffers) == 0 && ff_inlink_acknowledge_status(main_inlink, &status, &pts)) { + ff_outlink_set_status(outlink, status, pts); + return 0; + } + + if (av_audio_fifo_space(s->main_sample_buffers) > 0 && + (s->transition_pts == AV_NOPTS_VALUE || av_audio_fifo_size(s->main_sample_buffers) != s->cf_samples) && !s->default_mode) { + if (ff_inlink_acknowledge_status(main_inlink, &status, &pts)) { + s->main_eof = 1; + s->crossfade_ready = 1; + } else { + ff_inlink_request_frame(main_inlink); + return 0; + } + } + + if (!s->overlay_eof) { + if (av_audio_fifo_space(s->overlay_sample_buffers) > 0) { + ret = consume_samples(s, overlay_inlink, outlink); + if (ret <= 0) { + if (!s->overlay_eof) + return ret; + } + } + + if (av_audio_fifo_space(s->overlay_sample_buffers) > 0) { + if (ff_inlink_acknowledge_status(overlay_inlink, &status, &pts)) { + s->overlay_eof = 1; + s->transition_pts = s->pts + av_rescale_q(av_audio_fifo_size(s->overlay_sample_buffers) - (s->cf_samples / 2), + (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + s->is_disabled = 1; + } else { + ff_inlink_request_frame(overlay_inlink); + return 0; + } + } + } + + if (!ctx->enable_str) { + if (s->transition_pts != AV_NOPTS_VALUE && av_audio_fifo_size(s->main_sample_buffers) > s->cf_samples + SEGMENT_SIZE) { + nb_samples = av_audio_fifo_size(s->main_sample_buffers) + av_audio_fifo_space(s->main_sample_buffers) - s->cf_samples - SEGMENT_SIZE; + + if ((ret = crossfade_prepare(s, main_inlink, overlay_inlink, outlink, nb_samples, &main_buffer, &overlay_buffer, MODE_DEFAULT)) < 0) + return ret; + + return ff_filter_frame(outlink, main_buffer); + } else if (s->transition_pts != AV_NOPTS_VALUE || s->transition_pts2 != AV_NOPTS_VALUE) { + nb_samples = FFMIN(s->cf_samples, av_audio_fifo_size(s->main_sample_buffers) - SEGMENT_SIZE); + + if ((ret = crossfade_samples(s, main_inlink, overlay_inlink, outlink, nb_samples, &out, MODE_DEFAULT)) < 0) + return ret; + + av_log(ctx, AV_LOG_DEBUG, "PTS at stream transition: %lld\n", out->pts); + + return ff_filter_frame(outlink, out); + } else if (!s->default_mode) { + nb_samples = FFMIN(av_audio_fifo_size(s->main_sample_buffers), SEGMENT_SIZE); + + main_buffer = ff_get_audio_buffer(outlink, nb_samples); + if (!main_buffer) + return AVERROR(ENOMEM); + + main_buffer->pts = s->pts; + s->pts += av_rescale_q(nb_samples, (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + + if ((ret = av_audio_fifo_read(s->main_sample_buffers, (void **)main_buffer->extended_data, nb_samples)) < 0) + return ret; + } + + if (!s->default_mode || s->overlay_eof) { + s->previous_samples = main_buffer->nb_samples; + return ff_filter_frame(outlink, main_buffer); + } + + s->pts_gap = s->pts_gap_end - s->pts_gap_start; + + nb_samples = FFMIN(SEGMENT_SIZE, av_rescale_q(s->pts_gap, outlink->time_base, (AVRational){ 1, outlink->sample_rate })); + + overlay_buffer = ff_get_audio_buffer(outlink, nb_samples); + if (!overlay_buffer) + return AVERROR(ENOMEM); + + if ((ret = av_audio_fifo_read(s->overlay_sample_buffers, (void **)overlay_buffer->extended_data, nb_samples)) < 0) + return ret; + + s->previous_samples = nb_samples; + s->previous_pts += av_rescale_q(nb_samples, (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + s->pts_gap_start += av_rescale_q(nb_samples, (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + + overlay_buffer->pts = s->pts; + s->pts += av_rescale_q(nb_samples, (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + + av_frame_free(&main_buffer); + + return ff_filter_frame(outlink, overlay_buffer); + } + + if (s->overlay_eof && av_audio_fifo_size(s->overlay_sample_buffers) > 0) { + if (av_audio_fifo_size(s->overlay_sample_buffers) > s->cf_samples) { + nb_samples = av_audio_fifo_size(s->overlay_sample_buffers) - s->cf_samples; + + if ((ret = crossfade_prepare(s, main_inlink, overlay_inlink, outlink, nb_samples, &main_buffer, &overlay_buffer, MODE_OVERLAY_EOF)) < 0) + return ret; + + return ff_filter_frame(outlink, overlay_buffer); + } else if (av_audio_fifo_size(s->overlay_sample_buffers) >= s->cf_samples) { + if ((ret = crossfade_samples(s, main_inlink, overlay_inlink, outlink, s->cf_samples, &out, MODE_OVERLAY_EOF)) < 0) + return ret; + + av_log(ctx, AV_LOG_DEBUG, "PTS at stream transition: %lld\n", out->pts); + + return ff_filter_frame(outlink, out); + } + } + + if (s->transition_pts != AV_NOPTS_VALUE && !s->crossfade_ready) { + nb_samples = av_rescale_q(s->transition_pts - (s->cf_samples / 2) - s->pts, outlink->time_base, (AVRational) { 1, outlink->sample_rate }); + + if ((ret = crossfade_prepare(s, main_inlink, overlay_inlink, outlink, nb_samples, &main_buffer, &overlay_buffer, MODE_TIMELINE)) < 0) + return ret; + } else if (s->transition_pts != AV_NOPTS_VALUE) { + nb_samples = s->main_eof ? av_audio_fifo_size(s->main_sample_buffers) : s->cf_samples; + if (s->transition_pts < av_rescale_q(s->cf_samples, (AVRational){ 1, outlink->sample_rate }, outlink->time_base)) { + nb_samples = av_rescale_q(s->transition_pts, outlink->time_base, (AVRational){ 1, outlink->sample_rate }); + } + + if ((ret = crossfade_samples(s, main_inlink, overlay_inlink, outlink, nb_samples, &out, MODE_TIMELINE)) < 0) + return ret; + + av_log(ctx, AV_LOG_DEBUG, "PTS at stream transition: %lld\n", out->pts); + + return ff_filter_frame(outlink, out); + } else { + nb_samples = FFMIN(av_audio_fifo_size(s->main_sample_buffers), SEGMENT_SIZE); + main_buffer = ff_get_audio_buffer(outlink, nb_samples); + if (!main_buffer) + return AVERROR(ENOMEM); + + main_buffer->pts = s->pts; + s->pts += av_rescale_q(nb_samples, (AVRational){ 1, outlink->sample_rate }, outlink->time_base); + + if ((ret = av_audio_fifo_read(s->main_sample_buffers, (void **)main_buffer->extended_data, nb_samples)) < 0) + return ret; + } + + if (!ff_inlink_evaluate_timeline_at_frame(main_inlink, main_buffer) || (s->overlay_eof && av_audio_fifo_size(s->overlay_sample_buffers) == 0)) { + return ff_filter_frame(outlink, main_buffer); + } else { + if (s->transition_pts == AV_NOPTS_VALUE) { + nb_samples = FFMIN(av_audio_fifo_size(s->overlay_sample_buffers), SEGMENT_SIZE); + overlay_buffer = ff_get_audio_buffer(outlink, nb_samples); + if (!overlay_buffer) + return AVERROR(ENOMEM); + + if ((ret = av_audio_fifo_read(s->overlay_sample_buffers, (void **)overlay_buffer->extended_data, nb_samples)) < 0) + return ret; + + overlay_buffer->pts = main_buffer->pts; + } + av_frame_free(&main_buffer); + return ff_filter_frame(outlink, overlay_buffer); + } +} + +static int config_output(AVFilterLink *outlink) +{ + AVFilterContext *ctx = outlink->src; + AOverlayContext *s = ctx->priv; + int size, fifo_size; + + switch (outlink->format) { + case AV_SAMPLE_FMT_DBLP: s->crossfade_samples = crossfade_samples_dblp; + size = sizeof(double); + break; + case AV_SAMPLE_FMT_FLTP: s->crossfade_samples = crossfade_samples_fltp; + size = sizeof(float); + break; + case AV_SAMPLE_FMT_S16P: s->crossfade_samples = crossfade_samples_s16p; + size = sizeof(int16_t); + break; + case AV_SAMPLE_FMT_S32P: s->crossfade_samples = crossfade_samples_s32p; + size = sizeof(int32_t); + break; + } + + if (s->cf_duration) + s->cf_samples = av_rescale(s->cf_duration, outlink->sample_rate, AV_TIME_BASE); + + s->nb_channels = outlink->ch_layout.nb_channels; + + fifo_size = SEGMENT_SIZE + SEGMENT_SIZE * (1 + ((s->cf_samples - 1) / SEGMENT_SIZE)); + + s->main_sample_buffers = av_audio_fifo_alloc(outlink->format, s->nb_channels, fifo_size); + if (!s->main_sample_buffers) + return AVERROR(ENOMEM); + + s->overlay_sample_buffers = av_audio_fifo_alloc(outlink->format, s->nb_channels, fifo_size); + if (!s->overlay_sample_buffers) + return AVERROR(ENOMEM); + + s->cf0 = av_calloc(s->nb_channels, sizeof(*s->cf0)); + if (!s->cf0) + return AVERROR(ENOMEM); + + s->cf1 = av_calloc(s->nb_channels, sizeof(*s->cf1)); + if (!s->cf1) + return AVERROR(ENOMEM); + + for (int i = 0; i < s->nb_channels; i++) { + s->cf0[i] = av_malloc_array(s->cf_samples, size); + if (!s->cf0[i]) + return AVERROR(ENOMEM); + s->cf1[i] = av_malloc_array(s->cf_samples, size); + if (!s->cf1[i]) + return AVERROR(ENOMEM); + } + + return 0; +} + +static const AVFilterPad inputs[] = { + { + .name = "main", + .type = AVMEDIA_TYPE_AUDIO, + }, + { + .name = "overlay", + .type = AVMEDIA_TYPE_AUDIO, + }, +}; + +static const AVFilterPad outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_AUDIO, + .config_props = config_output, + }, +}; + +const AVFilter ff_af_aoverlay = { + .name = "aoverlay", + .description = NULL_IF_CONFIG_SMALL("Replace a specified section of an audio stream with another audio input."), + .priv_size = sizeof(AOverlayContext), + .priv_class = &aoverlay_class, + .activate = activate, + .init = init, + .uninit = uninit, + FILTER_INPUTS(inputs), + FILTER_OUTPUTS(outputs), + FILTER_SAMPLEFMTS_ARRAY(sample_fmts), + .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_INTERNAL, +}; diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index af84aa3d97..2310cbb250 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -67,6 +67,7 @@ extern const AVFilter ff_af_anlmdn; extern const AVFilter ff_af_anlmf; extern const AVFilter ff_af_anlms; extern const AVFilter ff_af_anull; +extern const AVFilter ff_af_aoverlay; extern const AVFilter ff_af_apad; extern const AVFilter ff_af_aperms; extern const AVFilter ff_af_aphaser;