From patchwork Fri Oct 14 18:09:51 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Rowe X-Patchwork-Id: 1005 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.140.133 with SMTP id o127csp34092vsd; Fri, 14 Oct 2016 11:10:04 -0700 (PDT) X-Received: by 10.28.163.5 with SMTP id m5mr2741439wme.58.1476468604573; Fri, 14 Oct 2016 11:10:04 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l8si921827wmg.56.2016.10.14.11.10.03; Fri, 14 Oct 2016 11:10:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@shoretel1.onmicrosoft.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 48C686897B4; Fri, 14 Oct 2016 21:10:00 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from NAM01-BY2-obe.outbound.protection.outlook.com (mail-by2nam01on0061.outbound.protection.outlook.com [104.47.34.61]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 77181680D03 for ; Fri, 14 Oct 2016 21:09:53 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shoretel1.onmicrosoft.com; s=selector1-shoretel-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=3eZ2C1tovUIJjLhj/SmflCHIJKtrF6Vo+j97LmAJbic=; b=AKvnY6SfqfSTpReYd5/KOlH0Y46AQ36Xpx6nYh+aG6whZRRZCV461OQuPpF0g/rZzQU+FK3owF18wWfTH88kmibgWAaBUkETwcBf9P480v787bz0L/Y04GZV1FsEIbAxEzXrP6JkC3TleYAAVkrEjeZnpfncaLo3rfySw51LkBQ= Received: from DM5PR10MB1258.namprd10.prod.outlook.com (10.172.39.138) by DM5PR10MB1258.namprd10.prod.outlook.com (10.172.39.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.659.11; Fri, 14 Oct 2016 18:09:52 +0000 Received: from DM5PR10MB1258.namprd10.prod.outlook.com ([10.172.39.138]) by DM5PR10MB1258.namprd10.prod.outlook.com ([10.172.39.138]) with mapi id 15.01.0659.020; Fri, 14 Oct 2016 18:09:51 +0000 From: Greg Rowe To: FFmpeg development discussions and patches Thread-Topic: [FFmpeg-devel] [PATCH] avfilter/af_silenceremove: add optional tone when silence is removed Thread-Index: AQHSJbx1eUslKe6H70aZnW62OC253aCoPD+Y Date: Fri, 14 Oct 2016 18:09:51 +0000 Message-ID: References: , <20161014014348.GA4602@nb4> In-Reply-To: <20161014014348.GA4602@nb4> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=growe@shoretel.com; x-originating-ip: [25.168.192.4] x-ms-office365-filtering-correlation-id: cf067611-7504-4bab-988f-08d3f45d4b24 x-microsoft-exchange-diagnostics: 1; DM5PR10MB1258; 7:ZMu3raak7nQWjAHtgCMJ3mL8/GlKbNMmZJsHU4uX8W+aYzDGRYf6+3uOBI9vgMDhQL1WK2Dzs7anP/nk7/XM0++Re0UMbTAfboA1uJIoHqaIXSSpOtH3nWPHaV0ZDnjlQR3itWji/nNNYR6ma21BPeS3zcJ3Q3xUOx8KEpqdEDs56YEMcfmdXjZ0oZRdTCSOB97v4OSPLigYwSlc3H85Nui/nQLENviTADCE/7+bFoQdKJG7shpVA+j9OIRVRylrGWNVAgNTTqCfTaDeI0+W8WaztEsV3Wh1m+KV5D9XRS6xKsfjyKoYacJgjlQ3otohQm8eZ0xDxlLiuRzT0oe/yyxTTxJHlAmtq0O7vLRAzsY= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DM5PR10MB1258; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(114461547978260); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(102415321)(6040176)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6042046)(6043046); SRVR:DM5PR10MB1258; BCL:0; PCL:0; RULEID:; SRVR:DM5PR10MB1258; x-forefront-prvs: 0095BCF226 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(189002)(199003)(2906002)(19580395003)(92566002)(106116001)(66066001)(122556002)(97736004)(7696004)(3280700002)(5660300001)(5890100001)(77096005)(3660700001)(87936001)(105586002)(15974865002)(8936002)(9686002)(76576001)(2950100002)(6916009)(189998001)(10400500002)(99936001)(76176999)(81166006)(107886002)(101416001)(81156014)(74316002)(54356999)(5002640100001)(86362001)(68736007)(450100001)(110136003)(7846002)(106356001)(305945005)(99286002)(50986999)(33656002)(7736002)(2900100001)(102836003)(3846002)(6116002)(586003)(18886075002); DIR:OUT; SFP:1101; SCL:1; SRVR:DM5PR10MB1258; H:DM5PR10MB1258.namprd10.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: shoretel.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: shoretel.com X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Oct 2016 18:09:51.5994 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 610c7684-bc75-4e31-a66a-d12e77c45e5c X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR10MB1258 Subject: Re: [FFmpeg-devel] [PATCH] avfilter/af_silenceremove: add optional tone when silence is removed X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Michael, In the attached patch I've tried to make all of the changes you've pointed out. I also renamed tone_hz to tone_frequency on Moritz Barsnick's suggestion. Is there a good way to generate the tone while avoiding floating point operations? If there is then don't bother reviewing this patch and I'll make that change once I know better how to do it. I removed the unrelated changes. The two parameters, tone_duration and tone_frequency, are integers now. The tone_duration parameter is changed from seconds to milliseconds. I have updated the documentation to reflect that. I moved the tone generation to an initialization function and fill a buffer that exists for the duration of the filter instead of needlessly generating the tone on the fly. Thanks, Greg diff --git a/Changelog b/Changelog index 0da009c..86e031c 100644 --- a/Changelog +++ b/Changelog @@ -2,6 +2,7 @@ Entries are sorted chronologically from oldest to youngest within each release, releases are sorted from youngest to oldest. version : +- Added optional tone insertion in af_silenceremove - libopenmpt demuxer - tee protocol - Changed metadata print option to accept general urls diff --git a/doc/filters.texi b/doc/filters.texi index 4b2f7bf..e09a303 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -3340,7 +3340,8 @@ ffmpeg -i silence.mp3 -af silencedetect=noise=0.0001 -f null - @section silenceremove -Remove silence from the beginning, middle or end of the audio. +Remove silence from the beginning, middle or end of the audio while +optionally inserting a tone where silence was removed. The filter accepts the following options: @@ -3401,6 +3402,14 @@ Default value is @code{rms}. @item window Set ratio used to calculate size of window for detecting silence. Default value is @code{0.02}. Allowed range is from @code{0} to @code{10}. + +@item tone_duration +Set the duration of the tone inserted in the stream when silence is removed. A value of @code{0} disables tone insertion. +Default value is @code{0.0}. + +@item tone_hz +Set the frequency of the tone inserted in the stream when silence is removed. +Default value is @code{1000.0}. @end table @subsection Examples diff --git a/libavfilter/af_silenceremove.c b/libavfilter/af_silenceremove.c index f156d18..07cf428 100644 --- a/libavfilter/af_silenceremove.c +++ b/libavfilter/af_silenceremove.c @@ -3,6 +3,7 @@ * Copyright (c) 2001 Chris Bagwell * Copyright (c) 2003 Donnie Smith * Copyright (c) 2014 Paul B Mahol + * Copyright (c) 2016 Shoretel * * This file is part of FFmpeg. * @@ -31,11 +32,20 @@ #include "internal.h" enum SilenceMode { - SILENCE_TRIM, + SILENCE_TRIM = 0, SILENCE_TRIM_FLUSH, SILENCE_COPY, SILENCE_COPY_FLUSH, - SILENCE_STOP + SILENCE_STOP, + SILENCE_END_MARKER +}; + +static const char* SILENCE_MODE_NAMES[] = { + NULL_IF_CONFIG_SMALL("TRIM"), + NULL_IF_CONFIG_SMALL("TRIM_FLUSH"), + NULL_IF_CONFIG_SMALL("COPY"), + NULL_IF_CONFIG_SMALL("COPY_FLUSH"), + NULL_IF_CONFIG_SMALL("STOP") }; typedef struct SilenceRemoveContext { @@ -75,6 +85,10 @@ typedef struct SilenceRemoveContext { int detection; void (*update)(struct SilenceRemoveContext *s, double sample); double(*compute)(struct SilenceRemoveContext *s, double sample); + + double last_pts_seconds; + double tone_duration; + double tone_hz; } SilenceRemoveContext; #define OFFSET(x) offsetof(SilenceRemoveContext, x) @@ -91,11 +105,51 @@ static const AVOption silenceremove_options[] = { { "peak", 0, 0, AV_OPT_TYPE_CONST, {.i64=0}, 0, 0, FLAGS, "detection" }, { "rms", 0, 0, AV_OPT_TYPE_CONST, {.i64=1}, 0, 0, FLAGS, "detection" }, { "window", NULL, OFFSET(window_ratio), AV_OPT_TYPE_DOUBLE, {.dbl=0.02}, 0, 10, FLAGS }, - { NULL } + { + .name = "tone_duration", + .help = "length of tone inserted when silence is detected (0 to disable)", + .offset = OFFSET(tone_duration), + .type = AV_OPT_TYPE_DOUBLE, + .default_val = {.dbl=0.0}, + .min = 0.0, + .max = DBL_MAX, + .flags = FLAGS, + .unit = "tone", + }, + { + .name = "tone_hz", + .help = "frequency of tone inserted when silence is removed, 1 kHz default", + .offset = OFFSET(tone_hz), + .type = AV_OPT_TYPE_DOUBLE, + .default_val = {.dbl=1000.0}, + .min = 0.0, + .max = DBL_MAX, + .flags = FLAGS, + .unit = "tone", + }, + {NULL} }; AVFILTER_DEFINE_CLASS(silenceremove); +static const char* mode_to_string(enum SilenceMode mode) +{ + if (mode >= SILENCE_END_MARKER) { + return ""; + } + /* This can be null if the config is small. */ + return SILENCE_MODE_NAMES[mode] ? SILENCE_MODE_NAMES[mode]:""; +} + + +static void set_mode(AVFilterContext *ctx, enum SilenceMode new) +{ + SilenceRemoveContext *s = ctx->priv; + av_log(ctx, AV_LOG_DEBUG, "changing state %s=>%s\n", + mode_to_string(s->mode), mode_to_string(new)); + s->mode = new; +} + static double compute_peak(SilenceRemoveContext *s, double sample) { double new_sum; @@ -209,14 +263,46 @@ static int config_input(AVFilterLink *inlink) s->stop_holdoff_end = 0; s->stop_found_periods = 0; - if (s->start_periods) - s->mode = SILENCE_TRIM; - else - s->mode = SILENCE_COPY; + set_mode(ctx, s->start_periods ? SILENCE_TRIM:SILENCE_COPY); return 0; } +static int insert_tone(AVFilterLink *inlink, + AVFilterLink *outlink, + double tone_hz, + double duration) +{ + AVFilterContext *ctx = inlink->dst; + int sample_count = duration * inlink->sample_rate; + double twopi = 2.0 * M_PI; + int i = 0; + AVFrame *out = NULL; + double *obuf = NULL; + double step = 0.0; + double s = 0.0; + + out = ff_get_audio_buffer(inlink, sample_count / inlink->channels); + if (!out) { + return AVERROR(ENOMEM); + } + obuf = (double *)out->data[0]; + step = tone_hz / (double)out->sample_rate; + s = step; + + av_log(ctx, AV_LOG_DEBUG, + "insert beep tone=%fhz duration=%f seconds\n", + tone_hz, duration); + + + for (i=0; idst; + AVFilterLink *outlink = ctx->outputs[0]; + SilenceRemoveContext *s = ctx->priv; + pts_seconds = (inlink->current_pts_us / 1000000.0) / AV_TIME_BASE; + + /* Check to be certain that we don't flood the stream with + * annoying tones. */ + if ((s->last_pts_seconds == 0.0) + || (pts_seconds - s->last_pts_seconds) > (s->tone_duration * 2.0)) { + + ret = insert_tone(inlink, outlink, s->tone_hz, s->tone_duration); + s->last_pts_seconds = pts_seconds; + } + + return ret; +} + static int filter_frame(AVFilterLink *inlink, AVFrame *in) { AVFilterContext *ctx = inlink->dst; @@ -243,7 +351,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) switch (s->mode) { case SILENCE_TRIM: -silence_trim: + silence_trim: nbs = in->nb_samples - nb_samples_read / inlink->channels; if (!nbs) break; @@ -263,7 +371,7 @@ silence_trim: if (s->start_holdoff_end >= s->start_duration * inlink->channels) { if (++s->start_found_periods >= s->start_periods) { - s->mode = SILENCE_TRIM_FLUSH; + set_mode(ctx, SILENCE_TRIM_FLUSH); goto silence_trim_flush; } @@ -283,7 +391,7 @@ silence_trim: break; case SILENCE_TRIM_FLUSH: -silence_trim_flush: + silence_trim_flush: nbs = s->start_holdoff_end - s->start_holdoff_offset; nbs -= nbs % inlink->channels; if (!nbs) @@ -304,13 +412,13 @@ silence_trim_flush: if (s->start_holdoff_offset == s->start_holdoff_end) { s->start_holdoff_offset = 0; s->start_holdoff_end = 0; - s->mode = SILENCE_COPY; + set_mode(ctx, SILENCE_COPY); goto silence_copy; } break; case SILENCE_COPY: -silence_copy: + silence_copy: nbs = in->nb_samples - nb_samples_read / inlink->channels; if (!nbs) break; @@ -329,7 +437,7 @@ silence_copy: threshold &= s->compute(s, ibuf[j]) > s->stop_threshold; if (threshold && s->stop_holdoff_end && !s->leave_silence) { - s->mode = SILENCE_COPY_FLUSH; + set_mode(ctx, SILENCE_COPY_FLUSH); flush(out, outlink, &nb_samples_written, &ret); goto silence_copy_flush; } else if (threshold) { @@ -357,7 +465,7 @@ silence_copy: s->stop_holdoff_end = 0; if (!s->restart) { - s->mode = SILENCE_STOP; + set_mode(ctx, SILENCE_STOP); flush(out, outlink, &nb_samples_written, &ret); goto silence_stop; } else { @@ -366,12 +474,19 @@ silence_copy: s->start_holdoff_offset = 0; s->start_holdoff_end = 0; clear_window(s); - s->mode = SILENCE_TRIM; - flush(out, outlink, &nb_samples_written, &ret); - goto silence_trim; + set_mode(ctx, SILENCE_TRIM); + + if (s->tone_duration > 0.0) { + ret = process_tone(inlink); + } + if (!ret) { + flush(out, outlink, + &nb_samples_written, &ret); + goto silence_trim; + } } } - s->mode = SILENCE_COPY_FLUSH; + set_mode(ctx, SILENCE_COPY_FLUSH); flush(out, outlink, &nb_samples_written, &ret); goto silence_copy_flush; } @@ -385,7 +500,7 @@ silence_copy: break; case SILENCE_COPY_FLUSH: -silence_copy_flush: + silence_copy_flush: nbs = s->stop_holdoff_end - s->stop_holdoff_offset; nbs -= nbs % inlink->channels; if (!nbs) @@ -406,12 +521,12 @@ silence_copy_flush: if (s->stop_holdoff_offset == s->stop_holdoff_end) { s->stop_holdoff_offset = 0; s->stop_holdoff_end = 0; - s->mode = SILENCE_COPY; + set_mode(ctx, SILENCE_COPY); goto silence_copy; } break; case SILENCE_STOP: -silence_stop: + silence_stop: break; } @@ -427,6 +542,8 @@ static int request_frame(AVFilterLink *outlink) int ret; ret = ff_request_frame(ctx->inputs[0]); + /* If there is no more data but the holdoff buffer still has data + * then copy the holdoff buffer out */ if (ret == AVERROR_EOF && (s->mode == SILENCE_COPY_FLUSH || s->mode == SILENCE_COPY)) { int nbs = s->stop_holdoff_end - s->stop_holdoff_offset; @@ -441,7 +558,7 @@ static int request_frame(AVFilterLink *outlink) nbs * sizeof(double)); ret = ff_filter_frame(ctx->inputs[0], frame); } - s->mode = SILENCE_STOP; + set_mode(ctx, SILENCE_STOP); } return ret; } diff --git a/libavfilter/version.h b/libavfilter/version.h index 93d249b..4626ca4 100644 --- a/libavfilter/version.h +++ b/libavfilter/version.h @@ -31,7 +31,7 @@ #define LIBAVFILTER_VERSION_MAJOR 6 #define LIBAVFILTER_VERSION_MINOR 63 -#define LIBAVFILTER_VERSION_MICRO 100 +#define LIBAVFILTER_VERSION_MICRO 101 #define LIBAVFILTER_VERSION_INT AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \ LIBAVFILTER_VERSION_MINOR, \