From patchwork Thu Oct 13 20:48:11 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Rowe X-Patchwork-Id: 995 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.140.66 with SMTP id o63csp341071vsd; Thu, 13 Oct 2016 13:48:39 -0700 (PDT) X-Received: by 10.28.26.193 with SMTP id a184mr3354764wma.93.1476391719801; Thu, 13 Oct 2016 13:48:39 -0700 (PDT) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id cf10si19979570wjc.162.2016.10.13.13.48.36; Thu, 13 Oct 2016 13:48:39 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@shoretel1.onmicrosoft.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D24F5689915; Thu, 13 Oct 2016 23:48:33 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from NAM01-BY2-obe.outbound.protection.outlook.com (mail-by2nam01on0063.outbound.protection.outlook.com [104.47.34.63]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7A54068974E for ; Thu, 13 Oct 2016 23:48:26 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shoretel1.onmicrosoft.com; s=selector1-shoretel-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=6j9A9hpDmIkfMeV/44FEa8MA+fBhY/v4Ua0hFg9BnHw=; b=WZQMT8z/WyUCrXv3D/+F7LOkzSpWVmrmiq5gIDniNBdxB8Ooc8qeyqfPs/Q4H6vbq8gN+1LowNiBHFs/AWeivCRdvcsX+N3nUzK7yOopZgKehjh5nzO2qKGZOtV0b5vsiOYVfd0zQ989ItjKmM/ikikIGV7akJNNl8uJQxi9LbA= Received: from BN6PR10MB1250.namprd10.prod.outlook.com (10.172.23.140) by BN6PR10MB1251.namprd10.prod.outlook.com (10.172.23.141) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.659.11; Thu, 13 Oct 2016 20:48:11 +0000 Received: from BN6PR10MB1250.namprd10.prod.outlook.com ([10.172.23.140]) by BN6PR10MB1250.namprd10.prod.outlook.com ([10.172.23.140]) with mapi id 15.01.0659.020; Thu, 13 Oct 2016 20:48:11 +0000 From: Greg Rowe To: "ffmpeg-devel@ffmpeg.org" Thread-Topic: [PATCH] avfilter/af_silenceremove: add optional tone when silence is removed Thread-Index: AQHSJZKeh7KdzFPdo0ODZOLQAW2IdA== Date: Thu, 13 Oct 2016 20:48:11 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=growe@shoretel.com; x-originating-ip: [25.168.192.4] x-ms-office365-filtering-correlation-id: a2ea425d-0346-45f7-3f6c-08d3f3aa3ef7 x-microsoft-exchange-diagnostics: 1; BN6PR10MB1251; 7:9DoI3sYoNgrxf1LKVM+jTDhRN5MYpcHhZKb17Olsa3IpkIsoTSgV6F7owjSoIXndaQ3GgcWa7CeyHXbMuOWSBCkZpiiv799GlRfxeolqIkp8NCskwBfKXJvWGUd+Op6UCi+CiUv+iv+d5n9I1/cN20WSWXvH4yFgH7/GhP3EDYNGlhvz3RiwZS2VjylpFGFwyz6UIleWeJ0e1z/t08jW8XKzgFgWBWJyTvC9a1FkwFe0Ezs0e3/5dOYeZwa0QXdmz6fikKfG6uEZ5dCIaBEnuZRihBoBwqsbzel46kWprZh2OBvN2zbmApwUvPWRYxjgRu52KQ/EbcIw5OgIRjaNWm7zPMVvfeyJVu3ffYZGNhk= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BN6PR10MB1251; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(114461547978260); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(102415321)(6040176)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6043046)(6042046); SRVR:BN6PR10MB1251; BCL:0; PCL:0; RULEID:; SRVR:BN6PR10MB1251; x-forefront-prvs: 0094E3478A x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(189002)(199003)(87936001)(3660700001)(106116001)(5890100001)(77096005)(19627405001)(106356001)(2351001)(229853001)(2501003)(2900100001)(110136003)(3280700002)(19580395003)(450100001)(99936001)(54356999)(92566002)(3846002)(102836003)(6116002)(99286002)(105586002)(10400500002)(50986999)(586003)(11100500001)(76576001)(122556002)(6916009)(6606003)(2906002)(81166006)(8676002)(101416001)(189998001)(16236675004)(81156014)(5002640100001)(107886002)(19625215002)(86362001)(7846002)(7736002)(74316002)(7696004)(5640700001)(8936002)(5660300001)(97736004)(9686002)(66066001)(68736007)(33656002); DIR:OUT; SFP:1101; SCL:1; SRVR:BN6PR10MB1251; H:BN6PR10MB1250.namprd10.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: shoretel.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: shoretel.com X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Oct 2016 20:48:11.4872 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 610c7684-bc75-4e31-a66a-d12e77c45e5c X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR10MB1251 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 Subject: [FFmpeg-devel] [PATCH] avfilter/af_silenceremove: add optional tone when silence is removed X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" The attached patch adds two optional parameters to af_silenceremove for the purpose of inserting a tone in place of where silence was removed. This alerts the user that silence has been trimmed from the original stream. The parameters are tone_duration which defaults to 0.0 which disables the feature and tone_hz which allows you to specify the frequency of the tone. Thanks, Greg --- Greg Rowe www.shoretel.com From 41405e90cb2fb41441a6cf29c7a0d14362fd1b1f Mon Sep 17 00:00:00 2001 From: Greg Rowe Date: Fri, 7 Oct 2016 13:39:58 -0400 Subject: [PATCH] avfilter/af_silenceremove: add optional tone when silence is removed This commit adds two options to the af_silenceremove filter. It adds tone_duration and tone_hz making it possible to insert a tone when silence is removed. Tone insertion is disabled by default (by using a tone_duration of 0.0 seconds). Signed-off-by: Greg Rowe --- Changelog | 1 + doc/filters.texi | 11 ++- libavfilter/af_silenceremove.c | 161 +++++++++++++++++++++++++++++++++++------ libavfilter/version.h | 2 +- 4 files changed, 151 insertions(+), 24 deletions(-) diff --git a/Changelog b/Changelog index 0da009c..86e031c 100644 --- a/Changelog +++ b/Changelog @@ -2,6 +2,7 @@ Entries are sorted chronologically from oldest to youngest within each release, releases are sorted from youngest to oldest. version : +- Added optional tone insertion in af_silenceremove - libopenmpt demuxer - tee protocol - Changed metadata print option to accept general urls diff --git a/doc/filters.texi b/doc/filters.texi index 4b2f7bf..e09a303 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -3340,7 +3340,8 @@ ffmpeg -i silence.mp3 -af silencedetect=noise=0.0001 -f null - @section silenceremove -Remove silence from the beginning, middle or end of the audio. +Remove silence from the beginning, middle or end of the audio while +optionally inserting a tone where silence was removed. The filter accepts the following options: @@ -3401,6 +3402,14 @@ Default value is @code{rms}. @item window Set ratio used to calculate size of window for detecting silence. Default value is @code{0.02}. Allowed range is from @code{0} to @code{10}. + +@item tone_duration +Set the duration of the tone inserted in the stream when silence is removed. A value of @code{0} disables tone insertion. +Default value is @code{0.0}. + +@item tone_hz +Set the frequency of the tone inserted in the stream when silence is removed. +Default value is @code{1000.0}. @end table @subsection Examples diff --git a/libavfilter/af_silenceremove.c b/libavfilter/af_silenceremove.c index f156d18..07cf428 100644 --- a/libavfilter/af_silenceremove.c +++ b/libavfilter/af_silenceremove.c @@ -3,6 +3,7 @@ * Copyright (c) 2001 Chris Bagwell * Copyright (c) 2003 Donnie Smith * Copyright (c) 2014 Paul B Mahol + * Copyright (c) 2016 Shoretel * * This file is part of FFmpeg. * @@ -31,11 +32,20 @@ #include "internal.h" enum SilenceMode { - SILENCE_TRIM, + SILENCE_TRIM = 0, SILENCE_TRIM_FLUSH, SILENCE_COPY, SILENCE_COPY_FLUSH, - SILENCE_STOP + SILENCE_STOP, + SILENCE_END_MARKER +}; + +static const char* SILENCE_MODE_NAMES[] = { + NULL_IF_CONFIG_SMALL("TRIM"), + NULL_IF_CONFIG_SMALL("TRIM_FLUSH"), + NULL_IF_CONFIG_SMALL("COPY"), + NULL_IF_CONFIG_SMALL("COPY_FLUSH"), + NULL_IF_CONFIG_SMALL("STOP") }; typedef struct SilenceRemoveContext { @@ -75,6 +85,10 @@ typedef struct SilenceRemoveContext { int detection; void (*update)(struct SilenceRemoveContext *s, double sample); double(*compute)(struct SilenceRemoveContext *s, double sample); + + double last_pts_seconds; + double tone_duration; + double tone_hz; } SilenceRemoveContext; #define OFFSET(x) offsetof(SilenceRemoveContext, x) @@ -91,11 +105,51 @@ static const AVOption silenceremove_options[] = { { "peak", 0, 0, AV_OPT_TYPE_CONST, {.i64=0}, 0, 0, FLAGS, "detection" }, { "rms", 0, 0, AV_OPT_TYPE_CONST, {.i64=1}, 0, 0, FLAGS, "detection" }, { "window", NULL, OFFSET(window_ratio), AV_OPT_TYPE_DOUBLE, {.dbl=0.02}, 0, 10, FLAGS }, - { NULL } + { + .name = "tone_duration", + .help = "length of tone inserted when silence is detected (0 to disable)", + .offset = OFFSET(tone_duration), + .type = AV_OPT_TYPE_DOUBLE, + .default_val = {.dbl=0.0}, + .min = 0.0, + .max = DBL_MAX, + .flags = FLAGS, + .unit = "tone", + }, + { + .name = "tone_hz", + .help = "frequency of tone inserted when silence is removed, 1 kHz default", + .offset = OFFSET(tone_hz), + .type = AV_OPT_TYPE_DOUBLE, + .default_val = {.dbl=1000.0}, + .min = 0.0, + .max = DBL_MAX, + .flags = FLAGS, + .unit = "tone", + }, + {NULL} }; AVFILTER_DEFINE_CLASS(silenceremove); +static const char* mode_to_string(enum SilenceMode mode) +{ + if (mode >= SILENCE_END_MARKER) { + return ""; + } + /* This can be null if the config is small. */ + return SILENCE_MODE_NAMES[mode] ? SILENCE_MODE_NAMES[mode]:""; +} + + +static void set_mode(AVFilterContext *ctx, enum SilenceMode new) +{ + SilenceRemoveContext *s = ctx->priv; + av_log(ctx, AV_LOG_DEBUG, "changing state %s=>%s\n", + mode_to_string(s->mode), mode_to_string(new)); + s->mode = new; +} + static double compute_peak(SilenceRemoveContext *s, double sample) { double new_sum; @@ -209,14 +263,46 @@ static int config_input(AVFilterLink *inlink) s->stop_holdoff_end = 0; s->stop_found_periods = 0; - if (s->start_periods) - s->mode = SILENCE_TRIM; - else - s->mode = SILENCE_COPY; + set_mode(ctx, s->start_periods ? SILENCE_TRIM:SILENCE_COPY); return 0; } +static int insert_tone(AVFilterLink *inlink, + AVFilterLink *outlink, + double tone_hz, + double duration) +{ + AVFilterContext *ctx = inlink->dst; + int sample_count = duration * inlink->sample_rate; + double twopi = 2.0 * M_PI; + int i = 0; + AVFrame *out = NULL; + double *obuf = NULL; + double step = 0.0; + double s = 0.0; + + out = ff_get_audio_buffer(inlink, sample_count / inlink->channels); + if (!out) { + return AVERROR(ENOMEM); + } + obuf = (double *)out->data[0]; + step = tone_hz / (double)out->sample_rate; + s = step; + + av_log(ctx, AV_LOG_DEBUG, + "insert beep tone=%fhz duration=%f seconds\n", + tone_hz, duration); + + + for (i=0; idst; + AVFilterLink *outlink = ctx->outputs[0]; + SilenceRemoveContext *s = ctx->priv; + pts_seconds = (inlink->current_pts_us / 1000000.0) / AV_TIME_BASE; + + /* Check to be certain that we don't flood the stream with + * annoying tones. */ + if ((s->last_pts_seconds == 0.0) + || (pts_seconds - s->last_pts_seconds) > (s->tone_duration * 2.0)) { + + ret = insert_tone(inlink, outlink, s->tone_hz, s->tone_duration); + s->last_pts_seconds = pts_seconds; + } + + return ret; +} + static int filter_frame(AVFilterLink *inlink, AVFrame *in) { AVFilterContext *ctx = inlink->dst; @@ -243,7 +351,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) switch (s->mode) { case SILENCE_TRIM: -silence_trim: + silence_trim: nbs = in->nb_samples - nb_samples_read / inlink->channels; if (!nbs) break; @@ -263,7 +371,7 @@ silence_trim: if (s->start_holdoff_end >= s->start_duration * inlink->channels) { if (++s->start_found_periods >= s->start_periods) { - s->mode = SILENCE_TRIM_FLUSH; + set_mode(ctx, SILENCE_TRIM_FLUSH); goto silence_trim_flush; } @@ -283,7 +391,7 @@ silence_trim: break; case SILENCE_TRIM_FLUSH: -silence_trim_flush: + silence_trim_flush: nbs = s->start_holdoff_end - s->start_holdoff_offset; nbs -= nbs % inlink->channels; if (!nbs) @@ -304,13 +412,13 @@ silence_trim_flush: if (s->start_holdoff_offset == s->start_holdoff_end) { s->start_holdoff_offset = 0; s->start_holdoff_end = 0; - s->mode = SILENCE_COPY; + set_mode(ctx, SILENCE_COPY); goto silence_copy; } break; case SILENCE_COPY: -silence_copy: + silence_copy: nbs = in->nb_samples - nb_samples_read / inlink->channels; if (!nbs) break; @@ -329,7 +437,7 @@ silence_copy: threshold &= s->compute(s, ibuf[j]) > s->stop_threshold; if (threshold && s->stop_holdoff_end && !s->leave_silence) { - s->mode = SILENCE_COPY_FLUSH; + set_mode(ctx, SILENCE_COPY_FLUSH); flush(out, outlink, &nb_samples_written, &ret); goto silence_copy_flush; } else if (threshold) { @@ -357,7 +465,7 @@ silence_copy: s->stop_holdoff_end = 0; if (!s->restart) { - s->mode = SILENCE_STOP; + set_mode(ctx, SILENCE_STOP); flush(out, outlink, &nb_samples_written, &ret); goto silence_stop; } else { @@ -366,12 +474,19 @@ silence_copy: s->start_holdoff_offset = 0; s->start_holdoff_end = 0; clear_window(s); - s->mode = SILENCE_TRIM; - flush(out, outlink, &nb_samples_written, &ret); - goto silence_trim; + set_mode(ctx, SILENCE_TRIM); + + if (s->tone_duration > 0.0) { + ret = process_tone(inlink); + } + if (!ret) { + flush(out, outlink, + &nb_samples_written, &ret); + goto silence_trim; + } } } - s->mode = SILENCE_COPY_FLUSH; + set_mode(ctx, SILENCE_COPY_FLUSH); flush(out, outlink, &nb_samples_written, &ret); goto silence_copy_flush; } @@ -385,7 +500,7 @@ silence_copy: break; case SILENCE_COPY_FLUSH: -silence_copy_flush: + silence_copy_flush: nbs = s->stop_holdoff_end - s->stop_holdoff_offset; nbs -= nbs % inlink->channels; if (!nbs) @@ -406,12 +521,12 @@ silence_copy_flush: if (s->stop_holdoff_offset == s->stop_holdoff_end) { s->stop_holdoff_offset = 0; s->stop_holdoff_end = 0; - s->mode = SILENCE_COPY; + set_mode(ctx, SILENCE_COPY); goto silence_copy; } break; case SILENCE_STOP: -silence_stop: + silence_stop: break; } @@ -427,6 +542,8 @@ static int request_frame(AVFilterLink *outlink) int ret; ret = ff_request_frame(ctx->inputs[0]); + /* If there is no more data but the holdoff buffer still has data + * then copy the holdoff buffer out */ if (ret == AVERROR_EOF && (s->mode == SILENCE_COPY_FLUSH || s->mode == SILENCE_COPY)) { int nbs = s->stop_holdoff_end - s->stop_holdoff_offset; @@ -441,7 +558,7 @@ static int request_frame(AVFilterLink *outlink) nbs * sizeof(double)); ret = ff_filter_frame(ctx->inputs[0], frame); } - s->mode = SILENCE_STOP; + set_mode(ctx, SILENCE_STOP); } return ret; } diff --git a/libavfilter/version.h b/libavfilter/version.h index 93d249b..4626ca4 100644 --- a/libavfilter/version.h +++ b/libavfilter/version.h @@ -31,7 +31,7 @@ #define LIBAVFILTER_VERSION_MAJOR 6 #define LIBAVFILTER_VERSION_MINOR 63 -#define LIBAVFILTER_VERSION_MICRO 100 +#define LIBAVFILTER_VERSION_MICRO 101 #define LIBAVFILTER_VERSION_INT AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \ LIBAVFILTER_VERSION_MINOR, \