diff mbox

[FFmpeg-devel] add dumpwave filter

Message ID 20180109234951.50933-1-dmitry.gumenyuk@gmail.com
State Withdrawn
Headers show

Commit Message

dmitry.gumenyuk@gmail.com Jan. 9, 2018, 11:49 p.m. UTC
From: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>

Signed-off-by: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
---
 Changelog                      |   1 +
 doc/filters.texi               |  23 ++++
 libavfilter/Makefile           |   1 +
 libavfilter/af_dumpwave.c      | 285 +++++++++++++++++++++++++++++++++++++++++
 libavfilter/allfilters.c       |   1 +
 libavfilter/version.h          |   4 +-
 tests/fate/filter-audio.mak    |   5 +
 tests/ref/fate/filter-dumpwave |   1 +
 8 files changed, 319 insertions(+), 2 deletions(-)
 create mode 100644 libavfilter/af_dumpwave.c
 create mode 100644 tests/ref/fate/filter-dumpwave

Comments

Kyle Swanson Jan. 10, 2018, 7:43 a.m. UTC | #1
Hi,

On Tue, Jan 9, 2018 at 3:49 PM,  <dmitry.gumenyuk@gmail.com> wrote:
> From: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
>
> Signed-off-by: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
> ---
>  Changelog                      |   1 +
>  doc/filters.texi               |  23 ++++
>  libavfilter/Makefile           |   1 +
>  libavfilter/af_dumpwave.c      | 285
+++++++++++++++++++++++++++++++++++++++++
>  libavfilter/allfilters.c       |   1 +
>  libavfilter/version.h          |   4 +-
>  tests/fate/filter-audio.mak    |   5 +
>  tests/ref/fate/filter-dumpwave |   1 +
>  8 files changed, 319 insertions(+), 2 deletions(-)
>  create mode 100644 libavfilter/af_dumpwave.c
>  create mode 100644 tests/ref/fate/filter-dumpwave

I could see this possibly being a useful filter, but I'm confused about
where the JSON schema came from. The two JS libraries that do this type of
thing (waveform.js, and wavesurer.js) both just load waveform data as an
array of floats. If we're going to add something like this to libavfilter
it should be as generic and extensible as possible. I'm not wild about the
string stuff, and the big sample format switch isn't necessary. I could do
a code review, but it might just be faster if I rewrite it and send another
patch. Is that OK with you?

Thanks,
Kyle
dmitry.gumenyuk@gmail.com Jan. 10, 2018, 7:51 a.m. UTC | #2
There is no rush on this. Could you please do a code review so I can see how to do things properly?
> On 10 Jan 2018, at 08:43, Kyle Swanson <k@ylo.ph> wrote:
> 
> Hi,
> 
> On Tue, Jan 9, 2018 at 3:49 PM,  <dmitry.gumenyuk@gmail.com> wrote:
>> From: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
>> 
>> Signed-off-by: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
>> ---
>> Changelog                      |   1 +
>> doc/filters.texi               |  23 ++++
>> libavfilter/Makefile           |   1 +
>> libavfilter/af_dumpwave.c      | 285
> +++++++++++++++++++++++++++++++++++++++++
>> libavfilter/allfilters.c       |   1 +
>> libavfilter/version.h          |   4 +-
>> tests/fate/filter-audio.mak    |   5 +
>> tests/ref/fate/filter-dumpwave |   1 +
>> 8 files changed, 319 insertions(+), 2 deletions(-)
>> create mode 100644 libavfilter/af_dumpwave.c
>> create mode 100644 tests/ref/fate/filter-dumpwave
> 
> I could see this possibly being a useful filter, but I'm confused about
> where the JSON schema came from. The two JS libraries that do this type of
> thing (waveform.js, and wavesurer.js) both just load waveform data as an
> array of floats. If we're going to add something like this to libavfilter
> it should be as generic and extensible as possible. I'm not wild about the
> string stuff, and the big sample format switch isn't necessary. I could do
> a code review, but it might just be faster if I rewrite it and send another
> patch. Is that OK with you?
> 
> Thanks,
> Kyle
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
dmitry.gumenyuk@gmail.com Jan. 10, 2018, 8:04 a.m. UTC | #3
Hi, 
While Waveform.js converts old SoundCloud PNGs, wavesurer.js is using Web Audio API which is limited/not supported by all browsers

> On 10 Jan 2018, at 08:51, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
> 
> There is no rush on this. Could you please do a code review so I can see how to do things properly?
>> On 10 Jan 2018, at 08:43, Kyle Swanson <k@ylo.ph> wrote:
>> 
>> Hi,
>> 
>> On Tue, Jan 9, 2018 at 3:49 PM,  <dmitry.gumenyuk@gmail.com> wrote:
>>> From: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
>>> 
>>> Signed-off-by: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
>>> ---
>>> Changelog                      |   1 +
>>> doc/filters.texi               |  23 ++++
>>> libavfilter/Makefile           |   1 +
>>> libavfilter/af_dumpwave.c      | 285
>> +++++++++++++++++++++++++++++++++++++++++
>>> libavfilter/allfilters.c       |   1 +
>>> libavfilter/version.h          |   4 +-
>>> tests/fate/filter-audio.mak    |   5 +
>>> tests/ref/fate/filter-dumpwave |   1 +
>>> 8 files changed, 319 insertions(+), 2 deletions(-)
>>> create mode 100644 libavfilter/af_dumpwave.c
>>> create mode 100644 tests/ref/fate/filter-dumpwave
>> 
>> I could see this possibly being a useful filter, but I'm confused about
>> where the JSON schema came from. The two JS libraries that do this type of
>> thing (waveform.js, and wavesurer.js) both just load waveform data as an
>> array of floats. If we're going to add something like this to libavfilter
>> it should be as generic and extensible as possible. I'm not wild about the
>> string stuff, and the big sample format switch isn't necessary. I could do
>> a code review, but it might just be faster if I rewrite it and send another
>> patch. Is that OK with you?
>> 
>> Thanks,
>> Kyle
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
dmitry.gumenyuk@gmail.com Jan. 10, 2018, 9:16 a.m. UTC | #4
> wavesurer.js  - Web Audio API
I mean its would be hard to do the same for large files
https://developer.mozilla.org/en-US/docs/Web/API/BaseAudioContext/decodeAudioData

> On 10 Jan 2018, at 09:04, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
> 
> Hi, 
> While Waveform.js converts old SoundCloud PNGs, wavesurer.js is using Web Audio API which is limited/not supported by all browsers
> 
>> On 10 Jan 2018, at 08:51, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>> 
>> There is no rush on this. Could you please do a code review so I can see how to do things properly?
>>> On 10 Jan 2018, at 08:43, Kyle Swanson <k@ylo.ph> wrote:
>>> 
>>> Hi,
>>> 
>>> On Tue, Jan 9, 2018 at 3:49 PM,  <dmitry.gumenyuk@gmail.com> wrote:
>>>> From: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
>>>> 
>>>> Signed-off-by: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
>>>> ---
>>>> Changelog                      |   1 +
>>>> doc/filters.texi               |  23 ++++
>>>> libavfilter/Makefile           |   1 +
>>>> libavfilter/af_dumpwave.c      | 285
>>> +++++++++++++++++++++++++++++++++++++++++
>>>> libavfilter/allfilters.c       |   1 +
>>>> libavfilter/version.h          |   4 +-
>>>> tests/fate/filter-audio.mak    |   5 +
>>>> tests/ref/fate/filter-dumpwave |   1 +
>>>> 8 files changed, 319 insertions(+), 2 deletions(-)
>>>> create mode 100644 libavfilter/af_dumpwave.c
>>>> create mode 100644 tests/ref/fate/filter-dumpwave
>>> 
>>> I could see this possibly being a useful filter, but I'm confused about
>>> where the JSON schema came from. The two JS libraries that do this type of
>>> thing (waveform.js, and wavesurer.js) both just load waveform data as an
>>> array of floats. If we're going to add something like this to libavfilter
>>> it should be as generic and extensible as possible. I'm not wild about the
>>> string stuff, and the big sample format switch isn't necessary. I could do
>>> a code review, but it might just be faster if I rewrite it and send another
>>> patch. Is that OK with you?
>>> 
>>> Thanks,
>>> Kyle
>>> _______________________________________________
>>> ffmpeg-devel mailing list
>>> ffmpeg-devel@ffmpeg.org
>>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>> 
>
dmitry.gumenyuk@gmail.com Jan. 10, 2018, 9:21 a.m. UTC | #5
Same JSON schema used by SoundCloud

> On 10 Jan 2018, at 10:16, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
> 
>> wavesurer.js  - Web Audio API
> I mean its would be hard to do the same for large files
> https://developer.mozilla.org/en-US/docs/Web/API/BaseAudioContext/decodeAudioData
> 
>> On 10 Jan 2018, at 09:04, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>> 
>> Hi, 
>> While Waveform.js converts old SoundCloud PNGs, wavesurer.js is using Web Audio API which is limited/not supported by all browsers
>> 
>>> On 10 Jan 2018, at 08:51, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>> 
>>> There is no rush on this. Could you please do a code review so I can see how to do things properly?
>>>> On 10 Jan 2018, at 08:43, Kyle Swanson <k@ylo.ph> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> On Tue, Jan 9, 2018 at 3:49 PM,  <dmitry.gumenyuk@gmail.com> wrote:
>>>>> From: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
>>>>> 
>>>>> Signed-off-by: Dmytro Humeniuk <dmitry.gumenyuk@gmail.com>
>>>>> ---
>>>>> Changelog                      |   1 +
>>>>> doc/filters.texi               |  23 ++++
>>>>> libavfilter/Makefile           |   1 +
>>>>> libavfilter/af_dumpwave.c      | 285
>>>> +++++++++++++++++++++++++++++++++++++++++
>>>>> libavfilter/allfilters.c       |   1 +
>>>>> libavfilter/version.h          |   4 +-
>>>>> tests/fate/filter-audio.mak    |   5 +
>>>>> tests/ref/fate/filter-dumpwave |   1 +
>>>>> 8 files changed, 319 insertions(+), 2 deletions(-)
>>>>> create mode 100644 libavfilter/af_dumpwave.c
>>>>> create mode 100644 tests/ref/fate/filter-dumpwave
>>>> 
>>>> I could see this possibly being a useful filter, but I'm confused about
>>>> where the JSON schema came from. The two JS libraries that do this type of
>>>> thing (waveform.js, and wavesurer.js) both just load waveform data as an
>>>> array of floats. If we're going to add something like this to libavfilter
>>>> it should be as generic and extensible as possible. I'm not wild about the
>>>> string stuff, and the big sample format switch isn't necessary. I could do
>>>> a code review, but it might just be faster if I rewrite it and send another
>>>> patch. Is that OK with you?
>>>> 
>>>> Thanks,
>>>> Kyle
>>>> _______________________________________________
>>>> ffmpeg-devel mailing list
>>>> ffmpeg-devel@ffmpeg.org
>>>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>> 
>> 
>
Kyle Swanson Jan. 10, 2018, 5:18 p.m. UTC | #6
Hi,

For this to be a part of libavfilter the output needs to be more generic
than the just the Soundcloud format. If we want this to be generally useful
it should probably just output an array of floats between 0.0 and 1.0. The
consumer of this data (JS library, or whatever) can use this in whatever
way it wants. If you send another patch, just reply to this thread because
that makes it easier to follow (sending a patch as an attachment is OK).
Here are some critiques:

+typedef struct DumpWaveContext {
> +    const AVClass *class;   /**< class for AVOptions */
> +    int w;                  /**< number of data values in json */
> +    int h;                  /**< values will be scaled according to
> provided */
> +    int is_disabled;        /**< disable filter in case it's
> misconfigured */
> +    int i;                  /**< index of value */
> +    char *json;             /**< path to json */
> +    char *str;              /**< comma separated values */, wha
> +    double *values;         /**< scaling factors */
> +    int64_t s;              /**< samples per value per channel */
> +    int64_t n;              /**< current number of samples counted */
> +    int64_t max_samples;    /**< samples per value */
> +    double sum;             /**< sum of the squared samples per value */
> +} DumpWaveContext;

Use more descriptive variable names for your struct members. They don't
have to be just one letter.


> +    { "d", "set width and height", OFFSET(w), AV_OPT_TYPE_IMAGE_SIZE,
> {.str = "640x480"}, 0, 0, FLAGS },

Get rid of this. We shouldn't care how this data is used/rendered. Our only
job should be to output an array of floats.


> +    { "s", "set number of samples per value per channel",  OFFSET(s),
> AV_OPT_TYPE_INT64,  {.i64 = 0}, 0, INT64_MAX, FLAGS },

Maybe you can call this frame_size? 0 is not a useful value here, it
shouldn't be the lower bound or the default value.



> +static av_cold int init(AVFilterContext *ctx)
> +{
> +    DumpWaveContext *dumpwave = ctx->priv;
> +    if(!dumpwave->s) {

The filter should just fail if it's not configured correctly. You'll get
this behavior for free with better default values.


> +static int config_output(AVFilterLink *outlink)
> +{
> +    AVFilterContext *ctx = outlink->src;
> +    DumpWaveContext *dumpwave = ctx->priv;
> +    const int width = dumpwave->w;
> +    dumpwave->values = av_realloc(NULL, width * sizeof(double));
> +    dumpwave->str = av_realloc(NULL, width * sizeof(int));

You don't need a giant buffer to hold the entire string. Just keep a file
open a write to it every frame. Maybe we could just write if to stdout
instead?


> +
> +/**
> + * Converts sample to dB and calculates root mean squared value
> + */
> +static inline void dbRms(DumpWaveContext *dumpwave, double smpl)
>
Just call this RMS and spit something out between 0.0 and 1.0.  Avoid
camelcase for function names.


>
> +    switch (inlink->format) {
> +        case AV_SAMPLE_FMT_DBLP:
> +            for (c = 0; c < channels; c++) {
> +                const double *src = (const double *)buf->extended_data[c];
> +
> +                for (i = 0; i < buf->nb_samples; i++, src++)
> +                    dbRms(dumpwave, *src);
> +            }
> +            break;
> +        case AV_SAMPLE_FMT_DBL: {
> +            const double *src = (const double *)buf->extended_data[0];
> +
> +            for (i = 0; i < buf->nb_samples; i++) {
> +                for (c = 0; c < channels; c++, src++)
> +                    dbRms(dumpwave, *src);
> +            }}
> +            break;
> +        case AV_SAMPLE_FMT_FLTP:
> +            for (c = 0; c < channels; c++) {
> +                const float *src = (const float *)buf->extended_data[c];
> +
> +                for (i = 0; i < buf->nb_samples; i++, src++)
> +                    dbRms(dumpwave, *src);
> +            }
> +            break;
> +        case AV_SAMPLE_FMT_FLT: {
> +            const float *src = (const float *)buf->extended_data[0];
> +
> +            for (i = 0; i < buf->nb_samples; i++) {
> +                for (c = 0; c < channels; c++, src++)
> +                    dbRms(dumpwave, *src);
> +            }}
> +            break;
> +        case AV_SAMPLE_FMT_S64P:
> +            for (c = 0; c < channels; c++) {
> +                const int64_t *src = (const int64_t
> *)buf->extended_data[c];
> +
> +                for (i = 0; i < buf->nb_samples; i++, src++)
> +                    dbRms(dumpwave, *src / (double)INT64_MAX);
> +            }
> +            break;
> +        case AV_SAMPLE_FMT_S64: {
> +            const int64_t *src = (const int64_t *)buf->extended_data[0];
> +
> +            for (i = 0; i < buf->nb_samples; i++) {
> +                for (c = 0; c < channels; c++, src++)
> +                    dbRms(dumpwave, *src / (double)INT64_MAX);
> +            }}
> +            break;
> +        case AV_SAMPLE_FMT_S32P:
> +            for (c = 0; c < channels; c++) {
> +                const int32_t *src = (const int32_t
> *)buf->extended_data[c];
> +
> +                for (i = 0; i < buf->nb_samples; i++, src++)
> +                    dbRms(dumpwave, *src / (double)INT32_MAX);
> +            }
> +            break;
> +        case AV_SAMPLE_FMT_S32: {
> +            const int32_t *src = (const int32_t *)buf->extended_data[0];
> +
> +            for (i = 0; i < buf->nb_samples; i++) {
> +                for (c = 0; c < channels; c++, src++)
> +                    dbRms(dumpwave, *src / (double)INT32_MAX);
> +            }}
> +            break;
> +        case AV_SAMPLE_FMT_S16P:
> +            for (c = 0; c < channels; c++) {
> +                const int16_t *src = (const int16_t
> *)buf->extended_data[c];
> +
> +                for (i = 0; i < buf->nb_samples; i++, src++)
> +                    dbRms(dumpwave, *src / (double)INT16_MAX);
> +            }
> +            break;
> +        case AV_SAMPLE_FMT_S16: {
> +            const int16_t *src = (const int16_t *)buf->extended_data[0];
> +
> +            for (i = 0; i < buf->nb_samples; i++) {
> +                for (c = 0; c < channels; c++, src++)
> +                    dbRms(dumpwave, *src / (double)INT16_MAX);
> +            }}
> +            break;
> +        case AV_SAMPLE_FMT_U8P:
> +            for (c = 0; c < channels; c++) {
> +                const int8_t *src = (const int8_t *)buf->extended_data[c];
> +
> +                for (i = 0; i < buf->nb_samples; i++, src++)
> +                    dbRms(dumpwave, *src / (double)INT8_MAX);
> +            }
> +            break;
> +        case AV_SAMPLE_FMT_U8: {
> +            const int8_t *src = (const int8_t *)buf->extended_data[0];
> +
> +            for (i = 0; i < buf->nb_samples; i++) {
> +                for (c = 0; c < channels; c++, src++)
> +                    dbRms(dumpwave, *src / (double)INT8_MAX);
> +            }}
> +            break;
> +        default:
> +            break;
> +    }
> +end:
> +    return ff_filter_frame(ctx->outputs[0], buf);
> +}
>
In some filters this might make sense, but not this one. Just force
something reasonable in query_formats. See one of many audio filters for an
example.


>
> +
> +AVFilter ff_af_dumpwave = {
> +    .name          = "dumpwave",
> +    .description   = NULL_IF_CONFIG_SMALL("Dumps RMS amplitude to JSON
> file"),
> +    .init          = init,
> +    .uninit        = uninit,
> +    .priv_size     = sizeof(DumpWaveContext),
> +    .inputs        = dumpwave_inputs,
> +    .outputs       = dumpwave_outputs,
> +    .priv_class    = &dumpwave_class,
> +};
>
You can get rid of the `dumpwave_` prefixes here.

Thanks,
Kyle
Tobias Rapp Jan. 11, 2018, 8:20 a.m. UTC | #7
On 10.01.2018 18:18, Kyle Swanson wrote:
> Hi,
> 
> For this to be a part of libavfilter the output needs to be more generic
> than the just the Soundcloud format. If we want this to be generally useful
> it should probably just output an array of floats between 0.0 and 1.0. The
> consumer of this data (JS library, or whatever) can use this in whatever
> way it wants.

I agree. If the BWF Peak Envelope output which was suggested in the 
other thread does not match your demands and filter implementation is 
actually necessary I would prefer if the filter would attach the RMS 
value(s) as frame metadata instead of directly dumping to file. Frame 
metadata can then be re-used by other filters or dumped into file by 
using the existing "ametadata" filter.

This would be similar to:

ffmpeg -i input-file -f null -filter:a 
"asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" 
/dev/null

BTW: The "astats" filter already provides some RMS values.

> If you send another patch, just reply to this thread because
> that makes it easier to follow (sending a patch as an attachment is OK).
> Here are some critiques:
> 
> [...]

Also when sending patches adding an increased version number helps 
sorting out which is the latest one (git format-patch -v2 ...).

Regards,
Tobias
dmitry.gumenyuk@gmail.com Jan. 12, 2018, 11:16 a.m. UTC | #8
Hi
> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com> wrote:
> 
> On 10.01.2018 18:18, Kyle Swanson wrote:
>> Hi,
>> For this to be a part of libavfilter the output needs to be more generic
>> than the just the Soundcloud format. If we want this to be generally useful
>> it should probably just output an array of floats between 0.0 and 1.0. The
>> consumer of this data (JS library, or whatever) can use this in whatever
>> way it wants.
> 
> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
RMS values may be counted for several frames or only for a half of a frame 
> used by other filters or dumped into file by using the existing "ametadata" filter.
> 
> This would be similar to:
> 
> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
> 
> 
> BTW: The "astats" filter already provides some RMS values.
> 
>> If you send another patch, just reply to this thread because
>> that makes it easier to follow (sending a patch as an attachment is OK).
>> Here are some critiques:
>> [...]
> 
> Also when sending patches adding an increased version number helps sorting out which is the latest one (git format-patch -v2 ...).
> 
> Regards,
> Tobias
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Tobias Rapp Jan. 12, 2018, 12:17 p.m. UTC | #9
On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
> Hi
>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>
>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>> Hi,
>>> For this to be a part of libavfilter the output needs to be more generic
>>> than the just the Soundcloud format. If we want this to be generally useful
>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>> consumer of this data (JS library, or whatever) can use this in whatever
>>> way it wants.
>>
>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
> RMS values may be counted for several frames or only for a half of a frame
>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>
>> This would be similar to:
>>
>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output

I added asetnsamples to define the audio frame size (interval of values 
from astats). You can reduce the number of lines printed by ametadata by 
using the "key=lavfi.astats.foo" option.

Regards,
Tobias
dmitry.gumenyuk@gmail.com Jan. 12, 2018, 12:32 p.m. UTC | #10
> On 12 Jan 2018, at 13:17, Tobias Rapp <t.rapp@noa-archive.com> wrote:
> 
> On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
>> Hi
>>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>> 
>>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>>> Hi,
>>>> For this to be a part of libavfilter the output needs to be more generic
>>>> than the just the Soundcloud format. If we want this to be generally useful
>>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>>> consumer of this data (JS library, or whatever) can use this in whatever
>>>> way it wants.
>>> 
>>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
>> RMS values may be counted for several frames or only for a half of a frame
>>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>> 
>>> This would be similar to:
>>> 
>>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
>> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
> 
> I added asetnsamples to define the audio frame size (interval of values from astats). You can reduce the number of lines printed by ametadata by using the "key=lavfi.astats.foo" option.
I used asetnsamples as well, and I measured performance while transcoding - it appears to be slight slower
> 
> 
> Regards,
> Tobias
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org <mailto:ffmpeg-devel@ffmpeg.org>
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>
dmitry.gumenyuk@gmail.com Jan. 13, 2018, 12:37 a.m. UTC | #11
Hi

> On 12 Jan 2018, at 13:32, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
> 
>> On 12 Jan 2018, at 13:17, Tobias Rapp <t.rapp@noa-archive.com <mailto:t.rapp@noa-archive.com>> wrote:
>> 
>> On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
>>> Hi
>>>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com <mailto:t.rapp@noa-archive.com>> wrote:
>>>> 
>>>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>>>> Hi,
>>>>> For this to be a part of libavfilter the output needs to be more generic
>>>>> than the just the Soundcloud format. If we want this to be generally useful
>>>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>>>> consumer of this data (JS library, or whatever) can use this in whatever
>>>>> way it wants.
>>>> 
>>>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
>>> RMS values may be counted for several frames or only for a half of a frame
>>>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>>> 
>>>> This would be similar to:
>>>> 
>>>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
>>> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
>> 
>> I added asetnsamples to define the audio frame size (interval of values from astats). You can reduce the number of lines printed by ametadata by using the "key=lavfi.astats.foo" option.
> I used asetnsamples as well, and I measured performance while transcoding - it appears to be slight slower
I think output is now more generic and I got rid of long switch/case, thanks for support 

>> 
>> Regards,
>> Tobias
>> 
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org <mailto:ffmpeg-devel@ffmpeg.org>
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>
dmitry.gumenyuk@gmail.com Jan. 13, 2018, 10:52 p.m. UTC | #12
Hi, 
> On 13 Jan 2018, at 01:37, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
> 
> Hi
> 
>> On 12 Jan 2018, at 13:32, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>> 
>>> On 12 Jan 2018, at 13:17, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>> 
>>> On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
>>>> Hi
>>>>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>> 
>>>>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>>>>> Hi,
>>>>>> For this to be a part of libavfilter the output needs to be more generic
>>>>>> than the just the Soundcloud format. If we want this to be generally useful
>>>>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>>>>> consumer of this data (JS library, or whatever) can use this in whatever
>>>>>> way it wants.
>>>>> 
>>>>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
>>>> RMS values may be counted for several frames or only for a half of a frame
>>>>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>>>> 
>>>>> This would be similar to:
>>>>> 
>>>>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
>>>> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
>>> 
>>> I added asetnsamples to define the audio frame size (interval of values from astats). You can reduce the number of lines printed by ametadata by using the "key=lavfi.astats.foo" option.
>> I used asetnsamples as well, and I measured performance while transcoding - it appears to be slight slower
> I think output is now more generic and I got rid of long switch/case, thanks for support 
Here is most recent patch, seems like all comments are addressed, did I miss something?
>>> 
>>> Regards,
>>> Tobias
>>> 
>>> _______________________________________________
>>> ffmpeg-devel mailing list
>>> ffmpeg-devel@ffmpeg.org
>>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> <0001-avfilter-add-dumpwave-filter.patch.txt>
Tobias Rapp Jan. 15, 2018, 8:14 a.m. UTC | #13
On 13.01.2018 23:52, Дмитрий Гуменюк wrote:
> Hi,
>> On 13 Jan 2018, at 01:37, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>
>> Hi
>>
>>> On 12 Jan 2018, at 13:32, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>>
>>>> On 12 Jan 2018, at 13:17, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>
>>>> On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
>>>>> Hi
>>>>>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>>
>>>>>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>>>>>> Hi,
>>>>>>> For this to be a part of libavfilter the output needs to be more generic
>>>>>>> than the just the Soundcloud format. If we want this to be generally useful
>>>>>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>>>>>> consumer of this data (JS library, or whatever) can use this in whatever
>>>>>>> way it wants.
>>>>>>
>>>>>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
>>>>> RMS values may be counted for several frames or only for a half of a frame
>>>>>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>>>>>
>>>>>> This would be similar to:
>>>>>>
>>>>>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
>>>>> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
>>>>
>>>> I added asetnsamples to define the audio frame size (interval of values from astats). You can reduce the number of lines printed by ametadata by using the "key=lavfi.astats.foo" option.
>>> I used asetnsamples as well, and I measured performance while transcoding - it appears to be slight slower
>> I think output is now more generic and I got rid of long switch/case, thanks for support
> Here is most recent patch, seems like all comments are addressed, did I miss something?

I still would prefer to have the value attached as frame metadata, then 
dumped into file via the existing "ametadata" filter. Even better would 
be to integrate the statistic value (if missing) into the "astats" filter.

If your concern is the output format of "ametadata" then some output 
format extension (CSV/JSON) needs to be discussed for ametadata/metadata.

If your concern is performance then please add some numbers. In my tests 
using an approx. 5 minutes input WAV file (48kHz, stereo) the run with 
"asetnsamples" was considerably faster than the run without (1.7s vs. 13.9s)

Regards,
Tobias
dmitry.gumenyuk@gmail.com Jan. 15, 2018, 12:48 p.m. UTC | #14
> On 15 Jan 2018, at 09:14, Tobias Rapp <t.rapp@noa-archive.com> wrote:
> 
> On 13.01.2018 23:52, Дмитрий Гуменюк wrote:
>> Hi,
>>> On 13 Jan 2018, at 01:37, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>> 
>>> Hi
>>> 
>>>> On 12 Jan 2018, at 13:32, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>>> 
>>>>> On 12 Jan 2018, at 13:17, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>> 
>>>>> On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
>>>>>> Hi
>>>>>>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>>> 
>>>>>>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>>>>>>> Hi,
>>>>>>>> For this to be a part of libavfilter the output needs to be more generic
>>>>>>>> than the just the Soundcloud format. If we want this to be generally useful
>>>>>>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>>>>>>> consumer of this data (JS library, or whatever) can use this in whatever
>>>>>>>> way it wants.
>>>>>>> 
>>>>>>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
>>>>>> RMS values may be counted for several frames or only for a half of a frame
>>>>>>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>>>>>> 
>>>>>>> This would be similar to:
>>>>>>> 
>>>>>>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
>>>>>> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
>>>>> 
>>>>> I added asetnsamples to define the audio frame size (interval of values from astats). You can reduce the number of lines printed by ametadata by using the "key=lavfi.astats.foo" option.
>>>> I used asetnsamples as well, and I measured performance while transcoding - it appears to be slight slower
>>> I think output is now more generic and I got rid of long switch/case, thanks for support
>> Here is most recent patch, seems like all comments are addressed, did I miss something?
> 
> I still would prefer to have the value attached as frame metadata, then dumped into file via the existing "ametadata" filter. Even better would be to integrate the statistic value (if missing) into the "astats" filter.
> 
> If your concern is the output format of "ametadata" then some output format extension (CSV/JSON) needs to be discussed for ametadata/metadata.
> 
> If your concern is performance then please add some numbers. In my tests using an approx. 5 minutes input WAV file (48kHz, stereo) the run with "asetnsamples" was considerably faster than the run without (1.7s vs. 13.9s)
Hi
As I mentioned previously adding metadata to each frame is not possible
as value may be counted for several frames or only for a half of a frame 

I used 2 hours long 48kHz mp3 https://s3-eu-west-1.amazonaws.com/balamii/SynthSystemSystersJAN2018.mp3
For this purposes I set up CentOS AWS EC2 nano instance
Then I transcoded it while filtering like following (just to recreate real situation):
1. -filter:a "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat" out.mp3
2. -filter:a "dumpwave=n=192197:f=-" out.mp3
Results:
1. 244810550046 nanoseconds
2. 87494286740 nanoseconds

One of the possible use cases - to set up 2 chains of asetnsamples->metadata - for example:
"asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat,asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file1.dat” for sure it will affect performance
Comparing with "dumpwave=n=192197:f=out1,dumpwave=n= 22050:f=out2"

> Regards,
> Tobias
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Tobias Rapp Jan. 18, 2018, 7:56 a.m. UTC | #15
On 15.01.2018 13:48, Dmytro Humeniuk wrote:
> 
>> On 15 Jan 2018, at 09:14, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>
>> On 13.01.2018 23:52, Дмитрий Гуменюк wrote:
>>> Hi,
>>>> On 13 Jan 2018, at 01:37, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>>>
>>>> Hi
>>>>
>>>>> On 12 Jan 2018, at 13:32, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>>>>
>>>>>> On 12 Jan 2018, at 13:17, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>>
>>>>>> On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
>>>>>>> Hi
>>>>>>>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>>>>
>>>>>>>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>>>>>>>> Hi,
>>>>>>>>> For this to be a part of libavfilter the output needs to be more generic
>>>>>>>>> than the just the Soundcloud format. If we want this to be generally useful
>>>>>>>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>>>>>>>> consumer of this data (JS library, or whatever) can use this in whatever
>>>>>>>>> way it wants.
>>>>>>>>
>>>>>>>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
>>>>>>> RMS values may be counted for several frames or only for a half of a frame
>>>>>>>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>>>>>>>
>>>>>>>> This would be similar to:
>>>>>>>>
>>>>>>>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
>>>>>>> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
>>>>>>
>>>>>> I added asetnsamples to define the audio frame size (interval of values from astats). You can reduce the number of lines printed by ametadata by using the "key=lavfi.astats.foo" option.
>>>>> I used asetnsamples as well, and I measured performance while transcoding - it appears to be slight slower
>>>> I think output is now more generic and I got rid of long switch/case, thanks for support
>>> Here is most recent patch, seems like all comments are addressed, did I miss something?
>>
>> I still would prefer to have the value attached as frame metadata, then dumped into file via the existing "ametadata" filter. Even better would be to integrate the statistic value (if missing) into the "astats" filter.
>>
>> If your concern is the output format of "ametadata" then some output format extension (CSV/JSON) needs to be discussed for ametadata/metadata.
>>
>> If your concern is performance then please add some numbers. In my tests using an approx. 5 minutes input WAV file (48kHz, stereo) the run with "asetnsamples" was considerably faster than the run without (1.7s vs. 13.9s)
> Hi
> As I mentioned previously adding metadata to each frame is not possible
> as value may be counted for several frames or only for a half of a frame
> 
> I used 2 hours long 48kHz mp3 https://s3-eu-west-1.amazonaws.com/balamii/SynthSystemSystersJAN2018.mp3
> For this purposes I set up CentOS AWS EC2 nano instance
> Then I transcoded it while filtering like following (just to recreate real situation):
> 1. -filter:a "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat" out.mp3
> 2. -filter:a "dumpwave=n=192197:f=-" out.mp3
> Results:
> 1. 244810550046 nanoseconds
> 2. 87494286740 nanoseconds
> 
> One of the possible use cases - to set up 2 chains of asetnsamples->metadata - for example:
> "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat,asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file1.dat” for sure it will affect performance
> Comparing with "dumpwave=n=192197:f=out1,dumpwave=n= 22050:f=out2"

Sorry, I misunderstood your concerns regarding asetnsamples filter 
performance. The numbers I provided have been for

"asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat"

versus

"astats=metadata=on,ametadata=print:file=stats-file.dat"

When comparing astats+ametadata versus dumpwave it is obvious that a 
specialized filter which only calculates one statistic value is faster 
than a filter that calculates multiple statistics. But still my opinion 
is that if the dumpwave filter is to be added to the codebase it should 
be more generic (i.e. output frame metadata similar to the psnr/ssim 
filters for video).

Regards,
Tobias
dmitry.gumenyuk@gmail.com Jan. 18, 2018, 4:32 p.m. UTC | #16
> On 18 Jan 2018, at 08:56, Tobias Rapp <t.rapp@noa-archive.com> wrote:
> 
> On 15.01.2018 13:48, Dmytro Humeniuk wrote:
>>> On 15 Jan 2018, at 09:14, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>> 
>>> On 13.01.2018 23:52, Дмитрий Гуменюк wrote:
>>>> Hi,
>>>>> On 13 Jan 2018, at 01:37, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>>>> 
>>>>> Hi
>>>>> 
>>>>>> On 12 Jan 2018, at 13:32, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>>>>> 
>>>>>>> On 12 Jan 2018, at 13:17, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>>> 
>>>>>>> On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
>>>>>>>> Hi
>>>>>>>>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>>>>> 
>>>>>>>>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> For this to be a part of libavfilter the output needs to be more generic
>>>>>>>>>> than the just the Soundcloud format. If we want this to be generally useful
>>>>>>>>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>>>>>>>>> consumer of this data (JS library, or whatever) can use this in whatever
>>>>>>>>>> way it wants.
>>>>>>>>> 
>>>>>>>>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
>>>>>>>> RMS values may be counted for several frames or only for a half of a frame
>>>>>>>>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>>>>>>>> 
>>>>>>>>> This would be similar to:
>>>>>>>>> 
>>>>>>>>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
>>>>>>>> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
>>>>>>> 
>>>>>>> I added asetnsamples to define the audio frame size (interval of values from astats). You can reduce the number of lines printed by ametadata by using the "key=lavfi.astats.foo" option.
>>>>>> I used asetnsamples as well, and I measured performance while transcoding - it appears to be slight slower
>>>>> I think output is now more generic and I got rid of long switch/case, thanks for support
>>>> Here is most recent patch, seems like all comments are addressed, did I miss something?
>>> 
>>> I still would prefer to have the value attached as frame metadata, then dumped into file via the existing "ametadata" filter. Even better would be to integrate the statistic value (if missing) into the "astats" filter.
>>> 
>>> If your concern is the output format of "ametadata" then some output format extension (CSV/JSON) needs to be discussed for ametadata/metadata.
>>> 
>>> If your concern is performance then please add some numbers. In my tests using an approx. 5 minutes input WAV file (48kHz, stereo) the run with "asetnsamples" was considerably faster than the run without (1.7s vs. 13.9s)
>> Hi
>> As I mentioned previously adding metadata to each frame is not possible
>> as value may be counted for several frames or only for a half of a frame
>> I used 2 hours long 48kHz mp3 https://s3-eu-west-1.amazonaws.com/balamii/SynthSystemSystersJAN2018.mp3
>> For this purposes I set up CentOS AWS EC2 nano instance
>> Then I transcoded it while filtering like following (just to recreate real situation):
>> 1. -filter:a "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat" out.mp3
>> 2. -filter:a "dumpwave=n=192197:f=-" out.mp3
>> Results:
>> 1. 244810550046 nanoseconds
>> 2. 87494286740 nanoseconds
>> One of the possible use cases - to set up 2 chains of asetnsamples->metadata - for example:
>> "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat,asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file1.dat” for sure it will affect performance
>> Comparing with "dumpwave=n=192197:f=out1,dumpwave=n= 22050:f=out2"
> 
> Sorry, I misunderstood your concerns regarding asetnsamples filter performance. The numbers I provided have been for
> 
> "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat"
> 
> versus
> 
> "astats=metadata=on,ametadata=print:file=stats-file.dat"
> 
> When comparing astats+ametadata versus dumpwave it is obvious that a specialized filter which only calculates one statistic value is faster than a filter that calculates multiple statistics. But still my opinion is that if the dumpwave filter is to be added to the codebase it should be more generic (i.e. output frame metadata similar to the psnr/ssim filters for video).

Actually current output(normalised float values in range 0...1) was proposed by Kyle as more generic.
> 
> Regards,
> Tobias
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
dmitry.gumenyuk@gmail.com Jan. 20, 2018, 7:17 p.m. UTC | #17
> On 18 Jan 2018, at 17:32, Dmytro Humeniuk <dmitry.gumenyuk@gmail.com> wrote:
> 
>> 
>> On 18 Jan 2018, at 08:56, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>> 
>> On 15.01.2018 13:48, Dmytro Humeniuk wrote:
>>>> On 15 Jan 2018, at 09:14, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>> 
>>>> On 13.01.2018 23:52, Дмитрий Гуменюк wrote:
>>>>> Hi,
>>>>>> On 13 Jan 2018, at 01:37, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>>>>> 
>>>>>> Hi
>>>>>> 
>>>>>>> On 12 Jan 2018, at 13:32, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>>>>>> 
>>>>>>>> On 12 Jan 2018, at 13:17, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>>>> 
>>>>>>>> On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
>>>>>>>>> Hi
>>>>>>>>>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> For this to be a part of libavfilter the output needs to be more generic
>>>>>>>>>>> than the just the Soundcloud format. If we want this to be generally useful
>>>>>>>>>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>>>>>>>>>> consumer of this data (JS library, or whatever) can use this in whatever
>>>>>>>>>>> way it wants.
>>>>>>>>>> 
>>>>>>>>>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
>>>>>>>>> RMS values may be counted for several frames or only for a half of a frame
>>>>>>>>>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>>>>>>>>> 
>>>>>>>>>> This would be similar to:
>>>>>>>>>> 
>>>>>>>>>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
>>>>>>>>> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
>>>>>>>> 
>>>>>>>> I added asetnsamples to define the audio frame size (interval of values from astats). You can reduce the number of lines printed by ametadata by using the "key=lavfi.astats.foo" option.
>>>>>>> I used asetnsamples as well, and I measured performance while transcoding - it appears to be slight slower
>>>>>> I think output is now more generic and I got rid of long switch/case, thanks for support
>>>>> Here is most recent patch, seems like all comments are addressed, did I miss something?
>>>> 
>>>> I still would prefer to have the value attached as frame metadata, then dumped into file via the existing "ametadata" filter. Even better would be to integrate the statistic value (if missing) into the "astats" filter.
>>>> 
>>>> If your concern is the output format of "ametadata" then some output format extension (CSV/JSON) needs to be discussed for ametadata/metadata.
>>>> 
>>>> If your concern is performance then please add some numbers. In my tests using an approx. 5 minutes input WAV file (48kHz, stereo) the run with "asetnsamples" was considerably faster than the run without (1.7s vs. 13.9s)
>>> Hi
>>> As I mentioned previously adding metadata to each frame is not possible
>>> as value may be counted for several frames or only for a half of a frame
>>> I used 2 hours long 48kHz mp3 https://s3-eu-west-1.amazonaws.com/balamii/SynthSystemSystersJAN2018.mp3
>>> For this purposes I set up CentOS AWS EC2 nano instance
>>> Then I transcoded it while filtering like following (just to recreate real situation):
>>> 1. -filter:a "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat" out.mp3
>>> 2. -filter:a "dumpwave=n=192197:f=-" out.mp3
>>> Results:
>>> 1. 244810550046 nanoseconds
>>> 2. 87494286740 nanoseconds
>>> One of the possible use cases - to set up 2 chains of asetnsamples->metadata - for example:
>>> "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat,asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file1.dat” for sure it will affect performance
>>> Comparing with "dumpwave=n=192197:f=out1,dumpwave=n= 22050:f=out2"
>> 
>> Sorry, I misunderstood your concerns regarding asetnsamples filter performance. The numbers I provided have been for
>> 
>> "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat"
>> 
>> versus
>> 
>> "astats=metadata=on,ametadata=print:file=stats-file.dat"
>> 
>> When comparing astats+ametadata versus dumpwave it is obvious that a specialized filter which only calculates one statistic value is faster than a filter that calculates multiple statistics. But still my opinion is that if the dumpwave filter is to be added to the codebase it should be more generic (i.e. output frame metadata similar to the psnr/ssim filters for video).
> 
> Actually current output(normalised float values in range 0...1) was proposed by Kyle as more generic.
Ping
>> 
>> Regards,
>> Tobias
>> 
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Tobias Rapp Jan. 22, 2018, 1:23 p.m. UTC | #18
On 20.01.2018 20:17, Dmytro Humeniuk wrote:
> 
>> On 18 Jan 2018, at 17:32, Dmytro Humeniuk <dmitry.gumenyuk@gmail.com> wrote:
>>
>>>
>>> On 18 Jan 2018, at 08:56, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>
>>> On 15.01.2018 13:48, Dmytro Humeniuk wrote:
>>>>> On 15 Jan 2018, at 09:14, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>
>>>>> On 13.01.2018 23:52, Дмитрий Гуменюк wrote:
>>>>>> Hi,
>>>>>>> On 13 Jan 2018, at 01:37, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>>> On 12 Jan 2018, at 13:32, Дмитрий Гуменюк <dmitry.gumenyuk@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> On 12 Jan 2018, at 13:17, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>>>>>
>>>>>>>>> On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
>>>>>>>>>> Hi
>>>>>>>>>>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp@noa-archive.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> For this to be a part of libavfilter the output needs to be more generic
>>>>>>>>>>>> than the just the Soundcloud format. If we want this to be generally useful
>>>>>>>>>>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>>>>>>>>>>> consumer of this data (JS library, or whatever) can use this in whatever
>>>>>>>>>>>> way it wants.
>>>>>>>>>>>
>>>>>>>>>>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
>>>>>>>>>> RMS values may be counted for several frames or only for a half of a frame
>>>>>>>>>>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>>>>>>>>>>
>>>>>>>>>>> This would be similar to:
>>>>>>>>>>>
>>>>>>>>>>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
>>>>>>>>>> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
>>>>>>>>>
>>>>>>>>> I added asetnsamples to define the audio frame size (interval of values from astats). You can reduce the number of lines printed by ametadata by using the "key=lavfi.astats.foo" option.
>>>>>>>> I used asetnsamples as well, and I measured performance while transcoding - it appears to be slight slower
>>>>>>> I think output is now more generic and I got rid of long switch/case, thanks for support
>>>>>> Here is most recent patch, seems like all comments are addressed, did I miss something?
>>>>>
>>>>> I still would prefer to have the value attached as frame metadata, then dumped into file via the existing "ametadata" filter. Even better would be to integrate the statistic value (if missing) into the "astats" filter.
>>>>>
>>>>> If your concern is the output format of "ametadata" then some output format extension (CSV/JSON) needs to be discussed for ametadata/metadata.
>>>>>
>>>>> If your concern is performance then please add some numbers. In my tests using an approx. 5 minutes input WAV file (48kHz, stereo) the run with "asetnsamples" was considerably faster than the run without (1.7s vs. 13.9s)
>>>> Hi
>>>> As I mentioned previously adding metadata to each frame is not possible
>>>> as value may be counted for several frames or only for a half of a frame
>>>> I used 2 hours long 48kHz mp3 https://s3-eu-west-1.amazonaws.com/balamii/SynthSystemSystersJAN2018.mp3
>>>> For this purposes I set up CentOS AWS EC2 nano instance
>>>> Then I transcoded it while filtering like following (just to recreate real situation):
>>>> 1. -filter:a "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat" out.mp3
>>>> 2. -filter:a "dumpwave=n=192197:f=-" out.mp3
>>>> Results:
>>>> 1. 244810550046 nanoseconds
>>>> 2. 87494286740 nanoseconds
>>>> One of the possible use cases - to set up 2 chains of asetnsamples->metadata - for example:
>>>> "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat,asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file1.dat” for sure it will affect performance
>>>> Comparing with "dumpwave=n=192197:f=out1,dumpwave=n= 22050:f=out2"
>>>
>>> Sorry, I misunderstood your concerns regarding asetnsamples filter performance. The numbers I provided have been for
>>>
>>> "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat"
>>>
>>> versus
>>>
>>> "astats=metadata=on,ametadata=print:file=stats-file.dat"
>>>
>>> When comparing astats+ametadata versus dumpwave it is obvious that a specialized filter which only calculates one statistic value is faster than a filter that calculates multiple statistics. But still my opinion is that if the dumpwave filter is to be added to the codebase it should be more generic (i.e. output frame metadata similar to the psnr/ssim filters for video).
>>
>> Actually current output(normalised float values in range 0...1) was proposed by Kyle as more generic.
> Ping

What I wrote is my personal opinion. I acknowledge that you have put 
good efforts in implementing the patch and even added FATE tests -- so 
my words must sound disappointing to you. Rest assured that almost all 
non-trivial patches need multiple iterations.

 From my side improving the existing astats+ametadata code would be the 
preferred way to continue. If that is absolutely unacceptable to you I 
suggest to take a look at the FFmpeg (public) API, the code in 
doc/examples/filtering_audio.c might be a good starting point.

Best regards,
Tobias
diff mbox

Patch

diff --git a/Changelog b/Changelog
index 61075b3392..40fd624449 100644
--- a/Changelog
+++ b/Changelog
@@ -38,6 +38,7 @@  version <next>:
 - Removed the ffserver program
 - Removed the ffmenc and ffmdec muxer and demuxer
 - VideoToolbox HEVC encoder and hwaccel
+- dumpwave audio filter
 
 
 version 3.4:
diff --git a/doc/filters.texi b/doc/filters.texi
index d29c40080f..98e54aec6e 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -2529,6 +2529,29 @@  Optional. It should have a value much less than 1 (e.g. 0.05 or 0.02) and is
 used to prevent clipping.
 @end table
 
+@section dumpwave
+Dumps RMS amplitude to JSON file.
+Converts samples to decibels and calculates RMS (Root-Mean-Square) audio power scaled to desired values.
+
+@table @option
+@item d
+Dimensions @code{WxH}.
+@code{W} - number of data values in json, values will be scaled according to @code{H}.
+The default value is @var{640x480}
+
+@item s
+Samples count per value per channel
+
+@item json
+Path to json file
+@end table
+
+For example, to generate RMS amplitude for 44.1 kHz 6 seconds length audio
+with dimensions @var{1800x140}, samples count @code{44100*6/1800=147} and store it to @var{/tmp/out.json}, you might use:
+@example
+dumpwave=d=1800x140:s=147:json=/tmp/out.json
+@end example
+
 @section dynaudnorm
 Dynamic Audio Normalizer.
 
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index ef4729dd3f..2ffbc9497a 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -87,6 +87,7 @@  OBJS-$(CONFIG_COMPENSATIONDELAY_FILTER)      += af_compensationdelay.o
 OBJS-$(CONFIG_CROSSFEED_FILTER)              += af_crossfeed.o
 OBJS-$(CONFIG_CRYSTALIZER_FILTER)            += af_crystalizer.o
 OBJS-$(CONFIG_DCSHIFT_FILTER)                += af_dcshift.o
+OBJS-$(CONFIG_DUMPWAVE_FILTER)               += af_dumpwave.o
 OBJS-$(CONFIG_DYNAUDNORM_FILTER)             += af_dynaudnorm.o
 OBJS-$(CONFIG_EARWAX_FILTER)                 += af_earwax.o
 OBJS-$(CONFIG_EBUR128_FILTER)                += f_ebur128.o
diff --git a/libavfilter/af_dumpwave.c b/libavfilter/af_dumpwave.c
new file mode 100644
index 0000000000..a1aa33d090
--- /dev/null
+++ b/libavfilter/af_dumpwave.c
@@ -0,0 +1,285 @@ 
+/*
+ * Copyright (c) 2017 Dmytro Humeniuk
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/**
+ * @file
+ * waveform audio filter – dumps RMS amplitude to JSON file like SoundCloud does
+ */
+
+#include "libavutil/avassert.h"
+#include "libavutil/avstring.h"
+#include "libavutil/channel_layout.h"
+#include "libavutil/opt.h"
+#include "libavutil/parseutils.h"
+#include "avfilter.h"
+#include "formats.h"
+#include "audio.h"
+#include "internal.h"
+
+typedef struct DumpWaveContext {
+    const AVClass *class;   /**< class for AVOptions */
+    int w;                  /**< number of data values in json */
+    int h;                  /**< values will be scaled according to provided */
+    int is_disabled;        /**< disable filter in case it's misconfigured */
+    int i;                  /**< index of value */
+    char *json;             /**< path to json */
+    char *str;              /**< comma separated values */
+    double *values;         /**< scaling factors */
+    int64_t s;              /**< samples per value per channel */
+    int64_t n;              /**< current number of samples counted */
+    int64_t max_samples;    /**< samples per value */
+    double sum;             /**< sum of the squared samples per value */
+} DumpWaveContext;
+
+#define OFFSET(x) offsetof(DumpWaveContext, x)
+#define FLAGS AV_OPT_FLAG_AUDIO_PARAM|AV_OPT_FLAG_FILTERING_PARAM
+
+static const AVOption dumpwave_options[] = {
+    { "d", "set width and height", OFFSET(w), AV_OPT_TYPE_IMAGE_SIZE, {.str = "640x480"}, 0, 0, FLAGS },
+    { "s", "set number of samples per value per channel",  OFFSET(s), AV_OPT_TYPE_INT64,  {.i64 = 0}, 0, INT64_MAX, FLAGS },
+    { "json", "set json dump file", OFFSET(json), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, FLAGS },
+    { NULL }
+};
+
+AVFILTER_DEFINE_CLASS(dumpwave);
+
+static av_cold int init(AVFilterContext *ctx)
+{
+    DumpWaveContext *dumpwave = ctx->priv;
+    if(!dumpwave->s) {
+        dumpwave->is_disabled = 1;
+        av_log(ctx, AV_LOG_ERROR, "Invalid samples per value config\n");
+    }
+    return 0;
+}
+
+static av_cold void uninit(AVFilterContext *ctx)
+{
+    DumpWaveContext *dumpwave = ctx->priv;
+    av_freep(&dumpwave->str);
+    av_freep(&dumpwave->values);
+}
+
+static int config_output(AVFilterLink *outlink)
+{
+    AVFilterContext *ctx = outlink->src;
+    DumpWaveContext *dumpwave = ctx->priv;
+    const int width = dumpwave->w;
+    dumpwave->values = av_realloc(NULL, width * sizeof(double));
+    dumpwave->str = av_realloc(NULL, width * sizeof(int));
+    dumpwave->max_samples = dumpwave->s * outlink->channels;
+    
+    return 0;
+}
+
+static int dumpwave_request_frame(AVFilterLink *outlink)
+{
+    AVFilterContext *ctx = outlink->src;
+    DumpWaveContext *dumpwave = ctx->priv;
+    const int width = dumpwave->w;
+    const int height = dumpwave->h;
+    char *p, *result = dumpwave->str;
+    FILE *dump_fp = NULL;
+    
+    AVFilterLink *inlink = ctx->inputs[0];
+    int ret;
+
+    ret = ff_request_frame(inlink);
+
+    if (ret == AVERROR_EOF && !dumpwave->is_disabled) {
+        p = result;
+
+        for(int i = 0; i < width; i++)
+            p += sprintf(p, "%d,", av_clip(dumpwave->h * dumpwave->values[i], 0, dumpwave->h));
+        
+        p[-1] = '\0'; //removing trailing comma
+        
+        if (dumpwave->json && !(dump_fp = av_fopen_utf8(dumpwave->json, "w")))
+            av_log(ctx, AV_LOG_WARNING, "Flushing dump failed\n");
+        
+        if (dump_fp) {
+            fprintf(dump_fp, "{\"width\":%d,\"height\":%d,\"samples\":[%s]}", width, height, result);
+            fclose(dump_fp);
+        }
+    }
+
+    return ret;
+}
+
+/**
+ * Converts sample to dB and calculates root mean squared value
+ */
+static inline void dbRms(DumpWaveContext *dumpwave, double smpl)
+{
+    if (smpl != 0)
+        smpl = (20 * log10(fabs(smpl)) + 60) / 60;
+
+    dumpwave->sum += smpl * smpl;
+    
+    if (dumpwave->n++ == dumpwave->max_samples) {
+        dumpwave->values[dumpwave->i++] = av_clipd(sqrt(dumpwave->sum / dumpwave->max_samples), 0, 1.0);
+        dumpwave->sum = dumpwave->n = 0;
+    }
+}
+
+static int dumpwave_filter_frame(AVFilterLink *inlink, AVFrame *buf)
+{
+    AVFilterContext *ctx = inlink->dst;
+    DumpWaveContext *dumpwave = ctx->priv;
+    const int channels = inlink->channels;
+
+    int i, c;
+    
+    if (dumpwave->is_disabled)
+        goto end;
+    
+    switch (inlink->format) {
+        case AV_SAMPLE_FMT_DBLP:
+            for (c = 0; c < channels; c++) {
+                const double *src = (const double *)buf->extended_data[c];
+                
+                for (i = 0; i < buf->nb_samples; i++, src++)
+                    dbRms(dumpwave, *src);
+            }
+            break;
+        case AV_SAMPLE_FMT_DBL: {
+            const double *src = (const double *)buf->extended_data[0];
+            
+            for (i = 0; i < buf->nb_samples; i++) {
+                for (c = 0; c < channels; c++, src++)
+                    dbRms(dumpwave, *src);
+            }}
+            break;
+        case AV_SAMPLE_FMT_FLTP:
+            for (c = 0; c < channels; c++) {
+                const float *src = (const float *)buf->extended_data[c];
+                
+                for (i = 0; i < buf->nb_samples; i++, src++)
+                    dbRms(dumpwave, *src);
+            }
+            break;
+        case AV_SAMPLE_FMT_FLT: {
+            const float *src = (const float *)buf->extended_data[0];
+            
+            for (i = 0; i < buf->nb_samples; i++) {
+                for (c = 0; c < channels; c++, src++)
+                    dbRms(dumpwave, *src);
+            }}
+            break;
+        case AV_SAMPLE_FMT_S64P:
+            for (c = 0; c < channels; c++) {
+                const int64_t *src = (const int64_t *)buf->extended_data[c];
+                
+                for (i = 0; i < buf->nb_samples; i++, src++)
+                    dbRms(dumpwave, *src / (double)INT64_MAX);
+            }
+            break;
+        case AV_SAMPLE_FMT_S64: {
+            const int64_t *src = (const int64_t *)buf->extended_data[0];
+            
+            for (i = 0; i < buf->nb_samples; i++) {
+                for (c = 0; c < channels; c++, src++)
+                    dbRms(dumpwave, *src / (double)INT64_MAX);
+            }}
+            break;
+        case AV_SAMPLE_FMT_S32P:
+            for (c = 0; c < channels; c++) {
+                const int32_t *src = (const int32_t *)buf->extended_data[c];
+                
+                for (i = 0; i < buf->nb_samples; i++, src++)
+                    dbRms(dumpwave, *src / (double)INT32_MAX);
+            }
+            break;
+        case AV_SAMPLE_FMT_S32: {
+            const int32_t *src = (const int32_t *)buf->extended_data[0];
+            
+            for (i = 0; i < buf->nb_samples; i++) {
+                for (c = 0; c < channels; c++, src++)
+                    dbRms(dumpwave, *src / (double)INT32_MAX);
+            }}
+            break;
+        case AV_SAMPLE_FMT_S16P:
+            for (c = 0; c < channels; c++) {
+                const int16_t *src = (const int16_t *)buf->extended_data[c];
+                
+                for (i = 0; i < buf->nb_samples; i++, src++)
+                    dbRms(dumpwave, *src / (double)INT16_MAX);
+            }
+            break;
+        case AV_SAMPLE_FMT_S16: {
+            const int16_t *src = (const int16_t *)buf->extended_data[0];
+            
+            for (i = 0; i < buf->nb_samples; i++) {
+                for (c = 0; c < channels; c++, src++)
+                    dbRms(dumpwave, *src / (double)INT16_MAX);
+            }}
+            break;
+        case AV_SAMPLE_FMT_U8P:
+            for (c = 0; c < channels; c++) {
+                const int8_t *src = (const int8_t *)buf->extended_data[c];
+                
+                for (i = 0; i < buf->nb_samples; i++, src++)
+                    dbRms(dumpwave, *src / (double)INT8_MAX);
+            }
+            break;
+        case AV_SAMPLE_FMT_U8: {
+            const int8_t *src = (const int8_t *)buf->extended_data[0];
+            
+            for (i = 0; i < buf->nb_samples; i++) {
+                for (c = 0; c < channels; c++, src++)
+                    dbRms(dumpwave, *src / (double)INT8_MAX);
+            }}
+            break;
+        default:
+            break;
+    }
+end:
+    return ff_filter_frame(ctx->outputs[0], buf);
+}
+
+static const AVFilterPad dumpwave_inputs[] = {
+    {
+        .name         = "default",
+        .type         = AVMEDIA_TYPE_AUDIO,
+        .filter_frame = dumpwave_filter_frame,
+    },
+    { NULL }
+};
+
+static const AVFilterPad dumpwave_outputs[] = {
+    {
+        .name          = "default",
+        .type          = AVMEDIA_TYPE_AUDIO,
+        .request_frame = dumpwave_request_frame,
+        .config_props = config_output
+    },
+    { NULL }
+};
+
+AVFilter ff_af_dumpwave = {
+    .name          = "dumpwave",
+    .description   = NULL_IF_CONFIG_SMALL("Dumps RMS amplitude to JSON file"),
+    .init          = init,
+    .uninit        = uninit,
+    .priv_size     = sizeof(DumpWaveContext),
+    .inputs        = dumpwave_inputs,
+    .outputs       = dumpwave_outputs,
+    .priv_class    = &dumpwave_class,
+};
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 42516bbdf9..2539ee9e9a 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -98,6 +98,7 @@  static void register_all(void)
     REGISTER_FILTER(CROSSFEED,      crossfeed,      af);
     REGISTER_FILTER(CRYSTALIZER,    crystalizer,    af);
     REGISTER_FILTER(DCSHIFT,        dcshift,        af);
+    REGISTER_FILTER(DUMPWAVE,       dumpwave,       af);
     REGISTER_FILTER(DYNAUDNORM,     dynaudnorm,     af);
     REGISTER_FILTER(EARWAX,         earwax,         af);
     REGISTER_FILTER(EBUR128,        ebur128,        af);
diff --git a/libavfilter/version.h b/libavfilter/version.h
index 0f11721822..ca096962bb 100644
--- a/libavfilter/version.h
+++ b/libavfilter/version.h
@@ -30,8 +30,8 @@ 
 #include "libavutil/version.h"
 
 #define LIBAVFILTER_VERSION_MAJOR   7
-#define LIBAVFILTER_VERSION_MINOR  11
-#define LIBAVFILTER_VERSION_MICRO 101
+#define LIBAVFILTER_VERSION_MINOR  12
+#define LIBAVFILTER_VERSION_MICRO 100
 
 #define LIBAVFILTER_VERSION_INT AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \
                                                LIBAVFILTER_VERSION_MINOR, \
diff --git a/tests/fate/filter-audio.mak b/tests/fate/filter-audio.mak
index bd8b3d3c35..9896527122 100644
--- a/tests/fate/filter-audio.mak
+++ b/tests/fate/filter-audio.mak
@@ -340,6 +340,10 @@  fate-filter-hdcd-s32p: CMD = md5 -i $(SRC) -af hdcd -f s32le
 fate-filter-hdcd-s32p: CMP = oneline
 fate-filter-hdcd-s32p: REF = 0c5513e83eedaa10ab6fac9ddc173cf5
 
+FATE_AFILTER-$(call FILTERDEMDEC, DUMPWAVE, WAV, PCM_S16LE) += fate-filter-dumpwave
+fate-filter-dumpwave: SRC = $(TARGET_PATH)/tests/data/asynth-44100-2.wav
+fate-filter-dumpwave: CMD = ffmpeg -i $(SRC) -af dumpwave=d=1800x140:s=147:json=$(TARGET_PATH)/tests/data/fate/filter-dumpwave.out -f null - && cat $(TARGET_PATH)/tests/data/fate/filter-dumpwave.out
+
 FATE_AFILTER-yes += fate-filter-formats
 fate-filter-formats: libavfilter/tests/formats$(EXESUF)
 fate-filter-formats: CMD = run libavfilter/tests/formats
@@ -347,3 +351,4 @@  fate-filter-formats: CMD = run libavfilter/tests/formats
 FATE_SAMPLES_AVCONV += $(FATE_AFILTER_SAMPLES-yes)
 FATE_FFMPEG += $(FATE_AFILTER-yes)
 fate-afilter: $(FATE_AFILTER-yes) $(FATE_AFILTER_SAMPLES-yes)
+
diff --git a/tests/ref/fate/filter-dumpwave b/tests/ref/fate/filter-dumpwave
new file mode 100644
index 0000000000..bd07098ef8
--- /dev/null
+++ b/tests/ref/fate/filter-dumpwave
@@ -0,0 +1 @@ 
+{"width":1800,"height":140,"samples":[103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,102,104,103,102,104,104,102,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,102,104,103,102,104,104,102,103,104,102,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,102,104,104,102,103,104,102,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,102,103,104,102,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,102,104,103,103,103,104,103,103,104,103,103,104,103,103,104,103,103,103,104,103,103,104,103,103,104,103,102,104,103,102,104,103,102,103,104,102,103,104,103,101,104,105,104,103,102,104,104,102,102,103,104,104,103,103,104,104,102,103,103,103,104,103,103,103,104,102,104,103,103,104,103,103,103,103,103,103,103,103,103,104,103,103,103,104,103,103,103,103,104,103,103,103,103,103,103,103,104,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,104,104,103,103,103,103,103,103,103,103,103,103,103,102,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,104,103,103,105,103,103,103,103,103,103,104,103,103,103,104,103,103,103,104,103,103,103,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,106,106,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,100,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,96,98,100,97,99,98,97,97,97,96,98,97,98,97,99,98,97,95,98,97,98,101,96,95,96,94,97,99,99,95,98,96,98,98,98,98,98,98,99,98,96,97,99,97,98,99,99,99,96,98,97,98,98,99,99,97,100,95,98,97,99,94,98,96,99,98,97,99,98,97,96,98,95,96,97,100,99,96,99,97,97,97,99,98,97,96,97,97,99,99,100,95,99,98,95,96,99,97,99,99,95,98,96,97,96,99,96,97,98,96,97,95,97,99,99,96,99,96,98,98,96,96,97,96,99,98,97,98,100,98,100,96,98,98,99,97,99,99,99,97,99,97,99,99,98,96,100,97,95,112,119,121,121,118,122,122,120,121,123,121,120,118,121,120,121,123,124,122,120,124,121,122,121,120,121,122,122,120,119,118,122,121,122,121,120,123,120,121,122,121,121,119,120,119,120,121,121,120,122,120,122,123,122,124,122,120,122,121,121,119,122,123,123,122,120,121,119,123,121,125,119,121,119,120,121,121,121,123,122,120,122,123,120,120,123,121,119,122,120,122,123,121,123,121,121,123,120,120,121,123,123,120,122,122,119,120,122,122,120,122,122,120,123,120,121,120,121,121,123,120,121,119,122,117,124,122,122,119,120,122,121,121,123,121,120,121,122,120,121,123,121,119,123,120,123,125,121,120,121,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,104,103,104,103,103,104,103,104,103,104,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,104,103,104,102,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,104,103,103,103,104,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,104,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,103,104,103,103,103,103,103,103,103,102,81,75,74,74,75,76,78,78,79,81,81,82,84,84,84,86,86,86,87,88,88,88,90,88,89,90,89,89,90,89,89,89,89,88,88,89,87,87,87,85,85,85,83,83,82,81,80,79,78,76,75,75,73,75,80,76,79,75,74,74,76,77,77,79,81,80,82,84,83,84,86,86,86,88,87,87,89,89,88,89,90,89,89,90,89,89,90,89,88,89,88,87,87,87,86,85,86,84,83,83,81,80,80,78,77,76,75,73,74,76,81,81,76,73,74,75,76,76,79,79,80,82,83,83,84,85,85,86,87,87,87,89,88,88,90,89,89,89,90,89,89,90,89,89,89,88,87,88,87,86,86,86,84,84,84,82,81,81,78,77,77,75,74,74,74,79,76,79,74,74,75,75,76,78,78,80,81,82,82,84,84,85,86,87,86,87,89,88,88,90,89,89,90,89,89,90,89,89,89,89,88,88,89,87,86,87,85,84,85,83,82,82,81,79,79,78,75,75,75,73,75,81,80,76,74,74,75,76,77,78,79,81,81,82,84,84,84,86,86,86,88,88,87,88,89,88,89,90,89,89,90,89,89,90,89,88,89,88,87,87,87,85,85,85,83,83,83,81,80,80,78,76,75,75,73,74,78,76,79,75,74,74,75,77,77,79,80,80,82,84,83,84,86,85,86,88,87,87,89,89,88,89,90,89,89,90,89,89,90,89,88,89,88,87,88,87,86,86,85,84,83,83,81,80,80,78,77,76,75,74,74,75,81,80,77,73,74,75,75,76,79,79,80,82,82,83,84,85,85,86,87,87,87,89,88,88,90,89,89,90,90,89,89,90,89,89,89,88,88,88,87,86,86,85,84,84,83,82,81,81,79,78,78,75,74,75,74,77,76,80,75,74,74,75,76,78,78,80,81,81,82,84,84,85,86,87,86,87,88,88,88,90,89,89,90,89,89,90,89,89,89,89,88,88,89,87,87,87,85,84,85,83,82,82,81,79,79,78,76,75,75,73,75,80,81,76,75,74,75,76,77,78,79,81,81,82,84,84,84,86,86,86,88,88,87,89,89,88,89,90,89,89,90,89,89,90,89,88,89,88,87,87,87,85,85,86,83,83,83,81,80,80,78,77,76,75,73,74,77,78,79,76,74,74,75,76,77,79,80,80,82,83,83,84,86,85,86,88,87,87,89,88,88,90,89,89,90,90,89,89,90,89,88,89,88,87,88,87,86,86,85,84,84,83,81,81,80,78,77,77,75,74,74,75,80,81,78,74,74,75,75,76,79,79,80,82,82,82,85,85,85,86,87,86,87,89,88,88,90,89,89,90,89,89,90,90,89,89,89,88,88,88,87,86,87,85,84,0,0,0,0,0,0,0]}