[FFmpeg-devel] avfilter: add arbitrary audio FIR filter

Message ID	20170508115948.3597-1-onemda@gmail.com
State	Superseded
Headers	show Delivered-To: ffmpegpatchwork@gmail.com Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; From: Paul B Mahol <onemda@gmail.com> To: ffmpeg-devel@ffmpeg.org Date: Mon, 8 May 2017 13:59:48 +0200 Message-Id: <20170508115948.3597-1-onemda@gmail.com> In-Reply-To: <20170507182200.14728-1-onemda@gmail.com> References: <20170507182200.14728-1-onemda@gmail.com> Subject: [FFmpeg-devel] [PATCH] avfilter: add arbitrary audio FIR filter Precedence: list Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Paul B Mahol May 8, 2017, 11:59 a.m. UTC

Signed-off-by: Paul B Mahol <onemda@gmail.com>
---
 configure                |   2 +
 doc/filters.texi         |  23 ++
 libavfilter/Makefile     |   1 +
 libavfilter/af_afir.c    | 544 +++++++++++++++++++++++++++++++++++++++++++++++
 libavfilter/allfilters.c |   1 +
 5 files changed, 571 insertions(+)
 create mode 100644 libavfilter/af_afir.c

Muhammad Faiz May 8, 2017, 3:07 p.m. UTC | #1

On Mon, May 8, 2017 at 6:59 PM, Paul B Mahol <onemda@gmail.com> wrote:
> Signed-off-by: Paul B Mahol <onemda@gmail.com>
> ---
>  configure                |   2 +
>  doc/filters.texi         |  23 ++
>  libavfilter/Makefile     |   1 +
>  libavfilter/af_afir.c    | 544 +++++++++++++++++++++++++++++++++++++++++++++++
>  libavfilter/allfilters.c |   1 +
>  5 files changed, 571 insertions(+)
>  create mode 100644 libavfilter/af_afir.c
>
> diff --git a/configure b/configure
> index 2e1786a..a46c375 100755
> --- a/configure
> +++ b/configure
> @@ -3081,6 +3081,8 @@ unix_protocol_select="network"
>  # filters
>  afftfilt_filter_deps="avcodec"
>  afftfilt_filter_select="fft"
> +afir_filter_deps="avcodec"
> +afir_filter_select="fft"
>  amovie_filter_deps="avcodec avformat"
>  aresample_filter_deps="swresample"
>  ass_filter_deps="libass"
> diff --git a/doc/filters.texi b/doc/filters.texi
> index f431274..0efce9a 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -878,6 +878,29 @@ afftfilt="1-clip((b/nb)*b,0,1)"
>  @end example
>  @end itemize
>
> +@section afir
> +
> +Apply an Arbitary Frequency Impulse Response filter.
> +
> +This filter uses second stream as FIR coefficients.
> +If second stream holds single channel, it will be used
> +for all input channels in first stream, otherwise
> +number of channels in second stream must be same as
> +number of channels in first stream.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +@item dry
> +Set dry gain. This sets input gain.
> +
> +@item wet
> +Set wet gain. This sets final output gain.
> +
> +@item length
> +Set Impulse Response filter length. Default is 1, which means whole IR is processed.
> +@end table
> +
>  @anchor{aformat}
>  @section aformat
>
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 0f99086..de5f992 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -37,6 +37,7 @@ OBJS-$(CONFIG_AEMPHASIS_FILTER)              += af_aemphasis.o
>  OBJS-$(CONFIG_AEVAL_FILTER)                  += aeval.o
>  OBJS-$(CONFIG_AFADE_FILTER)                  += af_afade.o
>  OBJS-$(CONFIG_AFFTFILT_FILTER)               += af_afftfilt.o window_func.o
> +OBJS-$(CONFIG_AFIR_FILTER)                   += af_afir.o
>  OBJS-$(CONFIG_AFORMAT_FILTER)                += af_aformat.o
>  OBJS-$(CONFIG_AGATE_FILTER)                  += af_agate.o
>  OBJS-$(CONFIG_AINTERLEAVE_FILTER)            += f_interleave.o
> diff --git a/libavfilter/af_afir.c b/libavfilter/af_afir.c
> new file mode 100644
> index 0000000..bc1b6a4
> --- /dev/null
> +++ b/libavfilter/af_afir.c
> @@ -0,0 +1,544 @@
> +/*
> + * Copyright (c) 2017 Paul B Mahol
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * An arbitrary audio FIR filter
> + */
> +
> +#include "libavutil/audio_fifo.h"
> +#include "libavutil/common.h"
> +#include "libavutil/opt.h"
> +#include "libavcodec/avfft.h"
> +
> +#include "audio.h"
> +#include "avfilter.h"
> +#include "formats.h"
> +#include "internal.h"
> +
> +#define MAX_IR_DURATION 30
> +
> +typedef struct AudioFIRContext {
> +    const AVClass *class;
> +
> +    float wet_gain;
> +    float dry_gain;
> +    float length;
> +
> +    float gain;
> +
> +    int eof_coeffs;
> +    int have_coeffs;
> +    int nb_coeffs;
> +    int nb_taps;
> +    int part_size;
> +    int part_index;
> +    int block_length;
> +    int nb_partitions;
> +    int nb_channels;
> +    int ir_length;
> +    int fft_length;
> +    int nb_coef_channels;
> +    int one2many;
> +    int nb_samples;
> +    int want_skip;
> +    int need_padding;
> +
> +    RDFTContext **rdft, **irdft;
> +    float **sum;
> +    float **block;
> +    FFTComplex **coeff;
> +
> +    AVAudioFifo *fifo[2];
> +    AVFrame *in[2];
> +    AVFrame *buffer;
> +    int64_t pts;
> +    int index;
> +} AudioFIRContext;
> +
> +static int fir_channel(AVFilterContext *ctx, void *arg, int ch, int nb_jobs)
> +{
> +    AudioFIRContext *s = ctx->priv;
> +    const FFTComplex *coeff = s->coeff[ch * !s->one2many];
> +    const float *src = (const float *)s->in[0]->extended_data[ch];
> +    int index1 = (s->index + 1) % 3;
> +    int index2 = (s->index + 2) % 3;
> +    float *sum = s->sum[ch];
> +    AVFrame *out = arg;
> +    float *block;
> +    float *dst;
> +    int n, i, j;
> +
> +    memset(sum, 0, sizeof(*sum) * s->fft_length);
> +    block = s->block[ch] + s->part_index * s->block_length;
> +    memset(block, 0, sizeof(*block) * s->fft_length);
> +    for (n = 0; n < s->nb_samples; n++) {
> +        block[s->part_size + n] = src[n] * s->dry_gain;
> +    }
> +
> +    av_rdft_calc(s->rdft[ch], block);
> +    block[2 * s->part_size] = block[1];
> +    block[1] = 0;
> +
> +    j = s->part_index;
> +
> +    for (i = 0; i < s->nb_partitions; i++) {
> +        const int coffset = i * (s->part_size + 1);
> +
> +        block = s->block[ch] + j * s->block_length;
> +        for (n = 0; n < s->part_size; n++) {
> +            const float cre = coeff[coffset + n].re;
> +            const float cim = coeff[coffset + n].im;
> +            const float tre = block[2 * n    ];
> +            const float tim = block[2 * n + 1];
> +
> +            sum[2 * n    ] += tre * cre - tim * cim;
> +            sum[2 * n + 1] += tre * cim + tim * cre;
> +        }
> +        sum[2 * n] += block[2 * n] * coeff[coffset + n].re;
> +
> +        if (j == 0)
> +            j = s->nb_partitions;
> +        j--;
> +    }
> +
> +    sum[1] = sum[2 * n];
> +    av_rdft_calc(s->irdft[ch], sum);
> +
> +    dst = (float *)s->buffer->extended_data[ch] + index1 * s->part_size;
> +    for (n = 0; n < s->part_size; n++) {
> +        dst[n] += sum[n];
> +    }
> +
> +    dst = (float *)s->buffer->extended_data[ch] + index2 * s->part_size;
> +
> +    memcpy(dst, sum + s->part_size, s->part_size * sizeof(*dst));
> +
> +    dst = (float *)s->buffer->extended_data[ch] + s->index * s->part_size;
> +
> +    if (out) {
> +        float *ptr = (float *)out->extended_data[ch];
> +        for (n = 0; n < out->nb_samples; n++) {
> +            ptr[n] = dst[n] * s->gain * s->wet_gain;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +static int fir_frame(AudioFIRContext *s, AVFilterLink *outlink)
> +{
> +    AVFilterContext *ctx = outlink->src;
> +    AVFrame *out = NULL;
> +    int ret;
> +
> +    s->nb_samples = FFMIN(s->part_size, av_audio_fifo_size(s->fifo[0]));
> +
> +    if (!s->want_skip) {
> +        out = ff_get_audio_buffer(outlink, s->nb_samples);
> +        if (!out)
> +            return AVERROR(ENOMEM);
> +    }
> +
> +    s->in[0] = ff_get_audio_buffer(ctx->inputs[0], s->nb_samples);
> +    if (!s->in[0]) {
> +        av_frame_free(&out);
> +        return AVERROR(ENOMEM);
> +    }
> +
> +    av_audio_fifo_peek(s->fifo[0], (void **)s->in[0]->extended_data, s->nb_samples);
> +
> +    ctx->internal->execute(ctx, fir_channel, out, NULL, outlink->channels);
> +
> +    s->part_index = (s->part_index + 1) % s->nb_partitions;
> +
> +    av_audio_fifo_drain(s->fifo[0], s->nb_samples);
> +
> +    if (!s->want_skip) {
> +        out->pts = s->pts;
> +        if (s->pts != AV_NOPTS_VALUE)
> +            s->pts += av_rescale_q(out->nb_samples, (AVRational){1, outlink->sample_rate}, outlink->time_base);
> +    }
> +
> +    s->index++;
> +    if (s->index == 3)
> +        s->index = 0;
> +
> +    av_frame_free(&s->in[0]);
> +
> +    if (s->want_skip == 1) {
> +        s->want_skip = 0;
> +        ret = 0;
> +    } else {
> +        ret = ff_filter_frame(outlink, out);
> +    }
> +
> +    return ret;
> +}
> +
> +static int convert_coeffs(AVFilterContext *ctx)
> +{
> +    AudioFIRContext *s = ctx->priv;
> +    int i, ch, n, N;
> +    float power = 0;
> +
> +    s->nb_taps = av_audio_fifo_size(s->fifo[1]);
> +
> +    for (n = 4; (1 << n) < s->nb_taps; n++);
> +    N = FFMIN(n, 16);

It is nice to allow user set maximum N e.g. for low latency app, user
can set low N with higher nb_partitions.


> +    s->ir_length = 1 << n;
> +    s->fft_length = (1 << (N + 1)) + 1;
> +    s->part_size = 1 << (N - 1);
> +    s->block_length = FFALIGN(s->fft_length, 16);
> +    s->nb_partitions = (s->nb_taps + s->part_size - 1) / s->part_size;
> +    s->nb_coeffs = s->ir_length + s->nb_partitions;
> +
> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
> +        s->sum[ch] = av_calloc(s->fft_length, sizeof(**s->sum));
> +        if (!s->sum[ch])
> +            return AVERROR(ENOMEM);
> +    }
> +
> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
> +        s->coeff[ch] = av_calloc(s->nb_coeffs, sizeof(**s->coeff));
> +        if (!s->coeff[ch])
> +            return AVERROR(ENOMEM);
> +    }
> +
> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
> +        s->block[ch] = av_calloc(s->nb_partitions * s->block_length, sizeof(**s->block));
> +        if (!s->block[ch])
> +            return AVERROR(ENOMEM);
> +    }
> +
> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
> +        s->rdft[ch]  = av_rdft_init(N, DFT_R2C);
> +        s->irdft[ch] = av_rdft_init(N, IDFT_C2R);
> +        if (!s->rdft[ch] || !s->irdft[ch])
> +            return AVERROR(ENOMEM);
> +    }
> +
> +    s->in[1] = ff_get_audio_buffer(ctx->inputs[1], s->nb_taps);
> +    if (!s->in[1])
> +        return AVERROR(ENOMEM);
> +
> +    s->buffer = ff_get_audio_buffer(ctx->inputs[0], s->part_size * 3);
> +    if (!s->buffer)
> +        return AVERROR(ENOMEM);
> +
> +    av_audio_fifo_read(s->fifo[1], (void **)s->in[1]->extended_data, s->nb_taps);
> +
> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
> +        float *time = (float *)s->in[1]->extended_data[!s->one2many * ch];
> +        float *block = s->block[ch];
> +        FFTComplex *coeff = s->coeff[ch];
> +
> +        for (i = FFMAX(1, s->length * s->nb_taps); i < s->nb_taps; i++)
> +            time[i] = 0;
> +
> +        for (i = 0; i < s->nb_partitions; i++) {
> +            const float scale = 1.f / s->part_size;
> +            const int toffset = i * s->part_size;
> +            const int coffset = i * (s->part_size + 1);
> +            const int boffset = s->part_size;
> +            const int remaining = s->nb_taps - (i * s->part_size);
> +            const int size = remaining >= s->part_size ? s->part_size : remaining;
> +
> +            memset(block, 0, sizeof(*block) * s->fft_length);
> +            for (n = 0; n < size; n++) {
> +                power += time[n + toffset] * time[n + toffset];
> +                block[n + boffset] = time[n + toffset];
> +            }
> +
> +            av_rdft_calc(s->rdft[0], block);
> +
> +            coeff[coffset].re = block[0] * scale;
> +            coeff[coffset].im = 0;
> +            for (n = 1; n < s->part_size; n++) {
> +                coeff[coffset + n].re = block[2 * n] * scale;
> +                coeff[coffset + n].im = block[2 * n + 1] * scale;
> +            }
> +            coeff[coffset + s->part_size].re = block[1] * scale;
> +            coeff[coffset + s->part_size].im = 0;
> +        }
> +    }
> +
> +    av_frame_free(&s->in[1]);
> +    s->gain = 1.f / sqrtf(power);

I think s->gain is not required at all. The coeffs are already scaled by scale.

Otherwise LGTM.

Thank's.

Paul B Mahol May 8, 2017, 4:06 p.m. UTC | #2

On 5/8/17, Muhammad Faiz <mfcc64@gmail.com> wrote:
> On Mon, May 8, 2017 at 6:59 PM, Paul B Mahol <onemda@gmail.com> wrote:
>> Signed-off-by: Paul B Mahol <onemda@gmail.com>
>> ---
>>  configure                |   2 +
>>  doc/filters.texi         |  23 ++
>>  libavfilter/Makefile     |   1 +
>>  libavfilter/af_afir.c    | 544
>> +++++++++++++++++++++++++++++++++++++++++++++++
>>  libavfilter/allfilters.c |   1 +
>>  5 files changed, 571 insertions(+)
>>  create mode 100644 libavfilter/af_afir.c
>>
>> diff --git a/configure b/configure
>> index 2e1786a..a46c375 100755
>> --- a/configure
>> +++ b/configure
>> @@ -3081,6 +3081,8 @@ unix_protocol_select="network"
>>  # filters
>>  afftfilt_filter_deps="avcodec"
>>  afftfilt_filter_select="fft"
>> +afir_filter_deps="avcodec"
>> +afir_filter_select="fft"
>>  amovie_filter_deps="avcodec avformat"
>>  aresample_filter_deps="swresample"
>>  ass_filter_deps="libass"
>> diff --git a/doc/filters.texi b/doc/filters.texi
>> index f431274..0efce9a 100644
>> --- a/doc/filters.texi
>> +++ b/doc/filters.texi
>> @@ -878,6 +878,29 @@ afftfilt="1-clip((b/nb)*b,0,1)"
>>  @end example
>>  @end itemize
>>
>> +@section afir
>> +
>> +Apply an Arbitary Frequency Impulse Response filter.
>> +
>> +This filter uses second stream as FIR coefficients.
>> +If second stream holds single channel, it will be used
>> +for all input channels in first stream, otherwise
>> +number of channels in second stream must be same as
>> +number of channels in first stream.
>> +
>> +It accepts the following parameters:
>> +
>> +@table @option
>> +@item dry
>> +Set dry gain. This sets input gain.
>> +
>> +@item wet
>> +Set wet gain. This sets final output gain.
>> +
>> +@item length
>> +Set Impulse Response filter length. Default is 1, which means whole IR is
>> processed.
>> +@end table
>> +
>>  @anchor{aformat}
>>  @section aformat
>>
>> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
>> index 0f99086..de5f992 100644
>> --- a/libavfilter/Makefile
>> +++ b/libavfilter/Makefile
>> @@ -37,6 +37,7 @@ OBJS-$(CONFIG_AEMPHASIS_FILTER)              +=
>> af_aemphasis.o
>>  OBJS-$(CONFIG_AEVAL_FILTER)                  += aeval.o
>>  OBJS-$(CONFIG_AFADE_FILTER)                  += af_afade.o
>>  OBJS-$(CONFIG_AFFTFILT_FILTER)               += af_afftfilt.o
>> window_func.o
>> +OBJS-$(CONFIG_AFIR_FILTER)                   += af_afir.o
>>  OBJS-$(CONFIG_AFORMAT_FILTER)                += af_aformat.o
>>  OBJS-$(CONFIG_AGATE_FILTER)                  += af_agate.o
>>  OBJS-$(CONFIG_AINTERLEAVE_FILTER)            += f_interleave.o
>> diff --git a/libavfilter/af_afir.c b/libavfilter/af_afir.c
>> new file mode 100644
>> index 0000000..bc1b6a4
>> --- /dev/null
>> +++ b/libavfilter/af_afir.c
>> @@ -0,0 +1,544 @@
>> +/*
>> + * Copyright (c) 2017 Paul B Mahol
>> + *
>> + * This file is part of FFmpeg.
>> + *
>> + * FFmpeg is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2.1 of the License, or (at your option) any later version.
>> + *
>> + * FFmpeg is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with FFmpeg; if not, write to the Free Software
>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
>> 02110-1301 USA
>> + */
>> +
>> +/**
>> + * @file
>> + * An arbitrary audio FIR filter
>> + */
>> +
>> +#include "libavutil/audio_fifo.h"
>> +#include "libavutil/common.h"
>> +#include "libavutil/opt.h"
>> +#include "libavcodec/avfft.h"
>> +
>> +#include "audio.h"
>> +#include "avfilter.h"
>> +#include "formats.h"
>> +#include "internal.h"
>> +
>> +#define MAX_IR_DURATION 30
>> +
>> +typedef struct AudioFIRContext {
>> +    const AVClass *class;
>> +
>> +    float wet_gain;
>> +    float dry_gain;
>> +    float length;
>> +
>> +    float gain;
>> +
>> +    int eof_coeffs;
>> +    int have_coeffs;
>> +    int nb_coeffs;
>> +    int nb_taps;
>> +    int part_size;
>> +    int part_index;
>> +    int block_length;
>> +    int nb_partitions;
>> +    int nb_channels;
>> +    int ir_length;
>> +    int fft_length;
>> +    int nb_coef_channels;
>> +    int one2many;
>> +    int nb_samples;
>> +    int want_skip;
>> +    int need_padding;
>> +
>> +    RDFTContext **rdft, **irdft;
>> +    float **sum;
>> +    float **block;
>> +    FFTComplex **coeff;
>> +
>> +    AVAudioFifo *fifo[2];
>> +    AVFrame *in[2];
>> +    AVFrame *buffer;
>> +    int64_t pts;
>> +    int index;
>> +} AudioFIRContext;
>> +
>> +static int fir_channel(AVFilterContext *ctx, void *arg, int ch, int
>> nb_jobs)
>> +{
>> +    AudioFIRContext *s = ctx->priv;
>> +    const FFTComplex *coeff = s->coeff[ch * !s->one2many];
>> +    const float *src = (const float *)s->in[0]->extended_data[ch];
>> +    int index1 = (s->index + 1) % 3;
>> +    int index2 = (s->index + 2) % 3;
>> +    float *sum = s->sum[ch];
>> +    AVFrame *out = arg;
>> +    float *block;
>> +    float *dst;
>> +    int n, i, j;
>> +
>> +    memset(sum, 0, sizeof(*sum) * s->fft_length);
>> +    block = s->block[ch] + s->part_index * s->block_length;
>> +    memset(block, 0, sizeof(*block) * s->fft_length);
>> +    for (n = 0; n < s->nb_samples; n++) {
>> +        block[s->part_size + n] = src[n] * s->dry_gain;
>> +    }
>> +
>> +    av_rdft_calc(s->rdft[ch], block);
>> +    block[2 * s->part_size] = block[1];
>> +    block[1] = 0;
>> +
>> +    j = s->part_index;
>> +
>> +    for (i = 0; i < s->nb_partitions; i++) {
>> +        const int coffset = i * (s->part_size + 1);
>> +
>> +        block = s->block[ch] + j * s->block_length;
>> +        for (n = 0; n < s->part_size; n++) {
>> +            const float cre = coeff[coffset + n].re;
>> +            const float cim = coeff[coffset + n].im;
>> +            const float tre = block[2 * n    ];
>> +            const float tim = block[2 * n + 1];
>> +
>> +            sum[2 * n    ] += tre * cre - tim * cim;
>> +            sum[2 * n + 1] += tre * cim + tim * cre;
>> +        }
>> +        sum[2 * n] += block[2 * n] * coeff[coffset + n].re;
>> +
>> +        if (j == 0)
>> +            j = s->nb_partitions;
>> +        j--;
>> +    }
>> +
>> +    sum[1] = sum[2 * n];
>> +    av_rdft_calc(s->irdft[ch], sum);
>> +
>> +    dst = (float *)s->buffer->extended_data[ch] + index1 * s->part_size;
>> +    for (n = 0; n < s->part_size; n++) {
>> +        dst[n] += sum[n];
>> +    }
>> +
>> +    dst = (float *)s->buffer->extended_data[ch] + index2 * s->part_size;
>> +
>> +    memcpy(dst, sum + s->part_size, s->part_size * sizeof(*dst));
>> +
>> +    dst = (float *)s->buffer->extended_data[ch] + s->index *
>> s->part_size;
>> +
>> +    if (out) {
>> +        float *ptr = (float *)out->extended_data[ch];
>> +        for (n = 0; n < out->nb_samples; n++) {
>> +            ptr[n] = dst[n] * s->gain * s->wet_gain;
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int fir_frame(AudioFIRContext *s, AVFilterLink *outlink)
>> +{
>> +    AVFilterContext *ctx = outlink->src;
>> +    AVFrame *out = NULL;
>> +    int ret;
>> +
>> +    s->nb_samples = FFMIN(s->part_size, av_audio_fifo_size(s->fifo[0]));
>> +
>> +    if (!s->want_skip) {
>> +        out = ff_get_audio_buffer(outlink, s->nb_samples);
>> +        if (!out)
>> +            return AVERROR(ENOMEM);
>> +    }
>> +
>> +    s->in[0] = ff_get_audio_buffer(ctx->inputs[0], s->nb_samples);
>> +    if (!s->in[0]) {
>> +        av_frame_free(&out);
>> +        return AVERROR(ENOMEM);
>> +    }
>> +
>> +    av_audio_fifo_peek(s->fifo[0], (void **)s->in[0]->extended_data,
>> s->nb_samples);
>> +
>> +    ctx->internal->execute(ctx, fir_channel, out, NULL,
>> outlink->channels);
>> +
>> +    s->part_index = (s->part_index + 1) % s->nb_partitions;
>> +
>> +    av_audio_fifo_drain(s->fifo[0], s->nb_samples);
>> +
>> +    if (!s->want_skip) {
>> +        out->pts = s->pts;
>> +        if (s->pts != AV_NOPTS_VALUE)
>> +            s->pts += av_rescale_q(out->nb_samples, (AVRational){1,
>> outlink->sample_rate}, outlink->time_base);
>> +    }
>> +
>> +    s->index++;
>> +    if (s->index == 3)
>> +        s->index = 0;
>> +
>> +    av_frame_free(&s->in[0]);
>> +
>> +    if (s->want_skip == 1) {
>> +        s->want_skip = 0;
>> +        ret = 0;
>> +    } else {
>> +        ret = ff_filter_frame(outlink, out);
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +static int convert_coeffs(AVFilterContext *ctx)
>> +{
>> +    AudioFIRContext *s = ctx->priv;
>> +    int i, ch, n, N;
>> +    float power = 0;
>> +
>> +    s->nb_taps = av_audio_fifo_size(s->fifo[1]);
>> +
>> +    for (n = 4; (1 << n) < s->nb_taps; n++);
>> +    N = FFMIN(n, 16);
>
> It is nice to allow user set maximum N e.g. for low latency app, user
> can set low N with higher nb_partitions.

Could be later added, but for low latency, one uses NUPOLS or first
partition is done in time domain.
Using small N drastically reduces speed.

>
>
>> +    s->ir_length = 1 << n;
>> +    s->fft_length = (1 << (N + 1)) + 1;
>> +    s->part_size = 1 << (N - 1);
>> +    s->block_length = FFALIGN(s->fft_length, 16);
>> +    s->nb_partitions = (s->nb_taps + s->part_size - 1) / s->part_size;
>> +    s->nb_coeffs = s->ir_length + s->nb_partitions;
>> +
>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>> +        s->sum[ch] = av_calloc(s->fft_length, sizeof(**s->sum));
>> +        if (!s->sum[ch])
>> +            return AVERROR(ENOMEM);
>> +    }
>> +
>> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
>> +        s->coeff[ch] = av_calloc(s->nb_coeffs, sizeof(**s->coeff));
>> +        if (!s->coeff[ch])
>> +            return AVERROR(ENOMEM);
>> +    }
>> +
>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>> +        s->block[ch] = av_calloc(s->nb_partitions * s->block_length,
>> sizeof(**s->block));
>> +        if (!s->block[ch])
>> +            return AVERROR(ENOMEM);
>> +    }
>> +
>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>> +        s->rdft[ch]  = av_rdft_init(N, DFT_R2C);
>> +        s->irdft[ch] = av_rdft_init(N, IDFT_C2R);
>> +        if (!s->rdft[ch] || !s->irdft[ch])
>> +            return AVERROR(ENOMEM);
>> +    }
>> +
>> +    s->in[1] = ff_get_audio_buffer(ctx->inputs[1], s->nb_taps);
>> +    if (!s->in[1])
>> +        return AVERROR(ENOMEM);
>> +
>> +    s->buffer = ff_get_audio_buffer(ctx->inputs[0], s->part_size * 3);
>> +    if (!s->buffer)
>> +        return AVERROR(ENOMEM);
>> +
>> +    av_audio_fifo_read(s->fifo[1], (void **)s->in[1]->extended_data,
>> s->nb_taps);
>> +
>> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
>> +        float *time = (float *)s->in[1]->extended_data[!s->one2many *
>> ch];
>> +        float *block = s->block[ch];
>> +        FFTComplex *coeff = s->coeff[ch];
>> +
>> +        for (i = FFMAX(1, s->length * s->nb_taps); i < s->nb_taps; i++)
>> +            time[i] = 0;
>> +
>> +        for (i = 0; i < s->nb_partitions; i++) {
>> +            const float scale = 1.f / s->part_size;
>> +            const int toffset = i * s->part_size;
>> +            const int coffset = i * (s->part_size + 1);
>> +            const int boffset = s->part_size;
>> +            const int remaining = s->nb_taps - (i * s->part_size);
>> +            const int size = remaining >= s->part_size ? s->part_size :
>> remaining;
>> +
>> +            memset(block, 0, sizeof(*block) * s->fft_length);
>> +            for (n = 0; n < size; n++) {
>> +                power += time[n + toffset] * time[n + toffset];
>> +                block[n + boffset] = time[n + toffset];
>> +            }
>> +
>> +            av_rdft_calc(s->rdft[0], block);
>> +
>> +            coeff[coffset].re = block[0] * scale;
>> +            coeff[coffset].im = 0;
>> +            for (n = 1; n < s->part_size; n++) {
>> +                coeff[coffset + n].re = block[2 * n] * scale;
>> +                coeff[coffset + n].im = block[2 * n + 1] * scale;
>> +            }
>> +            coeff[coffset + s->part_size].re = block[1] * scale;
>> +            coeff[coffset + s->part_size].im = 0;
>> +        }
>> +    }
>> +
>> +    av_frame_free(&s->in[1]);
>> +    s->gain = 1.f / sqrtf(power);
>
> I think s->gain is not required at all. The coeffs are already scaled by
> scale.

Its needed. Various IRs gives different peak values.
The calculation is not perfect but it helps.

>
> Otherwise LGTM.
>
> Thank's.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>

Muhammad Faiz May 8, 2017, 5:56 p.m. UTC | #3

On Mon, May 8, 2017 at 11:06 PM, Paul B Mahol <onemda@gmail.com> wrote:
> On 5/8/17, Muhammad Faiz <mfcc64@gmail.com> wrote:
>> On Mon, May 8, 2017 at 6:59 PM, Paul B Mahol <onemda@gmail.com> wrote:
>>> Signed-off-by: Paul B Mahol <onemda@gmail.com>
>>> ---
>>>  configure                |   2 +
>>>  doc/filters.texi         |  23 ++
>>>  libavfilter/Makefile     |   1 +
>>>  libavfilter/af_afir.c    | 544
>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>>  libavfilter/allfilters.c |   1 +
>>>  5 files changed, 571 insertions(+)
>>>  create mode 100644 libavfilter/af_afir.c
>>>
>>> diff --git a/configure b/configure
>>> index 2e1786a..a46c375 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -3081,6 +3081,8 @@ unix_protocol_select="network"
>>>  # filters
>>>  afftfilt_filter_deps="avcodec"
>>>  afftfilt_filter_select="fft"
>>> +afir_filter_deps="avcodec"
>>> +afir_filter_select="fft"
>>>  amovie_filter_deps="avcodec avformat"
>>>  aresample_filter_deps="swresample"
>>>  ass_filter_deps="libass"
>>> diff --git a/doc/filters.texi b/doc/filters.texi
>>> index f431274..0efce9a 100644
>>> --- a/doc/filters.texi
>>> +++ b/doc/filters.texi
>>> @@ -878,6 +878,29 @@ afftfilt="1-clip((b/nb)*b,0,1)"
>>>  @end example
>>>  @end itemize
>>>
>>> +@section afir
>>> +
>>> +Apply an Arbitary Frequency Impulse Response filter.
>>> +
>>> +This filter uses second stream as FIR coefficients.
>>> +If second stream holds single channel, it will be used
>>> +for all input channels in first stream, otherwise
>>> +number of channels in second stream must be same as
>>> +number of channels in first stream.
>>> +
>>> +It accepts the following parameters:
>>> +
>>> +@table @option
>>> +@item dry
>>> +Set dry gain. This sets input gain.
>>> +
>>> +@item wet
>>> +Set wet gain. This sets final output gain.
>>> +
>>> +@item length
>>> +Set Impulse Response filter length. Default is 1, which means whole IR is
>>> processed.
>>> +@end table
>>> +
>>>  @anchor{aformat}
>>>  @section aformat
>>>
>>> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
>>> index 0f99086..de5f992 100644
>>> --- a/libavfilter/Makefile
>>> +++ b/libavfilter/Makefile
>>> @@ -37,6 +37,7 @@ OBJS-$(CONFIG_AEMPHASIS_FILTER)              +=
>>> af_aemphasis.o
>>>  OBJS-$(CONFIG_AEVAL_FILTER)                  += aeval.o
>>>  OBJS-$(CONFIG_AFADE_FILTER)                  += af_afade.o
>>>  OBJS-$(CONFIG_AFFTFILT_FILTER)               += af_afftfilt.o
>>> window_func.o
>>> +OBJS-$(CONFIG_AFIR_FILTER)                   += af_afir.o
>>>  OBJS-$(CONFIG_AFORMAT_FILTER)                += af_aformat.o
>>>  OBJS-$(CONFIG_AGATE_FILTER)                  += af_agate.o
>>>  OBJS-$(CONFIG_AINTERLEAVE_FILTER)            += f_interleave.o
>>> diff --git a/libavfilter/af_afir.c b/libavfilter/af_afir.c
>>> new file mode 100644
>>> index 0000000..bc1b6a4
>>> --- /dev/null
>>> +++ b/libavfilter/af_afir.c
>>> @@ -0,0 +1,544 @@
>>> +/*
>>> + * Copyright (c) 2017 Paul B Mahol
>>> + *
>>> + * This file is part of FFmpeg.
>>> + *
>>> + * FFmpeg is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU Lesser General Public
>>> + * License as published by the Free Software Foundation; either
>>> + * version 2.1 of the License, or (at your option) any later version.
>>> + *
>>> + * FFmpeg is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> + * Lesser General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU Lesser General Public
>>> + * License along with FFmpeg; if not, write to the Free Software
>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
>>> 02110-1301 USA
>>> + */
>>> +
>>> +/**
>>> + * @file
>>> + * An arbitrary audio FIR filter
>>> + */
>>> +
>>> +#include "libavutil/audio_fifo.h"
>>> +#include "libavutil/common.h"
>>> +#include "libavutil/opt.h"
>>> +#include "libavcodec/avfft.h"
>>> +
>>> +#include "audio.h"
>>> +#include "avfilter.h"
>>> +#include "formats.h"
>>> +#include "internal.h"
>>> +
>>> +#define MAX_IR_DURATION 30
>>> +
>>> +typedef struct AudioFIRContext {
>>> +    const AVClass *class;
>>> +
>>> +    float wet_gain;
>>> +    float dry_gain;
>>> +    float length;
>>> +
>>> +    float gain;
>>> +
>>> +    int eof_coeffs;
>>> +    int have_coeffs;
>>> +    int nb_coeffs;
>>> +    int nb_taps;
>>> +    int part_size;
>>> +    int part_index;
>>> +    int block_length;
>>> +    int nb_partitions;
>>> +    int nb_channels;
>>> +    int ir_length;
>>> +    int fft_length;
>>> +    int nb_coef_channels;
>>> +    int one2many;
>>> +    int nb_samples;
>>> +    int want_skip;
>>> +    int need_padding;
>>> +
>>> +    RDFTContext **rdft, **irdft;
>>> +    float **sum;
>>> +    float **block;
>>> +    FFTComplex **coeff;
>>> +
>>> +    AVAudioFifo *fifo[2];
>>> +    AVFrame *in[2];
>>> +    AVFrame *buffer;
>>> +    int64_t pts;
>>> +    int index;
>>> +} AudioFIRContext;
>>> +
>>> +static int fir_channel(AVFilterContext *ctx, void *arg, int ch, int
>>> nb_jobs)
>>> +{
>>> +    AudioFIRContext *s = ctx->priv;
>>> +    const FFTComplex *coeff = s->coeff[ch * !s->one2many];
>>> +    const float *src = (const float *)s->in[0]->extended_data[ch];
>>> +    int index1 = (s->index + 1) % 3;
>>> +    int index2 = (s->index + 2) % 3;
>>> +    float *sum = s->sum[ch];
>>> +    AVFrame *out = arg;
>>> +    float *block;
>>> +    float *dst;
>>> +    int n, i, j;
>>> +
>>> +    memset(sum, 0, sizeof(*sum) * s->fft_length);
>>> +    block = s->block[ch] + s->part_index * s->block_length;
>>> +    memset(block, 0, sizeof(*block) * s->fft_length);
>>> +    for (n = 0; n < s->nb_samples; n++) {
>>> +        block[s->part_size + n] = src[n] * s->dry_gain;
>>> +    }
>>> +
>>> +    av_rdft_calc(s->rdft[ch], block);
>>> +    block[2 * s->part_size] = block[1];
>>> +    block[1] = 0;
>>> +
>>> +    j = s->part_index;
>>> +
>>> +    for (i = 0; i < s->nb_partitions; i++) {
>>> +        const int coffset = i * (s->part_size + 1);
>>> +
>>> +        block = s->block[ch] + j * s->block_length;
>>> +        for (n = 0; n < s->part_size; n++) {
>>> +            const float cre = coeff[coffset + n].re;
>>> +            const float cim = coeff[coffset + n].im;
>>> +            const float tre = block[2 * n    ];
>>> +            const float tim = block[2 * n + 1];
>>> +
>>> +            sum[2 * n    ] += tre * cre - tim * cim;
>>> +            sum[2 * n + 1] += tre * cim + tim * cre;
>>> +        }
>>> +        sum[2 * n] += block[2 * n] * coeff[coffset + n].re;
>>> +
>>> +        if (j == 0)
>>> +            j = s->nb_partitions;
>>> +        j--;
>>> +    }
>>> +
>>> +    sum[1] = sum[2 * n];
>>> +    av_rdft_calc(s->irdft[ch], sum);
>>> +
>>> +    dst = (float *)s->buffer->extended_data[ch] + index1 * s->part_size;
>>> +    for (n = 0; n < s->part_size; n++) {
>>> +        dst[n] += sum[n];
>>> +    }
>>> +
>>> +    dst = (float *)s->buffer->extended_data[ch] + index2 * s->part_size;
>>> +
>>> +    memcpy(dst, sum + s->part_size, s->part_size * sizeof(*dst));
>>> +
>>> +    dst = (float *)s->buffer->extended_data[ch] + s->index *
>>> s->part_size;
>>> +
>>> +    if (out) {
>>> +        float *ptr = (float *)out->extended_data[ch];
>>> +        for (n = 0; n < out->nb_samples; n++) {
>>> +            ptr[n] = dst[n] * s->gain * s->wet_gain;
>>> +        }
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int fir_frame(AudioFIRContext *s, AVFilterLink *outlink)
>>> +{
>>> +    AVFilterContext *ctx = outlink->src;
>>> +    AVFrame *out = NULL;
>>> +    int ret;
>>> +
>>> +    s->nb_samples = FFMIN(s->part_size, av_audio_fifo_size(s->fifo[0]));
>>> +
>>> +    if (!s->want_skip) {
>>> +        out = ff_get_audio_buffer(outlink, s->nb_samples);
>>> +        if (!out)
>>> +            return AVERROR(ENOMEM);
>>> +    }
>>> +
>>> +    s->in[0] = ff_get_audio_buffer(ctx->inputs[0], s->nb_samples);
>>> +    if (!s->in[0]) {
>>> +        av_frame_free(&out);
>>> +        return AVERROR(ENOMEM);
>>> +    }
>>> +
>>> +    av_audio_fifo_peek(s->fifo[0], (void **)s->in[0]->extended_data,
>>> s->nb_samples);
>>> +
>>> +    ctx->internal->execute(ctx, fir_channel, out, NULL,
>>> outlink->channels);
>>> +
>>> +    s->part_index = (s->part_index + 1) % s->nb_partitions;
>>> +
>>> +    av_audio_fifo_drain(s->fifo[0], s->nb_samples);
>>> +
>>> +    if (!s->want_skip) {
>>> +        out->pts = s->pts;
>>> +        if (s->pts != AV_NOPTS_VALUE)
>>> +            s->pts += av_rescale_q(out->nb_samples, (AVRational){1,
>>> outlink->sample_rate}, outlink->time_base);
>>> +    }
>>> +
>>> +    s->index++;
>>> +    if (s->index == 3)
>>> +        s->index = 0;
>>> +
>>> +    av_frame_free(&s->in[0]);
>>> +
>>> +    if (s->want_skip == 1) {
>>> +        s->want_skip = 0;
>>> +        ret = 0;
>>> +    } else {
>>> +        ret = ff_filter_frame(outlink, out);
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>> +
>>> +static int convert_coeffs(AVFilterContext *ctx)
>>> +{
>>> +    AudioFIRContext *s = ctx->priv;
>>> +    int i, ch, n, N;
>>> +    float power = 0;
>>> +
>>> +    s->nb_taps = av_audio_fifo_size(s->fifo[1]);
>>> +
>>> +    for (n = 4; (1 << n) < s->nb_taps; n++);
>>> +    N = FFMIN(n, 16);
>>
>> It is nice to allow user set maximum N e.g. for low latency app, user
>> can set low N with higher nb_partitions.
>
> Could be later added, but for low latency, one uses NUPOLS or first
> partition is done in time domain.
> Using small N drastically reduces speed.
>
>>
>>
>>> +    s->ir_length = 1 << n;
>>> +    s->fft_length = (1 << (N + 1)) + 1;
>>> +    s->part_size = 1 << (N - 1);
>>> +    s->block_length = FFALIGN(s->fft_length, 16);
>>> +    s->nb_partitions = (s->nb_taps + s->part_size - 1) / s->part_size;
>>> +    s->nb_coeffs = s->ir_length + s->nb_partitions;
>>> +
>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>> +        s->sum[ch] = av_calloc(s->fft_length, sizeof(**s->sum));
>>> +        if (!s->sum[ch])
>>> +            return AVERROR(ENOMEM);
>>> +    }
>>> +
>>> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
>>> +        s->coeff[ch] = av_calloc(s->nb_coeffs, sizeof(**s->coeff));
>>> +        if (!s->coeff[ch])
>>> +            return AVERROR(ENOMEM);
>>> +    }
>>> +
>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>> +        s->block[ch] = av_calloc(s->nb_partitions * s->block_length,
>>> sizeof(**s->block));
>>> +        if (!s->block[ch])
>>> +            return AVERROR(ENOMEM);
>>> +    }
>>> +
>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>> +        s->rdft[ch]  = av_rdft_init(N, DFT_R2C);
>>> +        s->irdft[ch] = av_rdft_init(N, IDFT_C2R);
>>> +        if (!s->rdft[ch] || !s->irdft[ch])
>>> +            return AVERROR(ENOMEM);
>>> +    }
>>> +
>>> +    s->in[1] = ff_get_audio_buffer(ctx->inputs[1], s->nb_taps);
>>> +    if (!s->in[1])
>>> +        return AVERROR(ENOMEM);
>>> +
>>> +    s->buffer = ff_get_audio_buffer(ctx->inputs[0], s->part_size * 3);
>>> +    if (!s->buffer)
>>> +        return AVERROR(ENOMEM);
>>> +
>>> +    av_audio_fifo_read(s->fifo[1], (void **)s->in[1]->extended_data,
>>> s->nb_taps);
>>> +
>>> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
>>> +        float *time = (float *)s->in[1]->extended_data[!s->one2many *
>>> ch];
>>> +        float *block = s->block[ch];
>>> +        FFTComplex *coeff = s->coeff[ch];
>>> +
>>> +        for (i = FFMAX(1, s->length * s->nb_taps); i < s->nb_taps; i++)
>>> +            time[i] = 0;
>>> +
>>> +        for (i = 0; i < s->nb_partitions; i++) {
>>> +            const float scale = 1.f / s->part_size;
>>> +            const int toffset = i * s->part_size;
>>> +            const int coffset = i * (s->part_size + 1);
>>> +            const int boffset = s->part_size;
>>> +            const int remaining = s->nb_taps - (i * s->part_size);
>>> +            const int size = remaining >= s->part_size ? s->part_size :
>>> remaining;
>>> +
>>> +            memset(block, 0, sizeof(*block) * s->fft_length);
>>> +            for (n = 0; n < size; n++) {
>>> +                power += time[n + toffset] * time[n + toffset];
>>> +                block[n + boffset] = time[n + toffset];
>>> +            }
>>> +
>>> +            av_rdft_calc(s->rdft[0], block);
>>> +
>>> +            coeff[coffset].re = block[0] * scale;
>>> +            coeff[coffset].im = 0;
>>> +            for (n = 1; n < s->part_size; n++) {
>>> +                coeff[coffset + n].re = block[2 * n] * scale;
>>> +                coeff[coffset + n].im = block[2 * n + 1] * scale;
>>> +            }
>>> +            coeff[coffset + s->part_size].re = block[1] * scale;
>>> +            coeff[coffset + s->part_size].im = 0;
>>> +        }
>>> +    }
>>> +
>>> +    av_frame_free(&s->in[1]);
>>> +    s->gain = 1.f / sqrtf(power);
>>
>> I think s->gain is not required at all. The coeffs are already scaled by
>> scale.
>
> Its needed. Various IRs gives different peak values.
> The calculation is not perfect but it helps.

OK. So, make it optional again (e.g using auto option).

Thank's.

Paul B Mahol May 8, 2017, 10:03 p.m. UTC | #4

On 5/8/17, Muhammad Faiz <mfcc64@gmail.com> wrote:
> On Mon, May 8, 2017 at 11:06 PM, Paul B Mahol <onemda@gmail.com> wrote:
>> On 5/8/17, Muhammad Faiz <mfcc64@gmail.com> wrote:
>>> On Mon, May 8, 2017 at 6:59 PM, Paul B Mahol <onemda@gmail.com> wrote:
>>>> Signed-off-by: Paul B Mahol <onemda@gmail.com>
>>>> ---
>>>>  configure                |   2 +
>>>>  doc/filters.texi         |  23 ++
>>>>  libavfilter/Makefile     |   1 +
>>>>  libavfilter/af_afir.c    | 544
>>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>>>  libavfilter/allfilters.c |   1 +
>>>>  5 files changed, 571 insertions(+)
>>>>  create mode 100644 libavfilter/af_afir.c
>>>>
>>>> diff --git a/configure b/configure
>>>> index 2e1786a..a46c375 100755
>>>> --- a/configure
>>>> +++ b/configure
>>>> @@ -3081,6 +3081,8 @@ unix_protocol_select="network"
>>>>  # filters
>>>>  afftfilt_filter_deps="avcodec"
>>>>  afftfilt_filter_select="fft"
>>>> +afir_filter_deps="avcodec"
>>>> +afir_filter_select="fft"
>>>>  amovie_filter_deps="avcodec avformat"
>>>>  aresample_filter_deps="swresample"
>>>>  ass_filter_deps="libass"
>>>> diff --git a/doc/filters.texi b/doc/filters.texi
>>>> index f431274..0efce9a 100644
>>>> --- a/doc/filters.texi
>>>> +++ b/doc/filters.texi
>>>> @@ -878,6 +878,29 @@ afftfilt="1-clip((b/nb)*b,0,1)"
>>>>  @end example
>>>>  @end itemize
>>>>
>>>> +@section afir
>>>> +
>>>> +Apply an Arbitary Frequency Impulse Response filter.
>>>> +
>>>> +This filter uses second stream as FIR coefficients.
>>>> +If second stream holds single channel, it will be used
>>>> +for all input channels in first stream, otherwise
>>>> +number of channels in second stream must be same as
>>>> +number of channels in first stream.
>>>> +
>>>> +It accepts the following parameters:
>>>> +
>>>> +@table @option
>>>> +@item dry
>>>> +Set dry gain. This sets input gain.
>>>> +
>>>> +@item wet
>>>> +Set wet gain. This sets final output gain.
>>>> +
>>>> +@item length
>>>> +Set Impulse Response filter length. Default is 1, which means whole IR
>>>> is
>>>> processed.
>>>> +@end table
>>>> +
>>>>  @anchor{aformat}
>>>>  @section aformat
>>>>
>>>> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
>>>> index 0f99086..de5f992 100644
>>>> --- a/libavfilter/Makefile
>>>> +++ b/libavfilter/Makefile
>>>> @@ -37,6 +37,7 @@ OBJS-$(CONFIG_AEMPHASIS_FILTER)              +=
>>>> af_aemphasis.o
>>>>  OBJS-$(CONFIG_AEVAL_FILTER)                  += aeval.o
>>>>  OBJS-$(CONFIG_AFADE_FILTER)                  += af_afade.o
>>>>  OBJS-$(CONFIG_AFFTFILT_FILTER)               += af_afftfilt.o
>>>> window_func.o
>>>> +OBJS-$(CONFIG_AFIR_FILTER)                   += af_afir.o
>>>>  OBJS-$(CONFIG_AFORMAT_FILTER)                += af_aformat.o
>>>>  OBJS-$(CONFIG_AGATE_FILTER)                  += af_agate.o
>>>>  OBJS-$(CONFIG_AINTERLEAVE_FILTER)            += f_interleave.o
>>>> diff --git a/libavfilter/af_afir.c b/libavfilter/af_afir.c
>>>> new file mode 100644
>>>> index 0000000..bc1b6a4
>>>> --- /dev/null
>>>> +++ b/libavfilter/af_afir.c
>>>> @@ -0,0 +1,544 @@
>>>> +/*
>>>> + * Copyright (c) 2017 Paul B Mahol
>>>> + *
>>>> + * This file is part of FFmpeg.
>>>> + *
>>>> + * FFmpeg is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU Lesser General Public
>>>> + * License as published by the Free Software Foundation; either
>>>> + * version 2.1 of the License, or (at your option) any later version.
>>>> + *
>>>> + * FFmpeg is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> + * Lesser General Public License for more details.
>>>> + *
>>>> + * You should have received a copy of the GNU Lesser General Public
>>>> + * License along with FFmpeg; if not, write to the Free Software
>>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
>>>> 02110-1301 USA
>>>> + */
>>>> +
>>>> +/**
>>>> + * @file
>>>> + * An arbitrary audio FIR filter
>>>> + */
>>>> +
>>>> +#include "libavutil/audio_fifo.h"
>>>> +#include "libavutil/common.h"
>>>> +#include "libavutil/opt.h"
>>>> +#include "libavcodec/avfft.h"
>>>> +
>>>> +#include "audio.h"
>>>> +#include "avfilter.h"
>>>> +#include "formats.h"
>>>> +#include "internal.h"
>>>> +
>>>> +#define MAX_IR_DURATION 30
>>>> +
>>>> +typedef struct AudioFIRContext {
>>>> +    const AVClass *class;
>>>> +
>>>> +    float wet_gain;
>>>> +    float dry_gain;
>>>> +    float length;
>>>> +
>>>> +    float gain;
>>>> +
>>>> +    int eof_coeffs;
>>>> +    int have_coeffs;
>>>> +    int nb_coeffs;
>>>> +    int nb_taps;
>>>> +    int part_size;
>>>> +    int part_index;
>>>> +    int block_length;
>>>> +    int nb_partitions;
>>>> +    int nb_channels;
>>>> +    int ir_length;
>>>> +    int fft_length;
>>>> +    int nb_coef_channels;
>>>> +    int one2many;
>>>> +    int nb_samples;
>>>> +    int want_skip;
>>>> +    int need_padding;
>>>> +
>>>> +    RDFTContext **rdft, **irdft;
>>>> +    float **sum;
>>>> +    float **block;
>>>> +    FFTComplex **coeff;
>>>> +
>>>> +    AVAudioFifo *fifo[2];
>>>> +    AVFrame *in[2];
>>>> +    AVFrame *buffer;
>>>> +    int64_t pts;
>>>> +    int index;
>>>> +} AudioFIRContext;
>>>> +
>>>> +static int fir_channel(AVFilterContext *ctx, void *arg, int ch, int
>>>> nb_jobs)
>>>> +{
>>>> +    AudioFIRContext *s = ctx->priv;
>>>> +    const FFTComplex *coeff = s->coeff[ch * !s->one2many];
>>>> +    const float *src = (const float *)s->in[0]->extended_data[ch];
>>>> +    int index1 = (s->index + 1) % 3;
>>>> +    int index2 = (s->index + 2) % 3;
>>>> +    float *sum = s->sum[ch];
>>>> +    AVFrame *out = arg;
>>>> +    float *block;
>>>> +    float *dst;
>>>> +    int n, i, j;
>>>> +
>>>> +    memset(sum, 0, sizeof(*sum) * s->fft_length);
>>>> +    block = s->block[ch] + s->part_index * s->block_length;
>>>> +    memset(block, 0, sizeof(*block) * s->fft_length);
>>>> +    for (n = 0; n < s->nb_samples; n++) {
>>>> +        block[s->part_size + n] = src[n] * s->dry_gain;
>>>> +    }
>>>> +
>>>> +    av_rdft_calc(s->rdft[ch], block);
>>>> +    block[2 * s->part_size] = block[1];
>>>> +    block[1] = 0;
>>>> +
>>>> +    j = s->part_index;
>>>> +
>>>> +    for (i = 0; i < s->nb_partitions; i++) {
>>>> +        const int coffset = i * (s->part_size + 1);
>>>> +
>>>> +        block = s->block[ch] + j * s->block_length;
>>>> +        for (n = 0; n < s->part_size; n++) {
>>>> +            const float cre = coeff[coffset + n].re;
>>>> +            const float cim = coeff[coffset + n].im;
>>>> +            const float tre = block[2 * n    ];
>>>> +            const float tim = block[2 * n + 1];
>>>> +
>>>> +            sum[2 * n    ] += tre * cre - tim * cim;
>>>> +            sum[2 * n + 1] += tre * cim + tim * cre;
>>>> +        }
>>>> +        sum[2 * n] += block[2 * n] * coeff[coffset + n].re;
>>>> +
>>>> +        if (j == 0)
>>>> +            j = s->nb_partitions;
>>>> +        j--;
>>>> +    }
>>>> +
>>>> +    sum[1] = sum[2 * n];
>>>> +    av_rdft_calc(s->irdft[ch], sum);
>>>> +
>>>> +    dst = (float *)s->buffer->extended_data[ch] + index1 *
>>>> s->part_size;
>>>> +    for (n = 0; n < s->part_size; n++) {
>>>> +        dst[n] += sum[n];
>>>> +    }
>>>> +
>>>> +    dst = (float *)s->buffer->extended_data[ch] + index2 *
>>>> s->part_size;
>>>> +
>>>> +    memcpy(dst, sum + s->part_size, s->part_size * sizeof(*dst));
>>>> +
>>>> +    dst = (float *)s->buffer->extended_data[ch] + s->index *
>>>> s->part_size;
>>>> +
>>>> +    if (out) {
>>>> +        float *ptr = (float *)out->extended_data[ch];
>>>> +        for (n = 0; n < out->nb_samples; n++) {
>>>> +            ptr[n] = dst[n] * s->gain * s->wet_gain;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int fir_frame(AudioFIRContext *s, AVFilterLink *outlink)
>>>> +{
>>>> +    AVFilterContext *ctx = outlink->src;
>>>> +    AVFrame *out = NULL;
>>>> +    int ret;
>>>> +
>>>> +    s->nb_samples = FFMIN(s->part_size,
>>>> av_audio_fifo_size(s->fifo[0]));
>>>> +
>>>> +    if (!s->want_skip) {
>>>> +        out = ff_get_audio_buffer(outlink, s->nb_samples);
>>>> +        if (!out)
>>>> +            return AVERROR(ENOMEM);
>>>> +    }
>>>> +
>>>> +    s->in[0] = ff_get_audio_buffer(ctx->inputs[0], s->nb_samples);
>>>> +    if (!s->in[0]) {
>>>> +        av_frame_free(&out);
>>>> +        return AVERROR(ENOMEM);
>>>> +    }
>>>> +
>>>> +    av_audio_fifo_peek(s->fifo[0], (void **)s->in[0]->extended_data,
>>>> s->nb_samples);
>>>> +
>>>> +    ctx->internal->execute(ctx, fir_channel, out, NULL,
>>>> outlink->channels);
>>>> +
>>>> +    s->part_index = (s->part_index + 1) % s->nb_partitions;
>>>> +
>>>> +    av_audio_fifo_drain(s->fifo[0], s->nb_samples);
>>>> +
>>>> +    if (!s->want_skip) {
>>>> +        out->pts = s->pts;
>>>> +        if (s->pts != AV_NOPTS_VALUE)
>>>> +            s->pts += av_rescale_q(out->nb_samples, (AVRational){1,
>>>> outlink->sample_rate}, outlink->time_base);
>>>> +    }
>>>> +
>>>> +    s->index++;
>>>> +    if (s->index == 3)
>>>> +        s->index = 0;
>>>> +
>>>> +    av_frame_free(&s->in[0]);
>>>> +
>>>> +    if (s->want_skip == 1) {
>>>> +        s->want_skip = 0;
>>>> +        ret = 0;
>>>> +    } else {
>>>> +        ret = ff_filter_frame(outlink, out);
>>>> +    }
>>>> +
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +static int convert_coeffs(AVFilterContext *ctx)
>>>> +{
>>>> +    AudioFIRContext *s = ctx->priv;
>>>> +    int i, ch, n, N;
>>>> +    float power = 0;
>>>> +
>>>> +    s->nb_taps = av_audio_fifo_size(s->fifo[1]);
>>>> +
>>>> +    for (n = 4; (1 << n) < s->nb_taps; n++);
>>>> +    N = FFMIN(n, 16);
>>>
>>> It is nice to allow user set maximum N e.g. for low latency app, user
>>> can set low N with higher nb_partitions.
>>
>> Could be later added, but for low latency, one uses NUPOLS or first
>> partition is done in time domain.
>> Using small N drastically reduces speed.
>>
>>>
>>>
>>>> +    s->ir_length = 1 << n;
>>>> +    s->fft_length = (1 << (N + 1)) + 1;
>>>> +    s->part_size = 1 << (N - 1);
>>>> +    s->block_length = FFALIGN(s->fft_length, 16);
>>>> +    s->nb_partitions = (s->nb_taps + s->part_size - 1) / s->part_size;
>>>> +    s->nb_coeffs = s->ir_length + s->nb_partitions;
>>>> +
>>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>>> +        s->sum[ch] = av_calloc(s->fft_length, sizeof(**s->sum));
>>>> +        if (!s->sum[ch])
>>>> +            return AVERROR(ENOMEM);
>>>> +    }
>>>> +
>>>> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
>>>> +        s->coeff[ch] = av_calloc(s->nb_coeffs, sizeof(**s->coeff));
>>>> +        if (!s->coeff[ch])
>>>> +            return AVERROR(ENOMEM);
>>>> +    }
>>>> +
>>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>>> +        s->block[ch] = av_calloc(s->nb_partitions * s->block_length,
>>>> sizeof(**s->block));
>>>> +        if (!s->block[ch])
>>>> +            return AVERROR(ENOMEM);
>>>> +    }
>>>> +
>>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>>> +        s->rdft[ch]  = av_rdft_init(N, DFT_R2C);
>>>> +        s->irdft[ch] = av_rdft_init(N, IDFT_C2R);
>>>> +        if (!s->rdft[ch] || !s->irdft[ch])
>>>> +            return AVERROR(ENOMEM);
>>>> +    }
>>>> +
>>>> +    s->in[1] = ff_get_audio_buffer(ctx->inputs[1], s->nb_taps);
>>>> +    if (!s->in[1])
>>>> +        return AVERROR(ENOMEM);
>>>> +
>>>> +    s->buffer = ff_get_audio_buffer(ctx->inputs[0], s->part_size * 3);
>>>> +    if (!s->buffer)
>>>> +        return AVERROR(ENOMEM);
>>>> +
>>>> +    av_audio_fifo_read(s->fifo[1], (void **)s->in[1]->extended_data,
>>>> s->nb_taps);
>>>> +
>>>> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
>>>> +        float *time = (float *)s->in[1]->extended_data[!s->one2many *
>>>> ch];
>>>> +        float *block = s->block[ch];
>>>> +        FFTComplex *coeff = s->coeff[ch];
>>>> +
>>>> +        for (i = FFMAX(1, s->length * s->nb_taps); i < s->nb_taps; i++)
>>>> +            time[i] = 0;
>>>> +
>>>> +        for (i = 0; i < s->nb_partitions; i++) {
>>>> +            const float scale = 1.f / s->part_size;
>>>> +            const int toffset = i * s->part_size;
>>>> +            const int coffset = i * (s->part_size + 1);
>>>> +            const int boffset = s->part_size;
>>>> +            const int remaining = s->nb_taps - (i * s->part_size);
>>>> +            const int size = remaining >= s->part_size ? s->part_size :
>>>> remaining;
>>>> +
>>>> +            memset(block, 0, sizeof(*block) * s->fft_length);
>>>> +            for (n = 0; n < size; n++) {
>>>> +                power += time[n + toffset] * time[n + toffset];
>>>> +                block[n + boffset] = time[n + toffset];
>>>> +            }
>>>> +
>>>> +            av_rdft_calc(s->rdft[0], block);
>>>> +
>>>> +            coeff[coffset].re = block[0] * scale;
>>>> +            coeff[coffset].im = 0;
>>>> +            for (n = 1; n < s->part_size; n++) {
>>>> +                coeff[coffset + n].re = block[2 * n] * scale;
>>>> +                coeff[coffset + n].im = block[2 * n + 1] * scale;
>>>> +            }
>>>> +            coeff[coffset + s->part_size].re = block[1] * scale;
>>>> +            coeff[coffset + s->part_size].im = 0;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    av_frame_free(&s->in[1]);
>>>> +    s->gain = 1.f / sqrtf(power);
>>>
>>> I think s->gain is not required at all. The coeffs are already scaled by
>>> scale.
>>
>> Its needed. Various IRs gives different peak values.
>> The calculation is not perfect but it helps.
>
> OK. So, make it optional again (e.g using auto option).

I don't see need for it, without it its always worse.

I updated patch with added SIMD for trivial complex multiplication.

It is faster (not much) then what gcc generates.

Muhammad Faiz May 9, 2017, 5:44 a.m. UTC | #5

On Tue, May 9, 2017 at 5:03 AM, Paul B Mahol <onemda@gmail.com> wrote:
> On 5/8/17, Muhammad Faiz <mfcc64@gmail.com> wrote:
>> On Mon, May 8, 2017 at 11:06 PM, Paul B Mahol <onemda@gmail.com> wrote:
>>> On 5/8/17, Muhammad Faiz <mfcc64@gmail.com> wrote:
>>>> On Mon, May 8, 2017 at 6:59 PM, Paul B Mahol <onemda@gmail.com> wrote:
>>>>> Signed-off-by: Paul B Mahol <onemda@gmail.com>
>>>>> ---
>>>>>  configure                |   2 +
>>>>>  doc/filters.texi         |  23 ++
>>>>>  libavfilter/Makefile     |   1 +
>>>>>  libavfilter/af_afir.c    | 544
>>>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>>>>  libavfilter/allfilters.c |   1 +
>>>>>  5 files changed, 571 insertions(+)
>>>>>  create mode 100644 libavfilter/af_afir.c
>>>>>
>>>>> diff --git a/configure b/configure
>>>>> index 2e1786a..a46c375 100755
>>>>> --- a/configure
>>>>> +++ b/configure
>>>>> @@ -3081,6 +3081,8 @@ unix_protocol_select="network"
>>>>>  # filters
>>>>>  afftfilt_filter_deps="avcodec"
>>>>>  afftfilt_filter_select="fft"
>>>>> +afir_filter_deps="avcodec"
>>>>> +afir_filter_select="fft"
>>>>>  amovie_filter_deps="avcodec avformat"
>>>>>  aresample_filter_deps="swresample"
>>>>>  ass_filter_deps="libass"
>>>>> diff --git a/doc/filters.texi b/doc/filters.texi
>>>>> index f431274..0efce9a 100644
>>>>> --- a/doc/filters.texi
>>>>> +++ b/doc/filters.texi
>>>>> @@ -878,6 +878,29 @@ afftfilt="1-clip((b/nb)*b,0,1)"
>>>>>  @end example
>>>>>  @end itemize
>>>>>
>>>>> +@section afir
>>>>> +
>>>>> +Apply an Arbitary Frequency Impulse Response filter.
>>>>> +
>>>>> +This filter uses second stream as FIR coefficients.
>>>>> +If second stream holds single channel, it will be used
>>>>> +for all input channels in first stream, otherwise
>>>>> +number of channels in second stream must be same as
>>>>> +number of channels in first stream.
>>>>> +
>>>>> +It accepts the following parameters:
>>>>> +
>>>>> +@table @option
>>>>> +@item dry
>>>>> +Set dry gain. This sets input gain.
>>>>> +
>>>>> +@item wet
>>>>> +Set wet gain. This sets final output gain.
>>>>> +
>>>>> +@item length
>>>>> +Set Impulse Response filter length. Default is 1, which means whole IR
>>>>> is
>>>>> processed.
>>>>> +@end table
>>>>> +
>>>>>  @anchor{aformat}
>>>>>  @section aformat
>>>>>
>>>>> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
>>>>> index 0f99086..de5f992 100644
>>>>> --- a/libavfilter/Makefile
>>>>> +++ b/libavfilter/Makefile
>>>>> @@ -37,6 +37,7 @@ OBJS-$(CONFIG_AEMPHASIS_FILTER)              +=
>>>>> af_aemphasis.o
>>>>>  OBJS-$(CONFIG_AEVAL_FILTER)                  += aeval.o
>>>>>  OBJS-$(CONFIG_AFADE_FILTER)                  += af_afade.o
>>>>>  OBJS-$(CONFIG_AFFTFILT_FILTER)               += af_afftfilt.o
>>>>> window_func.o
>>>>> +OBJS-$(CONFIG_AFIR_FILTER)                   += af_afir.o
>>>>>  OBJS-$(CONFIG_AFORMAT_FILTER)                += af_aformat.o
>>>>>  OBJS-$(CONFIG_AGATE_FILTER)                  += af_agate.o
>>>>>  OBJS-$(CONFIG_AINTERLEAVE_FILTER)            += f_interleave.o
>>>>> diff --git a/libavfilter/af_afir.c b/libavfilter/af_afir.c
>>>>> new file mode 100644
>>>>> index 0000000..bc1b6a4
>>>>> --- /dev/null
>>>>> +++ b/libavfilter/af_afir.c
>>>>> @@ -0,0 +1,544 @@
>>>>> +/*
>>>>> + * Copyright (c) 2017 Paul B Mahol
>>>>> + *
>>>>> + * This file is part of FFmpeg.
>>>>> + *
>>>>> + * FFmpeg is free software; you can redistribute it and/or
>>>>> + * modify it under the terms of the GNU Lesser General Public
>>>>> + * License as published by the Free Software Foundation; either
>>>>> + * version 2.1 of the License, or (at your option) any later version.
>>>>> + *
>>>>> + * FFmpeg is distributed in the hope that it will be useful,
>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>>> + * Lesser General Public License for more details.
>>>>> + *
>>>>> + * You should have received a copy of the GNU Lesser General Public
>>>>> + * License along with FFmpeg; if not, write to the Free Software
>>>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
>>>>> 02110-1301 USA
>>>>> + */
>>>>> +
>>>>> +/**
>>>>> + * @file
>>>>> + * An arbitrary audio FIR filter
>>>>> + */
>>>>> +
>>>>> +#include "libavutil/audio_fifo.h"
>>>>> +#include "libavutil/common.h"
>>>>> +#include "libavutil/opt.h"
>>>>> +#include "libavcodec/avfft.h"
>>>>> +
>>>>> +#include "audio.h"
>>>>> +#include "avfilter.h"
>>>>> +#include "formats.h"
>>>>> +#include "internal.h"
>>>>> +
>>>>> +#define MAX_IR_DURATION 30
>>>>> +
>>>>> +typedef struct AudioFIRContext {
>>>>> +    const AVClass *class;
>>>>> +
>>>>> +    float wet_gain;
>>>>> +    float dry_gain;
>>>>> +    float length;
>>>>> +
>>>>> +    float gain;
>>>>> +
>>>>> +    int eof_coeffs;
>>>>> +    int have_coeffs;
>>>>> +    int nb_coeffs;
>>>>> +    int nb_taps;
>>>>> +    int part_size;
>>>>> +    int part_index;
>>>>> +    int block_length;
>>>>> +    int nb_partitions;
>>>>> +    int nb_channels;
>>>>> +    int ir_length;
>>>>> +    int fft_length;
>>>>> +    int nb_coef_channels;
>>>>> +    int one2many;
>>>>> +    int nb_samples;
>>>>> +    int want_skip;
>>>>> +    int need_padding;
>>>>> +
>>>>> +    RDFTContext **rdft, **irdft;
>>>>> +    float **sum;
>>>>> +    float **block;
>>>>> +    FFTComplex **coeff;
>>>>> +
>>>>> +    AVAudioFifo *fifo[2];
>>>>> +    AVFrame *in[2];
>>>>> +    AVFrame *buffer;
>>>>> +    int64_t pts;
>>>>> +    int index;
>>>>> +} AudioFIRContext;
>>>>> +
>>>>> +static int fir_channel(AVFilterContext *ctx, void *arg, int ch, int
>>>>> nb_jobs)
>>>>> +{
>>>>> +    AudioFIRContext *s = ctx->priv;
>>>>> +    const FFTComplex *coeff = s->coeff[ch * !s->one2many];
>>>>> +    const float *src = (const float *)s->in[0]->extended_data[ch];
>>>>> +    int index1 = (s->index + 1) % 3;
>>>>> +    int index2 = (s->index + 2) % 3;
>>>>> +    float *sum = s->sum[ch];
>>>>> +    AVFrame *out = arg;
>>>>> +    float *block;
>>>>> +    float *dst;
>>>>> +    int n, i, j;
>>>>> +
>>>>> +    memset(sum, 0, sizeof(*sum) * s->fft_length);
>>>>> +    block = s->block[ch] + s->part_index * s->block_length;
>>>>> +    memset(block, 0, sizeof(*block) * s->fft_length);
>>>>> +    for (n = 0; n < s->nb_samples; n++) {
>>>>> +        block[s->part_size + n] = src[n] * s->dry_gain;
>>>>> +    }
>>>>> +
>>>>> +    av_rdft_calc(s->rdft[ch], block);
>>>>> +    block[2 * s->part_size] = block[1];
>>>>> +    block[1] = 0;
>>>>> +
>>>>> +    j = s->part_index;
>>>>> +
>>>>> +    for (i = 0; i < s->nb_partitions; i++) {
>>>>> +        const int coffset = i * (s->part_size + 1);
>>>>> +
>>>>> +        block = s->block[ch] + j * s->block_length;
>>>>> +        for (n = 0; n < s->part_size; n++) {
>>>>> +            const float cre = coeff[coffset + n].re;
>>>>> +            const float cim = coeff[coffset + n].im;
>>>>> +            const float tre = block[2 * n    ];
>>>>> +            const float tim = block[2 * n + 1];
>>>>> +
>>>>> +            sum[2 * n    ] += tre * cre - tim * cim;
>>>>> +            sum[2 * n + 1] += tre * cim + tim * cre;
>>>>> +        }
>>>>> +        sum[2 * n] += block[2 * n] * coeff[coffset + n].re;
>>>>> +
>>>>> +        if (j == 0)
>>>>> +            j = s->nb_partitions;
>>>>> +        j--;
>>>>> +    }
>>>>> +
>>>>> +    sum[1] = sum[2 * n];
>>>>> +    av_rdft_calc(s->irdft[ch], sum);
>>>>> +
>>>>> +    dst = (float *)s->buffer->extended_data[ch] + index1 *
>>>>> s->part_size;
>>>>> +    for (n = 0; n < s->part_size; n++) {
>>>>> +        dst[n] += sum[n];
>>>>> +    }
>>>>> +
>>>>> +    dst = (float *)s->buffer->extended_data[ch] + index2 *
>>>>> s->part_size;
>>>>> +
>>>>> +    memcpy(dst, sum + s->part_size, s->part_size * sizeof(*dst));
>>>>> +
>>>>> +    dst = (float *)s->buffer->extended_data[ch] + s->index *
>>>>> s->part_size;
>>>>> +
>>>>> +    if (out) {
>>>>> +        float *ptr = (float *)out->extended_data[ch];
>>>>> +        for (n = 0; n < out->nb_samples; n++) {
>>>>> +            ptr[n] = dst[n] * s->gain * s->wet_gain;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int fir_frame(AudioFIRContext *s, AVFilterLink *outlink)
>>>>> +{
>>>>> +    AVFilterContext *ctx = outlink->src;
>>>>> +    AVFrame *out = NULL;
>>>>> +    int ret;
>>>>> +
>>>>> +    s->nb_samples = FFMIN(s->part_size,
>>>>> av_audio_fifo_size(s->fifo[0]));
>>>>> +
>>>>> +    if (!s->want_skip) {
>>>>> +        out = ff_get_audio_buffer(outlink, s->nb_samples);
>>>>> +        if (!out)
>>>>> +            return AVERROR(ENOMEM);
>>>>> +    }
>>>>> +
>>>>> +    s->in[0] = ff_get_audio_buffer(ctx->inputs[0], s->nb_samples);
>>>>> +    if (!s->in[0]) {
>>>>> +        av_frame_free(&out);
>>>>> +        return AVERROR(ENOMEM);
>>>>> +    }
>>>>> +
>>>>> +    av_audio_fifo_peek(s->fifo[0], (void **)s->in[0]->extended_data,
>>>>> s->nb_samples);
>>>>> +
>>>>> +    ctx->internal->execute(ctx, fir_channel, out, NULL,
>>>>> outlink->channels);
>>>>> +
>>>>> +    s->part_index = (s->part_index + 1) % s->nb_partitions;
>>>>> +
>>>>> +    av_audio_fifo_drain(s->fifo[0], s->nb_samples);
>>>>> +
>>>>> +    if (!s->want_skip) {
>>>>> +        out->pts = s->pts;
>>>>> +        if (s->pts != AV_NOPTS_VALUE)
>>>>> +            s->pts += av_rescale_q(out->nb_samples, (AVRational){1,
>>>>> outlink->sample_rate}, outlink->time_base);
>>>>> +    }
>>>>> +
>>>>> +    s->index++;
>>>>> +    if (s->index == 3)
>>>>> +        s->index = 0;
>>>>> +
>>>>> +    av_frame_free(&s->in[0]);
>>>>> +
>>>>> +    if (s->want_skip == 1) {
>>>>> +        s->want_skip = 0;
>>>>> +        ret = 0;
>>>>> +    } else {
>>>>> +        ret = ff_filter_frame(outlink, out);
>>>>> +    }
>>>>> +
>>>>> +    return ret;
>>>>> +}
>>>>> +
>>>>> +static int convert_coeffs(AVFilterContext *ctx)
>>>>> +{
>>>>> +    AudioFIRContext *s = ctx->priv;
>>>>> +    int i, ch, n, N;
>>>>> +    float power = 0;
>>>>> +
>>>>> +    s->nb_taps = av_audio_fifo_size(s->fifo[1]);
>>>>> +
>>>>> +    for (n = 4; (1 << n) < s->nb_taps; n++);
>>>>> +    N = FFMIN(n, 16);
>>>>
>>>> It is nice to allow user set maximum N e.g. for low latency app, user
>>>> can set low N with higher nb_partitions.
>>>
>>> Could be later added, but for low latency, one uses NUPOLS or first
>>> partition is done in time domain.
>>> Using small N drastically reduces speed.
>>>
>>>>
>>>>
>>>>> +    s->ir_length = 1 << n;
>>>>> +    s->fft_length = (1 << (N + 1)) + 1;
>>>>> +    s->part_size = 1 << (N - 1);
>>>>> +    s->block_length = FFALIGN(s->fft_length, 16);
>>>>> +    s->nb_partitions = (s->nb_taps + s->part_size - 1) / s->part_size;
>>>>> +    s->nb_coeffs = s->ir_length + s->nb_partitions;
>>>>> +
>>>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>>>> +        s->sum[ch] = av_calloc(s->fft_length, sizeof(**s->sum));
>>>>> +        if (!s->sum[ch])
>>>>> +            return AVERROR(ENOMEM);
>>>>> +    }
>>>>> +
>>>>> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
>>>>> +        s->coeff[ch] = av_calloc(s->nb_coeffs, sizeof(**s->coeff));
>>>>> +        if (!s->coeff[ch])
>>>>> +            return AVERROR(ENOMEM);
>>>>> +    }
>>>>> +
>>>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>>>> +        s->block[ch] = av_calloc(s->nb_partitions * s->block_length,
>>>>> sizeof(**s->block));
>>>>> +        if (!s->block[ch])
>>>>> +            return AVERROR(ENOMEM);
>>>>> +    }
>>>>> +
>>>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>>>> +        s->rdft[ch]  = av_rdft_init(N, DFT_R2C);
>>>>> +        s->irdft[ch] = av_rdft_init(N, IDFT_C2R);
>>>>> +        if (!s->rdft[ch] || !s->irdft[ch])
>>>>> +            return AVERROR(ENOMEM);
>>>>> +    }
>>>>> +
>>>>> +    s->in[1] = ff_get_audio_buffer(ctx->inputs[1], s->nb_taps);
>>>>> +    if (!s->in[1])
>>>>> +        return AVERROR(ENOMEM);
>>>>> +
>>>>> +    s->buffer = ff_get_audio_buffer(ctx->inputs[0], s->part_size * 3);
>>>>> +    if (!s->buffer)
>>>>> +        return AVERROR(ENOMEM);
>>>>> +
>>>>> +    av_audio_fifo_read(s->fifo[1], (void **)s->in[1]->extended_data,
>>>>> s->nb_taps);
>>>>> +
>>>>> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
>>>>> +        float *time = (float *)s->in[1]->extended_data[!s->one2many *
>>>>> ch];
>>>>> +        float *block = s->block[ch];
>>>>> +        FFTComplex *coeff = s->coeff[ch];
>>>>> +
>>>>> +        for (i = FFMAX(1, s->length * s->nb_taps); i < s->nb_taps; i++)
>>>>> +            time[i] = 0;
>>>>> +
>>>>> +        for (i = 0; i < s->nb_partitions; i++) {
>>>>> +            const float scale = 1.f / s->part_size;
>>>>> +            const int toffset = i * s->part_size;
>>>>> +            const int coffset = i * (s->part_size + 1);
>>>>> +            const int boffset = s->part_size;
>>>>> +            const int remaining = s->nb_taps - (i * s->part_size);
>>>>> +            const int size = remaining >= s->part_size ? s->part_size :
>>>>> remaining;
>>>>> +
>>>>> +            memset(block, 0, sizeof(*block) * s->fft_length);
>>>>> +            for (n = 0; n < size; n++) {
>>>>> +                power += time[n + toffset] * time[n + toffset];
>>>>> +                block[n + boffset] = time[n + toffset];
>>>>> +            }
>>>>> +
>>>>> +            av_rdft_calc(s->rdft[0], block);
>>>>> +
>>>>> +            coeff[coffset].re = block[0] * scale;
>>>>> +            coeff[coffset].im = 0;
>>>>> +            for (n = 1; n < s->part_size; n++) {
>>>>> +                coeff[coffset + n].re = block[2 * n] * scale;
>>>>> +                coeff[coffset + n].im = block[2 * n + 1] * scale;
>>>>> +            }
>>>>> +            coeff[coffset + s->part_size].re = block[1] * scale;
>>>>> +            coeff[coffset + s->part_size].im = 0;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    av_frame_free(&s->in[1]);
>>>>> +    s->gain = 1.f / sqrtf(power);

sqrtf(power/ctx->inputs[1]->channels)


>>>>
>>>> I think s->gain is not required at all. The coeffs are already scaled by
>>>> scale.
>>>
>>> Its needed. Various IRs gives different peak values.
>>> The calculation is not perfect but it helps.
>>
>> OK. So, make it optional again (e.g using auto option).
>
> I don't see need for it, without it its always worse.

Is it bad to preserve the actual frequency response.
I mean here s->gain = 1.0f;
not s->gain = 1.0f / s->part_size;


>
> I updated patch with added SIMD for trivial complex multiplication.
>
> It is faster (not much) then what gcc generates.

Paul B Mahol May 9, 2017, 10:16 a.m. UTC | #6

On 5/9/17, Muhammad Faiz <mfcc64@gmail.com> wrote:
> On Tue, May 9, 2017 at 5:03 AM, Paul B Mahol <onemda@gmail.com> wrote:
>> On 5/8/17, Muhammad Faiz <mfcc64@gmail.com> wrote:
>>> On Mon, May 8, 2017 at 11:06 PM, Paul B Mahol <onemda@gmail.com> wrote:
>>>> On 5/8/17, Muhammad Faiz <mfcc64@gmail.com> wrote:
>>>>> On Mon, May 8, 2017 at 6:59 PM, Paul B Mahol <onemda@gmail.com> wrote:
>>>>>> Signed-off-by: Paul B Mahol <onemda@gmail.com>
>>>>>> ---
>>>>>>  configure                |   2 +
>>>>>>  doc/filters.texi         |  23 ++
>>>>>>  libavfilter/Makefile     |   1 +
>>>>>>  libavfilter/af_afir.c    | 544
>>>>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>  libavfilter/allfilters.c |   1 +
>>>>>>  5 files changed, 571 insertions(+)
>>>>>>  create mode 100644 libavfilter/af_afir.c
>>>>>>
>>>>>> diff --git a/configure b/configure
>>>>>> index 2e1786a..a46c375 100755
>>>>>> --- a/configure
>>>>>> +++ b/configure
>>>>>> @@ -3081,6 +3081,8 @@ unix_protocol_select="network"
>>>>>>  # filters
>>>>>>  afftfilt_filter_deps="avcodec"
>>>>>>  afftfilt_filter_select="fft"
>>>>>> +afir_filter_deps="avcodec"
>>>>>> +afir_filter_select="fft"
>>>>>>  amovie_filter_deps="avcodec avformat"
>>>>>>  aresample_filter_deps="swresample"
>>>>>>  ass_filter_deps="libass"
>>>>>> diff --git a/doc/filters.texi b/doc/filters.texi
>>>>>> index f431274..0efce9a 100644
>>>>>> --- a/doc/filters.texi
>>>>>> +++ b/doc/filters.texi
>>>>>> @@ -878,6 +878,29 @@ afftfilt="1-clip((b/nb)*b,0,1)"
>>>>>>  @end example
>>>>>>  @end itemize
>>>>>>
>>>>>> +@section afir
>>>>>> +
>>>>>> +Apply an Arbitary Frequency Impulse Response filter.
>>>>>> +
>>>>>> +This filter uses second stream as FIR coefficients.
>>>>>> +If second stream holds single channel, it will be used
>>>>>> +for all input channels in first stream, otherwise
>>>>>> +number of channels in second stream must be same as
>>>>>> +number of channels in first stream.
>>>>>> +
>>>>>> +It accepts the following parameters:
>>>>>> +
>>>>>> +@table @option
>>>>>> +@item dry
>>>>>> +Set dry gain. This sets input gain.
>>>>>> +
>>>>>> +@item wet
>>>>>> +Set wet gain. This sets final output gain.
>>>>>> +
>>>>>> +@item length
>>>>>> +Set Impulse Response filter length. Default is 1, which means whole
>>>>>> IR
>>>>>> is
>>>>>> processed.
>>>>>> +@end table
>>>>>> +
>>>>>>  @anchor{aformat}
>>>>>>  @section aformat
>>>>>>
>>>>>> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
>>>>>> index 0f99086..de5f992 100644
>>>>>> --- a/libavfilter/Makefile
>>>>>> +++ b/libavfilter/Makefile
>>>>>> @@ -37,6 +37,7 @@ OBJS-$(CONFIG_AEMPHASIS_FILTER)              +=
>>>>>> af_aemphasis.o
>>>>>>  OBJS-$(CONFIG_AEVAL_FILTER)                  += aeval.o
>>>>>>  OBJS-$(CONFIG_AFADE_FILTER)                  += af_afade.o
>>>>>>  OBJS-$(CONFIG_AFFTFILT_FILTER)               += af_afftfilt.o
>>>>>> window_func.o
>>>>>> +OBJS-$(CONFIG_AFIR_FILTER)                   += af_afir.o
>>>>>>  OBJS-$(CONFIG_AFORMAT_FILTER)                += af_aformat.o
>>>>>>  OBJS-$(CONFIG_AGATE_FILTER)                  += af_agate.o
>>>>>>  OBJS-$(CONFIG_AINTERLEAVE_FILTER)            += f_interleave.o
>>>>>> diff --git a/libavfilter/af_afir.c b/libavfilter/af_afir.c
>>>>>> new file mode 100644
>>>>>> index 0000000..bc1b6a4
>>>>>> --- /dev/null
>>>>>> +++ b/libavfilter/af_afir.c
>>>>>> @@ -0,0 +1,544 @@
>>>>>> +/*
>>>>>> + * Copyright (c) 2017 Paul B Mahol
>>>>>> + *
>>>>>> + * This file is part of FFmpeg.
>>>>>> + *
>>>>>> + * FFmpeg is free software; you can redistribute it and/or
>>>>>> + * modify it under the terms of the GNU Lesser General Public
>>>>>> + * License as published by the Free Software Foundation; either
>>>>>> + * version 2.1 of the License, or (at your option) any later version.
>>>>>> + *
>>>>>> + * FFmpeg is distributed in the hope that it will be useful,
>>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>>>> + * Lesser General Public License for more details.
>>>>>> + *
>>>>>> + * You should have received a copy of the GNU Lesser General Public
>>>>>> + * License along with FFmpeg; if not, write to the Free Software
>>>>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
>>>>>> 02110-1301 USA
>>>>>> + */
>>>>>> +
>>>>>> +/**
>>>>>> + * @file
>>>>>> + * An arbitrary audio FIR filter
>>>>>> + */
>>>>>> +
>>>>>> +#include "libavutil/audio_fifo.h"
>>>>>> +#include "libavutil/common.h"
>>>>>> +#include "libavutil/opt.h"
>>>>>> +#include "libavcodec/avfft.h"
>>>>>> +
>>>>>> +#include "audio.h"
>>>>>> +#include "avfilter.h"
>>>>>> +#include "formats.h"
>>>>>> +#include "internal.h"
>>>>>> +
>>>>>> +#define MAX_IR_DURATION 30
>>>>>> +
>>>>>> +typedef struct AudioFIRContext {
>>>>>> +    const AVClass *class;
>>>>>> +
>>>>>> +    float wet_gain;
>>>>>> +    float dry_gain;
>>>>>> +    float length;
>>>>>> +
>>>>>> +    float gain;
>>>>>> +
>>>>>> +    int eof_coeffs;
>>>>>> +    int have_coeffs;
>>>>>> +    int nb_coeffs;
>>>>>> +    int nb_taps;
>>>>>> +    int part_size;
>>>>>> +    int part_index;
>>>>>> +    int block_length;
>>>>>> +    int nb_partitions;
>>>>>> +    int nb_channels;
>>>>>> +    int ir_length;
>>>>>> +    int fft_length;
>>>>>> +    int nb_coef_channels;
>>>>>> +    int one2many;
>>>>>> +    int nb_samples;
>>>>>> +    int want_skip;
>>>>>> +    int need_padding;
>>>>>> +
>>>>>> +    RDFTContext **rdft, **irdft;
>>>>>> +    float **sum;
>>>>>> +    float **block;
>>>>>> +    FFTComplex **coeff;
>>>>>> +
>>>>>> +    AVAudioFifo *fifo[2];
>>>>>> +    AVFrame *in[2];
>>>>>> +    AVFrame *buffer;
>>>>>> +    int64_t pts;
>>>>>> +    int index;
>>>>>> +} AudioFIRContext;
>>>>>> +
>>>>>> +static int fir_channel(AVFilterContext *ctx, void *arg, int ch, int
>>>>>> nb_jobs)
>>>>>> +{
>>>>>> +    AudioFIRContext *s = ctx->priv;
>>>>>> +    const FFTComplex *coeff = s->coeff[ch * !s->one2many];
>>>>>> +    const float *src = (const float *)s->in[0]->extended_data[ch];
>>>>>> +    int index1 = (s->index + 1) % 3;
>>>>>> +    int index2 = (s->index + 2) % 3;
>>>>>> +    float *sum = s->sum[ch];
>>>>>> +    AVFrame *out = arg;
>>>>>> +    float *block;
>>>>>> +    float *dst;
>>>>>> +    int n, i, j;
>>>>>> +
>>>>>> +    memset(sum, 0, sizeof(*sum) * s->fft_length);
>>>>>> +    block = s->block[ch] + s->part_index * s->block_length;
>>>>>> +    memset(block, 0, sizeof(*block) * s->fft_length);
>>>>>> +    for (n = 0; n < s->nb_samples; n++) {
>>>>>> +        block[s->part_size + n] = src[n] * s->dry_gain;
>>>>>> +    }
>>>>>> +
>>>>>> +    av_rdft_calc(s->rdft[ch], block);
>>>>>> +    block[2 * s->part_size] = block[1];
>>>>>> +    block[1] = 0;
>>>>>> +
>>>>>> +    j = s->part_index;
>>>>>> +
>>>>>> +    for (i = 0; i < s->nb_partitions; i++) {
>>>>>> +        const int coffset = i * (s->part_size + 1);
>>>>>> +
>>>>>> +        block = s->block[ch] + j * s->block_length;
>>>>>> +        for (n = 0; n < s->part_size; n++) {
>>>>>> +            const float cre = coeff[coffset + n].re;
>>>>>> +            const float cim = coeff[coffset + n].im;
>>>>>> +            const float tre = block[2 * n    ];
>>>>>> +            const float tim = block[2 * n + 1];
>>>>>> +
>>>>>> +            sum[2 * n    ] += tre * cre - tim * cim;
>>>>>> +            sum[2 * n + 1] += tre * cim + tim * cre;
>>>>>> +        }
>>>>>> +        sum[2 * n] += block[2 * n] * coeff[coffset + n].re;
>>>>>> +
>>>>>> +        if (j == 0)
>>>>>> +            j = s->nb_partitions;
>>>>>> +        j--;
>>>>>> +    }
>>>>>> +
>>>>>> +    sum[1] = sum[2 * n];
>>>>>> +    av_rdft_calc(s->irdft[ch], sum);
>>>>>> +
>>>>>> +    dst = (float *)s->buffer->extended_data[ch] + index1 *
>>>>>> s->part_size;
>>>>>> +    for (n = 0; n < s->part_size; n++) {
>>>>>> +        dst[n] += sum[n];
>>>>>> +    }
>>>>>> +
>>>>>> +    dst = (float *)s->buffer->extended_data[ch] + index2 *
>>>>>> s->part_size;
>>>>>> +
>>>>>> +    memcpy(dst, sum + s->part_size, s->part_size * sizeof(*dst));
>>>>>> +
>>>>>> +    dst = (float *)s->buffer->extended_data[ch] + s->index *
>>>>>> s->part_size;
>>>>>> +
>>>>>> +    if (out) {
>>>>>> +        float *ptr = (float *)out->extended_data[ch];
>>>>>> +        for (n = 0; n < out->nb_samples; n++) {
>>>>>> +            ptr[n] = dst[n] * s->gain * s->wet_gain;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int fir_frame(AudioFIRContext *s, AVFilterLink *outlink)
>>>>>> +{
>>>>>> +    AVFilterContext *ctx = outlink->src;
>>>>>> +    AVFrame *out = NULL;
>>>>>> +    int ret;
>>>>>> +
>>>>>> +    s->nb_samples = FFMIN(s->part_size,
>>>>>> av_audio_fifo_size(s->fifo[0]));
>>>>>> +
>>>>>> +    if (!s->want_skip) {
>>>>>> +        out = ff_get_audio_buffer(outlink, s->nb_samples);
>>>>>> +        if (!out)
>>>>>> +            return AVERROR(ENOMEM);
>>>>>> +    }
>>>>>> +
>>>>>> +    s->in[0] = ff_get_audio_buffer(ctx->inputs[0], s->nb_samples);
>>>>>> +    if (!s->in[0]) {
>>>>>> +        av_frame_free(&out);
>>>>>> +        return AVERROR(ENOMEM);
>>>>>> +    }
>>>>>> +
>>>>>> +    av_audio_fifo_peek(s->fifo[0], (void **)s->in[0]->extended_data,
>>>>>> s->nb_samples);
>>>>>> +
>>>>>> +    ctx->internal->execute(ctx, fir_channel, out, NULL,
>>>>>> outlink->channels);
>>>>>> +
>>>>>> +    s->part_index = (s->part_index + 1) % s->nb_partitions;
>>>>>> +
>>>>>> +    av_audio_fifo_drain(s->fifo[0], s->nb_samples);
>>>>>> +
>>>>>> +    if (!s->want_skip) {
>>>>>> +        out->pts = s->pts;
>>>>>> +        if (s->pts != AV_NOPTS_VALUE)
>>>>>> +            s->pts += av_rescale_q(out->nb_samples, (AVRational){1,
>>>>>> outlink->sample_rate}, outlink->time_base);
>>>>>> +    }
>>>>>> +
>>>>>> +    s->index++;
>>>>>> +    if (s->index == 3)
>>>>>> +        s->index = 0;
>>>>>> +
>>>>>> +    av_frame_free(&s->in[0]);
>>>>>> +
>>>>>> +    if (s->want_skip == 1) {
>>>>>> +        s->want_skip = 0;
>>>>>> +        ret = 0;
>>>>>> +    } else {
>>>>>> +        ret = ff_filter_frame(outlink, out);
>>>>>> +    }
>>>>>> +
>>>>>> +    return ret;
>>>>>> +}
>>>>>> +
>>>>>> +static int convert_coeffs(AVFilterContext *ctx)
>>>>>> +{
>>>>>> +    AudioFIRContext *s = ctx->priv;
>>>>>> +    int i, ch, n, N;
>>>>>> +    float power = 0;
>>>>>> +
>>>>>> +    s->nb_taps = av_audio_fifo_size(s->fifo[1]);
>>>>>> +
>>>>>> +    for (n = 4; (1 << n) < s->nb_taps; n++);
>>>>>> +    N = FFMIN(n, 16);
>>>>>
>>>>> It is nice to allow user set maximum N e.g. for low latency app, user
>>>>> can set low N with higher nb_partitions.
>>>>
>>>> Could be later added, but for low latency, one uses NUPOLS or first
>>>> partition is done in time domain.
>>>> Using small N drastically reduces speed.
>>>>
>>>>>
>>>>>
>>>>>> +    s->ir_length = 1 << n;
>>>>>> +    s->fft_length = (1 << (N + 1)) + 1;
>>>>>> +    s->part_size = 1 << (N - 1);
>>>>>> +    s->block_length = FFALIGN(s->fft_length, 16);
>>>>>> +    s->nb_partitions = (s->nb_taps + s->part_size - 1) /
>>>>>> s->part_size;
>>>>>> +    s->nb_coeffs = s->ir_length + s->nb_partitions;
>>>>>> +
>>>>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>>>>> +        s->sum[ch] = av_calloc(s->fft_length, sizeof(**s->sum));
>>>>>> +        if (!s->sum[ch])
>>>>>> +            return AVERROR(ENOMEM);
>>>>>> +    }
>>>>>> +
>>>>>> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
>>>>>> +        s->coeff[ch] = av_calloc(s->nb_coeffs, sizeof(**s->coeff));
>>>>>> +        if (!s->coeff[ch])
>>>>>> +            return AVERROR(ENOMEM);
>>>>>> +    }
>>>>>> +
>>>>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>>>>> +        s->block[ch] = av_calloc(s->nb_partitions * s->block_length,
>>>>>> sizeof(**s->block));
>>>>>> +        if (!s->block[ch])
>>>>>> +            return AVERROR(ENOMEM);
>>>>>> +    }
>>>>>> +
>>>>>> +    for (ch = 0; ch < ctx->inputs[0]->channels; ch++) {
>>>>>> +        s->rdft[ch]  = av_rdft_init(N, DFT_R2C);
>>>>>> +        s->irdft[ch] = av_rdft_init(N, IDFT_C2R);
>>>>>> +        if (!s->rdft[ch] || !s->irdft[ch])
>>>>>> +            return AVERROR(ENOMEM);
>>>>>> +    }
>>>>>> +
>>>>>> +    s->in[1] = ff_get_audio_buffer(ctx->inputs[1], s->nb_taps);
>>>>>> +    if (!s->in[1])
>>>>>> +        return AVERROR(ENOMEM);
>>>>>> +
>>>>>> +    s->buffer = ff_get_audio_buffer(ctx->inputs[0], s->part_size *
>>>>>> 3);
>>>>>> +    if (!s->buffer)
>>>>>> +        return AVERROR(ENOMEM);
>>>>>> +
>>>>>> +    av_audio_fifo_read(s->fifo[1], (void **)s->in[1]->extended_data,
>>>>>> s->nb_taps);
>>>>>> +
>>>>>> +    for (ch = 0; ch < ctx->inputs[1]->channels; ch++) {
>>>>>> +        float *time = (float *)s->in[1]->extended_data[!s->one2many *
>>>>>> ch];
>>>>>> +        float *block = s->block[ch];
>>>>>> +        FFTComplex *coeff = s->coeff[ch];
>>>>>> +
>>>>>> +        for (i = FFMAX(1, s->length * s->nb_taps); i < s->nb_taps;
>>>>>> i++)
>>>>>> +            time[i] = 0;
>>>>>> +
>>>>>> +        for (i = 0; i < s->nb_partitions; i++) {
>>>>>> +            const float scale = 1.f / s->part_size;
>>>>>> +            const int toffset = i * s->part_size;
>>>>>> +            const int coffset = i * (s->part_size + 1);
>>>>>> +            const int boffset = s->part_size;
>>>>>> +            const int remaining = s->nb_taps - (i * s->part_size);
>>>>>> +            const int size = remaining >= s->part_size ? s->part_size
>>>>>> :
>>>>>> remaining;
>>>>>> +
>>>>>> +            memset(block, 0, sizeof(*block) * s->fft_length);
>>>>>> +            for (n = 0; n < size; n++) {
>>>>>> +                power += time[n + toffset] * time[n + toffset];
>>>>>> +                block[n + boffset] = time[n + toffset];
>>>>>> +            }
>>>>>> +
>>>>>> +            av_rdft_calc(s->rdft[0], block);
>>>>>> +
>>>>>> +            coeff[coffset].re = block[0] * scale;
>>>>>> +            coeff[coffset].im = 0;
>>>>>> +            for (n = 1; n < s->part_size; n++) {
>>>>>> +                coeff[coffset + n].re = block[2 * n] * scale;
>>>>>> +                coeff[coffset + n].im = block[2 * n + 1] * scale;
>>>>>> +            }
>>>>>> +            coeff[coffset + s->part_size].re = block[1] * scale;
>>>>>> +            coeff[coffset + s->part_size].im = 0;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    av_frame_free(&s->in[1]);
>>>>>> +    s->gain = 1.f / sqrtf(power);
>
> sqrtf(power/ctx->inputs[1]->channels)

done.

>
>
>>>>>
>>>>> I think s->gain is not required at all. The coeffs are already scaled
>>>>> by
>>>>> scale.
>>>>
>>>> Its needed. Various IRs gives different peak values.
>>>> The calculation is not perfect but it helps.
>>>
>>> OK. So, make it optional again (e.g using auto option).
>>
>> I don't see need for it, without it its always worse.
>
> Is it bad to preserve the actual frequency response.
> I mean here s->gain = 1.0f;
> not s->gain = 1.0f / s->part_size;

Added back.
Gonna apply soon.

[FFmpeg-devel] avfilter: add arbitrary audio FIR filter

Commit Message

Comments

Patch