diff mbox

[FFmpeg-devel] delogo filter: new "uglarm" interpolation mode added

Message ID a22056ae-643d-d5a8-30c9-3580f3aa28d6@gmx.de
State Superseded
Headers show

Commit Message

Uwe Freese Dec. 29, 2018, 8:38 p.m. UTC
Hello,

here's a new version of the patch.

Changes since the last version of the patch (mail from 27.12. 13:00), 
according the comments from Moritz and Carl Eugen:


- using av_assert0 instead of throwing an error if planes > MAX_PLANES
- using av_freep with the reference operator in uninit
- copyright year 2019 -> 2018
- add MODE_NB value in "mode" enum and limit mode parameter in 
delogo_options to MODE_NB-1
- changed commit message style: "avfilter/vf_delogo: add uglarm 
interpolation mode"
- bracket style, line 360
- changed documentation, using @var{...}, @code{...}, @table @option 
etc., explaining "uglarm" and using the term "raised to the power of" 
(see https://en.wikipedia.org/wiki/Exponentiation). I hope the use of 
the macros is correct as I used them... (@var referencing a parameter 
and @code referencing a value)
- changed parameter "power" (int) to "exponent" (float)


Some considerations about the parameter name "power" / "exponent":

I noticed that for the specific calculation of the weight in the new 
mode, the wording should be better named "exponent" instead of "power".
Wikipedia (https://en.wikipedia.org/wiki/Exponentiation) writes:

base ^ exponent = power

So I changed the value type to float and the calculation to

weight = distance ^ exponent
(float value exponent).

That is IMHO more consistent and easier to explain.

The downside is that the term "power" (or "strength") would be more 
generic and would probably be better reusable for other modes.

What do you think? Is "exponent" now ok, or should I change it back to 
"power", "strength" or something alike?


Again, I would be glad about comments and reviews. - Let me know what 
should be changed and I'll create a new patch in some days.

Regards,
Uwe


Am 28.12.18 um 00:08 schrieb Carl Eugen Hoyos:
> 2018-12-27 22:02 GMT+01:00, Uwe Freese <uwe.freese@gmx.de>:
>
>> Am 27.12.18 um 20:25 schrieb Carl Eugen Hoyos:
>>>> I have now added as error handling:
>>>>
>>>> av_log(inlink->src, AV_LOG_ERROR, "More planes in frame than
>>>> expected.\n");
>>>> return AVERROR(ENOMEM);
>>>>
>>>> Is this ok, or how should this be implemented instead?
>>> Not sure I understand: How can plane get >= MAX_PLANES?
>>> If this is impossible (as I believe), please use av_assert0().
>> I meant the use of "ENOMEM" and if there's a better error
>> constant to use here.
>>
>> At this line, the error is not about memory, but that the video input
>> format is unexpected.
> Is there a format with too many planes?
> If not, please use an assert, do not return an error.
>
> Carl Eugen
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Comments

Nicolas George Dec. 30, 2018, 1:02 p.m. UTC | #1
Uwe Freese (2018-12-29):
> The downside is that the term "power" (or "strength") would be more generic
> and would probably be better reusable for other modes.
> 
> What do you think? Is "exponent" now ok, or should I change it back to
> "power", "strength" or something alike?

I would say: do not bother. This can be decided when a new mode is
added, and options can have several names.

By the way, you would not be interested in developing a logo-detection
filter, per chance? Something that detects is a certain rectangle is
likely to contain a logo and computes its bounding box.

> Again, I would be glad about comments and reviews. - Let me know what should
> be changed and I'll create a new patch in some days.

Here is a more in-depth review from me:

> >From 83e79abb3311eb2dd92c63e8e15e0476af2f2891 Mon Sep 17 00:00:00 2001
> From: breaker27 <mail@uwe-freese.de>
> Date: Wed, 26 Dec 2018 18:16:48 +0100
> Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode
> 
> ---
>  doc/filters.texi        |  28 +++++++
>  libavfilter/vf_delogo.c | 189 +++++++++++++++++++++++++++++++++++++++++++++---
>  2 files changed, 208 insertions(+), 9 deletions(-)
> 
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 65ce25bc18..792560ad79 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -7952,6 +7952,34 @@ Specify the thickness of the fuzzy edge of the rectangle (added to
>  deprecated, setting higher values should no longer be necessary and
>  is not recommended.
>  
> +@item mode
> +Specify the interpolation mode used to remove the logo.
> +It accepts the following values:
> +@table @option
> +@item xy
> +Only the border pixels in straight x and y direction are considered
> +to replace any logo pixel. This mode tends to flicker and to show
> +horizontal and vertical lines.
> +@item uglarm
> +Consider the whole border for every logo pixel to replace. This mode
> +uses the distance raised to the power of the given @var{exponent} as
> +weight that decides to which amount every border pixel is taken into
> +account. This results in a more blurred area, which tends to be less
> +distracting. The uglarm mode was first implemented in VirtualDub's
> +LogoAway filter and is also known from ffdshow and is the
> +abbreviation for "Uwe's Great LogoAway Remove Mode".
> +@end table
> +The default value is @code{xy}.
> +
> +@item exponent
> +Specify the exponent used to calculate the weight out of the
> +distance in @code{uglarm} mode (weight = distance ^ @var{exponent}).
> +The value @code{0} results in a logo area which has
> +the same average color everywhere. The higher the value, the more
> +relevant a near border pixel will get, meaning that the borders of
> +the logo area are more similar to the surrounding pixels. The default
> +value is @code{3}.
> +
>  @item show
>  When set to 1, a green rectangle is drawn on the screen to simplify
>  finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters.
> diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c
> index 065d093641..dcb0366e63 100644
> --- a/libavfilter/vf_delogo.c
> +++ b/libavfilter/vf_delogo.c
> @@ -2,6 +2,7 @@
>   * Copyright (c) 2002 Jindrich Makovicka <makovick@gmail.com>
>   * Copyright (c) 2011 Stefano Sabatini
>   * Copyright (c) 2013, 2015 Jean Delvare <jdelvare@suse.com>
> + * Copyright (c) 2018 Uwe Freese <mail@uwe-freese.de>
>   *
>   * This file is part of FFmpeg.
>   *
> @@ -25,12 +26,16 @@
>   * A very simple tv station logo remover
>   * Originally imported from MPlayer libmpcodecs/vf_delogo.c,
>   * the algorithm was later improved.
> + * The "UGLARM" mode was first implemented 2001 by Uwe Freese for Virtual
> + * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into ffdshow's
> + * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg.
>   */
>  
>  #include "libavutil/common.h"
>  #include "libavutil/imgutils.h"
>  #include "libavutil/opt.h"
>  #include "libavutil/pixdesc.h"

> +#include "libavutil/avassert.h"

We try to keep the includes in alphabetical order.

>  #include "avfilter.h"
>  #include "formats.h"
>  #include "internal.h"
> @@ -50,6 +55,10 @@
>   * @param logo_w width of the logo
>   * @param logo_h height of the logo
>   * @param band   the size of the band around the processed area
> + * @param *uglarmtable      pointer to weight table in UGLARM interpolation mode,
> + *                          zero when x-y mode is used
> + * @param *uglarmweightsum  pointer to weight sum table in UGLARM interpolation mode,
> + *                          zero when x-y mode is used
>   * @param show   show a rectangle around the processed area, useful for
>   *               parameters tweaking
>   * @param direct if non-zero perform in-place processing
> @@ -58,7 +67,8 @@ static void apply_delogo(uint8_t *dst, int dst_linesize,
>                           uint8_t *src, int src_linesize,
>                           int w, int h, AVRational sar,
>                           int logo_x, int logo_y, int logo_w, int logo_h,
> -                         unsigned int band, int show, int direct)
> +                         unsigned int band, double *uglarmtable,
> +                         double *uglarmweightsum, int show, int direct)
>  {
>      int x, y;
>      uint64_t interp, weightl, weightr, weightt, weightb, weight;
> @@ -89,6 +99,7 @@ static void apply_delogo(uint8_t *dst, int dst_linesize,
>      dst += (logo_y1 + 1) * dst_linesize;
>      src += (logo_y1 + 1) * src_linesize;
>  
> +    if (!uglarmtable) {
>      for (y = logo_y1+1; y < logo_y2; y++) {
>          left_sample = topleft[src_linesize*(y-logo_y1)]   +
>                        topleft[src_linesize*(y-logo_y1-1)] +
> @@ -151,12 +162,125 @@ static void apply_delogo(uint8_t *dst, int dst_linesize,
>          dst += dst_linesize;
>          src += src_linesize;
>      }
> +    } else {
> +        int bx, by;
> +        double interpd;
> +

> +        for (y = logo_y1 + 1; y < logo_y2; y++) {
> +            for (x = logo_x1 + 1,
> +                xdst = dst + logo_x1 + 1,
> +                xsrc = src + logo_x1 + 1; x < logo_x2; x++, xdst++, xsrc++) {

I think a line break after the semicolons would make this easier to
understand.

Also, since x and y always used subtracted to logo_x1 or logo_y1, I
think changing the bounds of the loop would make the rest of the code
more readable:

	for (x = 1; x < logo_w_minus_one; x++)

> +
> +                if (show && (y == logo_y1 + 1 || y == logo_y2 - 1 ||
> +                            x == logo_x1 + 1 || x == logo_x2 - 1)) {

Nit: the alignment is off.

> +                    *xdst = 0;
> +                    continue;
> +                }
> +
> +                interpd = 0;
> +

> +                for (bx = 0; bx < logo_w; bx++) {
> +                    interpd += topleft[bx] *
> +                        uglarmtable[abs(bx - (x - logo_x1)) + (y - logo_y1) * (logo_w - 1)];
> +                    interpd += botleft[bx] *
> +                        uglarmtable[abs(bx - (x - logo_x1)) + (logo_h - (y - logo_y1) - 1) * (logo_w - 1)];
> +                }

This can be optimized, and since it is the inner loop of the filter, I
think it is worth it. You can declare a pointer that will stay the same
for the whole y loop:

	double *xweight = uglarmtable + (y - logo_y1) * (logo_w - 1);

and then use it in that loop:

	interpd += ... * xweight[abs(bx - (x - logo_x1))];

It avoids a lot of multiplications.

> +
> +                for (by = 1; by < logo_h - 1; by++) {
> +                    interpd += topleft[by * src_linesize] *
> +                        uglarmtable[(x - logo_x1) + abs(by - (y - logo_y1)) * (logo_w - 1)];
> +                    interpd += topleft[by * src_linesize + (logo_w - 1)] *
> +                        uglarmtable[logo_w - (x - logo_x1) - 1 + abs(by - (y - logo_y1)) * (logo_w - 1)];
> +                }

This one is more tricky to optimize, because of the abs() within the
multiplication, but it can be done by splitting the loop in two:

	for (by = 1; by < y; by++) {
	    interpd += ... * yweight[x - logo_x1];
	    yweight -= logo_w_minus_one;
	}
	for (; by < logo_h_minus_one; by++) {
	    interpd += ... * yweight[x - logo_x1];
	    yweight += logo_w_minus_one;
	}
	av_assert2(yweight == the_correct_value);

In fact, I think even the x loop would be a little more readable if it
was split in two like that.

> +
> +                interp = (uint64_t)(interpd /
> +                    uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) * (logo_w - 2)]);

The cast to uint64_t seems suspicious. Can you explain?

> +                *xdst = interp;
> +            }
> +
> +            dst += dst_linesize;
> +            src += src_linesize;
> +        }
> +    }
> +}
> +
> +/**
> + * Calculate the lookup tables to be used in UGLARM interpolation mode.
> + *
> + * @param *uglarmtable      Pointer to table containing weights for each possible
> + *                          diagonal distance between a border pixel and an inner
> + *                          logo pixel.
> + * @param *uglarmweightsum  Pointer to a table containing the weight sum to divide
> + *                          by for each pixel within the logo area.
> + * @param sar               The sar to take into account when calculating lookup
> + *                          tables.
> + * @param logo_w            width of the logo
> + * @param logo_h            height of the logo
> + * @param exponent          exponent used in uglarm interpolation
> + */
> +static void calc_uglarm_tables(double *uglarmtable, double *uglarmweightsum,
> +                               AVRational sar, int logo_w, int logo_h,
> +                               float exponent)
> +{

> +    double aspect = (double)sar.num / sar.den;

Tiny optimization:

	double aspect2 = aspect * aspect;

for use later.

> +    int x, y;


> +
> +    /* uglarmtable will contain a weight for each possible diagonal distance
> +     * between a border pixel and an inner logo pixel. The maximum distance in
> +     * each direction between border and an inner pixel can be logo_w - 1. The
> +     * weight of a border pixel which is x,y pixels away is stored at position
> +     * x + y * (logo_w - 1). */
> +    for (y = 0; y < logo_h - 1; y++)
> +        for (x = 0; x < logo_w - 1; x++) {
> +            if (x + y != 0) {

> +                double d = pow(sqrt(x * x * aspect * aspect + y * y), exponent);

	pow(sqrt(x * x * aspect2 + y * y), exponent / 2);

> +                uglarmtable[x + y * (logo_w - 1)] = 1.0 / d;
> +            } else {
> +                uglarmtable[x + y * (logo_w - 1)] = 1.0;
> +            }
> +        }
> +
> +    /* uglarmweightsum will contain the sum of all weights which is used when
> +     * an inner pixel of the logo at position x,y is calculated out of the
> +     * border pixels. The aggregated value has to be divided by that. The value
> +     * to use for the inner 1-based logo position x,y is stored at
> +     * (x - 1) + (y - 1) * (logo_w - 2). */
> +    for (y = 1; y < logo_h - 1; y++)
> +        for (x = 1; x < logo_w - 1; x++) {
> +            double weightsum = 0;
> +

> +            for (int bx = 0; bx < logo_w; bx++) {

Our coding style forbids declaring variables on the fly like that. Same
below.

> +                /* top border */
> +                weightsum += uglarmtable[abs(bx - x) + y * (logo_w - 1)];
> +                /* bottom border */
> +                weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) * (logo_w - 1)];
> +            }
> +
> +            for (int by = 1; by < logo_h - 1; by++) {
> +                /* left border */
> +                weightsum += uglarmtable[x + abs(by - y) * (logo_w - 1)];
> +                /* right border */
> +                weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) * (logo_w - 1)];
> +            }
> +
> +            uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] = weightsum;
> +        }
>  }
>  
> +enum mode {
> +    MODE_XY,
> +    MODE_UGLARM,
> +    MODE_NB
> +};
> +

> +#define MAX_PLANES 10

You could use FF_ARRAY_ELEMS(AVPixFmtDescriptor.comp), and the
consistency would be guaranteed. Well, that syntax is not valid, but it
can be expressed:

#define MAX_PLANES FF_ARRAY_ELEMS(((AVPixFmtDescriptor *)0)->comp)

But I have a better suggestion:

#define MAX_SUB 2

	double uglarmtable[MAX_SUB + 1][MAX_SUB + 1]

and use uglarmtable[hsub][vsub]. That way, for YUVA420P for example, the
table will be computed only once for Y and A and once for U and V.

The assert is still needed with that solution, though.

> +
>  typedef struct DelogoContext {
>      const AVClass *class;
> -    int x, y, w, h, band, show;
> -}  DelogoContext;
> +    int x, y, w, h, band, mode, show;
> +    float exponent;
> +    double *uglarmtable[MAX_PLANES], *uglarmweightsum[MAX_PLANES];
> +} DelogoContext;
>  
>  #define OFFSET(x) offsetof(DelogoContext, x)
>  #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
> @@ -171,6 +295,10 @@ static const AVOption delogo_options[]= {
>      { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 =  0 },  0, INT_MAX, FLAGS },
>      { "t",    "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 =  0 },  0, INT_MAX, FLAGS },
>  #endif
> +    { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT, { .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" },
> +        { "xy",     "use pixels in straight x and y direction",  OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY },     0, 0, FLAGS, "mode" },
> +        { "uglarm", "UGLARM mode, use full border",              OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS, "mode" },
> +    { "exponent","exponent used for UGLARM interpolation", OFFSET(exponent), AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS },
>      { "show", "show delogo area",          OFFSET(show), AV_OPT_TYPE_BOOL,{ .i64 =  0 },  0, 1,       FLAGS },
>      { NULL }
>  };
> @@ -215,8 +343,8 @@ static av_cold int init(AVFilterContext *ctx)
>  #else
>      s->band = 1;
>  #endif
> -    av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d show:%d\n",
> -           s->x, s->y, s->w, s->h, s->band, s->show);
> +    av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d exponent:%f show:%d\n",
> +           s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent, s->show);
>  
>      s->w += s->band*2;
>      s->h += s->band*2;
> @@ -226,6 +354,18 @@ static av_cold int init(AVFilterContext *ctx)
>      return 0;
>  }
>  
> +static av_cold void uninit(AVFilterContext *ctx)
> +{
> +    DelogoContext *s = ctx->priv;
> +

> +    if (s->mode == MODE_UGLARM) {

No need to test, we can free anyway.

> +        for (int plane = 0; plane < MAX_PLANES; plane++) {

Forbidden variable declaration.

> +            av_freep(&s->uglarmtable[plane]);
> +            av_freep(&s->uglarmweightsum[plane]);
> +        }
> +    }
> +}
> +
>  static int config_input(AVFilterLink *inlink)
>  {
>      DelogoContext *s = inlink->dst->priv;
> @@ -270,20 +410,50 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
>      if (!sar.num)
>          sar.num = sar.den = 1;
>  
> +    if (s->mode == MODE_UGLARM)
> +        av_assert0(desc->nb_components <= MAX_PLANES);
> +
>      for (plane = 0; plane < desc->nb_components; plane++) {
>          int hsub = plane == 1 || plane == 2 ? hsub0 : 0;
>          int vsub = plane == 1 || plane == 2 ? vsub0 : 0;
>  

> +        /* Up and left borders were rounded down, inject lost bits
> +        * into width and height to avoid error accumulation */

Not: the alignment is off.

> +        int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub);
> +        int logo_h = AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub);
> +
> +        /* Init lookup tables once */
> +        if (s->mode == MODE_UGLARM) {
> +            if (!s->uglarmtable[plane]) {
> +                s->uglarmtable[plane] =

> +                    (double*)av_malloc((logo_w - 1) * (logo_h - 1) * sizeof(double));

No casting of malloc returns in C, this is a c++ thing. Also, use
av_malloc_array().

> +
> +                if (!s->uglarmtable[plane]) {
> +                    return AVERROR(ENOMEM);
> +                }

The braces could be dropped.

> +
> +                s->uglarmweightsum[plane] =
> +                    (double*)av_malloc((logo_w - 2) * (logo_h - 2) * sizeof(double));
> +
> +                if (!s->uglarmweightsum[plane]) {
> +                    return AVERROR(ENOMEM);
> +                }

Same as above, of course.

> +
> +                calc_uglarm_tables(s->uglarmtable[plane],
> +                                s->uglarmweightsum[plane],
> +                                sar, logo_w, logo_h, s->exponent);
> +            }
> +        }
> +

I think this whole block could and should be moved into config_input().

>          apply_delogo(out->data[plane], out->linesize[plane],
>                       in ->data[plane], in ->linesize[plane],
>                       AV_CEIL_RSHIFT(inlink->w, hsub),
>                       AV_CEIL_RSHIFT(inlink->h, vsub),
>                       sar, s->x>>hsub, s->y>>vsub,
> -                     /* Up and left borders were rounded down, inject lost bits
> -                      * into width and height to avoid error accumulation */
> -                     AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub),
> -                     AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub),
> +                     logo_w, logo_h,
>                       s->band>>FFMIN(hsub, vsub),
> +                     s->uglarmtable[plane],
> +                     s->uglarmweightsum[plane],
>                       s->show, direct);
>      }
>  
> @@ -317,6 +487,7 @@ AVFilter ff_vf_delogo = {
>      .priv_size     = sizeof(DelogoContext),
>      .priv_class    = &delogo_class,
>      .init          = init,
> +    .uninit        = uninit,
>      .query_formats = query_formats,
>      .inputs        = avfilter_vf_delogo_inputs,
>      .outputs       = avfilter_vf_delogo_outputs,

Regards,
Helmut K. C. Tessarek Dec. 30, 2018, 1:50 p.m. UTC | #2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512



On 2018-12-30 08:02, Nicolas George wrote:
>> +                double d = pow(sqrt(x * x * aspect * aspect + y
>> * y), exponent);
> pow(sqrt(x * x * aspect2 + y * y), exponent / 2);
> 

There seems to be a mistake. You are taking the square root twice.

Or am I missing something here?

Cheers,
  K. C.

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEE191csiqpm8f5Ln9WvgmFNJ1E3QAFAlwozUAACgkQvgmFNJ1E
3QBvzw/+I6NXL7R+ugiDAmqUxUPwzvv/yJ+Vkbbp7Pl/XGApTol1Odee1P6QYs2t
e0Rs93w66+GjPZIMk7dWzl7EdX2GfUl+xPkdGzFOjLvw12nenEf8EoSXuld2Ggx8
b7algcvHeUp1orG3/c98TAjUe2zLwN2oEfHq59v5lShwK9KeThtNOsxQjz08iKfx
41SNqHKYtuz5wo2Ez7BKHZ0iBPrRr4EBJ2BPINbvAUPsELfhblUtGFFX0pumRAaJ
skLoPpaJuW3W1uK1/j0DpCNvHQzqrU0rM4hecBT8FaC0jQk6Kbi5RfS3u7xs8Gbq
snvPrRUF3UEpp1BILWWPOfG+pnytvOoKdK3j0cv74ImTYBpO7mniUKl9IZbmN1rK
2AbDI/1EH0Vgt44koqOL1hn9/zEZRbnfNrLO4rp9J/SH8+AX1a17Ts7XP/K/G88M
+pVq0u11B0/DnwrmnP11W+h6PT2ab6MNm+f57isu047j3ug9f56XvxZh56MDRzBg
RXuFqj/qu8p7tymdXfcQOzFQfQQp6pJearzd/pKImfyI/ay1cyT8XjnGKr+VC3NV
olT9lEQ13Qb7XBG/ASg6VlDuxyJqY4VRGNVx89/P3ApE406lNRpzgnK8G6XWvTQV
swN8f1qxwARqiimhN2iSPCCYDu0svt0DOwymSVUPXhDZNDGqYOQ=
=HEMW
-----END PGP SIGNATURE-----
Paul B Mahol Dec. 30, 2018, 1:53 p.m. UTC | #3
On 12/30/18, Nicolas George <george@nsup.org> wrote:
> Uwe Freese (2018-12-29):
>> The downside is that the term "power" (or "strength") would be more
>> generic
>> and would probably be better reusable for other modes.
>>
>> What do you think? Is "exponent" now ok, or should I change it back to
>> "power", "strength" or something alike?
>
> I would say: do not bother. This can be decided when a new mode is
> added, and options can have several names.
>
> By the way, you would not be interested in developing a logo-detection
> filter, per chance? Something that detects is a certain rectangle is
> likely to contain a logo and computes its bounding box.
>
>> Again, I would be glad about comments and reviews. - Let me know what
>> should
>> be changed and I'll create a new patch in some days.
>
> Here is a more in-depth review from me:
>
>> >From 83e79abb3311eb2dd92c63e8e15e0476af2f2891 Mon Sep 17 00:00:00 2001
>> From: breaker27 <mail@uwe-freese.de>
>> Date: Wed, 26 Dec 2018 18:16:48 +0100
>> Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode
>>
>> ---
>>  doc/filters.texi        |  28 +++++++
>>  libavfilter/vf_delogo.c | 189
>> +++++++++++++++++++++++++++++++++++++++++++++---
>>  2 files changed, 208 insertions(+), 9 deletions(-)
>>
>> diff --git a/doc/filters.texi b/doc/filters.texi
>> index 65ce25bc18..792560ad79 100644
>> --- a/doc/filters.texi
>> +++ b/doc/filters.texi
>> @@ -7952,6 +7952,34 @@ Specify the thickness of the fuzzy edge of the
>> rectangle (added to
>>  deprecated, setting higher values should no longer be necessary and
>>  is not recommended.
>>
>> +@item mode
>> +Specify the interpolation mode used to remove the logo.
>> +It accepts the following values:
>> +@table @option
>> +@item xy
>> +Only the border pixels in straight x and y direction are considered
>> +to replace any logo pixel. This mode tends to flicker and to show
>> +horizontal and vertical lines.
>> +@item uglarm
>> +Consider the whole border for every logo pixel to replace. This mode
>> +uses the distance raised to the power of the given @var{exponent} as
>> +weight that decides to which amount every border pixel is taken into
>> +account. This results in a more blurred area, which tends to be less
>> +distracting. The uglarm mode was first implemented in VirtualDub's
>> +LogoAway filter and is also known from ffdshow and is the
>> +abbreviation for "Uwe's Great LogoAway Remove Mode".
>> +@end table
>> +The default value is @code{xy}.
>> +
>> +@item exponent
>> +Specify the exponent used to calculate the weight out of the
>> +distance in @code{uglarm} mode (weight = distance ^ @var{exponent}).
>> +The value @code{0} results in a logo area which has
>> +the same average color everywhere. The higher the value, the more
>> +relevant a near border pixel will get, meaning that the borders of
>> +the logo area are more similar to the surrounding pixels. The default
>> +value is @code{3}.
>> +
>>  @item show
>>  When set to 1, a green rectangle is drawn on the screen to simplify
>>  finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters.
>> diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c
>> index 065d093641..dcb0366e63 100644
>> --- a/libavfilter/vf_delogo.c
>> +++ b/libavfilter/vf_delogo.c
>> @@ -2,6 +2,7 @@
>>   * Copyright (c) 2002 Jindrich Makovicka <makovick@gmail.com>
>>   * Copyright (c) 2011 Stefano Sabatini
>>   * Copyright (c) 2013, 2015 Jean Delvare <jdelvare@suse.com>
>> + * Copyright (c) 2018 Uwe Freese <mail@uwe-freese.de>
>>   *
>>   * This file is part of FFmpeg.
>>   *
>> @@ -25,12 +26,16 @@
>>   * A very simple tv station logo remover
>>   * Originally imported from MPlayer libmpcodecs/vf_delogo.c,
>>   * the algorithm was later improved.
>> + * The "UGLARM" mode was first implemented 2001 by Uwe Freese for
>> Virtual
>> + * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into
>> ffdshow's
>> + * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg.
>>   */
>>
>>  #include "libavutil/common.h"
>>  #include "libavutil/imgutils.h"
>>  #include "libavutil/opt.h"
>>  #include "libavutil/pixdesc.h"
>
>> +#include "libavutil/avassert.h"
>
> We try to keep the includes in alphabetical order.
>
>>  #include "avfilter.h"
>>  #include "formats.h"
>>  #include "internal.h"
>> @@ -50,6 +55,10 @@
>>   * @param logo_w width of the logo
>>   * @param logo_h height of the logo
>>   * @param band   the size of the band around the processed area
>> + * @param *uglarmtable      pointer to weight table in UGLARM
>> interpolation mode,
>> + *                          zero when x-y mode is used
>> + * @param *uglarmweightsum  pointer to weight sum table in UGLARM
>> interpolation mode,
>> + *                          zero when x-y mode is used
>>   * @param show   show a rectangle around the processed area, useful for
>>   *               parameters tweaking
>>   * @param direct if non-zero perform in-place processing
>> @@ -58,7 +67,8 @@ static void apply_delogo(uint8_t *dst, int
>> dst_linesize,
>>                           uint8_t *src, int src_linesize,
>>                           int w, int h, AVRational sar,
>>                           int logo_x, int logo_y, int logo_w, int logo_h,
>> -                         unsigned int band, int show, int direct)
>> +                         unsigned int band, double *uglarmtable,
>> +                         double *uglarmweightsum, int show, int direct)
>>  {
>>      int x, y;
>>      uint64_t interp, weightl, weightr, weightt, weightb, weight;
>> @@ -89,6 +99,7 @@ static void apply_delogo(uint8_t *dst, int
>> dst_linesize,
>>      dst += (logo_y1 + 1) * dst_linesize;
>>      src += (logo_y1 + 1) * src_linesize;
>>
>> +    if (!uglarmtable) {
>>      for (y = logo_y1+1; y < logo_y2; y++) {
>>          left_sample = topleft[src_linesize*(y-logo_y1)]   +
>>                        topleft[src_linesize*(y-logo_y1-1)] +
>> @@ -151,12 +162,125 @@ static void apply_delogo(uint8_t *dst, int
>> dst_linesize,
>>          dst += dst_linesize;
>>          src += src_linesize;
>>      }
>> +    } else {
>> +        int bx, by;
>> +        double interpd;
>> +
>
>> +        for (y = logo_y1 + 1; y < logo_y2; y++) {
>> +            for (x = logo_x1 + 1,
>> +                xdst = dst + logo_x1 + 1,
>> +                xsrc = src + logo_x1 + 1; x < logo_x2; x++, xdst++,
>> xsrc++) {
>
> I think a line break after the semicolons would make this easier to
> understand.
>
> Also, since x and y always used subtracted to logo_x1 or logo_y1, I
> think changing the bounds of the loop would make the rest of the code
> more readable:
>
> 	for (x = 1; x < logo_w_minus_one; x++)
>
>> +
>> +                if (show && (y == logo_y1 + 1 || y == logo_y2 - 1 ||
>> +                            x == logo_x1 + 1 || x == logo_x2 - 1)) {
>
> Nit: the alignment is off.
>
>> +                    *xdst = 0;
>> +                    continue;
>> +                }
>> +
>> +                interpd = 0;
>> +
>
>> +                for (bx = 0; bx < logo_w; bx++) {
>> +                    interpd += topleft[bx] *
>> +                        uglarmtable[abs(bx - (x - logo_x1)) + (y -
>> logo_y1) * (logo_w - 1)];
>> +                    interpd += botleft[bx] *
>> +                        uglarmtable[abs(bx - (x - logo_x1)) + (logo_h -
>> (y - logo_y1) - 1) * (logo_w - 1)];
>> +                }
>
> This can be optimized, and since it is the inner loop of the filter, I
> think it is worth it. You can declare a pointer that will stay the same
> for the whole y loop:
>
> 	double *xweight = uglarmtable + (y - logo_y1) * (logo_w - 1);
>
> and then use it in that loop:
>
> 	interpd += ... * xweight[abs(bx - (x - logo_x1))];
>
> It avoids a lot of multiplications.
>
>> +
>> +                for (by = 1; by < logo_h - 1; by++) {
>> +                    interpd += topleft[by * src_linesize] *
>> +                        uglarmtable[(x - logo_x1) + abs(by - (y -
>> logo_y1)) * (logo_w - 1)];
>> +                    interpd += topleft[by * src_linesize + (logo_w - 1)]
>> *
>> +                        uglarmtable[logo_w - (x - logo_x1) - 1 + abs(by -
>> (y - logo_y1)) * (logo_w - 1)];
>> +                }
>
> This one is more tricky to optimize, because of the abs() within the
> multiplication, but it can be done by splitting the loop in two:
>
> 	for (by = 1; by < y; by++) {
> 	    interpd += ... * yweight[x - logo_x1];
> 	    yweight -= logo_w_minus_one;
> 	}
> 	for (; by < logo_h_minus_one; by++) {
> 	    interpd += ... * yweight[x - logo_x1];
> 	    yweight += logo_w_minus_one;
> 	}
> 	av_assert2(yweight == the_correct_value);
>
> In fact, I think even the x loop would be a little more readable if it
> was split in two like that.
>
>> +
>> +                interp = (uint64_t)(interpd /
>> +                    uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1)
>> * (logo_w - 2)]);
>
> The cast to uint64_t seems suspicious. Can you explain?
>
>> +                *xdst = interp;
>> +            }
>> +
>> +            dst += dst_linesize;
>> +            src += src_linesize;
>> +        }
>> +    }
>> +}
>> +
>> +/**
>> + * Calculate the lookup tables to be used in UGLARM interpolation mode.
>> + *
>> + * @param *uglarmtable      Pointer to table containing weights for each
>> possible
>> + *                          diagonal distance between a border pixel and
>> an inner
>> + *                          logo pixel.
>> + * @param *uglarmweightsum  Pointer to a table containing the weight sum
>> to divide
>> + *                          by for each pixel within the logo area.
>> + * @param sar               The sar to take into account when calculating
>> lookup
>> + *                          tables.
>> + * @param logo_w            width of the logo
>> + * @param logo_h            height of the logo
>> + * @param exponent          exponent used in uglarm interpolation
>> + */
>> +static void calc_uglarm_tables(double *uglarmtable, double
>> *uglarmweightsum,
>> +                               AVRational sar, int logo_w, int logo_h,
>> +                               float exponent)
>> +{
>
>> +    double aspect = (double)sar.num / sar.den;
>
> Tiny optimization:
>
> 	double aspect2 = aspect * aspect;
>
> for use later.
>
>> +    int x, y;
>
>
>> +
>> +    /* uglarmtable will contain a weight for each possible diagonal
>> distance
>> +     * between a border pixel and an inner logo pixel. The maximum
>> distance in
>> +     * each direction between border and an inner pixel can be logo_w -
>> 1. The
>> +     * weight of a border pixel which is x,y pixels away is stored at
>> position
>> +     * x + y * (logo_w - 1). */
>> +    for (y = 0; y < logo_h - 1; y++)
>> +        for (x = 0; x < logo_w - 1; x++) {
>> +            if (x + y != 0) {
>
>> +                double d = pow(sqrt(x * x * aspect * aspect + y * y),
>> exponent);
>
> 	pow(sqrt(x * x * aspect2 + y * y), exponent / 2);
>
>> +                uglarmtable[x + y * (logo_w - 1)] = 1.0 / d;
>> +            } else {
>> +                uglarmtable[x + y * (logo_w - 1)] = 1.0;
>> +            }
>> +        }
>> +
>> +    /* uglarmweightsum will contain the sum of all weights which is used
>> when
>> +     * an inner pixel of the logo at position x,y is calculated out of
>> the
>> +     * border pixels. The aggregated value has to be divided by that. The
>> value
>> +     * to use for the inner 1-based logo position x,y is stored at
>> +     * (x - 1) + (y - 1) * (logo_w - 2). */
>> +    for (y = 1; y < logo_h - 1; y++)
>> +        for (x = 1; x < logo_w - 1; x++) {
>> +            double weightsum = 0;
>> +
>
>> +            for (int bx = 0; bx < logo_w; bx++) {
>
> Our coding style forbids declaring variables on the fly like that. Same
> below.

This is simply not more true.
Look at developer guidelines for this project.

>
>> +                /* top border */
>> +                weightsum += uglarmtable[abs(bx - x) + y * (logo_w -
>> 1)];
>> +                /* bottom border */
>> +                weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) *
>> (logo_w - 1)];
>> +            }
>> +
>> +            for (int by = 1; by < logo_h - 1; by++) {
>> +                /* left border */
>> +                weightsum += uglarmtable[x + abs(by - y) * (logo_w -
>> 1)];
>> +                /* right border */
>> +                weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) *
>> (logo_w - 1)];
>> +            }
>> +
>> +            uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] =
>> weightsum;
>> +        }
>>  }
>>
>> +enum mode {
>> +    MODE_XY,
>> +    MODE_UGLARM,
>> +    MODE_NB
>> +};
>> +
>
>> +#define MAX_PLANES 10
>
> You could use FF_ARRAY_ELEMS(AVPixFmtDescriptor.comp), and the
> consistency would be guaranteed. Well, that syntax is not valid, but it
> can be expressed:
>
> #define MAX_PLANES FF_ARRAY_ELEMS(((AVPixFmtDescriptor *)0)->comp)
>
> But I have a better suggestion:
>
> #define MAX_SUB 2
>
> 	double uglarmtable[MAX_SUB + 1][MAX_SUB + 1]
>
> and use uglarmtable[hsub][vsub]. That way, for YUVA420P for example, the
> table will be computed only once for Y and A and once for U and V.
>
> The assert is still needed with that solution, though.
>
>> +
>>  typedef struct DelogoContext {
>>      const AVClass *class;
>> -    int x, y, w, h, band, show;
>> -}  DelogoContext;
>> +    int x, y, w, h, band, mode, show;
>> +    float exponent;
>> +    double *uglarmtable[MAX_PLANES], *uglarmweightsum[MAX_PLANES];
>> +} DelogoContext;
>>
>>  #define OFFSET(x) offsetof(DelogoContext, x)
>>  #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
>> @@ -171,6 +295,10 @@ static const AVOption delogo_options[]= {
>>      { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT,
>> { .i64 =  0 },  0, INT_MAX, FLAGS },
>>      { "t",    "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT,
>> { .i64 =  0 },  0, INT_MAX, FLAGS },
>>  #endif
>> +    { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT,
>> { .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" },
>> +        { "xy",     "use pixels in straight x and y direction",
>> OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY },     0, 0, FLAGS,
>> "mode" },
>> +        { "uglarm", "UGLARM mode, use full border",
>> OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS,
>> "mode" },
>> +    { "exponent","exponent used for UGLARM interpolation",
>> OFFSET(exponent), AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS },
>>      { "show", "show delogo area",          OFFSET(show),
>> AV_OPT_TYPE_BOOL,{ .i64 =  0 },  0, 1,       FLAGS },
>>      { NULL }
>>  };
>> @@ -215,8 +343,8 @@ static av_cold int init(AVFilterContext *ctx)
>>  #else
>>      s->band = 1;
>>  #endif
>> -    av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d
>> show:%d\n",
>> -           s->x, s->y, s->w, s->h, s->band, s->show);
>> +    av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d
>> exponent:%f show:%d\n",
>> +           s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent,
>> s->show);
>>
>>      s->w += s->band*2;
>>      s->h += s->band*2;
>> @@ -226,6 +354,18 @@ static av_cold int init(AVFilterContext *ctx)
>>      return 0;
>>  }
>>
>> +static av_cold void uninit(AVFilterContext *ctx)
>> +{
>> +    DelogoContext *s = ctx->priv;
>> +
>
>> +    if (s->mode == MODE_UGLARM) {
>
> No need to test, we can free anyway.
>
>> +        for (int plane = 0; plane < MAX_PLANES; plane++) {
>
> Forbidden variable declaration.
>
>> +            av_freep(&s->uglarmtable[plane]);
>> +            av_freep(&s->uglarmweightsum[plane]);
>> +        }
>> +    }
>> +}
>> +
>>  static int config_input(AVFilterLink *inlink)
>>  {
>>      DelogoContext *s = inlink->dst->priv;
>> @@ -270,20 +410,50 @@ static int filter_frame(AVFilterLink *inlink,
>> AVFrame *in)
>>      if (!sar.num)
>>          sar.num = sar.den = 1;
>>
>> +    if (s->mode == MODE_UGLARM)
>> +        av_assert0(desc->nb_components <= MAX_PLANES);
>> +
>>      for (plane = 0; plane < desc->nb_components; plane++) {
>>          int hsub = plane == 1 || plane == 2 ? hsub0 : 0;
>>          int vsub = plane == 1 || plane == 2 ? vsub0 : 0;
>>
>
>> +        /* Up and left borders were rounded down, inject lost bits
>> +        * into width and height to avoid error accumulation */
>
> Not: the alignment is off.
>
>> +        int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)),
>> hsub);
>> +        int logo_h = AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)),
>> vsub);
>> +
>> +        /* Init lookup tables once */
>> +        if (s->mode == MODE_UGLARM) {
>> +            if (!s->uglarmtable[plane]) {
>> +                s->uglarmtable[plane] =
>
>> +                    (double*)av_malloc((logo_w - 1) * (logo_h - 1) *
>> sizeof(double));
>
> No casting of malloc returns in C, this is a c++ thing. Also, use
> av_malloc_array().
>
>> +
>> +                if (!s->uglarmtable[plane]) {
>> +                    return AVERROR(ENOMEM);
>> +                }
>
> The braces could be dropped.
>
>> +
>> +                s->uglarmweightsum[plane] =
>> +                    (double*)av_malloc((logo_w - 2) * (logo_h - 2) *
>> sizeof(double));
>> +
>> +                if (!s->uglarmweightsum[plane]) {
>> +                    return AVERROR(ENOMEM);
>> +                }
>
> Same as above, of course.
>
>> +
>> +                calc_uglarm_tables(s->uglarmtable[plane],
>> +                                s->uglarmweightsum[plane],
>> +                                sar, logo_w, logo_h, s->exponent);
>> +            }
>> +        }
>> +
>
> I think this whole block could and should be moved into config_input().
>
>>          apply_delogo(out->data[plane], out->linesize[plane],
>>                       in ->data[plane], in ->linesize[plane],
>>                       AV_CEIL_RSHIFT(inlink->w, hsub),
>>                       AV_CEIL_RSHIFT(inlink->h, vsub),
>>                       sar, s->x>>hsub, s->y>>vsub,
>> -                     /* Up and left borders were rounded down, inject
>> lost bits
>> -                      * into width and height to avoid error accumulation
>> */
>> -                     AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)),
>> hsub),
>> -                     AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)),
>> vsub),
>> +                     logo_w, logo_h,
>>                       s->band>>FFMIN(hsub, vsub),
>> +                     s->uglarmtable[plane],
>> +                     s->uglarmweightsum[plane],
>>                       s->show, direct);
>>      }
>>
>> @@ -317,6 +487,7 @@ AVFilter ff_vf_delogo = {
>>      .priv_size     = sizeof(DelogoContext),
>>      .priv_class    = &delogo_class,
>>      .init          = init,
>> +    .uninit        = uninit,
>>      .query_formats = query_formats,
>>      .inputs        = avfilter_vf_delogo_inputs,
>>      .outputs       = avfilter_vf_delogo_outputs,
>
> Regards,
>
> --
>   Nicolas George
>
Nicolas George Dec. 30, 2018, 1:56 p.m. UTC | #4
Helmut K. C. Tessarek (2018-12-30):
> > pow(sqrt(x * x * aspect2 + y * y), exponent / 2);

> There seems to be a mistake. You are taking the square root twice.
> 
> Or am I missing something here?

Thanks, I wanted to remove the sqrt() and replace it with the /2, not
just add the /2.

Regards,
Carl Eugen Hoyos Dec. 30, 2018, 6:47 p.m. UTC | #5
2018-12-30 14:02 GMT+01:00, Nicolas George <george@nsup.org>:
> Uwe Freese (2018-12-29):

>> +                    (double*)av_malloc((logo_w - 1) * (logo_h - 1) *
>> sizeof(double));
>
> No casting of malloc returns in C, this is a c++ thing. Also, use
> av_malloc_array().

While there, please don't use sizeof(type) but sizeof(variable),
same below.

Carl Eugen
Uwe Freese Jan. 3, 2019, 8:03 p.m. UTC | #6
Hello,

> By the way, you would not be interested in developing a logo-detection
> filter, per chance? Something that detects is a certain rectangle is
> likely to contain a logo and computes its bounding box.

Currently, I don't need it so much, but nonetheless I though a bit about it:

1)

Is there an "average" filter (or some filter that can be used like 
this)? I mean, a filter which creates the average of every one pixel for 
the whole video? (The last picture would then be the average of the 
whole video.)

I assume that the average value is "some grey", with some noise in it. 
The logo should be visible.

Then use "edgedetect" and we have the logo. Afterwards, some "eq" to 
make everything else black. Then "cropdetect". :)

Could this work? I would play a bit with this, but don't know how to 
calculate the "average" easy. Any ideas?

2)

If something like this would work, is it intended to reuse existing 
ffmpeg filters in other ffmpeg filters? Do other filters combine logic 
from existing filters (making them available with one filter string in 
the filter graph)?

3)

Playing around with edgedetection a bit (I wanted to mask edges and use 
a denoiser only in the remaining areas), I saw there seems no filter to 
convert a grey-level picture into pure black and white and no "and", 
"or", "invert" etc. filters, right? Or didn't I find them?

4)

What should be the result of the filter? A text file with coordinates of 
the most likely position, or a list of positions and likelihood in 
percentage? I mean, the user has to check manually afterwards if the 
detection worked and call ffmpeg again with removing the logo at the 
given coordinates, right?

Regards, Uwe
diff mbox

Patch

From 83e79abb3311eb2dd92c63e8e15e0476af2f2891 Mon Sep 17 00:00:00 2001
From: breaker27 <mail@uwe-freese.de>
Date: Wed, 26 Dec 2018 18:16:48 +0100
Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode

---
 doc/filters.texi        |  28 +++++++
 libavfilter/vf_delogo.c | 189 +++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 208 insertions(+), 9 deletions(-)

diff --git a/doc/filters.texi b/doc/filters.texi
index 65ce25bc18..792560ad79 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -7952,6 +7952,34 @@  Specify the thickness of the fuzzy edge of the rectangle (added to
 deprecated, setting higher values should no longer be necessary and
 is not recommended.
 
+@item mode
+Specify the interpolation mode used to remove the logo.
+It accepts the following values:
+@table @option
+@item xy
+Only the border pixels in straight x and y direction are considered
+to replace any logo pixel. This mode tends to flicker and to show
+horizontal and vertical lines.
+@item uglarm
+Consider the whole border for every logo pixel to replace. This mode
+uses the distance raised to the power of the given @var{exponent} as
+weight that decides to which amount every border pixel is taken into
+account. This results in a more blurred area, which tends to be less
+distracting. The uglarm mode was first implemented in VirtualDub's
+LogoAway filter and is also known from ffdshow and is the
+abbreviation for "Uwe's Great LogoAway Remove Mode".
+@end table
+The default value is @code{xy}.
+
+@item exponent
+Specify the exponent used to calculate the weight out of the
+distance in @code{uglarm} mode (weight = distance ^ @var{exponent}).
+The value @code{0} results in a logo area which has
+the same average color everywhere. The higher the value, the more
+relevant a near border pixel will get, meaning that the borders of
+the logo area are more similar to the surrounding pixels. The default
+value is @code{3}.
+
 @item show
 When set to 1, a green rectangle is drawn on the screen to simplify
 finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters.
diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c
index 065d093641..dcb0366e63 100644
--- a/libavfilter/vf_delogo.c
+++ b/libavfilter/vf_delogo.c
@@ -2,6 +2,7 @@ 
  * Copyright (c) 2002 Jindrich Makovicka <makovick@gmail.com>
  * Copyright (c) 2011 Stefano Sabatini
  * Copyright (c) 2013, 2015 Jean Delvare <jdelvare@suse.com>
+ * Copyright (c) 2018 Uwe Freese <mail@uwe-freese.de>
  *
  * This file is part of FFmpeg.
  *
@@ -25,12 +26,16 @@ 
  * A very simple tv station logo remover
  * Originally imported from MPlayer libmpcodecs/vf_delogo.c,
  * the algorithm was later improved.
+ * The "UGLARM" mode was first implemented 2001 by Uwe Freese for Virtual
+ * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into ffdshow's
+ * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg.
  */
 
 #include "libavutil/common.h"
 #include "libavutil/imgutils.h"
 #include "libavutil/opt.h"
 #include "libavutil/pixdesc.h"
+#include "libavutil/avassert.h"
 #include "avfilter.h"
 #include "formats.h"
 #include "internal.h"
@@ -50,6 +55,10 @@ 
  * @param logo_w width of the logo
  * @param logo_h height of the logo
  * @param band   the size of the band around the processed area
+ * @param *uglarmtable      pointer to weight table in UGLARM interpolation mode,
+ *                          zero when x-y mode is used
+ * @param *uglarmweightsum  pointer to weight sum table in UGLARM interpolation mode,
+ *                          zero when x-y mode is used
  * @param show   show a rectangle around the processed area, useful for
  *               parameters tweaking
  * @param direct if non-zero perform in-place processing
@@ -58,7 +67,8 @@  static void apply_delogo(uint8_t *dst, int dst_linesize,
                          uint8_t *src, int src_linesize,
                          int w, int h, AVRational sar,
                          int logo_x, int logo_y, int logo_w, int logo_h,
-                         unsigned int band, int show, int direct)
+                         unsigned int band, double *uglarmtable,
+                         double *uglarmweightsum, int show, int direct)
 {
     int x, y;
     uint64_t interp, weightl, weightr, weightt, weightb, weight;
@@ -89,6 +99,7 @@  static void apply_delogo(uint8_t *dst, int dst_linesize,
     dst += (logo_y1 + 1) * dst_linesize;
     src += (logo_y1 + 1) * src_linesize;
 
+    if (!uglarmtable) {
     for (y = logo_y1+1; y < logo_y2; y++) {
         left_sample = topleft[src_linesize*(y-logo_y1)]   +
                       topleft[src_linesize*(y-logo_y1-1)] +
@@ -151,12 +162,125 @@  static void apply_delogo(uint8_t *dst, int dst_linesize,
         dst += dst_linesize;
         src += src_linesize;
     }
+    } else {
+        int bx, by;
+        double interpd;
+
+        for (y = logo_y1 + 1; y < logo_y2; y++) {
+            for (x = logo_x1 + 1,
+                xdst = dst + logo_x1 + 1,
+                xsrc = src + logo_x1 + 1; x < logo_x2; x++, xdst++, xsrc++) {
+
+                if (show && (y == logo_y1 + 1 || y == logo_y2 - 1 ||
+                            x == logo_x1 + 1 || x == logo_x2 - 1)) {
+                    *xdst = 0;
+                    continue;
+                }
+
+                interpd = 0;
+
+                for (bx = 0; bx < logo_w; bx++) {
+                    interpd += topleft[bx] *
+                        uglarmtable[abs(bx - (x - logo_x1)) + (y - logo_y1) * (logo_w - 1)];
+                    interpd += botleft[bx] *
+                        uglarmtable[abs(bx - (x - logo_x1)) + (logo_h - (y - logo_y1) - 1) * (logo_w - 1)];
+                }
+
+                for (by = 1; by < logo_h - 1; by++) {
+                    interpd += topleft[by * src_linesize] *
+                        uglarmtable[(x - logo_x1) + abs(by - (y - logo_y1)) * (logo_w - 1)];
+                    interpd += topleft[by * src_linesize + (logo_w - 1)] *
+                        uglarmtable[logo_w - (x - logo_x1) - 1 + abs(by - (y - logo_y1)) * (logo_w - 1)];
+                }
+
+                interp = (uint64_t)(interpd /
+                    uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) * (logo_w - 2)]);
+                *xdst = interp;
+            }
+
+            dst += dst_linesize;
+            src += src_linesize;
+        }
+    }
+}
+
+/**
+ * Calculate the lookup tables to be used in UGLARM interpolation mode.
+ *
+ * @param *uglarmtable      Pointer to table containing weights for each possible
+ *                          diagonal distance between a border pixel and an inner
+ *                          logo pixel.
+ * @param *uglarmweightsum  Pointer to a table containing the weight sum to divide
+ *                          by for each pixel within the logo area.
+ * @param sar               The sar to take into account when calculating lookup
+ *                          tables.
+ * @param logo_w            width of the logo
+ * @param logo_h            height of the logo
+ * @param exponent          exponent used in uglarm interpolation
+ */
+static void calc_uglarm_tables(double *uglarmtable, double *uglarmweightsum,
+                               AVRational sar, int logo_w, int logo_h,
+                               float exponent)
+{
+    double aspect = (double)sar.num / sar.den;
+    int x, y;
+
+    /* uglarmtable will contain a weight for each possible diagonal distance
+     * between a border pixel and an inner logo pixel. The maximum distance in
+     * each direction between border and an inner pixel can be logo_w - 1. The
+     * weight of a border pixel which is x,y pixels away is stored at position
+     * x + y * (logo_w - 1). */
+    for (y = 0; y < logo_h - 1; y++)
+        for (x = 0; x < logo_w - 1; x++) {
+            if (x + y != 0) {
+                double d = pow(sqrt(x * x * aspect * aspect + y * y), exponent);
+                uglarmtable[x + y * (logo_w - 1)] = 1.0 / d;
+            } else {
+                uglarmtable[x + y * (logo_w - 1)] = 1.0;
+            }
+        }
+
+    /* uglarmweightsum will contain the sum of all weights which is used when
+     * an inner pixel of the logo at position x,y is calculated out of the
+     * border pixels. The aggregated value has to be divided by that. The value
+     * to use for the inner 1-based logo position x,y is stored at
+     * (x - 1) + (y - 1) * (logo_w - 2). */
+    for (y = 1; y < logo_h - 1; y++)
+        for (x = 1; x < logo_w - 1; x++) {
+            double weightsum = 0;
+
+            for (int bx = 0; bx < logo_w; bx++) {
+                /* top border */
+                weightsum += uglarmtable[abs(bx - x) + y * (logo_w - 1)];
+                /* bottom border */
+                weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) * (logo_w - 1)];
+            }
+
+            for (int by = 1; by < logo_h - 1; by++) {
+                /* left border */
+                weightsum += uglarmtable[x + abs(by - y) * (logo_w - 1)];
+                /* right border */
+                weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) * (logo_w - 1)];
+            }
+
+            uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] = weightsum;
+        }
 }
 
+enum mode {
+    MODE_XY,
+    MODE_UGLARM,
+    MODE_NB
+};
+
+#define MAX_PLANES 10
+
 typedef struct DelogoContext {
     const AVClass *class;
-    int x, y, w, h, band, show;
-}  DelogoContext;
+    int x, y, w, h, band, mode, show;
+    float exponent;
+    double *uglarmtable[MAX_PLANES], *uglarmweightsum[MAX_PLANES];
+} DelogoContext;
 
 #define OFFSET(x) offsetof(DelogoContext, x)
 #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
@@ -171,6 +295,10 @@  static const AVOption delogo_options[]= {
     { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 =  0 },  0, INT_MAX, FLAGS },
     { "t",    "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 =  0 },  0, INT_MAX, FLAGS },
 #endif
+    { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT, { .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" },
+        { "xy",     "use pixels in straight x and y direction",  OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY },     0, 0, FLAGS, "mode" },
+        { "uglarm", "UGLARM mode, use full border",              OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS, "mode" },
+    { "exponent","exponent used for UGLARM interpolation", OFFSET(exponent), AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS },
     { "show", "show delogo area",          OFFSET(show), AV_OPT_TYPE_BOOL,{ .i64 =  0 },  0, 1,       FLAGS },
     { NULL }
 };
@@ -215,8 +343,8 @@  static av_cold int init(AVFilterContext *ctx)
 #else
     s->band = 1;
 #endif
-    av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d show:%d\n",
-           s->x, s->y, s->w, s->h, s->band, s->show);
+    av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d exponent:%f show:%d\n",
+           s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent, s->show);
 
     s->w += s->band*2;
     s->h += s->band*2;
@@ -226,6 +354,18 @@  static av_cold int init(AVFilterContext *ctx)
     return 0;
 }
 
+static av_cold void uninit(AVFilterContext *ctx)
+{
+    DelogoContext *s = ctx->priv;
+
+    if (s->mode == MODE_UGLARM) {
+        for (int plane = 0; plane < MAX_PLANES; plane++) {
+            av_freep(&s->uglarmtable[plane]);
+            av_freep(&s->uglarmweightsum[plane]);
+        }
+    }
+}
+
 static int config_input(AVFilterLink *inlink)
 {
     DelogoContext *s = inlink->dst->priv;
@@ -270,20 +410,50 @@  static int filter_frame(AVFilterLink *inlink, AVFrame *in)
     if (!sar.num)
         sar.num = sar.den = 1;
 
+    if (s->mode == MODE_UGLARM)
+        av_assert0(desc->nb_components <= MAX_PLANES);
+
     for (plane = 0; plane < desc->nb_components; plane++) {
         int hsub = plane == 1 || plane == 2 ? hsub0 : 0;
         int vsub = plane == 1 || plane == 2 ? vsub0 : 0;
 
+        /* Up and left borders were rounded down, inject lost bits
+        * into width and height to avoid error accumulation */
+        int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub);
+        int logo_h = AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub);
+
+        /* Init lookup tables once */
+        if (s->mode == MODE_UGLARM) {
+            if (!s->uglarmtable[plane]) {
+                s->uglarmtable[plane] =
+                    (double*)av_malloc((logo_w - 1) * (logo_h - 1) * sizeof(double));
+
+                if (!s->uglarmtable[plane]) {
+                    return AVERROR(ENOMEM);
+                }
+
+                s->uglarmweightsum[plane] =
+                    (double*)av_malloc((logo_w - 2) * (logo_h - 2) * sizeof(double));
+
+                if (!s->uglarmweightsum[plane]) {
+                    return AVERROR(ENOMEM);
+                }
+
+                calc_uglarm_tables(s->uglarmtable[plane],
+                                s->uglarmweightsum[plane],
+                                sar, logo_w, logo_h, s->exponent);
+            }
+        }
+
         apply_delogo(out->data[plane], out->linesize[plane],
                      in ->data[plane], in ->linesize[plane],
                      AV_CEIL_RSHIFT(inlink->w, hsub),
                      AV_CEIL_RSHIFT(inlink->h, vsub),
                      sar, s->x>>hsub, s->y>>vsub,
-                     /* Up and left borders were rounded down, inject lost bits
-                      * into width and height to avoid error accumulation */
-                     AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub),
-                     AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub),
+                     logo_w, logo_h,
                      s->band>>FFMIN(hsub, vsub),
+                     s->uglarmtable[plane],
+                     s->uglarmweightsum[plane],
                      s->show, direct);
     }
 
@@ -317,6 +487,7 @@  AVFilter ff_vf_delogo = {
     .priv_size     = sizeof(DelogoContext),
     .priv_class    = &delogo_class,
     .init          = init,
+    .uninit        = uninit,
     .query_formats = query_formats,
     .inputs        = avfilter_vf_delogo_inputs,
     .outputs       = avfilter_vf_delogo_outputs,
-- 
2.11.0