[FFmpeg-devel] delogo filter: new "uglarm" interpolation mode added

Submitted by Uwe Freese on Jan. 2, 2019, 10:34 p.m.

Details

Message ID 3cb20460-412f-0154-5c60-a916351ca480@gmx.de
State New
Headers show

Commit Message

Uwe Freese Jan. 2, 2019, 10:34 p.m.
Hello,

Here's a new version of the patch. Changes:


- My last version didn't compile because of moving code to config_input. 
Don't know why I didn't see this, sorry.
   I moved the code back to filter_frame because of two reasons. First, 
I need the "sar" to init my tables and it seems I need the "AVFrame *in" 
parameter for that. Secondly, I also need *desc, hsub0, vsub0, plane, 
logo_w, logo_h which means much duplicated code or creation of functions.

- Corrected use of sizeof in av_malloc_array

- changed calculation of the distance regarding exponent, avoid sqrt, 
use "aspect2" variable

- use double *uglarmtable[MAX_SUB + 1][MAX_SUB + 1], 
*uglarmweightsum[MAX_SUB + 1][MAX_SUB + 1]; (instead per 1-dimensional 
array for the planes)

- unconditional av_freep in uninit

- used the alternative loops as suggested by Nicolas (thanks)


Remaining points / answers / questions:

Am 02.01.19 um 16:25 schrieb Nicolas George:
>>>> +                interp = (uint64_t)(interpd /
>>>> +                    uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) * (logo_w - 2)]);
>>> The cast to uint64_t seems suspicious. Can you explain?
>> Every pixel value of the inner logo area is an integer. Only for the
>> calculation of the weighted average, all values use floats. At the end, it
>> is rounded (truncated) to an int.
> int and uint64_t are not the same thing. Why uint64_t?

"interp" was defined as uint64_t in the original code.

Do you see a problem here? Then let us know.

>>> 	pow(x * x * aspect2 + y * y, exponent / 2);
>> Hm. Again, I'm unsure if this "optimization" is worth it. I wouldn't do this
>> normally.
> In this particular instance, definitely yes. Compilers have very little
> latitude to optimize floating point operations, and they will certainly
> not optimize mathematical functions based on subtle arithmetic
> properties. This is a C compiler, not a TI-89.

We have obviously distinct opinions here. I would leave the code because 
it's easier to understand (as written), and it runs once, taking maybe 
some microseconds more for a usual conversion of video taking seconds to 
hours.

But I have no problem changing it. - Done.


Question to "aspect2": is ^2 not good (slow) or something, or why not 
directly use

double aspect2 = ((double)sar.num / sar.den) ^ 2;


> [...]
>      av_assert2(table_t == uglarmtable + (logo_w - x) + table_stride * y);
>      av_assert2(table_b == uglarmtable + (logo_w - x) + table_stride * (logo_h - y - 1));
>      av_assert2(table_l == uglarmtable + table_stride * (logo_h - y - 1) + x);
>      av_assert2(table_r == uglarmtable + table_stride * (logo_h - y - 1) + logo_w - x - 1;
>
> That makes more lines, but the lines are way simpler: no tricky
> arithmetic, all blocks almost identical, with the changes easy to see
> and understand.

I took over these alternative loops for the calculations.

Question to the assert: Is this useful (compared to the running time)? I 
mean, basically it is the same expression as over the loops, only x and 
y are different by the amount the loops are counting them up. I wouldn't 
say that it is probable that one makes an error coding the loop counter 
or that it doesn't somehow function.

Is there another thing which this assert checks that I didn't see?

Regards,
Uwe

Comments

Uwe Freese Jan. 10, 2019, 7:52 p.m.
Hello,

just a kind reminder. - What is the next step, is there anything more I 
should improve, check, modify,...?

For me, the filter works as intended.

@Nicolas: Can you answer my remaining three questions below, please?

Regards,
Uwe

Am 02.01.19 um 23:34 schrieb Uwe Freese:
> Hello,
> 
> Here's a new version of the patch. Changes:
> 
> 
> - My last version didn't compile because of moving code to config_input. 
> Don't know why I didn't see this, sorry.
>    I moved the code back to filter_frame because of two reasons. First, 
> I need the "sar" to init my tables and it seems I need the "AVFrame *in" 
> parameter for that. Secondly, I also need *desc, hsub0, vsub0, plane, 
> logo_w, logo_h which means much duplicated code or creation of functions.
> 
> - Corrected use of sizeof in av_malloc_array
> 
> - changed calculation of the distance regarding exponent, avoid sqrt, 
> use "aspect2" variable
> 
> - use double *uglarmtable[MAX_SUB + 1][MAX_SUB + 1], 
> *uglarmweightsum[MAX_SUB + 1][MAX_SUB + 1]; (instead per 1-dimensional 
> array for the planes)
> 
> - unconditional av_freep in uninit
> 
> - used the alternative loops as suggested by Nicolas (thanks)
> 
> 
> Remaining points / answers / questions:
> 
> Am 02.01.19 um 16:25 schrieb Nicolas George:
>>>>> +                interp = (uint64_t)(interpd /
>>>>> +                    uglarmweightsum[(x - logo_x1) - 1 + (y - 
>>>>> logo_y1 - 1) * (logo_w - 2)]);
>>>> The cast to uint64_t seems suspicious. Can you explain?
>>> Every pixel value of the inner logo area is an integer. Only for the
>>> calculation of the weighted average, all values use floats. At the 
>>> end, it
>>> is rounded (truncated) to an int.
>> int and uint64_t are not the same thing. Why uint64_t?
> 
> "interp" was defined as uint64_t in the original code.
> 
> Do you see a problem here? Then let us know.
> 
>>>>     pow(x * x * aspect2 + y * y, exponent / 2);
>>> Hm. Again, I'm unsure if this "optimization" is worth it. I wouldn't 
>>> do this
>>> normally.
>> In this particular instance, definitely yes. Compilers have very little
>> latitude to optimize floating point operations, and they will certainly
>> not optimize mathematical functions based on subtle arithmetic
>> properties. This is a C compiler, not a TI-89.
> 
> We have obviously distinct opinions here. I would leave the code because 
> it's easier to understand (as written), and it runs once, taking maybe 
> some microseconds more for a usual conversion of video taking seconds to 
> hours.
> 
> But I have no problem changing it. - Done.
> 
> 
> Question to "aspect2": is ^2 not good (slow) or something, or why not 
> directly use
> 
> double aspect2 = ((double)sar.num / sar.den) ^ 2;
> 
> 
>> [...]
>>      av_assert2(table_t == uglarmtable + (logo_w - x) + table_stride * 
>> y);
>>      av_assert2(table_b == uglarmtable + (logo_w - x) + table_stride * 
>> (logo_h - y - 1));
>>      av_assert2(table_l == uglarmtable + table_stride * (logo_h - y - 
>> 1) + x);
>>      av_assert2(table_r == uglarmtable + table_stride * (logo_h - y - 
>> 1) + logo_w - x - 1;
>>
>> That makes more lines, but the lines are way simpler: no tricky
>> arithmetic, all blocks almost identical, with the changes easy to see
>> and understand.
> 
> I took over these alternative loops for the calculations.
> 
> Question to the assert: Is this useful (compared to the running time)? I 
> mean, basically it is the same expression as over the loops, only x and 
> y are different by the amount the loops are counting them up. I wouldn't 
> say that it is probable that one makes an error coding the loop counter 
> or that it doesn't somehow function.
> 
> Is there another thing which this assert checks that I didn't see?
> 
> Regards,
> Uwe
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>

Patch hide | download patch | download mbox

From 7b221a27b29c7376672d4ee60092051f6b65c978 Mon Sep 17 00:00:00 2001
From: breaker27 <mail@uwe-freese.de>
Date: Wed, 26 Dec 2018 18:16:48 +0100
Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode

---
 doc/filters.texi        |  28 +++++++
 libavfilter/vf_delogo.c | 214 ++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 233 insertions(+), 9 deletions(-)

diff --git a/doc/filters.texi b/doc/filters.texi
index 65ce25bc18..792560ad79 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -7952,6 +7952,34 @@  Specify the thickness of the fuzzy edge of the rectangle (added to
 deprecated, setting higher values should no longer be necessary and
 is not recommended.
 
+@item mode
+Specify the interpolation mode used to remove the logo.
+It accepts the following values:
+@table @option
+@item xy
+Only the border pixels in straight x and y direction are considered
+to replace any logo pixel. This mode tends to flicker and to show
+horizontal and vertical lines.
+@item uglarm
+Consider the whole border for every logo pixel to replace. This mode
+uses the distance raised to the power of the given @var{exponent} as
+weight that decides to which amount every border pixel is taken into
+account. This results in a more blurred area, which tends to be less
+distracting. The uglarm mode was first implemented in VirtualDub's
+LogoAway filter and is also known from ffdshow and is the
+abbreviation for "Uwe's Great LogoAway Remove Mode".
+@end table
+The default value is @code{xy}.
+
+@item exponent
+Specify the exponent used to calculate the weight out of the
+distance in @code{uglarm} mode (weight = distance ^ @var{exponent}).
+The value @code{0} results in a logo area which has
+the same average color everywhere. The higher the value, the more
+relevant a near border pixel will get, meaning that the borders of
+the logo area are more similar to the surrounding pixels. The default
+value is @code{3}.
+
 @item show
 When set to 1, a green rectangle is drawn on the screen to simplify
 finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters.
diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c
index 065d093641..f489c6a586 100644
--- a/libavfilter/vf_delogo.c
+++ b/libavfilter/vf_delogo.c
@@ -2,6 +2,7 @@ 
  * Copyright (c) 2002 Jindrich Makovicka <makovick@gmail.com>
  * Copyright (c) 2011 Stefano Sabatini
  * Copyright (c) 2013, 2015 Jean Delvare <jdelvare@suse.com>
+ * Copyright (c) 2018, 2019 Uwe Freese <mail@uwe-freese.de>
  *
  * This file is part of FFmpeg.
  *
@@ -25,8 +26,12 @@ 
  * A very simple tv station logo remover
  * Originally imported from MPlayer libmpcodecs/vf_delogo.c,
  * the algorithm was later improved.
+ * The "UGLARM" mode was first implemented 2001 by Uwe Freese for Virtual
+ * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into ffdshow's
+ * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg.
  */
 
+#include "libavutil/avassert.h"
 #include "libavutil/common.h"
 #include "libavutil/imgutils.h"
 #include "libavutil/opt.h"
@@ -50,6 +55,10 @@ 
  * @param logo_w width of the logo
  * @param logo_h height of the logo
  * @param band   the size of the band around the processed area
+ * @param *uglarmtable      pointer to weight table in UGLARM interpolation mode,
+ *                          zero when x-y mode is used
+ * @param *uglarmweightsum  pointer to weight sum table in UGLARM interpolation mode,
+ *                          zero when x-y mode is used
  * @param show   show a rectangle around the processed area, useful for
  *               parameters tweaking
  * @param direct if non-zero perform in-place processing
@@ -58,7 +67,8 @@  static void apply_delogo(uint8_t *dst, int dst_linesize,
                          uint8_t *src, int src_linesize,
                          int w, int h, AVRational sar,
                          int logo_x, int logo_y, int logo_w, int logo_h,
-                         unsigned int band, int show, int direct)
+                         unsigned int band, double *uglarmtable,
+                         double *uglarmweightsum, int show, int direct)
 {
     int x, y;
     uint64_t interp, weightl, weightr, weightt, weightb, weight;
@@ -89,6 +99,7 @@  static void apply_delogo(uint8_t *dst, int dst_linesize,
     dst += (logo_y1 + 1) * dst_linesize;
     src += (logo_y1 + 1) * src_linesize;
 
+    if (!uglarmtable) {
     for (y = logo_y1+1; y < logo_y2; y++) {
         left_sample = topleft[src_linesize*(y-logo_y1)]   +
                       topleft[src_linesize*(y-logo_y1-1)] +
@@ -151,12 +162,154 @@  static void apply_delogo(uint8_t *dst, int dst_linesize,
         dst += dst_linesize;
         src += src_linesize;
     }
+    } else {
+        int bx, by, table_stride;
+        double interpd;
+        double *table_t, *table_b, *table_l, *table_r;
+
+        table_stride = logo_w - 1;
+
+        for (y = 1; y < logo_y2 - logo_y1; y++) {
+            for (x = 1, xdst = dst + logo_x1 + 1, xsrc = src + logo_x1 + 1;
+                x < logo_x2 - logo_x1;
+                x++, xdst++, xsrc++) {
+
+                if (show && (y == 1 || y == logo_y2 - logo_y1 - 1 ||
+                             x == 1 || x == logo_x2 - logo_x1 - 1)) {
+                    *xdst = 0;
+                    continue;
+                }
+
+                table_t = uglarmtable + x + table_stride * y;
+                table_b = uglarmtable + x + table_stride * (logo_h - y - 1);
+                table_l = uglarmtable + table_stride * (y - 1) + x;
+                table_r = uglarmtable + table_stride * (y - 1) + logo_w - x - 1;
+
+                interpd = 0;
+
+                /* top+bottom on the left of the current point */
+                for (bx = 0; bx < x; bx++) {
+                    interpd += topleft[bx] * *table_t;
+                    interpd += botleft[bx] * *table_b;
+                    table_t--;
+                    table_b--;
+                }
+                /* top+bottom on the right of the current point */
+                for (; bx < logo_w; bx++) {
+                    interpd += topleft[bx] * *table_t;
+                    interpd += botleft[bx] * *table_b;
+                    table_t++;
+                    table_b++;
+                }
+                /* left+right above the current point */
+                for (by = 1; by < y; by++) {
+                    interpd += topleft[by * src_linesize] * *table_l;
+                    interpd += topleft[by * src_linesize + table_stride] * *table_r;
+                    table_l -= table_stride;
+                    table_r -= table_stride;
+                }
+                /* left+right below the current point */
+                for (; by < logo_h - 1; by++) {
+                    interpd += topleft[by * src_linesize] * *table_l;
+                    interpd += topleft[by * src_linesize + table_stride] * *table_r;
+                    table_l += table_stride;
+                    table_r += table_stride;
+                }
+
+                av_assert2(table_t == uglarmtable + (logo_w - x) + table_stride * y);
+                av_assert2(table_b == uglarmtable + (logo_w - x) + table_stride * (logo_h - y - 1));
+                av_assert2(table_l == uglarmtable + table_stride * (logo_h - y - 1) + x);
+                av_assert2(table_r == uglarmtable + table_stride * (logo_h - y - 1) + logo_w - x - 1);
+
+                interp = (uint64_t)(interpd /
+                    uglarmweightsum[x - 1 + (y - 1) * (logo_w - 2)]);
+                *xdst = interp;
+            }
+
+            dst += dst_linesize;
+            src += src_linesize;
+        }
+    }
+}
+
+/**
+ * Calculate the lookup tables to be used in UGLARM interpolation mode.
+ *
+ * @param *uglarmtable      Pointer to table containing weights for each possible
+ *                          diagonal distance between a border pixel and an inner
+ *                          logo pixel.
+ * @param *uglarmweightsum  Pointer to a table containing the weight sum to divide
+ *                          by for each pixel within the logo area.
+ * @param sar               The sar to take into account when calculating lookup
+ *                          tables.
+ * @param logo_w            width of the logo
+ * @param logo_h            height of the logo
+ * @param exponent          exponent used in uglarm interpolation
+ */
+static void calc_uglarm_tables(double *uglarmtable, double *uglarmweightsum,
+                               AVRational sar, int logo_w, int logo_h,
+                               float exponent)
+{
+    double aspect = (double)sar.num / sar.den;
+    double aspect2 = aspect * aspect;
+    int x, y;
+
+    /* uglarmtable will contain a weight for each possible diagonal distance
+     * between a border pixel and an inner logo pixel. The maximum distance in
+     * each direction between border and an inner pixel can be logo_w - 1. The
+     * weight of a border pixel which is x,y pixels away is stored at position
+     * x + y * (logo_w - 1). */
+    for (y = 0; y < logo_h - 1; y++)
+        for (x = 0; x < logo_w - 1; x++) {
+            if (x + y != 0) {
+                double d = pow(x * x * aspect2 + y * y, exponent / 2);
+                uglarmtable[x + y * (logo_w - 1)] = 1.0 / d;
+            } else {
+                uglarmtable[x + y * (logo_w - 1)] = 1.0;
+            }
+        }
+
+    /* uglarmweightsum will contain the sum of all weights which is used when
+     * an inner pixel of the logo at position x,y is calculated out of the
+     * border pixels. The aggregated value has to be divided by that. The value
+     * to use for the inner 1-based logo position x,y is stored at
+     * (x - 1) + (y - 1) * (logo_w - 2). */
+    for (y = 1; y < logo_h - 1; y++)
+        for (x = 1; x < logo_w - 1; x++) {
+            double weightsum = 0;
+
+            for (int bx = 0; bx < logo_w; bx++) {
+                /* top border */
+                weightsum += uglarmtable[abs(bx - x) + y * (logo_w - 1)];
+                /* bottom border */
+                weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) * (logo_w - 1)];
+            }
+
+            for (int by = 1; by < logo_h - 1; by++) {
+                /* left border */
+                weightsum += uglarmtable[x + abs(by - y) * (logo_w - 1)];
+                /* right border */
+                weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) * (logo_w - 1)];
+            }
+
+            uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] = weightsum;
+        }
 }
 
+enum mode {
+    MODE_XY,
+    MODE_UGLARM,
+    MODE_NB
+};
+
+#define MAX_SUB 2
+
 typedef struct DelogoContext {
     const AVClass *class;
-    int x, y, w, h, band, show;
-}  DelogoContext;
+    int x, y, w, h, band, mode, show;
+    float exponent;
+    double *uglarmtable[MAX_SUB + 1][MAX_SUB + 1], *uglarmweightsum[MAX_SUB + 1][MAX_SUB + 1];
+} DelogoContext;
 
 #define OFFSET(x) offsetof(DelogoContext, x)
 #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
@@ -171,6 +324,10 @@  static const AVOption delogo_options[]= {
     { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 =  0 },  0, INT_MAX, FLAGS },
     { "t",    "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 =  0 },  0, INT_MAX, FLAGS },
 #endif
+    { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT, { .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" },
+        { "xy",     "use pixels in straight x and y direction",  OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY },     0, 0, FLAGS, "mode" },
+        { "uglarm", "UGLARM mode, use full border",              OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS, "mode" },
+    { "exponent","exponent used for UGLARM interpolation", OFFSET(exponent), AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS },
     { "show", "show delogo area",          OFFSET(show), AV_OPT_TYPE_BOOL,{ .i64 =  0 },  0, 1,       FLAGS },
     { NULL }
 };
@@ -215,8 +372,8 @@  static av_cold int init(AVFilterContext *ctx)
 #else
     s->band = 1;
 #endif
-    av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d show:%d\n",
-           s->x, s->y, s->w, s->h, s->band, s->show);
+    av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d exponent:%f show:%d\n",
+           s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent, s->show);
 
     s->w += s->band*2;
     s->h += s->band*2;
@@ -226,6 +383,17 @@  static av_cold int init(AVFilterContext *ctx)
     return 0;
 }
 
+static av_cold void uninit(AVFilterContext *ctx)
+{
+    DelogoContext *s = ctx->priv;
+
+    for (int hsub = 0; hsub <= MAX_SUB; hsub++)
+        for (int vsub = 0; vsub <= MAX_SUB; vsub++) {
+            av_freep(&s->uglarmtable[hsub][vsub]);
+            av_freep(&s->uglarmweightsum[hsub][vsub]);
+        }
+}
+
 static int config_input(AVFilterLink *inlink)
 {
     DelogoContext *s = inlink->dst->priv;
@@ -270,20 +438,47 @@  static int filter_frame(AVFilterLink *inlink, AVFrame *in)
     if (!sar.num)
         sar.num = sar.den = 1;
 
+    if (s->mode == MODE_UGLARM)
+        av_assert0(hsub0 <= MAX_SUB && vsub0 <= MAX_SUB);
+
     for (plane = 0; plane < desc->nb_components; plane++) {
         int hsub = plane == 1 || plane == 2 ? hsub0 : 0;
         int vsub = plane == 1 || plane == 2 ? vsub0 : 0;
 
+        /* Up and left borders were rounded down, inject lost bits
+         * into width and height to avoid error accumulation */
+        int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub);
+        int logo_h = AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub);
+
+        /* Init lookup tables once */
+        if ((s->mode == MODE_UGLARM) && (!s->uglarmtable[hsub][vsub])) {
+            s->uglarmtable[hsub][vsub] =
+                av_malloc_array((logo_w - 1) * (logo_h - 1), sizeof(*s->uglarmtable[hsub][vsub]));
+
+            if (!s->uglarmtable[hsub][vsub])
+                return AVERROR(ENOMEM);
+
+            s->uglarmweightsum[hsub][vsub] =
+                av_malloc_array((logo_w - 2) * (logo_h - 2), sizeof(*s->uglarmweightsum[hsub][vsub]));
+
+            if (!s->uglarmweightsum[hsub][vsub])
+                return AVERROR(ENOMEM);
+
+            calc_uglarm_tables(s->uglarmtable[hsub][vsub],
+                            s->uglarmweightsum[hsub][vsub],
+                            sar, logo_w, logo_h, s->exponent);
+        }
+
         apply_delogo(out->data[plane], out->linesize[plane],
                      in ->data[plane], in ->linesize[plane],
                      AV_CEIL_RSHIFT(inlink->w, hsub),
                      AV_CEIL_RSHIFT(inlink->h, vsub),
                      sar, s->x>>hsub, s->y>>vsub,
-                     /* Up and left borders were rounded down, inject lost bits
-                      * into width and height to avoid error accumulation */
-                     AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub),
-                     AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub),
+                     logo_w,
+                     logo_h,
                      s->band>>FFMIN(hsub, vsub),
+                     s->uglarmtable[hsub][vsub],
+                     s->uglarmweightsum[hsub][vsub],
                      s->show, direct);
     }
 
@@ -317,6 +512,7 @@  AVFilter ff_vf_delogo = {
     .priv_size     = sizeof(DelogoContext),
     .priv_class    = &delogo_class,
     .init          = init,
+    .uninit        = uninit,
     .query_formats = query_formats,
     .inputs        = avfilter_vf_delogo_inputs,
     .outputs       = avfilter_vf_delogo_outputs,
-- 
2.11.0