From patchwork Wed Jan 2 22:34:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uwe Freese X-Patchwork-Id: 11636 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id CD6B544C6C6 for ; Thu, 3 Jan 2019 00:34:23 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 86CF5689F27; Thu, 3 Jan 2019 00:34:20 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.17.10]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 54093689E85 for ; Thu, 3 Jan 2019 00:34:14 +0200 (EET) Received: from butler2.ufnet ([84.57.61.127]) by mrelayeu.kundenserver.de (mreue106 [212.227.15.183]) with ESMTPSA (Nemesis) id 1MtfeD-1hSn8z1Lpf-00v4OQ for ; Wed, 02 Jan 2019 23:34:22 +0100 Received: from ixos.fritz.box ([192.168.0.86]) by butler2.ufnet with esmtp (Exim 4.89) (envelope-from ) id 1gep5h-0004vY-Qp for ffmpeg-devel@ffmpeg.org; Wed, 02 Jan 2019 23:34:21 +0100 References: <5f86c630-efe4-0b8b-9a68-8c624e522573@gmx.de> <20181226170207.GA24413@sunshine.barsnick.net> <0c15df5c-3667-c84b-e059-bc4daa0e743d@gmx.de> <1c96f17c-e274-e32e-06b3-7b20483823e3@gmx.de> <20181230130211.whrg4yptjwuxlpai@phare.normalesup.org> <20190102143750.eienalhtkiasdzmf@phare.normalesup.org> <20190102152517.7lb5fgvpgtan3tou@phare.normalesup.org> From: Uwe Freese To: ffmpeg-devel@ffmpeg.org Message-ID: <3cb20460-412f-0154-5c60-a916351ca480@gmx.de> Date: Wed, 2 Jan 2019 23:34:21 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <20190102152517.7lb5fgvpgtan3tou@phare.normalesup.org> Content-Language: en-US X-Provags-ID: V03:K1:xsCGRp7IopEz1x7MHqoN02qJkGJGug6WQ7EL6pfF+CuHlurkf85 b5WMTAdGxlEsZKPFmXEp4Fz3if7KS33zBUVXmM+xuXH8msl5g6nQCrim/PpU1VCkNk2enyR BOaF1tGc0+7E3uYHmesPKLb0Fo2c5/GU4NpryMvrXdbm2ruVZahu6DOJAHG9DFI1dE3zMmp 4J/pdzZk9Xm3D1Xe7uEZw== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1; V03:K0:vUuQ7WigwdM=:rXADw4rpuzIwzEuxR0/+qm 2BtoA4R1Xv3hWUbEghKNqPs7LmKO/dR+GFlmG08DXXG7TVaRShy23nQ1rGwnOAZnNuCSO0qND 6gpuhgjiiu/+Lm6stc/GXPVlvCJMAeoIoIalJHvgD2ZaPcSggeUIf12//RWdzxBrp3WOAb4uI DR/dR1dzjVGC/xgV9fSM+nZxX+DkB4qa5JjJWj1sO3Unz5l3qTrbFXjtUQLYszMAj583N8DAH j7KxQ5jHd+Xp6bBWXVKp3/lGEG/eP1KeFN0pn+bqmH3qnwPWN3yGpX+dtDql6HmXFsW+aQEaH PhVWe3wZX06XOdeKXeUBmL//pD9wKyWxVzOJmjyCSBvK2Ac86VImXwfX9oTEv5XecB2dy4HFI XkrD+S2s40N7h+cqO7AhXom9mBLIEQv/bS7YVHn9juqHMBd7iuwfNbc+uD8OQ+LvCB4KN9nx8 UYCoyzWX4DwV1HVZ4l6V+scYVT8rsQ6JpqHkmexoxj6av+/O7a2qUq235b5bvlvEmmcD5iLvb JHURMFIi/HNHrHOjRlv9GFyIwlCoUhIXYhEvGAM8UMzeygnsMEFMgYbHm0wAXryg8abUoZAp2 ER4EL5h571pjrnNRMKexEFIU0Tu00suGRB2JLHoJ/zHMwA0WV4gwDV6oJeOZ5rPKcI9QQTlhx HPunPpd7rb1if1riCTabbB9vX75Pf6jTnz95zxhxZe+zA8tUas+PbUR6L4zrLRF5hvD3Tn666 oerQ8OEW4+4zxZV5sqxGUY4SUF+Y8JfKZF+2eg== Subject: Re: [FFmpeg-devel] [PATCH] delogo filter: new "uglarm" interpolation mode added X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Hello, Here's a new version of the patch. Changes: - My last version didn't compile because of moving code to config_input. Don't know why I didn't see this, sorry.   I moved the code back to filter_frame because of two reasons. First, I need the "sar" to init my tables and it seems I need the "AVFrame *in" parameter for that. Secondly, I also need *desc, hsub0, vsub0, plane, logo_w, logo_h which means much duplicated code or creation of functions. - Corrected use of sizeof in av_malloc_array - changed calculation of the distance regarding exponent, avoid sqrt, use "aspect2" variable - use double *uglarmtable[MAX_SUB + 1][MAX_SUB + 1], *uglarmweightsum[MAX_SUB + 1][MAX_SUB + 1]; (instead per 1-dimensional array for the planes) - unconditional av_freep in uninit - used the alternative loops as suggested by Nicolas (thanks) Remaining points / answers / questions: Am 02.01.19 um 16:25 schrieb Nicolas George: >>>> + interp = (uint64_t)(interpd / >>>> + uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) * (logo_w - 2)]); >>> The cast to uint64_t seems suspicious. Can you explain? >> Every pixel value of the inner logo area is an integer. Only for the >> calculation of the weighted average, all values use floats. At the end, it >> is rounded (truncated) to an int. > int and uint64_t are not the same thing. Why uint64_t? "interp" was defined as uint64_t in the original code. Do you see a problem here? Then let us know. >>> pow(x * x * aspect2 + y * y, exponent / 2); >> Hm. Again, I'm unsure if this "optimization" is worth it. I wouldn't do this >> normally. > In this particular instance, definitely yes. Compilers have very little > latitude to optimize floating point operations, and they will certainly > not optimize mathematical functions based on subtle arithmetic > properties. This is a C compiler, not a TI-89. We have obviously distinct opinions here. I would leave the code because it's easier to understand (as written), and it runs once, taking maybe some microseconds more for a usual conversion of video taking seconds to hours. But I have no problem changing it. - Done. Question to "aspect2": is ^2 not good (slow) or something, or why not directly use double aspect2 = ((double)sar.num / sar.den) ^ 2; > [...] > av_assert2(table_t == uglarmtable + (logo_w - x) + table_stride * y); > av_assert2(table_b == uglarmtable + (logo_w - x) + table_stride * (logo_h - y - 1)); > av_assert2(table_l == uglarmtable + table_stride * (logo_h - y - 1) + x); > av_assert2(table_r == uglarmtable + table_stride * (logo_h - y - 1) + logo_w - x - 1; > > That makes more lines, but the lines are way simpler: no tricky > arithmetic, all blocks almost identical, with the changes easy to see > and understand. I took over these alternative loops for the calculations. Question to the assert: Is this useful (compared to the running time)? I mean, basically it is the same expression as over the loops, only x and y are different by the amount the loops are counting them up. I wouldn't say that it is probable that one makes an error coding the loop counter or that it doesn't somehow function. Is there another thing which this assert checks that I didn't see? Regards, Uwe From 7b221a27b29c7376672d4ee60092051f6b65c978 Mon Sep 17 00:00:00 2001 From: breaker27 Date: Wed, 26 Dec 2018 18:16:48 +0100 Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode --- doc/filters.texi | 28 +++++++ libavfilter/vf_delogo.c | 214 ++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 233 insertions(+), 9 deletions(-) diff --git a/doc/filters.texi b/doc/filters.texi index 65ce25bc18..792560ad79 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -7952,6 +7952,34 @@ Specify the thickness of the fuzzy edge of the rectangle (added to deprecated, setting higher values should no longer be necessary and is not recommended. +@item mode +Specify the interpolation mode used to remove the logo. +It accepts the following values: +@table @option +@item xy +Only the border pixels in straight x and y direction are considered +to replace any logo pixel. This mode tends to flicker and to show +horizontal and vertical lines. +@item uglarm +Consider the whole border for every logo pixel to replace. This mode +uses the distance raised to the power of the given @var{exponent} as +weight that decides to which amount every border pixel is taken into +account. This results in a more blurred area, which tends to be less +distracting. The uglarm mode was first implemented in VirtualDub's +LogoAway filter and is also known from ffdshow and is the +abbreviation for "Uwe's Great LogoAway Remove Mode". +@end table +The default value is @code{xy}. + +@item exponent +Specify the exponent used to calculate the weight out of the +distance in @code{uglarm} mode (weight = distance ^ @var{exponent}). +The value @code{0} results in a logo area which has +the same average color everywhere. The higher the value, the more +relevant a near border pixel will get, meaning that the borders of +the logo area are more similar to the surrounding pixels. The default +value is @code{3}. + @item show When set to 1, a green rectangle is drawn on the screen to simplify finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters. diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c index 065d093641..f489c6a586 100644 --- a/libavfilter/vf_delogo.c +++ b/libavfilter/vf_delogo.c @@ -2,6 +2,7 @@ * Copyright (c) 2002 Jindrich Makovicka * Copyright (c) 2011 Stefano Sabatini * Copyright (c) 2013, 2015 Jean Delvare + * Copyright (c) 2018, 2019 Uwe Freese * * This file is part of FFmpeg. * @@ -25,8 +26,12 @@ * A very simple tv station logo remover * Originally imported from MPlayer libmpcodecs/vf_delogo.c, * the algorithm was later improved. + * The "UGLARM" mode was first implemented 2001 by Uwe Freese for Virtual + * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into ffdshow's + * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg. */ +#include "libavutil/avassert.h" #include "libavutil/common.h" #include "libavutil/imgutils.h" #include "libavutil/opt.h" @@ -50,6 +55,10 @@ * @param logo_w width of the logo * @param logo_h height of the logo * @param band the size of the band around the processed area + * @param *uglarmtable pointer to weight table in UGLARM interpolation mode, + * zero when x-y mode is used + * @param *uglarmweightsum pointer to weight sum table in UGLARM interpolation mode, + * zero when x-y mode is used * @param show show a rectangle around the processed area, useful for * parameters tweaking * @param direct if non-zero perform in-place processing @@ -58,7 +67,8 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, uint8_t *src, int src_linesize, int w, int h, AVRational sar, int logo_x, int logo_y, int logo_w, int logo_h, - unsigned int band, int show, int direct) + unsigned int band, double *uglarmtable, + double *uglarmweightsum, int show, int direct) { int x, y; uint64_t interp, weightl, weightr, weightt, weightb, weight; @@ -89,6 +99,7 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, dst += (logo_y1 + 1) * dst_linesize; src += (logo_y1 + 1) * src_linesize; + if (!uglarmtable) { for (y = logo_y1+1; y < logo_y2; y++) { left_sample = topleft[src_linesize*(y-logo_y1)] + topleft[src_linesize*(y-logo_y1-1)] + @@ -151,12 +162,154 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, dst += dst_linesize; src += src_linesize; } + } else { + int bx, by, table_stride; + double interpd; + double *table_t, *table_b, *table_l, *table_r; + + table_stride = logo_w - 1; + + for (y = 1; y < logo_y2 - logo_y1; y++) { + for (x = 1, xdst = dst + logo_x1 + 1, xsrc = src + logo_x1 + 1; + x < logo_x2 - logo_x1; + x++, xdst++, xsrc++) { + + if (show && (y == 1 || y == logo_y2 - logo_y1 - 1 || + x == 1 || x == logo_x2 - logo_x1 - 1)) { + *xdst = 0; + continue; + } + + table_t = uglarmtable + x + table_stride * y; + table_b = uglarmtable + x + table_stride * (logo_h - y - 1); + table_l = uglarmtable + table_stride * (y - 1) + x; + table_r = uglarmtable + table_stride * (y - 1) + logo_w - x - 1; + + interpd = 0; + + /* top+bottom on the left of the current point */ + for (bx = 0; bx < x; bx++) { + interpd += topleft[bx] * *table_t; + interpd += botleft[bx] * *table_b; + table_t--; + table_b--; + } + /* top+bottom on the right of the current point */ + for (; bx < logo_w; bx++) { + interpd += topleft[bx] * *table_t; + interpd += botleft[bx] * *table_b; + table_t++; + table_b++; + } + /* left+right above the current point */ + for (by = 1; by < y; by++) { + interpd += topleft[by * src_linesize] * *table_l; + interpd += topleft[by * src_linesize + table_stride] * *table_r; + table_l -= table_stride; + table_r -= table_stride; + } + /* left+right below the current point */ + for (; by < logo_h - 1; by++) { + interpd += topleft[by * src_linesize] * *table_l; + interpd += topleft[by * src_linesize + table_stride] * *table_r; + table_l += table_stride; + table_r += table_stride; + } + + av_assert2(table_t == uglarmtable + (logo_w - x) + table_stride * y); + av_assert2(table_b == uglarmtable + (logo_w - x) + table_stride * (logo_h - y - 1)); + av_assert2(table_l == uglarmtable + table_stride * (logo_h - y - 1) + x); + av_assert2(table_r == uglarmtable + table_stride * (logo_h - y - 1) + logo_w - x - 1); + + interp = (uint64_t)(interpd / + uglarmweightsum[x - 1 + (y - 1) * (logo_w - 2)]); + *xdst = interp; + } + + dst += dst_linesize; + src += src_linesize; + } + } +} + +/** + * Calculate the lookup tables to be used in UGLARM interpolation mode. + * + * @param *uglarmtable Pointer to table containing weights for each possible + * diagonal distance between a border pixel and an inner + * logo pixel. + * @param *uglarmweightsum Pointer to a table containing the weight sum to divide + * by for each pixel within the logo area. + * @param sar The sar to take into account when calculating lookup + * tables. + * @param logo_w width of the logo + * @param logo_h height of the logo + * @param exponent exponent used in uglarm interpolation + */ +static void calc_uglarm_tables(double *uglarmtable, double *uglarmweightsum, + AVRational sar, int logo_w, int logo_h, + float exponent) +{ + double aspect = (double)sar.num / sar.den; + double aspect2 = aspect * aspect; + int x, y; + + /* uglarmtable will contain a weight for each possible diagonal distance + * between a border pixel and an inner logo pixel. The maximum distance in + * each direction between border and an inner pixel can be logo_w - 1. The + * weight of a border pixel which is x,y pixels away is stored at position + * x + y * (logo_w - 1). */ + for (y = 0; y < logo_h - 1; y++) + for (x = 0; x < logo_w - 1; x++) { + if (x + y != 0) { + double d = pow(x * x * aspect2 + y * y, exponent / 2); + uglarmtable[x + y * (logo_w - 1)] = 1.0 / d; + } else { + uglarmtable[x + y * (logo_w - 1)] = 1.0; + } + } + + /* uglarmweightsum will contain the sum of all weights which is used when + * an inner pixel of the logo at position x,y is calculated out of the + * border pixels. The aggregated value has to be divided by that. The value + * to use for the inner 1-based logo position x,y is stored at + * (x - 1) + (y - 1) * (logo_w - 2). */ + for (y = 1; y < logo_h - 1; y++) + for (x = 1; x < logo_w - 1; x++) { + double weightsum = 0; + + for (int bx = 0; bx < logo_w; bx++) { + /* top border */ + weightsum += uglarmtable[abs(bx - x) + y * (logo_w - 1)]; + /* bottom border */ + weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) * (logo_w - 1)]; + } + + for (int by = 1; by < logo_h - 1; by++) { + /* left border */ + weightsum += uglarmtable[x + abs(by - y) * (logo_w - 1)]; + /* right border */ + weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) * (logo_w - 1)]; + } + + uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] = weightsum; + } } +enum mode { + MODE_XY, + MODE_UGLARM, + MODE_NB +}; + +#define MAX_SUB 2 + typedef struct DelogoContext { const AVClass *class; - int x, y, w, h, band, show; -} DelogoContext; + int x, y, w, h, band, mode, show; + float exponent; + double *uglarmtable[MAX_SUB + 1][MAX_SUB + 1], *uglarmweightsum[MAX_SUB + 1][MAX_SUB + 1]; +} DelogoContext; #define OFFSET(x) offsetof(DelogoContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM @@ -171,6 +324,10 @@ static const AVOption delogo_options[]= { { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, { "t", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, #endif + { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT, { .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" }, + { "xy", "use pixels in straight x and y direction", OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY }, 0, 0, FLAGS, "mode" }, + { "uglarm", "UGLARM mode, use full border", OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS, "mode" }, + { "exponent","exponent used for UGLARM interpolation", OFFSET(exponent), AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS }, { "show", "show delogo area", OFFSET(show), AV_OPT_TYPE_BOOL,{ .i64 = 0 }, 0, 1, FLAGS }, { NULL } }; @@ -215,8 +372,8 @@ static av_cold int init(AVFilterContext *ctx) #else s->band = 1; #endif - av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d show:%d\n", - s->x, s->y, s->w, s->h, s->band, s->show); + av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d exponent:%f show:%d\n", + s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent, s->show); s->w += s->band*2; s->h += s->band*2; @@ -226,6 +383,17 @@ static av_cold int init(AVFilterContext *ctx) return 0; } +static av_cold void uninit(AVFilterContext *ctx) +{ + DelogoContext *s = ctx->priv; + + for (int hsub = 0; hsub <= MAX_SUB; hsub++) + for (int vsub = 0; vsub <= MAX_SUB; vsub++) { + av_freep(&s->uglarmtable[hsub][vsub]); + av_freep(&s->uglarmweightsum[hsub][vsub]); + } +} + static int config_input(AVFilterLink *inlink) { DelogoContext *s = inlink->dst->priv; @@ -270,20 +438,47 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) if (!sar.num) sar.num = sar.den = 1; + if (s->mode == MODE_UGLARM) + av_assert0(hsub0 <= MAX_SUB && vsub0 <= MAX_SUB); + for (plane = 0; plane < desc->nb_components; plane++) { int hsub = plane == 1 || plane == 2 ? hsub0 : 0; int vsub = plane == 1 || plane == 2 ? vsub0 : 0; + /* Up and left borders were rounded down, inject lost bits + * into width and height to avoid error accumulation */ + int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<h + (s->y & ((1<mode == MODE_UGLARM) && (!s->uglarmtable[hsub][vsub])) { + s->uglarmtable[hsub][vsub] = + av_malloc_array((logo_w - 1) * (logo_h - 1), sizeof(*s->uglarmtable[hsub][vsub])); + + if (!s->uglarmtable[hsub][vsub]) + return AVERROR(ENOMEM); + + s->uglarmweightsum[hsub][vsub] = + av_malloc_array((logo_w - 2) * (logo_h - 2), sizeof(*s->uglarmweightsum[hsub][vsub])); + + if (!s->uglarmweightsum[hsub][vsub]) + return AVERROR(ENOMEM); + + calc_uglarm_tables(s->uglarmtable[hsub][vsub], + s->uglarmweightsum[hsub][vsub], + sar, logo_w, logo_h, s->exponent); + } + apply_delogo(out->data[plane], out->linesize[plane], in ->data[plane], in ->linesize[plane], AV_CEIL_RSHIFT(inlink->w, hsub), AV_CEIL_RSHIFT(inlink->h, vsub), sar, s->x>>hsub, s->y>>vsub, - /* Up and left borders were rounded down, inject lost bits - * into width and height to avoid error accumulation */ - AV_CEIL_RSHIFT(s->w + (s->x & ((1<h + (s->y & ((1<band>>FFMIN(hsub, vsub), + s->uglarmtable[hsub][vsub], + s->uglarmweightsum[hsub][vsub], s->show, direct); } @@ -317,6 +512,7 @@ AVFilter ff_vf_delogo = { .priv_size = sizeof(DelogoContext), .priv_class = &delogo_class, .init = init, + .uninit = uninit, .query_formats = query_formats, .inputs = avfilter_vf_delogo_inputs, .outputs = avfilter_vf_delogo_outputs, -- 2.11.0