From patchwork Thu Dec 27 12:00:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uwe Freese X-Patchwork-Id: 11566 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 7369544DA98 for ; Thu, 27 Dec 2018 14:00:35 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 281C968AC73; Thu, 27 Dec 2018 14:00:32 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.17.24]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E305A68AC4C for ; Thu, 27 Dec 2018 14:00:24 +0200 (EET) Received: from butler2.ufnet ([84.57.61.127]) by mrelayeu.kundenserver.de (mreue106 [212.227.15.183]) with ESMTPSA (Nemesis) id 1MIcux-1gXD1105LB-00EhY9 for ; Thu, 27 Dec 2018 13:00:30 +0100 Received: from localhost ([127.0.0.1] helo=[IPv6:::1]) by butler2.ufnet with esmtp (Exim 4.89) (envelope-from ) id 1gcUKz-0004aL-BX for ffmpeg-devel@ffmpeg.org; Thu, 27 Dec 2018 13:00:29 +0100 To: ffmpeg-devel@ffmpeg.org References: <04cf53a1-0749-a4b5-1d1a-cef28c086a57@uwe-freese.de> <984b88c3-605e-a281-ec48-4b73956a4288@uwe-freese.de> <5f86c630-efe4-0b8b-9a68-8c624e522573@gmx.de> <20181226170207.GA24413@sunshine.barsnick.net> From: Uwe Freese Message-ID: <0c15df5c-3667-c84b-e059-bc4daa0e743d@gmx.de> Date: Thu, 27 Dec 2018 13:00:29 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <20181226170207.GA24413@sunshine.barsnick.net> Content-Language: en-US X-Provags-ID: V03:K1:dJgnKg055LgvlrEcYYgX9nWgjKLPXs8LReL6VbayAi+wPy1GBWK 1A/mlvsUn3O66yLE96+Md3prN8ulUpGlepoiYi/aBBKnaczMZXKiCBkrHbinU2LDitsHuOO v1Z7oQi11YjkbmhgzPcyYW6tiRXKXkxElq+58LcS/xFgeueCBCQu9ccn4cy7hwmI4YIE2Xi MxP/6scpoOdE/ZRra5Bsw== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1; V03:K0:N9ItdFRIUbk=:nz+XYOOopRd907MhSGes39 SXUlc2+LwO3cjttnivAkim9ClFF60AV6C5z5/GbXqffYBZOSwu4FtRGP6fQ3Ia3zayi6MBpKD i5tLij2jAREGg9teCqWIFhOe2EPNXIjnSxDw4Uvxd5+AbUjX/gujqQMvdwr//omHYr3WaSBvf JAZ1ZMpxGqjhkPhXvESxD12Nx7krsJ5igK2qrLZSGulLVjDMeos/EdhU/ZQq5rcStzn2nIHEz OkfTrmQ0nTQ20BBZq9zOHlqUThZeHTFcjXlc97GU/eMGkv9XbRGpC7BjHcdNSl0fd4YJzk9XG u8v29MSxgdsnfU5bsRQezjVcZ+4R8wC57DVhZgVb4oldcxGVpoYKrG2buq5ndKTdfq3csk1/6 U6ukIVQNrod0vhMOqIwG6DQo2n+i4VnRgKL3Ic7DROvC4tnRg7fP5XzsU1g5ENv1hsLo15SIy IdFKoxTj3xaPezXMMsgBmmu/xR3EHQsryHYYUNqd6X+KLTdoCpszQuKDysxO8EoeQrI6rRcHx btlC+cRK9WnPVGPM27aCjt6oPMSTRM5OldeN7BZ6O0yC0ZMlumAfSdePmJZI6hgNztDtVjjo1 oGwt6Qnpsi0HZ2nHBYUm6si5yZJmmgVp0+24pWOcgiPUfAAyV01gsWpSZi+zf/yA7lJissDms 5GskVQiBDdXvJAqrmzNUFLffxQtgCAHhI9me29bNz2pU+YSCNz+aiwYKJwdb3Y1J4KFHOP5T3 zlVC+5NSyfV+Ol4DBRSXN5hoMZ1vTUBkq20yYQ== Subject: Re: [FFmpeg-devel] [PATCH] delogo filter: new "uglarm" interpolation mode added X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Hello, thanks for the comments. Most points were clear and I've changed the code accordingly (see attached new patch). Here are the remaining questions / points to discuss: > You can start writing it now already, because it needs to go into > doc/filters.texi. I've added the first version in the patch attached. Maybe some of the sentences have to be improved - I'm not an english native speaker. > The "10" should be a #define, [...] I have now added as error handling: av_log(inlink->src, AV_LOG_ERROR, "More planes in frame than expected.\n"); return AVERROR(ENOMEM); Is this ok, or how should this be implemented instead? >> + * Copyright (c) 2019 Uwe Freese > Considering you authored it in 2018, this is forward-looking. ;-) I thought it would take some days to review and when the code is integrated finally, it is already 2019. Should I write 2018, 2019 instead (assuming that it won't be integrated before next week)? > >> + for (x = logo_x1+1, >> + xdst = dst+logo_x1+1, >> + xsrc = src+logo_x1+1; x < logo_x2; x++, xdst++, xsrc++) { > Spaces around operators: x = logo_x1 + 1 > (Also everywhere else. Unless it's the original code, then leave it be.) The original code had no spaces around operators in this case. To be clear: Spaces are wanted here, right? So it should be "x = logo_x1 + 1" etc. right? > >> + double e = 0.2 * power; > Could power also be a double instead of an int? Would specifying a > power of e.g. 2.5 make sense? This is a good point. It was an int value in VirtualDub's delogo and in ffdshow, I think mostly because this int value was mapped to a slider in the GUI... There is the multiplier of 0.2 so the differences between the parameters that are possible to set are small enough. Power of 2,5 is not needed when the 0,2 factor is used. But the question is, if a double parameter would be preferred instead of an int (which is multiplied by 0,2)? The parameter to set by the user would be e.g. "3" or "2.2" instead of "15" or "11". I personally would prefer int parameters a little, but to explain the functioning of the calculation to the user, a double might be better ("At value 3, the weight for the consideration of a border pixel is distance ^ 3."). So shall I change this and use a double parameter? > >> + {"mode", "set the interpolation mode", OFFSET(mode), AV_OPT_TYPE_INT, { .i64 = MODE_XY}, 0, 1, FLAGS, "mode"}, > min and max are MODE_XY and MODE_UGLARM (or MODE_NB-1, if you code it > that way to give room for more modes). I wasn't sure if one can rely on the fact that MODE_UGLARM is mapped to 1 and is the max value, because it's not specified in the enum declaration. But it seems we can be sure: https://stackoverflow.com/questions/42128376/what-is-the-rule-for-assignment-of-the-integer-value-of-enum So I changed it. It would be nice to get some infos and opinions about these questions. I'll change the code accordingly afterwards. Also let me know if there's something else to change. Regards, Uwe From f2c56ed60403b3bdee9749170c155720acd689c6 Mon Sep 17 00:00:00 2001 From: breaker27 Date: Wed, 26 Dec 2018 18:16:48 +0100 Subject: [PATCH] Add new delogo interpolation mode uglarm. --- doc/filters.texi | 19 +++++ libavfilter/vf_delogo.c | 189 +++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 199 insertions(+), 9 deletions(-) diff --git a/doc/filters.texi b/doc/filters.texi index 65ce25bc18..39bdb0d188 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -7952,6 +7952,25 @@ Specify the thickness of the fuzzy edge of the rectangle (added to deprecated, setting higher values should no longer be necessary and is not recommended. +@item mode +Specify the interpolation mode used to remove the logo. 'xy' uses +only the border pixels in straight x and y direction to replace any +logo pixel. It tends to flicker and to show horizontal and vertical +lines. 'uglarm' considers the whole border for every logo pixel to +replace. It uses the power of the distance to any border pixel as +weight to which amount it's taken into account. This results in a +more blurred area, which tends to be less distracting. The default +value is 'xy'. + +@item power +Specify the power (factor) used to calculate the weight out of the +distance in 'uglarm' mode (weight = distance ^ 0.2 * power). The +value 0 results in a logo area which has the same average color +everywhere. The higher the value, the more relevant a near border +pixel will get, meaning that the borders of the logo area are more +similar to the surrounding pixels. The default value 15 results in +power 3 (= 0.2 * 15). + @item show When set to 1, a green rectangle is drawn on the screen to simplify finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters. diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c index 065d093641..f97ef0598c 100644 --- a/libavfilter/vf_delogo.c +++ b/libavfilter/vf_delogo.c @@ -2,6 +2,7 @@ * Copyright (c) 2002 Jindrich Makovicka * Copyright (c) 2011 Stefano Sabatini * Copyright (c) 2013, 2015 Jean Delvare + * Copyright (c) 2019 Uwe Freese * * This file is part of FFmpeg. * @@ -25,6 +26,9 @@ * A very simple tv station logo remover * Originally imported from MPlayer libmpcodecs/vf_delogo.c, * the algorithm was later improved. + * The "UGLARM" mode was first implemented 2001 by Uwe Freese for Virtual + * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into ffdshow's + * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg. */ #include "libavutil/common.h" @@ -50,6 +54,10 @@ * @param logo_w width of the logo * @param logo_h height of the logo * @param band the size of the band around the processed area + * @param *uglarmtable pointer to weight table in UGLARM interpolation mode, + * zero when x-y mode is used + * @param *uglarmweightsum pointer to weight sum table in UGLARM interpolation mode, + * zero when x-y mode is used * @param show show a rectangle around the processed area, useful for * parameters tweaking * @param direct if non-zero perform in-place processing @@ -58,7 +66,8 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, uint8_t *src, int src_linesize, int w, int h, AVRational sar, int logo_x, int logo_y, int logo_w, int logo_h, - unsigned int band, int show, int direct) + unsigned int band, double *uglarmtable, + double *uglarmweightsum, int show, int direct) { int x, y; uint64_t interp, weightl, weightr, weightt, weightb, weight; @@ -89,6 +98,7 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, dst += (logo_y1 + 1) * dst_linesize; src += (logo_y1 + 1) * src_linesize; + if (!uglarmtable) { for (y = logo_y1+1; y < logo_y2; y++) { left_sample = topleft[src_linesize*(y-logo_y1)] + topleft[src_linesize*(y-logo_y1-1)] + @@ -151,12 +161,123 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, dst += dst_linesize; src += src_linesize; } + } else { + int bx, by; + double interpd; + + for (y = logo_y1 + 1; y < logo_y2; y++) { + for (x = logo_x1 + 1, + xdst = dst + logo_x1 + 1, + xsrc = src + logo_x1 + 1; x < logo_x2; x++, xdst++, xsrc++) { + + if (show && (y == logo_y1 + 1 || y == logo_y2 - 1 || + x == logo_x1 + 1 || x == logo_x2 - 1)) { + *xdst = 0; + continue; + } + + interpd = 0; + + for (bx = 0; bx < logo_w; bx++) { + interpd += topleft[bx] * + uglarmtable[abs(bx - (x - logo_x1)) + (y - logo_y1) * (logo_w - 1)]; + interpd += botleft[bx] * + uglarmtable[abs(bx - (x - logo_x1)) + (logo_h - (y - logo_y1) - 1) * (logo_w - 1)]; + } + + for (by = 1; by < logo_h - 1; by++) { + interpd += topleft[by * src_linesize] * + uglarmtable[(x - logo_x1) + abs(by - (y - logo_y1)) * (logo_w - 1)]; + interpd += topleft[by * src_linesize + (logo_w - 1)] * + uglarmtable[logo_w - (x - logo_x1) - 1 + abs(by - (y - logo_y1)) * (logo_w - 1)]; + } + + interp = (uint64_t)(interpd / + uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) * (logo_w - 2)]); + *xdst = interp; + } + + dst += dst_linesize; + src += src_linesize; + } + } } +/** + * Calculate the lookup tables to be used in UGLARM interpolation mode. + * + * @param *uglarmtable Pointer to table containing weights for each possible + * diagonal distance between a border pixel and an inner + * logo pixel. + * @param *uglarmweightsum Pointer to a table containing the weight sum to divide + * by for each pixel within the logo area. + * @param sar The sar to take into account when calculating lookup + * tables. + * @param logo_w width of the logo + * @param logo_h height of the logo + * @param power power of uglarm interpolation + */ +static void calc_uglarm_tables(double *uglarmtable, double *uglarmweightsum, + AVRational sar, int logo_w, int logo_h, int power) +{ + double e = 0.2 * power; + double aspect = (double)sar.num / sar.den; + int x, y; + + /* uglarmtable will contain a weight for each possible diagonal distance + * between a border pixel and an inner logo pixel. The maximum distance in + * each direction between border and an inner pixel can be logo_w - 1. The + * weight of a border pixel which is x,y pixels away is stored at position + * x + y * (logo_w - 1). */ + for (y = 0; y < logo_h - 1; y++) + for (x = 0; x < logo_w - 1; x++) { + if (x + y != 0) { + double d = pow(sqrt(x * x * aspect * aspect + y * y), e); + uglarmtable[x + y * (logo_w - 1)] = 1.0 / d; + } else { + uglarmtable[x + y * (logo_w - 1)] = 1.0; + } + } + + /* uglarmweightsum will contain the sum of all weights which is used when + * an inner pixel of the logo at position x,y is calculated out of the + * border pixels. The aggregated value has to be divided by that. The value + * to use for the inner 1-based logo position x,y is stored at + * (x - 1) + (y - 1) * (logo_w - 2). */ + for (y = 1; y < logo_h - 1; y++) + for (x = 1; x < logo_w - 1; x++) { + double weightsum = 0; + + for (int bx = 0; bx < logo_w; bx++) { + /* top border */ + weightsum += uglarmtable[abs(bx - x) + y * (logo_w - 1)]; + /* bottom border */ + weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) * (logo_w - 1)]; + } + + for (int by = 1; by < logo_h - 1; by++) { + /* left border */ + weightsum += uglarmtable[x + abs(by - y) * (logo_w - 1)]; + /* right border */ + weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) * (logo_w - 1)]; + } + + uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] = weightsum; + } +} + +enum mode { + MODE_XY, + MODE_UGLARM +}; + +#define MAX_PLANES 10 + typedef struct DelogoContext { const AVClass *class; - int x, y, w, h, band, show; -} DelogoContext; + int x, y, w, h, band, mode, power, show; + double *uglarmtable[MAX_PLANES], *uglarmweightsum[MAX_PLANES]; +} DelogoContext; #define OFFSET(x) offsetof(DelogoContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM @@ -171,6 +292,10 @@ static const AVOption delogo_options[]= { { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, { "t", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, #endif + { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT, { .i64 = MODE_XY }, MODE_XY, MODE_UGLARM, FLAGS, "mode" }, + { "xy", "use pixels in straight x and y direction", OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY }, 0, 0, FLAGS, "mode" }, + { "uglarm", "UGLARM mode, use full border", OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS, "mode" }, + { "power","power of UGLARM interpolation", OFFSET(power), AV_OPT_TYPE_INT, { .i64 = 15 }, 0, 30, FLAGS }, { "show", "show delogo area", OFFSET(show), AV_OPT_TYPE_BOOL,{ .i64 = 0 }, 0, 1, FLAGS }, { NULL } }; @@ -215,8 +340,8 @@ static av_cold int init(AVFilterContext *ctx) #else s->band = 1; #endif - av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d show:%d\n", - s->x, s->y, s->w, s->h, s->band, s->show); + av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d power:%d show:%d\n", + s->x, s->y, s->w, s->h, s->band, s->mode, s->power, s->show); s->w += s->band*2; s->h += s->band*2; @@ -226,6 +351,19 @@ static av_cold int init(AVFilterContext *ctx) return 0; } +static av_cold void uninit(AVFilterContext *ctx) +{ + DelogoContext *s = ctx->priv; + + if (s->mode == MODE_UGLARM) + { + for (int plane = 0; plane < MAX_PLANES; plane++) { + av_free(s->uglarmtable[plane]); + av_free(s->uglarmweightsum[plane]); + } + } +} + static int config_input(AVFilterLink *inlink) { DelogoContext *s = inlink->dst->priv; @@ -274,16 +412,48 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) int hsub = plane == 1 || plane == 2 ? hsub0 : 0; int vsub = plane == 1 || plane == 2 ? vsub0 : 0; + /* Up and left borders were rounded down, inject lost bits + * into width and height to avoid error accumulation */ + int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<h + (s->y & ((1<mode == MODE_UGLARM) { + if (plane >= MAX_PLANES) { + av_log(inlink->src, AV_LOG_ERROR, "More planes in frame than expected.\n"); + return AVERROR(ENOMEM); + } + + if (!s->uglarmtable[plane]) { + s->uglarmtable[plane] = + (double*)av_malloc((logo_w - 1) * (logo_h - 1) * sizeof(double)); + + if (!s->uglarmtable[plane]) { + return AVERROR(ENOMEM); + } + + s->uglarmweightsum[plane] = + (double*)av_malloc((logo_w - 2) * (logo_h - 2) * sizeof(double)); + + if (!s->uglarmweightsum[plane]) { + return AVERROR(ENOMEM); + } + + calc_uglarm_tables(s->uglarmtable[plane], + s->uglarmweightsum[plane], + sar, logo_w, logo_h, s->power); + } + } + apply_delogo(out->data[plane], out->linesize[plane], in ->data[plane], in ->linesize[plane], AV_CEIL_RSHIFT(inlink->w, hsub), AV_CEIL_RSHIFT(inlink->h, vsub), sar, s->x>>hsub, s->y>>vsub, - /* Up and left borders were rounded down, inject lost bits - * into width and height to avoid error accumulation */ - AV_CEIL_RSHIFT(s->w + (s->x & ((1<h + (s->y & ((1<band>>FFMIN(hsub, vsub), + s->uglarmtable[plane], + s->uglarmweightsum[plane], s->show, direct); } @@ -317,6 +487,7 @@ AVFilter ff_vf_delogo = { .priv_size = sizeof(DelogoContext), .priv_class = &delogo_class, .init = init, + .uninit = uninit, .query_formats = query_formats, .inputs = avfilter_vf_delogo_inputs, .outputs = avfilter_vf_delogo_outputs, -- 2.11.0