From patchwork Sat Dec 29 20:38:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uwe Freese X-Patchwork-Id: 11585 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 3CF0744EB2B for ; Sat, 29 Dec 2018 22:39:00 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DDF7568A614; Sat, 29 Dec 2018 22:38:56 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.131]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9BE3968A5E0 for ; Sat, 29 Dec 2018 22:38:50 +0200 (EET) Received: from butler2.ufnet ([84.57.61.127]) by mrelayeu.kundenserver.de (mreue010 [212.227.15.167]) with ESMTPSA (Nemesis) id 1M9WeC-1ggdAy3ASy-005YXY for ; Sat, 29 Dec 2018 21:38:56 +0100 Received: from localhost ([127.0.0.1] helo=[IPv6:::1]) by butler2.ufnet with esmtp (Exim 4.89) (envelope-from ) id 1gdLNo-0003CC-2I for ffmpeg-devel@ffmpeg.org; Sat, 29 Dec 2018 21:38:56 +0100 To: ffmpeg-devel@ffmpeg.org References: <04cf53a1-0749-a4b5-1d1a-cef28c086a57@uwe-freese.de> <984b88c3-605e-a281-ec48-4b73956a4288@uwe-freese.de> <5f86c630-efe4-0b8b-9a68-8c624e522573@gmx.de> <20181226170207.GA24413@sunshine.barsnick.net> <0c15df5c-3667-c84b-e059-bc4daa0e743d@gmx.de> <1c96f17c-e274-e32e-06b3-7b20483823e3@gmx.de> From: Uwe Freese Message-ID: Date: Sat, 29 Dec 2018 21:38:55 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Provags-ID: V03:K1:uilE9Ngs0NmWQrx3oXKQILnX1eJk3yFlDYcMzGsKqnv7MjFtNlw OgdmcRQbQ4owuWJZedtYsoimBE4zlpeSF55UcrImIjAJuCQZS1N3k0eVDxr/HtW/hEBg6qh pi1o2pjQUnN4sF5APGtO68He9/p2qG2jULYxhKxOtZQRhEYWmEPGKfs5uRDAtNfz/keQi2N BrHW9/ffCfnpgFvTFt1Rg== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1; V03:K0:2D3bBvBROoI=:bwJM4XA2mYVPAYBaGGk2DL zGsYodTNhPu7BqWtxUBGFdgX17W1pSBMBv5z8TdTbnfu+yRFauV+Hz1zYVk/TfoIDSUc8jz/s vNguWRRyrRFx9c0yqJzlHrQdKP2Dx3T7f+FlfnwrUTy6KWnoFho3pcptG08ethpFaXVMdL2Cv aSql2O8lPFJr+xuWQ4/wHlyviU4IFtXPSwikx0edfIwUZKMAJfAyqK1RaBJsgXK1zyQ2iqeI2 aKrLkXbUVJkTXkQURQh0ISf17p7/et2mrfM0sq8o3itNVFVsrfoq+nyTPD8ZsKwi2NXprXZdD CNv8BbThNZfT/cwHBRtOi+3nAapBAwtyRtVpe8E8uq9vtd+5JIFnAOrP4yhGYQHBj2X1WjY0k valffCTjVQk2nlJ9WA+nqGDqfH0moDw/B6wMBYnWkTrPKGNFP8eoMoUvC5ztIPotM/4O58Slt I3zhlHUWdKIFTO4iOIVuHLOKVTlhOofZIk6Y7H7Bk//FjIh0rbFEpuJh1UHafI5gpBlsTEdGW 9OxNT66xE2RCCVGSffbicRxaNYp0NiSTctF4TWvKrSA5L5WhxkBipsasrN3JclWYDVKk7Y4GU 1Rn8Rd87l+7fH5q0oAC7SrIFVEH3BlphbCOoZlf83d/hmiNh8bc98sKgLB4cgLZfOWyUB2qlF 88PePO6cduCrUczmz8SjIkdCIoWCRFNraNnLw0Pnvja/e6/hsg//OOyXHrbemFbZJ8DrHjVH7 Ea2LWQ1R2WuvjczBUOmxAodAH5LZK0R6x9ZttA== Subject: Re: [FFmpeg-devel] [PATCH] delogo filter: new "uglarm" interpolation mode added X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Hello, here's a new version of the patch. Changes since the last version of the patch (mail from 27.12. 13:00), according the comments from Moritz and Carl Eugen: - using av_assert0 instead of throwing an error if planes > MAX_PLANES - using av_freep with the reference operator in uninit - copyright year 2019 -> 2018 - add MODE_NB value in "mode" enum and limit mode parameter in delogo_options to MODE_NB-1 - changed commit message style: "avfilter/vf_delogo: add uglarm interpolation mode" - bracket style, line 360 - changed documentation, using @var{...}, @code{...}, @table @option etc., explaining "uglarm" and using the term "raised to the power of" (see https://en.wikipedia.org/wiki/Exponentiation). I hope the use of the macros is correct as I used them... (@var referencing a parameter and @code referencing a value) - changed parameter "power" (int) to "exponent" (float) Some considerations about the parameter name "power" / "exponent": I noticed that for the specific calculation of the weight in the new mode, the wording should be better named "exponent" instead of "power". Wikipedia (https://en.wikipedia.org/wiki/Exponentiation) writes: base ^ exponent = power So I changed the value type to float and the calculation to weight = distance ^ exponent (float value exponent). That is IMHO more consistent and easier to explain. The downside is that the term "power" (or "strength") would be more generic and would probably be better reusable for other modes. What do you think? Is "exponent" now ok, or should I change it back to "power", "strength" or something alike? Again, I would be glad about comments and reviews. - Let me know what should be changed and I'll create a new patch in some days. Regards, Uwe Am 28.12.18 um 00:08 schrieb Carl Eugen Hoyos: > 2018-12-27 22:02 GMT+01:00, Uwe Freese : > >> Am 27.12.18 um 20:25 schrieb Carl Eugen Hoyos: >>>> I have now added as error handling: >>>> >>>> av_log(inlink->src, AV_LOG_ERROR, "More planes in frame than >>>> expected.\n"); >>>> return AVERROR(ENOMEM); >>>> >>>> Is this ok, or how should this be implemented instead? >>> Not sure I understand: How can plane get >= MAX_PLANES? >>> If this is impossible (as I believe), please use av_assert0(). >> I meant the use of "ENOMEM" and if there's a better error >> constant to use here. >> >> At this line, the error is not about memory, but that the video input >> format is unexpected. > Is there a format with too many planes? > If not, please use an assert, do not return an error. > > Carl Eugen > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel From 83e79abb3311eb2dd92c63e8e15e0476af2f2891 Mon Sep 17 00:00:00 2001 From: breaker27 Date: Wed, 26 Dec 2018 18:16:48 +0100 Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode --- doc/filters.texi | 28 +++++++ libavfilter/vf_delogo.c | 189 +++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 208 insertions(+), 9 deletions(-) diff --git a/doc/filters.texi b/doc/filters.texi index 65ce25bc18..792560ad79 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -7952,6 +7952,34 @@ Specify the thickness of the fuzzy edge of the rectangle (added to deprecated, setting higher values should no longer be necessary and is not recommended. +@item mode +Specify the interpolation mode used to remove the logo. +It accepts the following values: +@table @option +@item xy +Only the border pixels in straight x and y direction are considered +to replace any logo pixel. This mode tends to flicker and to show +horizontal and vertical lines. +@item uglarm +Consider the whole border for every logo pixel to replace. This mode +uses the distance raised to the power of the given @var{exponent} as +weight that decides to which amount every border pixel is taken into +account. This results in a more blurred area, which tends to be less +distracting. The uglarm mode was first implemented in VirtualDub's +LogoAway filter and is also known from ffdshow and is the +abbreviation for "Uwe's Great LogoAway Remove Mode". +@end table +The default value is @code{xy}. + +@item exponent +Specify the exponent used to calculate the weight out of the +distance in @code{uglarm} mode (weight = distance ^ @var{exponent}). +The value @code{0} results in a logo area which has +the same average color everywhere. The higher the value, the more +relevant a near border pixel will get, meaning that the borders of +the logo area are more similar to the surrounding pixels. The default +value is @code{3}. + @item show When set to 1, a green rectangle is drawn on the screen to simplify finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters. diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c index 065d093641..dcb0366e63 100644 --- a/libavfilter/vf_delogo.c +++ b/libavfilter/vf_delogo.c @@ -2,6 +2,7 @@ * Copyright (c) 2002 Jindrich Makovicka * Copyright (c) 2011 Stefano Sabatini * Copyright (c) 2013, 2015 Jean Delvare + * Copyright (c) 2018 Uwe Freese * * This file is part of FFmpeg. * @@ -25,12 +26,16 @@ * A very simple tv station logo remover * Originally imported from MPlayer libmpcodecs/vf_delogo.c, * the algorithm was later improved. + * The "UGLARM" mode was first implemented 2001 by Uwe Freese for Virtual + * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into ffdshow's + * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg. */ #include "libavutil/common.h" #include "libavutil/imgutils.h" #include "libavutil/opt.h" #include "libavutil/pixdesc.h" +#include "libavutil/avassert.h" #include "avfilter.h" #include "formats.h" #include "internal.h" @@ -50,6 +55,10 @@ * @param logo_w width of the logo * @param logo_h height of the logo * @param band the size of the band around the processed area + * @param *uglarmtable pointer to weight table in UGLARM interpolation mode, + * zero when x-y mode is used + * @param *uglarmweightsum pointer to weight sum table in UGLARM interpolation mode, + * zero when x-y mode is used * @param show show a rectangle around the processed area, useful for * parameters tweaking * @param direct if non-zero perform in-place processing @@ -58,7 +67,8 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, uint8_t *src, int src_linesize, int w, int h, AVRational sar, int logo_x, int logo_y, int logo_w, int logo_h, - unsigned int band, int show, int direct) + unsigned int band, double *uglarmtable, + double *uglarmweightsum, int show, int direct) { int x, y; uint64_t interp, weightl, weightr, weightt, weightb, weight; @@ -89,6 +99,7 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, dst += (logo_y1 + 1) * dst_linesize; src += (logo_y1 + 1) * src_linesize; + if (!uglarmtable) { for (y = logo_y1+1; y < logo_y2; y++) { left_sample = topleft[src_linesize*(y-logo_y1)] + topleft[src_linesize*(y-logo_y1-1)] + @@ -151,12 +162,125 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, dst += dst_linesize; src += src_linesize; } + } else { + int bx, by; + double interpd; + + for (y = logo_y1 + 1; y < logo_y2; y++) { + for (x = logo_x1 + 1, + xdst = dst + logo_x1 + 1, + xsrc = src + logo_x1 + 1; x < logo_x2; x++, xdst++, xsrc++) { + + if (show && (y == logo_y1 + 1 || y == logo_y2 - 1 || + x == logo_x1 + 1 || x == logo_x2 - 1)) { + *xdst = 0; + continue; + } + + interpd = 0; + + for (bx = 0; bx < logo_w; bx++) { + interpd += topleft[bx] * + uglarmtable[abs(bx - (x - logo_x1)) + (y - logo_y1) * (logo_w - 1)]; + interpd += botleft[bx] * + uglarmtable[abs(bx - (x - logo_x1)) + (logo_h - (y - logo_y1) - 1) * (logo_w - 1)]; + } + + for (by = 1; by < logo_h - 1; by++) { + interpd += topleft[by * src_linesize] * + uglarmtable[(x - logo_x1) + abs(by - (y - logo_y1)) * (logo_w - 1)]; + interpd += topleft[by * src_linesize + (logo_w - 1)] * + uglarmtable[logo_w - (x - logo_x1) - 1 + abs(by - (y - logo_y1)) * (logo_w - 1)]; + } + + interp = (uint64_t)(interpd / + uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) * (logo_w - 2)]); + *xdst = interp; + } + + dst += dst_linesize; + src += src_linesize; + } + } +} + +/** + * Calculate the lookup tables to be used in UGLARM interpolation mode. + * + * @param *uglarmtable Pointer to table containing weights for each possible + * diagonal distance between a border pixel and an inner + * logo pixel. + * @param *uglarmweightsum Pointer to a table containing the weight sum to divide + * by for each pixel within the logo area. + * @param sar The sar to take into account when calculating lookup + * tables. + * @param logo_w width of the logo + * @param logo_h height of the logo + * @param exponent exponent used in uglarm interpolation + */ +static void calc_uglarm_tables(double *uglarmtable, double *uglarmweightsum, + AVRational sar, int logo_w, int logo_h, + float exponent) +{ + double aspect = (double)sar.num / sar.den; + int x, y; + + /* uglarmtable will contain a weight for each possible diagonal distance + * between a border pixel and an inner logo pixel. The maximum distance in + * each direction between border and an inner pixel can be logo_w - 1. The + * weight of a border pixel which is x,y pixels away is stored at position + * x + y * (logo_w - 1). */ + for (y = 0; y < logo_h - 1; y++) + for (x = 0; x < logo_w - 1; x++) { + if (x + y != 0) { + double d = pow(sqrt(x * x * aspect * aspect + y * y), exponent); + uglarmtable[x + y * (logo_w - 1)] = 1.0 / d; + } else { + uglarmtable[x + y * (logo_w - 1)] = 1.0; + } + } + + /* uglarmweightsum will contain the sum of all weights which is used when + * an inner pixel of the logo at position x,y is calculated out of the + * border pixels. The aggregated value has to be divided by that. The value + * to use for the inner 1-based logo position x,y is stored at + * (x - 1) + (y - 1) * (logo_w - 2). */ + for (y = 1; y < logo_h - 1; y++) + for (x = 1; x < logo_w - 1; x++) { + double weightsum = 0; + + for (int bx = 0; bx < logo_w; bx++) { + /* top border */ + weightsum += uglarmtable[abs(bx - x) + y * (logo_w - 1)]; + /* bottom border */ + weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) * (logo_w - 1)]; + } + + for (int by = 1; by < logo_h - 1; by++) { + /* left border */ + weightsum += uglarmtable[x + abs(by - y) * (logo_w - 1)]; + /* right border */ + weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) * (logo_w - 1)]; + } + + uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] = weightsum; + } } +enum mode { + MODE_XY, + MODE_UGLARM, + MODE_NB +}; + +#define MAX_PLANES 10 + typedef struct DelogoContext { const AVClass *class; - int x, y, w, h, band, show; -} DelogoContext; + int x, y, w, h, band, mode, show; + float exponent; + double *uglarmtable[MAX_PLANES], *uglarmweightsum[MAX_PLANES]; +} DelogoContext; #define OFFSET(x) offsetof(DelogoContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM @@ -171,6 +295,10 @@ static const AVOption delogo_options[]= { { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, { "t", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, #endif + { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT, { .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" }, + { "xy", "use pixels in straight x and y direction", OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY }, 0, 0, FLAGS, "mode" }, + { "uglarm", "UGLARM mode, use full border", OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS, "mode" }, + { "exponent","exponent used for UGLARM interpolation", OFFSET(exponent), AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS }, { "show", "show delogo area", OFFSET(show), AV_OPT_TYPE_BOOL,{ .i64 = 0 }, 0, 1, FLAGS }, { NULL } }; @@ -215,8 +343,8 @@ static av_cold int init(AVFilterContext *ctx) #else s->band = 1; #endif - av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d show:%d\n", - s->x, s->y, s->w, s->h, s->band, s->show); + av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d exponent:%f show:%d\n", + s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent, s->show); s->w += s->band*2; s->h += s->band*2; @@ -226,6 +354,18 @@ static av_cold int init(AVFilterContext *ctx) return 0; } +static av_cold void uninit(AVFilterContext *ctx) +{ + DelogoContext *s = ctx->priv; + + if (s->mode == MODE_UGLARM) { + for (int plane = 0; plane < MAX_PLANES; plane++) { + av_freep(&s->uglarmtable[plane]); + av_freep(&s->uglarmweightsum[plane]); + } + } +} + static int config_input(AVFilterLink *inlink) { DelogoContext *s = inlink->dst->priv; @@ -270,20 +410,50 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) if (!sar.num) sar.num = sar.den = 1; + if (s->mode == MODE_UGLARM) + av_assert0(desc->nb_components <= MAX_PLANES); + for (plane = 0; plane < desc->nb_components; plane++) { int hsub = plane == 1 || plane == 2 ? hsub0 : 0; int vsub = plane == 1 || plane == 2 ? vsub0 : 0; + /* Up and left borders were rounded down, inject lost bits + * into width and height to avoid error accumulation */ + int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<h + (s->y & ((1<mode == MODE_UGLARM) { + if (!s->uglarmtable[plane]) { + s->uglarmtable[plane] = + (double*)av_malloc((logo_w - 1) * (logo_h - 1) * sizeof(double)); + + if (!s->uglarmtable[plane]) { + return AVERROR(ENOMEM); + } + + s->uglarmweightsum[plane] = + (double*)av_malloc((logo_w - 2) * (logo_h - 2) * sizeof(double)); + + if (!s->uglarmweightsum[plane]) { + return AVERROR(ENOMEM); + } + + calc_uglarm_tables(s->uglarmtable[plane], + s->uglarmweightsum[plane], + sar, logo_w, logo_h, s->exponent); + } + } + apply_delogo(out->data[plane], out->linesize[plane], in ->data[plane], in ->linesize[plane], AV_CEIL_RSHIFT(inlink->w, hsub), AV_CEIL_RSHIFT(inlink->h, vsub), sar, s->x>>hsub, s->y>>vsub, - /* Up and left borders were rounded down, inject lost bits - * into width and height to avoid error accumulation */ - AV_CEIL_RSHIFT(s->w + (s->x & ((1<h + (s->y & ((1<band>>FFMIN(hsub, vsub), + s->uglarmtable[plane], + s->uglarmweightsum[plane], s->show, direct); } @@ -317,6 +487,7 @@ AVFilter ff_vf_delogo = { .priv_size = sizeof(DelogoContext), .priv_class = &delogo_class, .init = init, + .uninit = uninit, .query_formats = query_formats, .inputs = avfilter_vf_delogo_inputs, .outputs = avfilter_vf_delogo_outputs, -- 2.11.0