Message ID | d803ca1a-72f1-98cb-7f61-952be8421162@gmx.de |
---|---|
State | Superseded |
Headers | show |
Uwe Freese (2019-01-01): > > This can be optimized, and since it is the inner loop of the filter, I > > think it is worth it. You can declare a pointer that will stay the same > > for the whole y loop: > > > > double *xweight = uglarmtable + (y - logo_y1) * (logo_w - 1); > > > > and then use it in that loop: > > > > interpd += ... * xweight[abs(bx - (x - logo_x1))]; > > > > It avoids a lot of multiplications. > > I'm not sure about this point. First, I would absolutely assume from the > compiler that it optimizes expressions like "(logo_w - 1)" or a fixed offset > to access the array in the loop and that they are only calculated once. Relying a lot on compiler optimizations is a sure way of being disappointed. But the logo_w_minus_one variable was not about optimization but about organization. Each time there is the same computation at several places, it is a sign that a specific variable should probably be used. That way, if the design changes a little, only the initialization of the variable needs to be changed. In this particular instance, logo_w_minus_one was not a good name (it was for explaining); table_stride would be better. That way, if you decide to change the structure of the table, you do not need to look at all uses of logo_w, you just need to update the initialization of table_stride. > Secondly, I don't exactly understand how *xweight in your example should be > used (and would mean smaller or easier code). XXX > > > + interp = (uint64_t)(interpd / > > > + uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) * (logo_w - 2)]); > > The cast to uint64_t seems suspicious. Can you explain? > > Every pixel value of the inner logo area is an integer. Only for the > calculation of the weighted average, all values use floats. At the end, it > is rounded (truncated) to an int. int and uint64_t are not the same thing. Why uint64_t? > > pow(x * x * aspect2 + y * y, exponent / 2); > > Hm. Again, I'm unsure if this "optimization" is worth it. I wouldn't do this > normally. In this particular instance, definitely yes. Compilers have very little latitude to optimize floating point operations, and they will certainly not optimize mathematical functions based on subtle arithmetic properties. This is a C compiler, not a TI-89. > > But I have a better suggestion: > > > > #define MAX_SUB 2 > > > > double uglarmtable[MAX_SUB + 1][MAX_SUB + 1] > > > > and use uglarmtable[hsub][vsub]. That way, for YUVA420P for example, the > > table will be computed only once for Y and A and once for U and V. > > > > The assert is still needed with that solution, though. > > I don't understand this. Please provide a complete example, or it stays as > it is. - and could of course be optimized later. I do not see how it could be more complete without including code that is irrelevant to the example. But since you insist: #define MAX_SUB 2 typedef struct DelogoContext { ... double uglarmtable[MAX_SUB + 1][MAX_SUB + 1] ... } ... av_assert0(hsub <= MAX_SUB && vsub <= MAX_SUB); if (!s->uglarmtable[hsub][vsub]) s->uglarmtable[hsub][vsub] = av_malloc_array(...); ... calc_uglarm_tables(s->uglarmtable[hsub][vsub], s->uglarmweightsum[hsub][vsub]); ... apply_delogo(..., s->uglarmtable[hsub][vsub], s->uglarmweightsum[hsub][vsub]); > But why should the for loop run in xy mode and check "planes count" times to > free the memory? The code without the "if" also looks to me more like this > is not checked by mistake. This is the usual way this project does things: free everything unconditionally. The rationale is that it is less likely to lead to leaks in case of code change. > Hope that the code is correct like this? > > s->uglarmtable[plane] = > av_malloc_array((logo_w - 1) * (logo_h - 1), sizeof(s->uglarmtable[plane])); Sorry, no: you are taking the size of the pointer uglarmtable[plane]; you need to take the size of the pointed objects *uglarmtable[plane]. Regards,
Nicolas George (2019-01-02): > Uwe Freese (2019-01-01): > > > This can be optimized, and since it is the inner loop of the filter, I > > > think it is worth it. You can declare a pointer that will stay the same > > > for the whole y loop: > > > > > > double *xweight = uglarmtable + (y - logo_y1) * (logo_w - 1); > > > > > > and then use it in that loop: > > > > > > interpd += ... * xweight[abs(bx - (x - logo_x1))]; > > > > > > It avoids a lot of multiplications. > > > Secondly, I don't exactly understand how *xweight in your example should be > > used (and would mean smaller or easier code). > > XXX Sorry, forgot to fill that part before sending. Completely untested code: double *table_t, *table_b, *table_l, *table_r; table_t = uglarmtable + x + table_stride * y; table_b = uglarmtable + x + table_stride * (logo_h - y - 1); table_l = uglarmtable + table_stride * (y - 1) + x; table_r = uglarmtable + table_stride * (y - 1) + logo_w - x - 1; /* top+bottom on the left of the current point */ for (bx = 0; bx < x; bx++) { weightsum += table_t; weightsum += table_b; table_t--; table_b--; } /* top+bottom on the right of the current point */ for (; bx < logo_w; bx++) { weightsum += table_t; weightsum += table_b; table_t++; table_b++; } /* left+right above the current point */ for (by = 1; by < y; by++) { weightsum += *table_l; weightsum += *table_r; table_l -= table_stride; table_r -= table_stride; } /* left+right below the current point */ for (; by < logo_w - 1; by++) { weightsum += *table_l; weightsum += *table_r; table_l += table_stride; table_r += table_stride; } av_assert2(table_t == uglarmtable + (logo_w - x) + table_stride * y); av_assert2(table_b == uglarmtable + (logo_w - x) + table_stride * (logo_h - y - 1)); av_assert2(table_l == uglarmtable + table_stride * (logo_h - y - 1) + x); av_assert2(table_r == uglarmtable + table_stride * (logo_h - y - 1) + logo_w - x - 1; That makes more lines, but the lines are way simpler: no tricky arithmetic, all blocks almost identical, with the changes easy to see and understand. Regards,
Hi, just an info: code didn't compile, don't know why I didn't see this... New patch is in work... Regards, Uwe Am 01.01.19 um 22:14 schrieb Uwe Freese: > Hello, > > here's a new version of the patch. > > Thanks for the infos. I used the raw output of a small test video > (where delogo is applied in both modes) before and after the changes > to make sure the output is bytewise identical (the changes don't > change the output). > > > In general I want to say that it seems to me that *some* of the points > go now in a more philosophical direction. > > I prefer code easy to understand and assume that compilers optimize > much. Yes, there may be many possibilities to probably "optimize" > something, but I tend to believe that many times this is not needed > nor helpful, because it doesn't really optimize and complicates the code. > > Additionally, when I don't have complete and functioning code to > replace my code (the same what's expected from me), I'm not willing to > spend many hours to guess what could be meant. > > I hope that not so many additional patches are needed anymore. I > understand that coding styles, syntax etc. shall be corrected, but > hope that we don't have to discuss too much alternative ways of > implementing basically the same. > > > > Changes since the last version of the patch (mail from 29.12. 21:38), > according the comments since then: > > - change include order at the beginning of the file > - change loop format (linebreaks) > - change loop borders > - change indentation (line 175) > - moved init code (create tables) to config_input function > > - used av_malloc_array instead of av_malloc. Avoid the cast. Use > variable in sizeof(...). > - copyright 2018, 2019 - happy new year BTW ;-) > > > Comments and remaining open points from the mails: > >>> + for (y = logo_y1 + 1; y < logo_y2; y++) { >>> + for (x = logo_x1 + 1, >>> + xdst = dst + logo_x1 + 1, >>> + xsrc = src + logo_x1 + 1; x < logo_x2; x++, xdst++, >>> xsrc++) { >> I think a line break after the semicolons would make this easier to >> understand. > > Hm. I used the same format as in the original code. > > Nonetheless, I changed the format now, because I changed the loop > borders as requested and the loops are now different anyway. > >>> + for (bx = 0; bx < logo_w; bx++) { >>> + interpd += topleft[bx] * >>> + uglarmtable[abs(bx - (x - logo_x1)) + (y - >>> logo_y1) * (logo_w - 1)]; >>> + interpd += botleft[bx] * >>> + uglarmtable[abs(bx - (x - logo_x1)) + >>> (logo_h - (y - logo_y1) - 1) * (logo_w - 1)]; >>> + } >> This can be optimized, and since it is the inner loop of the filter, I >> think it is worth it. You can declare a pointer that will stay the same >> for the whole y loop: >> >> double *xweight = uglarmtable + (y - logo_y1) * (logo_w - 1); >> >> and then use it in that loop: >> >> interpd += ... * xweight[abs(bx - (x - logo_x1))]; >> >> It avoids a lot of multiplications. > > I'm not sure about this point. First, I would absolutely assume from > the compiler that it optimizes expressions like "(logo_w - 1)" or a > fixed offset to access the array in the loop and that they are only > calculated once. > > Secondly, I don't exactly understand how *xweight in your example > should be used (and would mean smaller or easier code). > > @All: What is the opinion about changing this loop regarding use of an > additional xweight pointer? > >> >>> + >>> + for (by = 1; by < logo_h - 1; by++) { >>> + interpd += topleft[by * src_linesize] * >>> + uglarmtable[(x - logo_x1) + abs(by - (y - >>> logo_y1)) * (logo_w - 1)]; >>> + interpd += topleft[by * src_linesize + (logo_w >>> - 1)] * >>> + uglarmtable[logo_w - (x - logo_x1) - 1 + >>> abs(by - (y - logo_y1)) * (logo_w - 1)]; >>> + } >> This one is more tricky to optimize, because of the abs() within the >> multiplication, but it can be done by splitting the loop in two: >> >> for (by = 1; by < y; by++) { >> interpd += ... * yweight[x - logo_x1]; >> yweight -= logo_w_minus_one; >> } >> for (; by < logo_h_minus_one; by++) { >> interpd += ... * yweight[x - logo_x1]; >> yweight += logo_w_minus_one; >> } >> av_assert2(yweight == the_correct_value); >> >> In fact, I think even the x loop would be a little more readable if it >> was split in two like that. > Sorry, I don't understand. Please give me a complete example that > replaces the previous for loop. >>> + >>> + interp = (uint64_t)(interpd / >>> + uglarmweightsum[(x - logo_x1) - 1 + (y - >>> logo_y1 - 1) * (logo_w - 2)]); >> The cast to uint64_t seems suspicious. Can you explain? > > Every pixel value of the inner logo area is an integer. Only for the > calculation of the weighted average, all values use floats. At the > end, it is rounded (truncated) to an int. > > Should work - and did work for 17 years... > >> >>> + *xdst = interp; >>> + } >>> + >>> + dst += dst_linesize; >>> + src += src_linesize; >>> + } >>> + } >>> +} >>> + >>> +/** >>> + * Calculate the lookup tables to be used in UGLARM interpolation >>> mode. >>> + * >>> + * @param *uglarmtable Pointer to table containing weights for >>> each possible >>> + * diagonal distance between a border >>> pixel and an inner >>> + * logo pixel. >>> + * @param *uglarmweightsum Pointer to a table containing the >>> weight sum to divide >>> + * by for each pixel within the logo area. >>> + * @param sar The sar to take into account when >>> calculating lookup >>> + * tables. >>> + * @param logo_w width of the logo >>> + * @param logo_h height of the logo >>> + * @param exponent exponent used in uglarm interpolation >>> + */ >>> +static void calc_uglarm_tables(double *uglarmtable, double >>> *uglarmweightsum, >>> + AVRational sar, int logo_w, int logo_h, >>> + float exponent) >>> +{ >>> + double aspect = (double)sar.num / sar.den; >> Tiny optimization: >> >> double aspect2 = aspect * aspect; >> >> for use later. > > I wouldn't consider this an optimization. First, I assume the compiler > to only calculate "aspect * aspect" once in that loop (as well as "y * > y" which doesn't change in the inner loop). Second, the code with > using the additional variable makes the code more complex and third, > this loop is only calculated once at startup. > > @All: What are your opinions? > >> >>> + double d = pow(sqrt(x * x * aspect * aspect + y * >>> y), exponent); >> pow(x * x * aspect2 + y * y, exponent / 2); > > Hm. Again, I'm unsure if this "optimization" is worth it. I wouldn't > do this normally. > Reasons: Again, I think the compiler would optimize it. At least the > compiled ffmpeg binaries are exactly the same size... And the original > code explains better the calculation: Calculate the distance > (*Pythagorean theorem, c = sqrt(a² + b²), and then calculate the > weighted distance value with the power. Again, this is only run once > at startup > * > > *@All: Other opinions? > * > >> >>> +#define MAX_PLANES 10 >> You could use FF_ARRAY_ELEMS(AVPixFmtDescriptor.comp), and the >> consistency would be guaranteed. Well, that syntax is not valid, but it >> can be expressed: >> >> #define MAX_PLANES FF_ARRAY_ELEMS(((AVPixFmtDescriptor *)0)->comp) >> >> But I have a better suggestion: >> >> #define MAX_SUB 2 >> >> double uglarmtable[MAX_SUB + 1][MAX_SUB + 1] >> >> and use uglarmtable[hsub][vsub]. That way, for YUVA420P for example, the >> table will be computed only once for Y and A and once for U and V. >> >> The assert is still needed with that solution, though. > > I don't understand this. Please provide a complete example, or it > stays as it is. - and could of course be optimized later. > >>> + if (s->mode == MODE_UGLARM) { >> No need to test, we can free anyway. > > But why should the for loop run in xy mode and check "planes count" > times to free the memory? The code without the "if" also looks to me > more like this is not checked by mistake. > > @All: Opinions? > > >> [Carl Eugen] While there, please don't use sizeof(type) but >> sizeof(variable), > same below. > > Hope that the code is correct like this? > > s->uglarmtable[plane] = > av_malloc_array((logo_w - 1) * (logo_h - 1), > sizeof(s->uglarmtable[plane])); > > Regards, > Uwe > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
From 001e57532567de410ecfa860df99610620dcd385 Mon Sep 17 00:00:00 2001 From: breaker27 <mail@uwe-freese.de> Date: Wed, 26 Dec 2018 18:16:48 +0100 Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode --- doc/filters.texi | 28 ++++++++ libavfilter/vf_delogo.c | 182 ++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 205 insertions(+), 5 deletions(-) diff --git a/doc/filters.texi b/doc/filters.texi index 65ce25bc18..792560ad79 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -7952,6 +7952,34 @@ Specify the thickness of the fuzzy edge of the rectangle (added to deprecated, setting higher values should no longer be necessary and is not recommended. +@item mode +Specify the interpolation mode used to remove the logo. +It accepts the following values: +@table @option +@item xy +Only the border pixels in straight x and y direction are considered +to replace any logo pixel. This mode tends to flicker and to show +horizontal and vertical lines. +@item uglarm +Consider the whole border for every logo pixel to replace. This mode +uses the distance raised to the power of the given @var{exponent} as +weight that decides to which amount every border pixel is taken into +account. This results in a more blurred area, which tends to be less +distracting. The uglarm mode was first implemented in VirtualDub's +LogoAway filter and is also known from ffdshow and is the +abbreviation for "Uwe's Great LogoAway Remove Mode". +@end table +The default value is @code{xy}. + +@item exponent +Specify the exponent used to calculate the weight out of the +distance in @code{uglarm} mode (weight = distance ^ @var{exponent}). +The value @code{0} results in a logo area which has +the same average color everywhere. The higher the value, the more +relevant a near border pixel will get, meaning that the borders of +the logo area are more similar to the surrounding pixels. The default +value is @code{3}. + @item show When set to 1, a green rectangle is drawn on the screen to simplify finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters. diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c index 065d093641..45f2f9cd3f 100644 --- a/libavfilter/vf_delogo.c +++ b/libavfilter/vf_delogo.c @@ -2,6 +2,7 @@ * Copyright (c) 2002 Jindrich Makovicka <makovick@gmail.com> * Copyright (c) 2011 Stefano Sabatini * Copyright (c) 2013, 2015 Jean Delvare <jdelvare@suse.com> + * Copyright (c) 2018, 2019 Uwe Freese <mail@uwe-freese.de> * * This file is part of FFmpeg. * @@ -25,8 +26,12 @@ * A very simple tv station logo remover * Originally imported from MPlayer libmpcodecs/vf_delogo.c, * the algorithm was later improved. + * The "UGLARM" mode was first implemented 2001 by Uwe Freese for Virtual + * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into ffdshow's + * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg. */ +#include "libavutil/avassert.h" #include "libavutil/common.h" #include "libavutil/imgutils.h" #include "libavutil/opt.h" @@ -50,6 +55,10 @@ * @param logo_w width of the logo * @param logo_h height of the logo * @param band the size of the band around the processed area + * @param *uglarmtable pointer to weight table in UGLARM interpolation mode, + * zero when x-y mode is used + * @param *uglarmweightsum pointer to weight sum table in UGLARM interpolation mode, + * zero when x-y mode is used * @param show show a rectangle around the processed area, useful for * parameters tweaking * @param direct if non-zero perform in-place processing @@ -58,7 +67,8 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, uint8_t *src, int src_linesize, int w, int h, AVRational sar, int logo_x, int logo_y, int logo_w, int logo_h, - unsigned int band, int show, int direct) + unsigned int band, double *uglarmtable, + double *uglarmweightsum, int show, int direct) { int x, y; uint64_t interp, weightl, weightr, weightt, weightb, weight; @@ -89,6 +99,7 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, dst += (logo_y1 + 1) * dst_linesize; src += (logo_y1 + 1) * src_linesize; + if (!uglarmtable) { for (y = logo_y1+1; y < logo_y2; y++) { left_sample = topleft[src_linesize*(y-logo_y1)] + topleft[src_linesize*(y-logo_y1-1)] + @@ -151,12 +162,125 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, dst += dst_linesize; src += src_linesize; } + } else { + int bx, by; + double interpd; + + for (y = 1; y < logo_y2 - logo_y1; y++) { + for (x = 1, xdst = dst + logo_x1 + 1, xsrc = src + logo_x1 + 1; + x < logo_x2 - logo_x1; + x++, xdst++, xsrc++) { + + if (show && (y == 1 || y == logo_y2 - logo_y1 - 1 || + x == 1 || x == logo_x2 - logo_x1 - 1)) { + *xdst = 0; + continue; + } + + interpd = 0; + + for (bx = 0; bx < logo_w; bx++) { + interpd += topleft[bx] * + uglarmtable[abs(bx - x) + y * (logo_w - 1)]; + interpd += botleft[bx] * + uglarmtable[abs(bx - x) + (logo_h - y - 1) * (logo_w - 1)]; + } + + for (by = 1; by < logo_h - 1; by++) { + interpd += topleft[by * src_linesize] * + uglarmtable[x + abs(by - y) * (logo_w - 1)]; + interpd += topleft[by * src_linesize + (logo_w - 1)] * + uglarmtable[logo_w - x - 1 + abs(by - y) * (logo_w - 1)]; + } + + interp = (uint64_t)(interpd / + uglarmweightsum[x - 1 + (y - 1) * (logo_w - 2)]); + *xdst = interp; + } + + dst += dst_linesize; + src += src_linesize; + } + } +} + +/** + * Calculate the lookup tables to be used in UGLARM interpolation mode. + * + * @param *uglarmtable Pointer to table containing weights for each possible + * diagonal distance between a border pixel and an inner + * logo pixel. + * @param *uglarmweightsum Pointer to a table containing the weight sum to divide + * by for each pixel within the logo area. + * @param sar The sar to take into account when calculating lookup + * tables. + * @param logo_w width of the logo + * @param logo_h height of the logo + * @param exponent exponent used in uglarm interpolation + */ +static void calc_uglarm_tables(double *uglarmtable, double *uglarmweightsum, + AVRational sar, int logo_w, int logo_h, + float exponent) +{ + double aspect = (double)sar.num / sar.den; + int x, y; + + /* uglarmtable will contain a weight for each possible diagonal distance + * between a border pixel and an inner logo pixel. The maximum distance in + * each direction between border and an inner pixel can be logo_w - 1. The + * weight of a border pixel which is x,y pixels away is stored at position + * x + y * (logo_w - 1). */ + for (y = 0; y < logo_h - 1; y++) + for (x = 0; x < logo_w - 1; x++) { + if (x + y != 0) { + double d = pow(sqrt(x * x * aspect * aspect + y * y), exponent); + uglarmtable[x + y * (logo_w - 1)] = 1.0 / d; + } else { + uglarmtable[x + y * (logo_w - 1)] = 1.0; + } + } + + /* uglarmweightsum will contain the sum of all weights which is used when + * an inner pixel of the logo at position x,y is calculated out of the + * border pixels. The aggregated value has to be divided by that. The value + * to use for the inner 1-based logo position x,y is stored at + * (x - 1) + (y - 1) * (logo_w - 2). */ + for (y = 1; y < logo_h - 1; y++) + for (x = 1; x < logo_w - 1; x++) { + double weightsum = 0; + + for (int bx = 0; bx < logo_w; bx++) { + /* top border */ + weightsum += uglarmtable[abs(bx - x) + y * (logo_w - 1)]; + /* bottom border */ + weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) * (logo_w - 1)]; + } + + for (int by = 1; by < logo_h - 1; by++) { + /* left border */ + weightsum += uglarmtable[x + abs(by - y) * (logo_w - 1)]; + /* right border */ + weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) * (logo_w - 1)]; + } + + uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] = weightsum; + } } +enum mode { + MODE_XY, + MODE_UGLARM, + MODE_NB +}; + +#define MAX_PLANES 10 + typedef struct DelogoContext { const AVClass *class; - int x, y, w, h, band, show; -} DelogoContext; + int x, y, w, h, band, mode, show; + float exponent; + double *uglarmtable[MAX_PLANES], *uglarmweightsum[MAX_PLANES]; +} DelogoContext; #define OFFSET(x) offsetof(DelogoContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM @@ -171,6 +295,10 @@ static const AVOption delogo_options[]= { { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, { "t", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, #endif + { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT, { .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" }, + { "xy", "use pixels in straight x and y direction", OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY }, 0, 0, FLAGS, "mode" }, + { "uglarm", "UGLARM mode, use full border", OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS, "mode" }, + { "exponent","exponent used for UGLARM interpolation", OFFSET(exponent), AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS }, { "show", "show delogo area", OFFSET(show), AV_OPT_TYPE_BOOL,{ .i64 = 0 }, 0, 1, FLAGS }, { NULL } }; @@ -215,8 +343,8 @@ static av_cold int init(AVFilterContext *ctx) #else s->band = 1; #endif - av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d show:%d\n", - s->x, s->y, s->w, s->h, s->band, s->show); + av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d exponent:%f show:%d\n", + s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent, s->show); s->w += s->band*2; s->h += s->band*2; @@ -226,6 +354,18 @@ static av_cold int init(AVFilterContext *ctx) return 0; } +static av_cold void uninit(AVFilterContext *ctx) +{ + DelogoContext *s = ctx->priv; + + if (s->mode == MODE_UGLARM) { + for (int plane = 0; plane < MAX_PLANES; plane++) { + av_freep(&s->uglarmtable[plane]); + av_freep(&s->uglarmweightsum[plane]); + } + } +} + static int config_input(AVFilterLink *inlink) { DelogoContext *s = inlink->dst->priv; @@ -237,6 +377,32 @@ static int config_input(AVFilterLink *inlink) return AVERROR(EINVAL); } + /* Init lookup tables once */ + if (s->mode == MODE_UGLARM) { + /* Up and left borders were rounded down, inject lost bits + * into width and height to avoid error accumulation */ + int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub); + int logo_h = AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub); + + if (!s->uglarmtable[plane]) { + s->uglarmtable[plane] = + av_malloc_array((logo_w - 1) * (logo_h - 1), sizeof(s->uglarmtable[plane])); + + if (!s->uglarmtable[plane]) + return AVERROR(ENOMEM); + + s->uglarmweightsum[plane] = + av_malloc_array((logo_w - 2) * (logo_h - 2), sizeof(s->uglarmweightsum[plane])); + + if (!s->uglarmweightsum[plane]) + return AVERROR(ENOMEM); + + calc_uglarm_tables(s->uglarmtable[plane], + s->uglarmweightsum[plane], + sar, logo_w, logo_h, s->exponent); + } + } + return 0; } @@ -270,6 +436,9 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) if (!sar.num) sar.num = sar.den = 1; + if (s->mode == MODE_UGLARM) + av_assert0(desc->nb_components <= MAX_PLANES); + for (plane = 0; plane < desc->nb_components; plane++) { int hsub = plane == 1 || plane == 2 ? hsub0 : 0; int vsub = plane == 1 || plane == 2 ? vsub0 : 0; @@ -284,6 +453,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub), AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub), s->band>>FFMIN(hsub, vsub), + s->uglarmtable[plane], + s->uglarmweightsum[plane], s->show, direct); } @@ -317,6 +488,7 @@ AVFilter ff_vf_delogo = { .priv_size = sizeof(DelogoContext), .priv_class = &delogo_class, .init = init, + .uninit = uninit, .query_formats = query_formats, .inputs = avfilter_vf_delogo_inputs, .outputs = avfilter_vf_delogo_outputs, -- 2.11.0