[FFmpeg-devel] beautified + accelerated vf_fillborders – Please review

Submitted by Ulf Zibis on March 28, 2019, 9:01 p.m.

Details

Message ID 8760af5a-88ad-1e6e-8515-1d6b0b88ba6d@CoSoCo.de
State New
Headers show

Commit Message

Ulf Zibis March 28, 2019, 9:01 p.m.
Hi again,

Am 25.03.19 um 12:31 schrieb Ulf Zibis:
>> There are two patches "1", one with wrong indentation.
> I intentionally have provided 2 patches with the same number, one for
> the code base an one with additions for the benchmark. I've catched the
> wrong indentation, hopefully at the place you meant.
>
> I'm preparing a new set of patches to follow your advice.
>
>> Do I read the results correctly that for all patches some cases
>> get faster and others get slower?
> Correct. I'm wondering about the cases, where it gets such slower. So
> I'm interested in an answer from you experienced developers to
> understand this. Maybe a compiler option would help.

Here they are, my new set of patches.

The most patches are more or less cosmetic, but good for preparing the
essential patch of #9.

As you can see from the benchmark log included in the
vf_fillbd_benchmark_9.patch I have attained a performance gain up to 45 %.
It is remarkable, that in several cases the processing of 16-bit planes
is often faster as of 8-bit planes of same image dimension.

Regards,

-Ulf

Comments

Carl Eugen Hoyos March 28, 2019, 9:45 p.m.
2019-03-28 22:01 GMT+01:00, Ulf Zibis <Ulf.Zibis@cosoco.de>:
> Hi again,
>
> Am 25.03.19 um 12:31 schrieb Ulf Zibis:
>>> There are two patches "1", one with wrong indentation.
>> I intentionally have provided 2 patches with the same number, one for
>> the code base an one with additions for the benchmark. I've catched the
>> wrong indentation, hopefully at the place you meant.
>>
>> I'm preparing a new set of patches to follow your advice.
>>
>>> Do I read the results correctly that for all patches some cases
>>> get faster and others get slower?
>> Correct. I'm wondering about the cases, where it gets such slower. So
>> I'm interested in an answer from you experienced developers to
>> understand this. Maybe a compiler option would help.
>
> Here they are, my new set of patches.

Patch 1 is wrong.

I don't understand the benchmarks (actually benchmark) you attached
but before sending another patch, you may want to wait for a comment
from Paul, he maintains this filter.

Carl Eugen
Carl Eugen Hoyos March 28, 2019, 9:48 p.m.
2019-03-28 22:45 GMT+01:00, Carl Eugen Hoyos <ceffmpeg@gmail.com>:

> Patch 1 is wrong.
>
> I don't understand the benchmarks

Ok, numer 9 looks like a good idea, either send only this patch or wait
for another comment.

Carl Eugen
Paul B Mahol March 28, 2019, 10:18 p.m.
On 3/28/19, Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
> 2019-03-28 22:45 GMT+01:00, Carl Eugen Hoyos <ceffmpeg@gmail.com>:
>
>> Patch 1 is wrong.
>>
>> I don't understand the benchmarks
>
> Ok, numer 9 looks like a good idea, either send only this patch or wait
> for another comment.

Patches are big mess. Until this is fixed I not going to apply this mess.
Carl Eugen Hoyos March 28, 2019, 10:22 p.m.
2019-03-28 23:18 GMT+01:00, Paul B Mahol <onemda@gmail.com>:
> On 3/28/19, Carl Eugen Hoyos <ceffmpeg@gmail.com> wrote:
>> 2019-03-28 22:45 GMT+01:00, Carl Eugen Hoyos <ceffmpeg@gmail.com>:
>>
>>> Patch 1 is wrong.
>>>
>>> I don't understand the benchmarks
>>
>> Ok, numer 9 looks like a good idea, either send only this patch or wait
>> for another comment.
>
> Patches are big mess. Until this is fixed I not going to apply this mess.

More important is probably if you agree to the variable
renaming and code moving or not. If you don't, sending
only patch 9 (which, if the benchmarks are correct,
indeed shows a large speedup) could be resent.
Not sure where you see a "mess" in the latest patchset,
it looks quite clean once patch 1 is removed...

Carl Eugen
Ulf Zibis April 1, 2019, 7:39 p.m.
Hi Carl Eugen,

Am 28.03.19 um 22:45 schrieb Carl Eugen Hoyos:
>> Here they are, my new set of patches.
> Patch 1 is wrong.

Can you please tell me more detailed, what is wrong with the indentations?

-Ulf
Carl Eugen Hoyos April 1, 2019, 8:08 p.m.
> Am 01.04.2019 um 21:39 schrieb Ulf Zibis <Ulf.Zibis@cosoco.de>:
> 
> Hi Carl Eugen,
> 
> Am 28.03.19 um 22:45 schrieb Carl Eugen Hoyos:
>>> Here they are, my new set of patches.
>> Patch 1 is wrong.
> 
> Can you please tell me more detailed, what is wrong with the indentations?

It’s correct as it is now, please don’t change it.

Carl Eugen
Ulf Zibis April 2, 2019, 3:24 p.m.
Am 01.04.19 um 22:08 schrieb Carl Eugen Hoyos:
>> Can you please tell me more detailed, what is wrong with the indentations?
> It’s correct as it is now.

You are sure?
line 125: there is a double space
line 130: the indentation is not aligned with the upper square bracket
lines 310..318: the code is hard to read without highlighting the
line-continuation by extra indentation

All other cases are surly matter of taste, but I prefer unified
line-continuation indentation of 8. This has the advantage, that
renaming a variable, e.g. "ptr" -> "data", does not entail an additional
correction of the indentations each time and all line-continuations are
clearly distinguishable from new lines.

-Ulf

Patch hide | download patch | download mbox

From b75d289b25c4eb764b76f035c028647a56e52899 Mon Sep 17 00:00:00 2001
From: Ulf Zibis <Ulf.Zibis@CoSoCo.de>
Date: 28.03.2019, 20:32:33

avfilter/fillborders: move definitions to their context, also to reduce their scope

diff --git a/libavfilter/vf_fillborders.c b/libavfilter/vf_fillborders.c
index 3757351..7cf6acd 100644
--- a/libavfilter/vf_fillborders.c
+++ b/libavfilter/vf_fillborders.c
@@ -24,9 +24,6 @@ 
 #include "drawutils.h"
 #include "internal.h"
 
-enum { Y, U, V, A };
-enum { R, G, B };
-
 enum FillMode { FM_SMEAR, FM_MIRROR, FM_FIXED, FM_NB_MODES };
 
 typedef struct Borders {
@@ -66,33 +63,11 @@ 
     { NULL }
 };
 
-static int query_formats(AVFilterContext *ctx)
-{
-    static const enum AVPixelFormat pix_fmts[] = {
-        AV_PIX_FMT_YUVA444P, AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV440P,
-        AV_PIX_FMT_YUVJ444P, AV_PIX_FMT_YUVJ440P,
-        AV_PIX_FMT_YUVA422P, AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUVA420P, AV_PIX_FMT_YUV420P,
-        AV_PIX_FMT_YUVJ422P, AV_PIX_FMT_YUVJ420P,
-        AV_PIX_FMT_YUVJ411P, AV_PIX_FMT_YUV411P, AV_PIX_FMT_YUV410P,
-        AV_PIX_FMT_YUV420P9, AV_PIX_FMT_YUV422P9, AV_PIX_FMT_YUV444P9,
-        AV_PIX_FMT_YUV420P10, AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV444P10,
-        AV_PIX_FMT_YUV420P12, AV_PIX_FMT_YUV422P12, AV_PIX_FMT_YUV444P12, AV_PIX_FMT_YUV440P12,
-        AV_PIX_FMT_YUV420P14, AV_PIX_FMT_YUV422P14, AV_PIX_FMT_YUV444P14,
-        AV_PIX_FMT_YUV420P16, AV_PIX_FMT_YUV422P16, AV_PIX_FMT_YUV444P16,
-        AV_PIX_FMT_YUVA420P9, AV_PIX_FMT_YUVA422P9, AV_PIX_FMT_YUVA444P9,
-        AV_PIX_FMT_YUVA420P10, AV_PIX_FMT_YUVA422P10, AV_PIX_FMT_YUVA444P10,
-        AV_PIX_FMT_YUVA420P16, AV_PIX_FMT_YUVA422P16, AV_PIX_FMT_YUVA444P16,
-        AV_PIX_FMT_GBRP, AV_PIX_FMT_GBRP9, AV_PIX_FMT_GBRP10,
-        AV_PIX_FMT_GBRP12, AV_PIX_FMT_GBRP14, AV_PIX_FMT_GBRP16,
-        AV_PIX_FMT_GBRAP, AV_PIX_FMT_GBRAP10, AV_PIX_FMT_GBRAP12, AV_PIX_FMT_GBRAP16,
-        AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAY9, AV_PIX_FMT_GRAY10, AV_PIX_FMT_GRAY12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_GRAY16,
-        AV_PIX_FMT_NONE
-    };
-    AVFilterFormats *fmts_list = ff_make_format_list(pix_fmts);
-    if (!fmts_list)
-        return AVERROR(ENOMEM);
-    return ff_set_common_formats(ctx, fmts_list);
-}
+AVFILTER_DEFINE_CLASS(fillborders);
+
+/***********************/
+/*  Private functions  */
+/***********************/
 
 static void smear_borders8(FillBordersContext *s, AVFrame *frame)
 {
@@ -270,11 +245,14 @@ 
 
 static char testcase[128];
 
+/*****************************/
+/*  Global access functions  */
+/*****************************/
+
 static int config_props(AVFilterLink *inlink)
 {
     AVFilterContext *ctx = inlink->dst;
     FillBordersContext *s = ctx->priv;
-    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(inlink->format);
 
     if (inlink->w < s->left + s->right ||
             inlink->w <= s->left ||
@@ -290,6 +268,7 @@ 
         return AVERROR(EINVAL);
     }
 
+    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(inlink->format);
     s->nb_planes = desc->nb_components;
     s->depth = desc->comp[0].depth;
 
@@ -317,6 +296,8 @@ 
             for (int i = 0; i < sizeof(rgba_map); i++)
                 s->fill[rgba_map[i]] = s->rgba_color[i];
         } else {
+            enum { Y, U, V, A };
+            enum { R, G, B };
             s->yuv_color[Y] = RGB_TO_Y_CCIR(s->rgba_color[R], s->rgba_color[G], s->rgba_color[B]);
             s->yuv_color[U] = RGB_TO_U_CCIR(s->rgba_color[R], s->rgba_color[G], s->rgba_color[B], 0);
             s->yuv_color[V] = RGB_TO_V_CCIR(s->rgba_color[R], s->rgba_color[G], s->rgba_color[B], 0);
@@ -348,7 +329,33 @@ 
     return ff_filter_frame(inlink->dst->outputs[0], frame);
 }
 
-AVFILTER_DEFINE_CLASS(fillborders);
+static int query_formats(AVFilterContext *ctx)
+{
+    static const enum AVPixelFormat pix_fmts[] = {
+        AV_PIX_FMT_YUVA444P, AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV440P,
+        AV_PIX_FMT_YUVJ444P, AV_PIX_FMT_YUVJ440P,
+        AV_PIX_FMT_YUVA422P, AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUVA420P, AV_PIX_FMT_YUV420P,
+        AV_PIX_FMT_YUVJ422P, AV_PIX_FMT_YUVJ420P,
+        AV_PIX_FMT_YUVJ411P, AV_PIX_FMT_YUV411P, AV_PIX_FMT_YUV410P,
+        AV_PIX_FMT_YUV420P9, AV_PIX_FMT_YUV422P9, AV_PIX_FMT_YUV444P9,
+        AV_PIX_FMT_YUV420P10, AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV444P10,
+        AV_PIX_FMT_YUV420P12, AV_PIX_FMT_YUV422P12, AV_PIX_FMT_YUV444P12, AV_PIX_FMT_YUV440P12,
+        AV_PIX_FMT_YUV420P14, AV_PIX_FMT_YUV422P14, AV_PIX_FMT_YUV444P14,
+        AV_PIX_FMT_YUV420P16, AV_PIX_FMT_YUV422P16, AV_PIX_FMT_YUV444P16,
+        AV_PIX_FMT_YUVA420P9, AV_PIX_FMT_YUVA422P9, AV_PIX_FMT_YUVA444P9,
+        AV_PIX_FMT_YUVA420P10, AV_PIX_FMT_YUVA422P10, AV_PIX_FMT_YUVA444P10,
+        AV_PIX_FMT_YUVA420P16, AV_PIX_FMT_YUVA422P16, AV_PIX_FMT_YUVA444P16,
+        AV_PIX_FMT_GBRP, AV_PIX_FMT_GBRP9, AV_PIX_FMT_GBRP10,
+        AV_PIX_FMT_GBRP12, AV_PIX_FMT_GBRP14, AV_PIX_FMT_GBRP16,
+        AV_PIX_FMT_GBRAP, AV_PIX_FMT_GBRAP10, AV_PIX_FMT_GBRAP12, AV_PIX_FMT_GBRAP16,
+        AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAY9, AV_PIX_FMT_GRAY10, AV_PIX_FMT_GRAY12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_GRAY16,
+        AV_PIX_FMT_NONE
+    };
+    AVFilterFormats *fmts_list = ff_make_format_list(pix_fmts);
+    if (!fmts_list)
+        return AVERROR(ENOMEM);
+    return ff_set_common_formats(ctx, fmts_list);
+}
 
 static const AVFilterPad fillborders_inputs[] = {
     {