From patchwork Thu Mar 28 21:01:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ulf Zibis X-Patchwork-Id: 12523 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id B47744497F2 for ; Thu, 28 Mar 2019 23:01:48 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 93F7568A965; Thu, 28 Mar 2019 23:01:48 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from wp215.webpack.hosteurope.de (wp215.webpack.hosteurope.de [80.237.132.222]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E7180689D16 for ; Thu, 28 Mar 2019 23:01:41 +0200 (EET) Received: from dslb-088-077-117-041.088.077.pools.vodafone-ip.de ([88.77.117.41] helo=[192.168.178.140]); authenticated by wp215.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) id 1h9c9b-0007Ih-5G; Thu, 28 Mar 2019 22:01:41 +0100 To: ffmpeg-devel@ffmpeg.org References: <20190311232534.GG31978@sunshine.barsnick.net> From: Ulf Zibis Message-ID: <8760af5a-88ad-1e6e-8515-1d6b0b88ba6d@CoSoCo.de> Date: Thu, 28 Mar 2019 22:01:38 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 In-Reply-To: Content-Language: de-DE X-bounce-key: webpack.hosteurope.de; ulf.zibis@cosoco.de; 1553806906; ce7ecb85; X-HE-SMSGID: 1h9c9b-0007Ih-5G Subject: Re: [FFmpeg-devel] =?utf-8?q?=5BPatch=5D_beautified_+_accelerated_vf?= =?utf-8?q?=5Ffillborders_=E2=80=93_Please_review?= X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Hi again, Am 25.03.19 um 12:31 schrieb Ulf Zibis: >> There are two patches "1", one with wrong indentation. > I intentionally have provided 2 patches with the same number, one for > the code base an one with additions for the benchmark. I've catched the > wrong indentation, hopefully at the place you meant. > > I'm preparing a new set of patches to follow your advice. > >> Do I read the results correctly that for all patches some cases >> get faster and others get slower? > Correct. I'm wondering about the cases, where it gets such slower. So > I'm interested in an answer from you experienced developers to > understand this. Maybe a compiler option would help. Here they are, my new set of patches. The most patches are more or less cosmetic, but good for preparing the essential patch of #9. As you can see from the benchmark log included in the vf_fillbd_benchmark_9.patch I have attained a performance gain up to 45 %. It is remarkable, that in several cases the processing of 16-bit planes is often faster as of 8-bit planes of same image dimension. Regards, -Ulf From b75d289b25c4eb764b76f035c028647a56e52899 Mon Sep 17 00:00:00 2001 From: Ulf Zibis Date: 28.03.2019, 20:32:33 avfilter/fillborders: move definitions to their context, also to reduce their scope diff --git a/libavfilter/vf_fillborders.c b/libavfilter/vf_fillborders.c index 3757351..7cf6acd 100644 --- a/libavfilter/vf_fillborders.c +++ b/libavfilter/vf_fillborders.c @@ -24,9 +24,6 @@ #include "drawutils.h" #include "internal.h" -enum { Y, U, V, A }; -enum { R, G, B }; - enum FillMode { FM_SMEAR, FM_MIRROR, FM_FIXED, FM_NB_MODES }; typedef struct Borders { @@ -66,33 +63,11 @@ { NULL } }; -static int query_formats(AVFilterContext *ctx) -{ - static const enum AVPixelFormat pix_fmts[] = { - AV_PIX_FMT_YUVA444P, AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV440P, - AV_PIX_FMT_YUVJ444P, AV_PIX_FMT_YUVJ440P, - AV_PIX_FMT_YUVA422P, AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUVA420P, AV_PIX_FMT_YUV420P, - AV_PIX_FMT_YUVJ422P, AV_PIX_FMT_YUVJ420P, - AV_PIX_FMT_YUVJ411P, AV_PIX_FMT_YUV411P, AV_PIX_FMT_YUV410P, - AV_PIX_FMT_YUV420P9, AV_PIX_FMT_YUV422P9, AV_PIX_FMT_YUV444P9, - AV_PIX_FMT_YUV420P10, AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV444P10, - AV_PIX_FMT_YUV420P12, AV_PIX_FMT_YUV422P12, AV_PIX_FMT_YUV444P12, AV_PIX_FMT_YUV440P12, - AV_PIX_FMT_YUV420P14, AV_PIX_FMT_YUV422P14, AV_PIX_FMT_YUV444P14, - AV_PIX_FMT_YUV420P16, AV_PIX_FMT_YUV422P16, AV_PIX_FMT_YUV444P16, - AV_PIX_FMT_YUVA420P9, AV_PIX_FMT_YUVA422P9, AV_PIX_FMT_YUVA444P9, - AV_PIX_FMT_YUVA420P10, AV_PIX_FMT_YUVA422P10, AV_PIX_FMT_YUVA444P10, - AV_PIX_FMT_YUVA420P16, AV_PIX_FMT_YUVA422P16, AV_PIX_FMT_YUVA444P16, - AV_PIX_FMT_GBRP, AV_PIX_FMT_GBRP9, AV_PIX_FMT_GBRP10, - AV_PIX_FMT_GBRP12, AV_PIX_FMT_GBRP14, AV_PIX_FMT_GBRP16, - AV_PIX_FMT_GBRAP, AV_PIX_FMT_GBRAP10, AV_PIX_FMT_GBRAP12, AV_PIX_FMT_GBRAP16, - AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAY9, AV_PIX_FMT_GRAY10, AV_PIX_FMT_GRAY12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_GRAY16, - AV_PIX_FMT_NONE - }; - AVFilterFormats *fmts_list = ff_make_format_list(pix_fmts); - if (!fmts_list) - return AVERROR(ENOMEM); - return ff_set_common_formats(ctx, fmts_list); -} +AVFILTER_DEFINE_CLASS(fillborders); + +/***********************/ +/* Private functions */ +/***********************/ static void smear_borders8(FillBordersContext *s, AVFrame *frame) { @@ -270,11 +245,14 @@ static char testcase[128]; +/*****************************/ +/* Global access functions */ +/*****************************/ + static int config_props(AVFilterLink *inlink) { AVFilterContext *ctx = inlink->dst; FillBordersContext *s = ctx->priv; - const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(inlink->format); if (inlink->w < s->left + s->right || inlink->w <= s->left || @@ -290,6 +268,7 @@ return AVERROR(EINVAL); } + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(inlink->format); s->nb_planes = desc->nb_components; s->depth = desc->comp[0].depth; @@ -317,6 +296,8 @@ for (int i = 0; i < sizeof(rgba_map); i++) s->fill[rgba_map[i]] = s->rgba_color[i]; } else { + enum { Y, U, V, A }; + enum { R, G, B }; s->yuv_color[Y] = RGB_TO_Y_CCIR(s->rgba_color[R], s->rgba_color[G], s->rgba_color[B]); s->yuv_color[U] = RGB_TO_U_CCIR(s->rgba_color[R], s->rgba_color[G], s->rgba_color[B], 0); s->yuv_color[V] = RGB_TO_V_CCIR(s->rgba_color[R], s->rgba_color[G], s->rgba_color[B], 0); @@ -348,7 +329,33 @@ return ff_filter_frame(inlink->dst->outputs[0], frame); } -AVFILTER_DEFINE_CLASS(fillborders); +static int query_formats(AVFilterContext *ctx) +{ + static const enum AVPixelFormat pix_fmts[] = { + AV_PIX_FMT_YUVA444P, AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV440P, + AV_PIX_FMT_YUVJ444P, AV_PIX_FMT_YUVJ440P, + AV_PIX_FMT_YUVA422P, AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUVA420P, AV_PIX_FMT_YUV420P, + AV_PIX_FMT_YUVJ422P, AV_PIX_FMT_YUVJ420P, + AV_PIX_FMT_YUVJ411P, AV_PIX_FMT_YUV411P, AV_PIX_FMT_YUV410P, + AV_PIX_FMT_YUV420P9, AV_PIX_FMT_YUV422P9, AV_PIX_FMT_YUV444P9, + AV_PIX_FMT_YUV420P10, AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV444P10, + AV_PIX_FMT_YUV420P12, AV_PIX_FMT_YUV422P12, AV_PIX_FMT_YUV444P12, AV_PIX_FMT_YUV440P12, + AV_PIX_FMT_YUV420P14, AV_PIX_FMT_YUV422P14, AV_PIX_FMT_YUV444P14, + AV_PIX_FMT_YUV420P16, AV_PIX_FMT_YUV422P16, AV_PIX_FMT_YUV444P16, + AV_PIX_FMT_YUVA420P9, AV_PIX_FMT_YUVA422P9, AV_PIX_FMT_YUVA444P9, + AV_PIX_FMT_YUVA420P10, AV_PIX_FMT_YUVA422P10, AV_PIX_FMT_YUVA444P10, + AV_PIX_FMT_YUVA420P16, AV_PIX_FMT_YUVA422P16, AV_PIX_FMT_YUVA444P16, + AV_PIX_FMT_GBRP, AV_PIX_FMT_GBRP9, AV_PIX_FMT_GBRP10, + AV_PIX_FMT_GBRP12, AV_PIX_FMT_GBRP14, AV_PIX_FMT_GBRP16, + AV_PIX_FMT_GBRAP, AV_PIX_FMT_GBRAP10, AV_PIX_FMT_GBRAP12, AV_PIX_FMT_GBRAP16, + AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAY9, AV_PIX_FMT_GRAY10, AV_PIX_FMT_GRAY12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_GRAY16, + AV_PIX_FMT_NONE + }; + AVFilterFormats *fmts_list = ff_make_format_list(pix_fmts); + if (!fmts_list) + return AVERROR(ENOMEM); + return ff_set_common_formats(ctx, fmts_list); +} static const AVFilterPad fillborders_inputs[] = { {