From patchwork Sat Mar 23 23:13:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ulf Zibis X-Patchwork-Id: 12408 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 56FE9449320 for ; Sun, 24 Mar 2019 01:13:45 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2CE736898F2; Sun, 24 Mar 2019 01:13:45 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from wp215.webpack.hosteurope.de (wp215.webpack.hosteurope.de [80.237.132.222]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 803A5689730 for ; Sun, 24 Mar 2019 01:13:38 +0200 (EET) Received: from dslb-094-220-207-223.094.220.pools.vodafone-ip.de ([94.220.207.223] helo=[192.168.178.140]); authenticated by wp215.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) id 1h7ppW-0004Bd-Pd; Sun, 24 Mar 2019 00:13:37 +0100 To: ffmpeg-devel@ffmpeg.org References: <20190311232534.GG31978@sunshine.barsnick.net> From: Ulf Zibis Message-ID: Date: Sun, 24 Mar 2019 00:13:32 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 In-Reply-To: Content-Language: de-DE X-bounce-key: webpack.hosteurope.de; ulf.zibis@cosoco.de; 1553382823; a5cbc107; X-HE-SMSGID: 1h7ppW-0004Bd-Pd Subject: Re: [FFmpeg-devel] =?utf-8?q?=5BPatch=5D_beautified_+_accelerated_vf?= =?utf-8?q?=5Ffillborders_=E2=80=93_Please_review?= X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Hi again, Am 19.03.19 um 17:31 schrieb Carl Eugen Hoyos: > One run is not good. > Either use the loop option to filter the same frame again and > again or feed a video to ffmpeg. I have new patches. Patch 1 is just a little renaming and a preparation for the benchmark timer code. Patch 2 is a slight enhancement in performance for cases, where only top and bottom borders are filled. Patch 3 beautifies the code an really enhances the performance. See the results included in the benchmark patch -Ulf From 727ead94327e04b9f1128af97dc8607b5276d739 Mon Sep 17 00:00:00 2001 From: Ulf Zibis Date: 24.03.2019, 00:02:09 avfilter/fillborders: enhanced readability; side effect: better performance by less indirections in for loops diff --git a/libavfilter/vf_fillborders.c b/libavfilter/vf_fillborders.c index 43de099..f6631c3 100644 --- a/libavfilter/vf_fillborders.c +++ b/libavfilter/vf_fillborders.c @@ -101,28 +101,30 @@ for (p = 0; p < s->nb_planes; p++) { uint8_t *data = frame->data[p]; int linesize = frame->linesize[p]; + int width = s->planewidth[p]; + int height = s->planeheight[p]; + int left = s->borders[p].left; + int right = s->borders[p].right; + int top = s->borders[p].top; + int bottom = s->borders[p].bottom; /* fill left and right borders from top to bottom border */ - if (s->borders[p].left != 0 || - s->borders[p].right != s->planewidth[p]) // in case skip for performance - for (y = s->borders[p].top; y < s->planeheight[p] - s->borders[p].bottom; y++) { + if (left != 0 || right != width) // in case skip for performance + for (y = top; y < height - bottom; y++) { memset(data + y * linesize, - *(data + y * linesize + s->borders[p].left), - s->borders[p].left); - memset(data + y * linesize + s->planewidth[p] - s->borders[p].right, - *(data + y * linesize + s->planewidth[p] - s->borders[p].right - 1), - s->borders[p].right); + *(data + y * linesize + left), left); + memset(data + y * linesize + width - right, + *(data + y * linesize + width - right - 1), right); } /* fill top and bottom borders */ - for (y = 0; y < s->borders[p].top; y++) { + for (y = 0; y < top; y++) { memcpy(data + y * linesize, - data + s->borders[p].top * linesize, s->planewidth[p]); + data + top * linesize, width); } - for (y = s->planeheight[p] - s->borders[p].bottom; y < s->planeheight[p]; y++) { + for (y = height - bottom; y < height; y++) { memcpy(data + y * linesize, - data + (s->planeheight[p] - s->borders[p].bottom - 1) * linesize, - s->planewidth[p]); + data + (height - bottom - 1) * linesize, width); } } } @@ -134,29 +136,33 @@ for (p = 0; p < s->nb_planes; p++) { uint16_t *data = (uint16_t *)frame->data[p]; int linesize = frame->linesize[p] / sizeof(uint16_t); + int width = s->planewidth[p]; + int height = s->planeheight[p]; + int left = s->borders[p].left; + int right = s->borders[p].right; + int top = s->borders[p].top; + int bottom = s->borders[p].bottom; /* fill left and right borders from top to bottom border */ - if (s->borders[p].left != 0 || - s->borders[p].right != s->planewidth[p]) // in case skip for performance - for (y = s->borders[p].top; y < s->planeheight[p] - s->borders[p].bottom; y++) { - for (x = 0; x < s->borders[p].left; x++) { - data[y * linesize + x] = *(data + y * linesize + s->borders[p].left); + if (left != 0 || right != width) // in case skip for performance + for (y = top; y < height - bottom; y++) { + for (x = 0; x < left; x++) { + data[y * linesize + x] = data[y * linesize + left]; } - for (x = 0; x < s->borders[p].right; x++) { - data[y * linesize + s->planewidth[p] - s->borders[p].right + x] = - *(data + y * linesize + s->planewidth[p] - s->borders[p].right - 1); + for (x = 0; x < right; x++) { + data[y * linesize + width - right + x] = + data[y * linesize + width - right - 1]; } } /* fill top and bottom borders */ - for (y = 0; y < s->borders[p].top; y++) { + for (y = 0; y < top; y++) { memcpy(data + y * linesize, - data + s->borders[p].top * linesize, s->planewidth[p] * sizeof(uint16_t)); + data + top * linesize, width * sizeof(uint16_t)); } - for (y = s->planeheight[p] - s->borders[p].bottom; y < s->planeheight[p]; y++) { + for (y = height - bottom; y < height; y++) { memcpy(data + y * linesize, - data + (s->planeheight[p] - s->borders[p].bottom - 1) * linesize, - s->planewidth[p] * sizeof(uint16_t)); + data + (height - bottom - 1) * linesize, width * sizeof(uint16_t)); } } } @@ -168,30 +174,33 @@ for (p = 0; p < s->nb_planes; p++) { uint8_t *data = frame->data[p]; int linesize = frame->linesize[p]; + int width = s->planewidth[p]; + int height = s->planeheight[p]; + int left = s->borders[p].left; + int right = s->borders[p].right; + int top = s->borders[p].top; + int bottom = s->borders[p].bottom; /* fill left and right borders from top to bottom border */ - if (s->borders[p].left != 0 || - s->borders[p].right != s->planewidth[p]) // in case skip for performance - for (y = s->borders[p].top; y < s->planeheight[p] - s->borders[p].bottom; y++) { - for (x = 0; x < s->borders[p].left; x++) { - data[y * linesize + x] = data[y * linesize + s->borders[p].left * 2 - 1 - x]; + if (left != 0 || right != width) // in case skip for performance + for (y = top; y < height - bottom; y++) { + for (x = 0; x < left; x++) { + data[y * linesize + x] = data[y * linesize + left * 2 - 1 - x]; } - for (x = 0; x < s->borders[p].right; x++) { - data[y * linesize + s->planewidth[p] - s->borders[p].right + x] = - data[y * linesize + s->planewidth[p] - s->borders[p].right - 1 - x]; + for (x = 0; x < right; x++) { + data[y * linesize + width - right + x] = + data[y * linesize + width - right - 1 - x]; } } /* fill top and bottom borders */ - for (y = 0; y < s->borders[p].top; y++) { + for (y = 0; y < top; y++) { memcpy(data + y * linesize, - data + (s->borders[p].top * 2 - 1 - y) * linesize, - s->planewidth[p]); + data + (top * 2 - 1 - y) * linesize, width); } - for (y = 0; y < s->borders[p].bottom; y++) { - memcpy(data + (s->planeheight[p] - s->borders[p].bottom + y) * linesize, - data + (s->planeheight[p] - s->borders[p].bottom - 1 - y) * linesize, - s->planewidth[p]); + for (y = 0; y < bottom; y++) { + memcpy(data + (height - bottom + y) * linesize, + data + (height - bottom - 1 - y) * linesize, width); } } } @@ -203,31 +212,34 @@ for (p = 0; p < s->nb_planes; p++) { uint16_t *data = (uint16_t *)frame->data[p]; int linesize = frame->linesize[p] / sizeof(uint16_t); + int width = s->planewidth[p]; + int height = s->planeheight[p]; + int left = s->borders[p].left; + int right = s->borders[p].right; + int top = s->borders[p].top; + int bottom = s->borders[p].bottom; /* fill left and right borders from top to bottom border */ - if (s->borders[p].left != 0 || - s->borders[p].right != s->planewidth[p]) // in case skip for performance - for (y = s->borders[p].top; y < s->planeheight[p] - s->borders[p].bottom; y++) { - for (x = 0; x < s->borders[p].left; x++) { - data[y * linesize + x] = data[y * linesize + s->borders[p].left * 2 - 1 - x]; + if (left != 0 || right != width) // in case skip for performance + for (y = top; y < height - bottom; y++) { + for (x = 0; x < left; x++) { + data[y * linesize + x] = data[y * linesize + left * 2 - 1 - x]; } - for (x = 0; x < s->borders[p].right; x++) { - data[y * linesize + s->planewidth[p] - s->borders[p].right + x] = - data[y * linesize + s->planewidth[p] - s->borders[p].right - 1 - x]; + for (x = 0; x < right; x++) { + data[y * linesize + width - right + x] = + data[y * linesize + width - right - 1 - x]; } } /* fill top and bottom borders */ - for (y = 0; y < s->borders[p].top; y++) { + for (y = 0; y < top; y++) { memcpy(data + y * linesize, - data + (s->borders[p].top * 2 - 1 - y) * linesize, - s->planewidth[p] * sizeof(uint16_t)); + data + (top * 2 - 1 - y) * linesize, width * sizeof(uint16_t)); } - for (y = 0; y < s->borders[p].bottom; y++) { - memcpy(data + (s->planeheight[p] - s->borders[p].bottom + y) * linesize, - data + (s->planeheight[p] - s->borders[p].bottom - 1 - y) * linesize, - s->planewidth[p] * sizeof(uint16_t)); + for (y = 0; y < bottom; y++) { + memcpy(data + (height - bottom + y) * linesize, + data + (height - bottom - 1 - y) * linesize, width * sizeof(uint16_t)); } } } @@ -238,24 +250,28 @@ for (p = 0; p < s->nb_planes; p++) { uint8_t *data = frame->data[p]; - uint8_t fill = s->fill[p]; int linesize = frame->linesize[p]; + int width = s->planewidth[p]; + int height = s->planeheight[p]; + int left = s->borders[p].left; + int right = s->borders[p].right; + int top = s->borders[p].top; + int bottom = s->borders[p].bottom; + uint8_t fill = s->fill[p]; /* fill left and right borders from top to bottom border */ - if (s->borders[p].left != 0 || - s->borders[p].right != s->planewidth[p]) // in case skip for performance - for (y = s->borders[p].top; y < s->planeheight[p] - s->borders[p].bottom; y++) { - memset(data + y * linesize, fill, s->borders[p].left); - memset(data + y * linesize + s->planewidth[p] - s->borders[p].right, fill, - s->borders[p].right); + if (left != 0 || right != width) // in case skip for performance + for (y = top; y < height - bottom; y++) { + memset(data + y * linesize, fill, left); + memset(data + y * linesize + width - right, fill, right); } /* fill top and bottom borders */ - for (y = 0; y < s->borders[p].top; y++) { - memset(data + y * linesize, fill, s->planewidth[p]); + for (y = 0; y < top; y++) { + memset(data + y * linesize, fill, width); } - for (y = s->planeheight[p] - s->borders[p].bottom; y < s->planeheight[p]; y++) { - memset(data + y * linesize, fill, s->planewidth[p]); + for (y = height - bottom; y < height; y++) { + memset(data + y * linesize, fill, width); } } } @@ -266,29 +282,34 @@ for (p = 0; p < s->nb_planes; p++) { uint16_t *data = (uint16_t *)frame->data[p]; - uint16_t fill = s->fill[p] << (s->depth - 8); int linesize = frame->linesize[p] / sizeof(uint16_t); + int width = s->planewidth[p]; + int height = s->planeheight[p]; + int left = s->borders[p].left; + int right = s->borders[p].right; + int top = s->borders[p].top; + int bottom = s->borders[p].bottom; + uint16_t fill = s->fill[p] << (s->depth - 8); /* fill left and right borders from top to bottom border */ - if (s->borders[p].left != 0 || - s->borders[p].right != s->planewidth[p]) // in case skip for performance - for (y = s->borders[p].top; y < s->planeheight[p] - s->borders[p].bottom; y++) { - for (x = 0; x < s->borders[p].left; x++) { + if (left != 0 || right != width) // in case skip for performance + for (y = top; y < height - bottom; y++) { + for (x = 0; x < left; x++) { data[y * linesize + x] = fill; } - for (x = 0; x < s->borders[p].right; x++) { - data[y * linesize + s->planewidth[p] - s->borders[p].right + x] = fill; + for (x = 0; x < right; x++) { + data[y * linesize + width - right + x] = fill; } } /* fill top and bottom borders */ - for (y = 0; y < s->borders[p].top; y++) { - for (x = 0; x < s->planewidth[p]; x++) { + for (y = 0; y < top; y++) { + for (x = 0; x < width; x++) { data[y * linesize + x] = fill; } } - for (y = s->planeheight[p] - s->borders[p].bottom; y < s->planeheight[p]; y++) { - for (x = 0; x < s->planewidth[p]; x++) { + for (y = height - bottom; y < height; y++) { + for (x = 0; x < width; x++) { data[y * linesize + x] = fill; } }