[FFmpeg-devel,V1] lavfi/normalize: improve the performance

Message ID	1561038226-451-1-git-send-email-mypopydev@gmail.com
State	New
Headers	show Return-Path: <ffmpeg-devel-bounces@ffmpeg.org> From: Jun Zhao <mypopydev@gmail.com> To: ffmpeg-devel@ffmpeg.org Date: Thu, 20 Jun 2019 21:43:46 +0800 Message-Id: <1561038226-451-1-git-send-email-mypopydev@gmail.com> Subject: [FFmpeg-devel] [PATCH V1] lavfi/normalize: improve the performance Precedence: list Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: Jun Zhao <barryjzhao@tencent.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Message ID

1561038226-451-1-git-send-email-mypopydev@gmail.com

State

New

Headers

From: Jun Zhao <mypopydev@gmail.com>
To: ffmpeg-devel@ffmpeg.org
Date: Thu, 20 Jun 2019 21:43:46 +0800
Message-Id: <1561038226-451-1-git-send-email-mypopydev@gmail.com>
Subject: [FFmpeg-devel] [PATCH V1] lavfi/normalize: improve the performance
Precedence: list
Reply-To: FFmpeg development discussions and patches
	<ffmpeg-devel@ffmpeg.org>
Cc: Jun Zhao <barryjzhao@tencent.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

Commit Message

Jun Zhao June 20, 2019, 1:43 p.m. UTC

From: Jun Zhao <barryjzhao@tencent.com>

Remove unnecessary max value found, it's will improve the performance
about 10%. Used the test command like:
ffmpeg -i 1080P.mp4 -an -vf normalize -f null /dev/null, the FPS change
from 96fps to 107fps.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
---
 libavfilter/vf_normalize.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

Comments

Reimar Döffinger June 21, 2019, 8:39 a.m. UTC | #1

On Thu, Jun 20, 2019 at 09:43:46PM +0800, Jun Zhao wrote:
> From: Jun Zhao <barryjzhao@tencent.com>
>
> Remove unnecessary max value found, it's will improve the performance
> about 10%. Used the test command like:
> ffmpeg -i 1080P.mp4 -an -vf normalize -f null /dev/null, the FPS change
> from 96fps to 107fps.
>
> Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
> ---
>  libavfilter/vf_normalize.c |    7 +++----
>  1 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/libavfilter/vf_normalize.c b/libavfilter/vf_normalize.c
> index 48eea59..40dc031 100644
> --- a/libavfilter/vf_normalize.c
> +++ b/libavfilter/vf_normalize.c
> @@ -143,14 +143,13 @@ static void normalize(NormalizeContext *s, AVFrame *in, AVFrame *out)
>          min[c].in = max[c].in = in->data[0][s->co[c]];
>      for (y = 0; y < in->height; y++) {
>          uint8_t *inp = in->data[0] + y * in->linesize[0];
> -        uint8_t *outp = out->data[0] + y * out->linesize[0];
>          for (x = 0; x < in->width; x++) {
>              for (c = 0; c < 3; c++) {
> -                min[c].in = FFMIN(min[c].in, inp[s->co[c]]);
> -                max[c].in = FFMAX(max[c].in, inp[s->co[c]]);
> +                uint8_t val = inp[s->co[c]];
> +                if (val < min[c].in)      min[c].in = val;
> +                else if (val > max[c].in) max[c].in = val;
>              }
>              inp += s->step;
> -            outp += s->step;

outp should just be removed, a patch for that certainly would be
accepted.
But I am very skeptical towards the rest of the change.
In a way I would expect the compiler should be able to
optimize this.
However this code is clearly completely unoptimized, and
thus I don't think making it more complex for a minor optimization
that gives just 10% is worth it.
A proper optimization would get rid of the s->co indirection in the
inner loop - it really is not necessary for the limited format
support the code has.
This would then also allow for SIMD optimization, which should give
a vastly larger speedup - and possibly the compiler would even be able
to auto-vectorize.
This applies even more to the processing step - the transform
calculation done in SIMD is almost certain to be cheaper than
a LUT lookup - at least if switch to fixed-point arithmetic
instead of float.
Side note: I haven't really figured out what the point
of all this s->co complexity is supposed to be.
The only thing the code seems to actually care is whether
there is an alpha channel, and if so where it is.
But the quickest path to a significant speedup would
be to write a SIMD version of the normalize function specific
to that format and the SIMD instruction set available
and call it instead of normalize if applicable.
I would expect that you should see at least a 3x speedup
if using e.g. SSE2 (which is why I don't consider a
10% speedup worth changing the code for really).

Best regards,
Reimar Döffinger

diff --git a/libavfilter/vf_normalize.c b/libavfilter/vf_normalize.c
index 48eea59..40dc031 100644
--- a/libavfilter/vf_normalize.c
+++ b/libavfilter/vf_normalize.c
@@ -143,14 +143,13 @@  static void normalize(NormalizeContext *s, AVFrame *in, AVFrame *out)
         min[c].in = max[c].in = in->data[0][s->co[c]];
     for (y = 0; y < in->height; y++) {
         uint8_t *inp = in->data[0] + y * in->linesize[0];
-        uint8_t *outp = out->data[0] + y * out->linesize[0];
         for (x = 0; x < in->width; x++) {
             for (c = 0; c < 3; c++) {
-                min[c].in = FFMIN(min[c].in, inp[s->co[c]]);
-                max[c].in = FFMAX(max[c].in, inp[s->co[c]]);
+                uint8_t val = inp[s->co[c]];
+                if (val < min[c].in)      min[c].in = val;
+                else if (val > max[c].in) max[c].in = val;
             }
             inp += s->step;
-            outp += s->step;
         }
     }

[FFmpeg-devel,V1] lavfi/normalize: improve the performance

Commit Message

Comments

Patch