From patchwork Wed Oct 23 10:27:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lance Wang X-Patchwork-Id: 15912 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 3226E449817 for ; Wed, 23 Oct 2019 13:28:07 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 18E8668A93E; Wed, 23 Oct 2019 13:28:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f172.google.com (mail-pg1-f172.google.com [209.85.215.172]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E7A3B68A93E for ; Wed, 23 Oct 2019 13:27:59 +0300 (EEST) Received: by mail-pg1-f172.google.com with SMTP id w3so11888604pgt.5 for ; Wed, 23 Oct 2019 03:27:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=H2GOiI7plQiDoRHbUpZh7QTpdxofMtOXZTA9OtGXFts=; b=tD6dQ9WAKBNnQWTSWQN7hae6xwzCW+KcNrkwUV0euMYJWdEqiz0lrsTcuK3fC6MBJa qCb6EORlyNKEWcodp8KLAuXeqhbbkR64wKxHQfKH63ig38BkQSQxXisjTz6dVbnhkubj ZGON9e9JdWHPr0NXFCyxlP8DHK5r09VxV67GOJxTPlnNgk2JeicKzIxB9lPigOP8DJX9 KXuBnAdSmIvEw4TZZj9h0Gu0O/Xw4W8j3VFM+v5ak5lJfD1gDcVoTFOl3LRF6BAlveJU zSsirEmdcKuWvxXS6UJSw9xnRKv+vws6WxxIuh4gr5y/XBmCHj/IHRrISXwDOl1s94aw CmKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=H2GOiI7plQiDoRHbUpZh7QTpdxofMtOXZTA9OtGXFts=; b=IsAxwVgtwGMyNfPXZfZBQ+E3iwI6qQODK/x07F91wbmCYEKHqe8t0JGlm0rQjEUDgW DgzbIhM3xaQOmvWYCeDh3iMlu9Uv2YidQkwpg+nHX6sNgz4tgK0mVticKBYG6z5jUbLA Xvja1Bal/Qg+fMRw1WMLVzUXZZXfx7Rr4lUWOWsFBN7tbA0LGrYuCHmlQVi+yEvPEzzM kL7Fc9HVFDM/Ky8gxO0rTLmQmjYrALTvMcMXfQt7rW/Utt4cORi1Wmaqcl6YQMV230ba mms3uti1jYvBJpOhEw7daXR8CKAhxG+YaS6wD3vgNwk0me9pPv0FbENOqqEBjwvBTW1M PWuw== X-Gm-Message-State: APjAAAUT8bKghQfGN9lbEfkPBCSQ6s+Vywx5jUjUvBR9n9EVIMGZ/Nzy h3r+ABnx01J/da7bYyGT3lueBneuV+E= X-Google-Smtp-Source: APXvYqyXAHkiA05kfnyd4qP0r4S0//dtQnaytLz2wn9/fYJmtZd4L0+GEEWcexl1gc//Ng3DXY+99A== X-Received: by 2002:a63:3044:: with SMTP id w65mr8834894pgw.384.1571826477883; Wed, 23 Oct 2019 03:27:57 -0700 (PDT) Received: from vpn.localdomain ([47.90.99.151]) by smtp.gmail.com with ESMTPSA id w11sm20453068pgl.82.2019.10.23.03.27.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 23 Oct 2019 03:27:57 -0700 (PDT) From: lance.lmwang@gmail.com To: ffmpeg-devel@ffmpeg.org Date: Wed, 23 Oct 2019 18:27:43 +0800 Message-Id: <20191023102743.19979-3-lance.lmwang@gmail.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20191023102743.19979-1-lance.lmwang@gmail.com> References: <20191023102743.19979-1-lance.lmwang@gmail.com> Subject: [FFmpeg-devel] [PATCH v1 3/3] avfilter/colorlevels: add slice threading support with less code X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Limin Wang MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Limin Wang Signed-off-by: Limin Wang --- libavfilter/vf_colorlevels.c | 176 +++++++++++++++-------------------- 1 file changed, 77 insertions(+), 99 deletions(-) diff --git a/libavfilter/vf_colorlevels.c b/libavfilter/vf_colorlevels.c index 5385a5e754..f8645a08bd 100644 --- a/libavfilter/vf_colorlevels.c +++ b/libavfilter/vf_colorlevels.c @@ -26,6 +26,7 @@ #include "formats.h" #include "internal.h" #include "video.h" +#include "thread.h" #define R 0 #define G 1 @@ -37,6 +38,11 @@ typedef struct Range { double out_min, out_max; } Range; +typedef struct ThreadData { + AVFrame *in; + AVFrame *out; +} ThreadData; + typedef struct ColorLevelsContext { const AVClass *class; Range range[4]; @@ -45,6 +51,7 @@ typedef struct ColorLevelsContext { int step; uint8_t rgba_map[4]; int linesize; + int (*colorlevels_slice)(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs); } ColorLevelsContext; #define OFFSET(x) offsetof(ColorLevelsContext, x) @@ -90,6 +97,68 @@ static int query_formats(AVFilterContext *ctx) return ff_set_common_formats(ctx, fmts_list); } +#define DEFINE_COLORLEVELS(type, nbits) \ +static int do_##nbits##bit_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) \ +{ \ + ColorLevelsContext *s = ctx->priv; \ + AVFilterLink *inlink = ctx->inputs[0]; \ + const int step = s->step; \ + int x, y, i; \ + ThreadData *td = arg; \ + const AVFrame *in = td->in; \ + AVFrame *out = td->out; \ + \ + for (i = 0; i < s->nb_comp; i++) { \ + Range *r = &s->range[i]; \ + const int slice_start = (inlink->h * jobnr) / nb_jobs; \ + const int slice_end = (inlink->h * (jobnr+1)) / nb_jobs; \ + const uint8_t offset = s->rgba_map[i]; \ + const uint8_t *srcrow = in->data[0] + slice_start * in->linesize[0]; \ + uint8_t *dstrow = out->data[0] + slice_start * out->linesize[0]; \ + int imin = lrint(r->in_min * UINT##nbits##_MAX); \ + int imax = lrint(r->in_max * UINT##nbits##_MAX); \ + int omin = lrint(r->out_min * UINT##nbits##_MAX); \ + int omax = lrint(r->out_max * UINT##nbits##_MAX); \ + double coeff; \ + \ + if (imin < 0) { \ + imin = UINT##nbits##_MAX; \ + for (y = slice_start; y < slice_end; y++) { \ + const type *src = (const type *)srcrow; \ + \ + for (x = 0; x < s->linesize; x += step) \ + imin = FFMIN(imin, src[x + offset]); \ + srcrow += in->linesize[0]; \ + } \ + } \ + if (imax < 0) { \ + imax = 0; \ + for (y = slice_start; y < slice_end; y++) { \ + const type *src = (const type *)srcrow; \ + \ + for (x = 0; x < s->linesize; x += step) \ + imax = FFMAX(imax, src[x + offset]); \ + srcrow += in->linesize[0]; \ + } \ + } \ + \ + coeff = (omax - omin) / (double)(imax - imin); \ + for (y = slice_start; y < slice_end; y++) { \ + const type *src = (const type*)srcrow; \ + type *dst = (type *)dstrow; \ + \ + for (x = 0; x < s->linesize; x += step) \ + dst[x + offset] = av_clip_uint##nbits( \ + (src[x + offset] - imin) * coeff + omin); \ + dstrow += out->linesize[0]; \ + srcrow += in->linesize[0]; \ + } \ + } \ + return 0; \ +} +DEFINE_COLORLEVELS(uint8_t, 8) +DEFINE_COLORLEVELS(uint16_t, 16) + static int config_input(AVFilterLink *inlink) { AVFilterContext *ctx = inlink->dst; @@ -102,17 +171,17 @@ static int config_input(AVFilterLink *inlink) s->linesize = inlink->w * s->step; ff_fill_rgba_map(s->rgba_map, inlink->format); + s->colorlevels_slice = s->bpp <= 1 ? do_8bit_slice : do_16bit_slice; return 0; } static int filter_frame(AVFilterLink *inlink, AVFrame *in) { AVFilterContext *ctx = inlink->dst; - ColorLevelsContext *s = ctx->priv; AVFilterLink *outlink = ctx->outputs[0]; - const int step = s->step; + ColorLevelsContext *s = ctx->priv; AVFrame *out; - int x, y, i; + ThreadData td; if (av_frame_is_writable(in)) { out = in; @@ -125,101 +194,10 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) av_frame_copy_props(out, in); } - switch (s->bpp) { - case 1: - for (i = 0; i < s->nb_comp; i++) { - Range *r = &s->range[i]; - const uint8_t offset = s->rgba_map[i]; - const uint8_t *srcrow = in->data[0]; - uint8_t *dstrow = out->data[0]; - int imin = lrint(r->in_min * UINT8_MAX); - int imax = lrint(r->in_max * UINT8_MAX); - int omin = lrint(r->out_min * UINT8_MAX); - int omax = lrint(r->out_max * UINT8_MAX); - double coeff; - - if (imin < 0) { - imin = UINT8_MAX; - for (y = 0; y < inlink->h; y++) { - const uint8_t *src = srcrow; - - for (x = 0; x < s->linesize; x += step) - imin = FFMIN(imin, src[x + offset]); - srcrow += in->linesize[0]; - } - } - if (imax < 0) { - srcrow = in->data[0]; - imax = 0; - for (y = 0; y < inlink->h; y++) { - const uint8_t *src = srcrow; - - for (x = 0; x < s->linesize; x += step) - imax = FFMAX(imax, src[x + offset]); - srcrow += in->linesize[0]; - } - } - - srcrow = in->data[0]; - coeff = (omax - omin) / (double)(imax - imin); - for (y = 0; y < inlink->h; y++) { - const uint8_t *src = srcrow; - uint8_t *dst = dstrow; - - for (x = 0; x < s->linesize; x += step) - dst[x + offset] = av_clip_uint8((src[x + offset] - imin) * coeff + omin); - dstrow += out->linesize[0]; - srcrow += in->linesize[0]; - } - } - break; - case 2: - for (i = 0; i < s->nb_comp; i++) { - Range *r = &s->range[i]; - const uint8_t offset = s->rgba_map[i]; - const uint8_t *srcrow = in->data[0]; - uint8_t *dstrow = out->data[0]; - int imin = lrint(r->in_min * UINT16_MAX); - int imax = lrint(r->in_max * UINT16_MAX); - int omin = lrint(r->out_min * UINT16_MAX); - int omax = lrint(r->out_max * UINT16_MAX); - double coeff; - - if (imin < 0) { - imin = UINT16_MAX; - for (y = 0; y < inlink->h; y++) { - const uint16_t *src = (const uint16_t *)srcrow; - - for (x = 0; x < s->linesize; x += step) - imin = FFMIN(imin, src[x + offset]); - srcrow += in->linesize[0]; - } - } - if (imax < 0) { - srcrow = in->data[0]; - imax = 0; - for (y = 0; y < inlink->h; y++) { - const uint16_t *src = (const uint16_t *)srcrow; - - for (x = 0; x < s->linesize; x += step) - imax = FFMAX(imax, src[x + offset]); - srcrow += in->linesize[0]; - } - } - - srcrow = in->data[0]; - coeff = (omax - omin) / (double)(imax - imin); - for (y = 0; y < inlink->h; y++) { - const uint16_t *src = (const uint16_t*)srcrow; - uint16_t *dst = (uint16_t *)dstrow; - - for (x = 0; x < s->linesize; x += step) - dst[x + offset] = av_clip_uint16((src[x + offset] - imin) * coeff + omin); - dstrow += out->linesize[0]; - srcrow += in->linesize[0]; - } - } - } + td.in = in; + td.out = out; + ctx->internal->execute(ctx, s->colorlevels_slice, &td, NULL, + FFMIN(inlink->h, ff_filter_get_nb_threads(ctx))); if (in != out) av_frame_free(&in); @@ -252,5 +230,5 @@ AVFilter ff_vf_colorlevels = { .query_formats = query_formats, .inputs = colorlevels_inputs, .outputs = colorlevels_outputs, - .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC, + .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC | AVFILTER_FLAG_SLICE_THREADS, };