From patchwork Tue Feb 14 19:43:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Oberhoff X-Patchwork-Id: 2548 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.89.21 with SMTP id n21csp1615250vsb; Tue, 14 Feb 2017 11:51:02 -0800 (PST) X-Received: by 10.28.54.195 with SMTP id y64mr4548389wmh.10.1487101862261; Tue, 14 Feb 2017 11:51:02 -0800 (PST) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j4si1985960wrd.206.2017.02.14.11.51.01; Tue, 14 Feb 2017 11:51:02 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@googlemail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=googlemail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1445F6879BC; Tue, 14 Feb 2017 21:50:54 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm0-f66.google.com (mail-wm0-f66.google.com [74.125.82.66]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8E568680159 for ; Tue, 14 Feb 2017 21:50:47 +0200 (EET) Received: by mail-wm0-f66.google.com with SMTP id r18so5158708wmd.3 for ; Tue, 14 Feb 2017 11:50:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=from:content-transfer-encoding:mime-version:subject:message-id:date :cc:to; bh=4nsv1zpx4aQg0PNpXtgaHuicq0+7cv/uasB61v8+IJY=; b=HIVyHnch2hLrDtJc55WY1/krN5AMQ9XMUfTYfdj+L6oGnPDhBrFfw/drhWoIE4S+PQ +BCerN+brdIgXlUPiu1ohsE3QI69yVkuQvnY/mWB6aCyuqVGdCz1XSIc0MoGPSBGcBxR X+rmfq8tnRE+IUltCUMDu9lMNzBIVFGmMMV4sFZBkol0p39yQPqKeuYwT6eazNcyFBsy DQLKZyXpzC6GXNtAMPKwnlY7gmF6gt4JgNA/z+5n4aYQrk4BNz/J4z5j0OCWsvBDG8bf GaWlD/qU8aDrKcpPdAFnIrzIRY++Locp1C/wGVdC+GVpT1Ri4bAB7IbZ05fOtOP9Olgq vPtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:message-id:date:cc:to; bh=4nsv1zpx4aQg0PNpXtgaHuicq0+7cv/uasB61v8+IJY=; b=DqSD3nDbhsO9lupX5KHLnS54/m+76At70RC7esNtoJtc+twkqdv34HD9cmrTDutNTt 67xR3O5D43aJ1ga4dPj8h5wrVbGBvzUhKDrOmjlq9TBSkewrxVCjshFAKsuloKrDKZ0T 75uIglkTmu0nS18iRFkKuD3SNN70EjqJTkdaPZJm9krxfj7Jfpnhl22o8BufTaVwdM2V zo24JXe+03TR647jv1Oa0QJO/cON80yqSX5BB359F+igcnKVyKDxHoDLGYxmG7b3/RtR f0/k2P94WTwncRjgyLmN5Vnc3Wkw+5u4cvGU33gx19awA0kAIzC1qLX4Hy5j4gJ7aXmq ZZVA== X-Gm-Message-State: AMke39lTuNzBh7VDxNUiw/fHONpJa534omby6Tuc84FN9hMlndn2gorgKE/ZtNFwbz82SA== X-Received: by 10.28.141.199 with SMTP id p190mr5013649wmd.89.1487101417199; Tue, 14 Feb 2017 11:43:37 -0800 (PST) Received: from macbook-pro.fritz.box (x4db517f1.dyn.telefonica.de. [77.181.23.241]) by smtp.googlemail.com with ESMTPSA id n13sm1979433wrn.40.2017.02.14.11.43.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Feb 2017 11:43:36 -0800 (PST) From: Daniel Oberhoff Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Message-Id: <1BC1F5EA-265A-453F-8BBD-EEF387E5BADD@googlemail.com> Date: Tue, 14 Feb 2017 20:43:34 +0100 To: FFmpeg development discussions and patches X-Mailer: Apple Mail (2.3259) Subject: [FFmpeg-devel] [PATCH] avfilter: parallelize vf_remap X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Michael Riss Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Signed-off-by: Daniel Oberhoff --- libavfilter/vf_remap.c | 149 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 97 insertions(+), 52 deletions(-) diff --git a/libavfilter/vf_remap.c b/libavfilter/vf_remap.c index e70956d..84b2466 100644 --- a/libavfilter/vf_remap.c +++ b/libavfilter/vf_remap.c @@ -44,6 +44,7 @@ #include "framesync.h" #include "internal.h" #include "video.h" +#include "pthread.h" typedef struct RemapContext { const AVClass *class; @@ -52,9 +53,8 @@ typedef struct RemapContext { int step; FFFrameSync fs; - void (*remap)(struct RemapContext *s, const AVFrame *in, - const AVFrame *xin, const AVFrame *yin, - AVFrame *out); + void (*remap_slice)(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs); + } RemapContext; #define OFFSET(x) offsetof(RemapContext, x) @@ -66,6 +66,13 @@ static const AVOption remap_options[] = { AVFILTER_DEFINE_CLASS(remap); +typedef struct ThreadData { + AVFrame *in, *xin, *yin, *out; + int nb_planes; + int nb_components; + int step; +} ThreadData; + static int query_formats(AVFilterContext *ctx) { static const enum AVPixelFormat pix_fmts[] = { @@ -113,28 +120,36 @@ fail: return ret; } + /** * remap_planar algorithm expects planes of same size * pixels are copied from source to target using : * Target_frame[y][x] = Source_frame[ ymap[y][x] ][ [xmap[y][x] ]; */ -static void remap_planar(RemapContext *s, const AVFrame *in, - const AVFrame *xin, const AVFrame *yin, - AVFrame *out) +static void remap_planar_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { + const ThreadData *td = (ThreadData*)arg; + const AVFrame *in = td->in; + const AVFrame *xin = td->xin; + const AVFrame *yin = td->yin; + const AVFrame *out = td->out; + + const int slice_start = (out->height * jobnr ) / nb_jobs; + const int slice_end = (out->height * (jobnr+1)) / nb_jobs; + const int xlinesize = xin->linesize[0] / 2; const int ylinesize = yin->linesize[0] / 2; int x , y, plane; - for (plane = 0; plane < s->nb_planes ; plane++) { - uint8_t *dst = out->data[plane]; + for (plane = 0; plane < td->nb_planes ; plane++) { const int dlinesize = out->linesize[plane]; const uint8_t *src = in->data[plane]; + uint8_t *dst = out->data[plane] + slice_start * dlinesize; const int slinesize = in->linesize[plane]; - const uint16_t *xmap = (const uint16_t *)xin->data[0]; - const uint16_t *ymap = (const uint16_t *)yin->data[0]; + const uint16_t *xmap = (const uint16_t *)xin->data[0] + slice_start * xlinesize; + const uint16_t *ymap = (const uint16_t *)yin->data[0] + slice_start * ylinesize; - for (y = 0; y < out->height; y++) { + for (y = slice_start; y < slice_end; y++) { for (x = 0; x < out->width; x++) { if (ymap[x] < in->height && xmap[x] < in->width) { dst[x] = src[ymap[x] * slinesize + xmap[x]]; @@ -149,23 +164,30 @@ static void remap_planar(RemapContext *s, const AVFrame *in, } } -static void remap_planar16(RemapContext *s, const AVFrame *in, - const AVFrame *xin, const AVFrame *yin, - AVFrame *out) +static void remap_planar16_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { + const ThreadData *td = (ThreadData*)arg; + const AVFrame *in = td->in; + const AVFrame *xin = td->xin; + const AVFrame *yin = td->yin; + const AVFrame *out = td->out; + + const int slice_start = (out->height * jobnr ) / nb_jobs; + const int slice_end = (out->height * (jobnr+1)) / nb_jobs; + const int xlinesize = xin->linesize[0] / 2; const int ylinesize = yin->linesize[0] / 2; int x , y, plane; - for (plane = 0; plane < s->nb_planes ; plane++) { - uint16_t *dst = (uint16_t *)out->data[plane]; + for (plane = 0; plane < td->nb_planes ; plane++) { const int dlinesize = out->linesize[plane] / 2; const uint16_t *src = (const uint16_t *)in->data[plane]; + uint16_t *dst = (uint16_t *)out->data[plane] + slice_start * dlinesize; const int slinesize = in->linesize[plane] / 2; - const uint16_t *xmap = (const uint16_t *)xin->data[0]; - const uint16_t *ymap = (const uint16_t *)yin->data[0]; + const uint16_t *xmap = (const uint16_t *)xin->data[0] + slice_start * xlinesize; + const uint16_t *ymap = (const uint16_t *)yin->data[0] + slice_start * ylinesize; - for (y = 0; y < out->height; y++) { + for (y = slice_start; y < slice_end; y++) { for (x = 0; x < out->width; x++) { if (ymap[x] < in->height && xmap[x] < in->width) { dst[x] = src[ymap[x] * slinesize + xmap[x]]; @@ -186,24 +208,31 @@ static void remap_planar16(RemapContext *s, const AVFrame *in, * pixels are copied from source to target using : * Target_frame[y][x] = Source_frame[ ymap[y][x] ][ [xmap[y][x] ]; */ -static void remap_packed(RemapContext *s, const AVFrame *in, - const AVFrame *xin, const AVFrame *yin, - AVFrame *out) +static void remap_packed_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - uint8_t *dst = out->data[0]; - const uint8_t *src = in->data[0]; - const int dlinesize = out->linesize[0]; - const int slinesize = in->linesize[0]; - const int xlinesize = xin->linesize[0] / 2; - const int ylinesize = yin->linesize[0] / 2; - const uint16_t *xmap = (const uint16_t *)xin->data[0]; - const uint16_t *ymap = (const uint16_t *)yin->data[0]; - const int step = s->step; + const ThreadData *td = (ThreadData*)arg; + const AVFrame *in = td->in; + const AVFrame *xin = td->xin; + const AVFrame *yin = td->yin; + const AVFrame *out = td->out; + + const int slice_start = (out->height * jobnr ) / nb_jobs; + const int slice_end = (out->height * (jobnr+1)) / nb_jobs; + + const int dlinesize = out->linesize[0]; + const int slinesize = in->linesize[0]; + const int xlinesize = xin->linesize[0] / 2; + const int ylinesize = yin->linesize[0] / 2; + const uint8_t *src = in->data[0]; + uint8_t *dst = out->data[0] + slice_start * dlinesize; + const uint16_t *xmap = (const uint16_t *)xin->data[0] + slice_start * xlinesize; + const uint16_t *ymap = (const uint16_t *)yin->data[0] + slice_start * ylinesize; + const int step = td->step; int c, x, y; - for (y = 0; y < out->height; y++) { + for (y = slice_start; y < slice_end; y++) { for (x = 0; x < out->width; x++) { - for (c = 0; c < s->nb_components; c++) { + for (c = 0; c < td->nb_components; c++) { if (ymap[x] < in->height && xmap[x] < in->width) { dst[x * step + c] = src[ymap[x] * slinesize + xmap[x] * step + c]; } else { @@ -217,24 +246,31 @@ static void remap_packed(RemapContext *s, const AVFrame *in, } } -static void remap_packed16(RemapContext *s, const AVFrame *in, - const AVFrame *xin, const AVFrame *yin, - AVFrame *out) +static void remap_packed16_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - uint16_t *dst = (uint16_t *)out->data[0]; + const ThreadData *td = (ThreadData*)arg; + const AVFrame *in = td->in; + const AVFrame *xin = td->xin; + const AVFrame *yin = td->yin; + const AVFrame *out = td->out; + + const int slice_start = (out->height * jobnr ) / nb_jobs; + const int slice_end = (out->height * (jobnr+1)) / nb_jobs; + + const int dlinesize = out->linesize[0] / 2; + const int slinesize = in->linesize[0] / 2; + const int xlinesize = xin->linesize[0] / 2; + const int ylinesize = yin->linesize[0] / 2; const uint16_t *src = (const uint16_t *)in->data[0]; - const int dlinesize = out->linesize[0] / 2; - const int slinesize = in->linesize[0] / 2; - const int xlinesize = xin->linesize[0] / 2; - const int ylinesize = yin->linesize[0] / 2; - const uint16_t *xmap = (const uint16_t *)xin->data[0]; - const uint16_t *ymap = (const uint16_t *)yin->data[0]; - const int step = s->step / 2; + uint16_t *dst = (uint16_t *)out->data[0] + slice_start * dlinesize; + const uint16_t *xmap = (const uint16_t *)xin->data[0] + slice_start * xlinesize; + const uint16_t *ymap = (const uint16_t *)yin->data[0] + slice_start * ylinesize; + const int step = td->step / 2; int c, x, y; - for (y = 0; y < out->height; y++) { + for (y = slice_start; y < slice_end; y++) { for (x = 0; x < out->width; x++) { - for (c = 0; c < s->nb_components; c++) { + for (c = 0; c < td->nb_components; c++) { if (ymap[x] < in->height && xmap[x] < in->width) { dst[x * step + c] = src[ymap[x] * slinesize + xmap[x] * step + c]; } else { @@ -259,15 +295,15 @@ static int config_input(AVFilterLink *inlink) if (desc->comp[0].depth == 8) { if (s->nb_planes > 1 || s->nb_components == 1) { - s->remap = remap_planar; + s->remap_slice = remap_planar_slice; } else { - s->remap = remap_packed; + s->remap_slice = remap_packed_slice; } } else { if (s->nb_planes > 1 || s->nb_components == 1) { - s->remap = remap_planar16; + s->remap_slice = remap_planar16_slice; } else { - s->remap = remap_packed16; + s->remap_slice = remap_packed16_slice; } } @@ -293,12 +329,21 @@ static int process_frame(FFFrameSync *fs) if (!out) return AVERROR(ENOMEM); } else { + ThreadData td; + out = ff_get_video_buffer(outlink, outlink->w, outlink->h); if (!out) return AVERROR(ENOMEM); av_frame_copy_props(out, in); - s->remap(s, in, xpic, ypic, out); + td.in = in; + td.xin = xpic; + td.yin = ypic; + td.out = out; + td.nb_planes = s->nb_planes; + td.nb_components = s->nb_components; + td.step = s->step; + ctx->internal->execute(ctx, s->remap_slice, &td, NULL, FFMIN(outlink->h, ctx->graph->nb_threads)); } out->pts = av_rescale_q(in->pts, s->fs.time_base, outlink->time_base); @@ -411,5 +456,5 @@ AVFilter ff_vf_remap = { .inputs = remap_inputs, .outputs = remap_outputs, .priv_class = &remap_class, - .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC, + .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC | AVFILTER_FLAG_SLICE_THREADS, };