From patchwork Sat Apr 28 10:00:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul B Mahol X-Patchwork-Id: 8683 Delivered-To: ffmpegpatchwork@gmail.com Received: by 2002:a02:155:0:0:0:0:0 with SMTP id c82-v6csp1556123jad; Sat, 28 Apr 2018 03:01:35 -0700 (PDT) X-Google-Smtp-Source: AB8JxZorQYulyIn1OY9lYTPOtZZur/ekwseHlaXDGcXYXSxJoUwi+fouBWYhcw3NgDAcOIzVsGVJ X-Received: by 2002:adf:992d:: with SMTP id x42-v6mr4271065wrb.145.1524909695623; Sat, 28 Apr 2018 03:01:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524909695; cv=none; d=google.com; s=arc-20160816; b=cMAcc9IgogCY73bTb5cpZDBANprVtgl/cGoNDih0qR8isk6d932jYZ+t1iyDsMm5DN k1/M3u8yvTVSVfeNLz4gBdTKzWWSSWaM2ifSSyyym8P78NS6Ny7DBNHHQ4yWDUoVzDNP WubYPlqLamLnE2rFiY+Pqy1ElG0/Jvai/cqHwzNBMzSvtaYLA00rIpOS/rkG/hmtkLG8 jvOeucvDQvAWI6XyO6rzfpVvNBW4JvOM8gp7vTXTnJKx1Ek2fEu5tOCLeIp4N26+pUf8 4NB3asgkxXGKby6DvCp/iHuceQ0YnsblLT3fWuEAazj3uoP0JCDqitmfIiCs2NItP+sz 2OAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to:arc-authentication-results; bh=5VcEmKHoHeN4b7uUKPCpWtXY8GcD0hHi3yCYaW8LTZY=; b=bQQZGsFlEAwAhp5VyjYdx1roIX3LLse45QNmGN7/+0DMal5Grt/YZt2gdX43iqkny8 blPeA7F/uT+pnrULIBrR5vo5TpfE+8X3v5HOmRGpRxsLkB39gqK9CSRnLbjsJ7G5QmP5 EHnIEqj7Wkee8jT69zekCNcvBnMVhargOYblyiT5UW/TpQoNIU+P7zX5dcF9SAqBz7ZJ hJEy2vuT9KU2xvtKhhHWtCUZgT3Ai+x4d4tSs8rzk+DS+m+R4f79/Na5V5vv4c6mAawg Oe9bPlu3XgjZ4kpTFtgUs0/Oo6c2VbPsKf7NPE+vCW0qFOcHAHNuFc3uiZ/GbA8FcjrE AXYw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=iR666E6N; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a66si2194616wmc.92.2018.04.28.03.01.34; Sat, 28 Apr 2018 03:01:35 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=iR666E6N; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 138A268A692; Sat, 28 Apr 2018 13:01:02 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr0-f196.google.com (mail-wr0-f196.google.com [209.85.128.196]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3D36A68A4C6 for ; Sat, 28 Apr 2018 13:00:55 +0300 (EEST) Received: by mail-wr0-f196.google.com with SMTP id c14-v6so3861403wrd.4 for ; Sat, 28 Apr 2018 03:01:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references; bh=bnhyS0PMPyTJ2KhlcobrFEY6InOofIK7ThKADpwktfc=; b=iR666E6N3ZqiYeDhwObA3wSWYapSlMjJbwUoWmdlxv/bvDW1gDrAj7TP97SoZdJ9DU 4+hKGvfrS5LzKa+1e1nQQUsrgXLnpGlFDa/dtKbyKF6Am+OeJDLtuYacYuOIlWDhODU/ 4+qNCVedH1D0VHex6L5zQ32V29IhzB34nUgMLbX746fdkYgl07wVYsX2uE0aHjl4w76C zHxYWXAf7SgsThFNwVSIKZJ7Gv8t1jT9j/SiX7ndodBocqrrgpSLW+VFQbZRgGzBbMh5 TBe6ARtfxdnk11BAVnOXd2haTtfuAAfqKCl5b4xc3vN3DZCX3otBjSopV77Y3p9SXxAF DW2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=bnhyS0PMPyTJ2KhlcobrFEY6InOofIK7ThKADpwktfc=; b=E0xNxDqe5WZvK3W6//8WAvLu5qm3H9UOHW65u6WiTibne1DIsezNVUHGw0a5C9178j SsszKHXF1jHUuBer9sjFzhaVvGGuH6sNC9c6TT3VNhyMFXciOugVQUFze9P2hxZfyu+c aGCH88syfo2VCVsfeZ2GK7jJ4UwwnLYjZiPfnTWUlEIkOcKeyVdZvBnX9qKSjcLcFqZH pQtfdgnRfekxnfTgv78cgg5lhZrXanzM1g3tQZg6BNRDO3W0NlZA1cAVF1Ds6ySiC1E/ cnedHiTtyZslFvstgS4nsv6XqUP87CtlXAouji5mp0mo6fS9DAEGn0OOva0N/8BnafaR lRzg== X-Gm-Message-State: ALQs6tBMk2Wp8phO5+OFIUk/uVVye62uNuGX6BcdS98olBA3jLq7+Rq0 4nLf1tuzoyv8fY9nrxPh7FVFsg== X-Received: by 2002:adf:9745:: with SMTP id r63-v6mr4261993wrb.57.1524909685557; Sat, 28 Apr 2018 03:01:25 -0700 (PDT) Received: from localhost.localdomain ([94.250.174.60]) by smtp.gmail.com with ESMTPSA id y100sm6723499wmh.2.2018.04.28.03.01.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 28 Apr 2018 03:01:25 -0700 (PDT) From: Paul B Mahol To: ffmpeg-devel@ffmpeg.org Date: Sat, 28 Apr 2018 12:00:46 +0200 Message-Id: <20180428100046.9993-1-onemda@gmail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180427125049.12305-1-onemda@gmail.com> References: <20180427125049.12305-1-onemda@gmail.com> Subject: [FFmpeg-devel] [PATCH] avfilter/vf_overlay: add slice threading X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Signed-off-by: Paul B Mahol --- libavfilter/vf_overlay.c | 281 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 190 insertions(+), 91 deletions(-) diff --git a/libavfilter/vf_overlay.c b/libavfilter/vf_overlay.c index c6a6ac82f3..cb304e9522 100644 --- a/libavfilter/vf_overlay.c +++ b/libavfilter/vf_overlay.c @@ -40,6 +40,10 @@ #include "framesync.h" #include "video.h" +typedef struct ThreadData { + AVFrame *dst, *src; +} ThreadData; + static const char *const var_names[] = { "main_w", "W", ///< width of the main video "main_h", "H", ///< height of the main video @@ -124,7 +128,7 @@ typedef struct OverlayContext { AVExpr *x_pexpr, *y_pexpr; - void (*blend_image)(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y); + int (*blend_slice)(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs); } OverlayContext; static av_cold void uninit(AVFilterContext *ctx) @@ -403,10 +407,10 @@ static int config_output(AVFilterLink *outlink) * Blend image in src to destination buffer dst at position (x, y). */ -static av_always_inline void blend_image_packed_rgb(AVFilterContext *ctx, +static av_always_inline void blend_slice_packed_rgb(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int main_has_alpha, int x, int y, - int is_straight) + int is_straight, int jobnr, int nb_jobs) { OverlayContext *s = ctx->priv; int i, imax, j, jmax; @@ -425,13 +429,19 @@ static av_always_inline void blend_image_packed_rgb(AVFilterContext *ctx, const int sb = s->overlay_rgba_map[B]; const int sa = s->overlay_rgba_map[A]; const int sstep = s->overlay_pix_step[0]; + int slice_start, slice_end; uint8_t *S, *sp, *d, *dp; i = FFMAX(-y, 0); - sp = src->data[0] + i * src->linesize[0]; - dp = dst->data[0] + (y+i) * dst->linesize[0]; + imax = FFMIN(-y + dst_h, src_h); + + slice_start = (imax * jobnr) / nb_jobs; + slice_end = (imax * (jobnr+1)) / nb_jobs; + + sp = src->data[0] + (i + slice_start) * src->linesize[0]; + dp = dst->data[0] + (y + i + slice_start) * dst->linesize[0]; - for (imax = FFMIN(-y + dst_h, src_h); i < imax; i++) { + for (i = i + slice_start; i < slice_end; i++) { j = FFMAX(-x, 0); S = sp + j * sstep; d = dp + (x+j) * dstep; @@ -495,7 +505,9 @@ static av_always_inline void blend_plane(AVFilterContext *ctx, int dst_offset, int dst_step, int straight, - int yuv) + int yuv, + int jobnr, + int nb_jobs) { int src_wp = AV_CEIL_RSHIFT(src_w, hsub); int src_hp = AV_CEIL_RSHIFT(src_h, vsub); @@ -505,16 +517,22 @@ static av_always_inline void blend_plane(AVFilterContext *ctx, int xp = x>>hsub; uint8_t *s, *sp, *d, *dp, *dap, *a, *da, *ap; int jmax, j, k, kmax; + int slice_start, slice_end; j = FFMAX(-yp, 0); - sp = src->data[i] + j * src->linesize[i]; + jmax = FFMIN(-yp + dst_hp, src_hp); + + slice_start = (jmax * jobnr) / nb_jobs; + slice_end = ((jmax * (jobnr+1)) / nb_jobs); + + sp = src->data[i] + slice_start * src->linesize[i]; dp = dst->data[dst_plane] - + (yp+j) * dst->linesize[dst_plane] + + (yp + slice_start) * dst->linesize[dst_plane] + dst_offset; - ap = src->data[3] + (j<linesize[3]; - dap = dst->data[3] + ((yp+j) << vsub) * dst->linesize[3]; + ap = src->data[3] + (slice_start << vsub) * src->linesize[3]; + dap = dst->data[3] + ((yp + slice_start) << vsub) * dst->linesize[3]; - for (jmax = FFMIN(-yp + dst_hp, src_hp); j < jmax; j++) { + for (j = j + slice_start; j < slice_end; j++) { k = FFMAX(-xp, 0); d = dp + (xp+k) * dst_step; s = sp + k; @@ -577,17 +595,23 @@ static av_always_inline void blend_plane(AVFilterContext *ctx, static inline void alpha_composite(const AVFrame *src, const AVFrame *dst, int src_w, int src_h, int dst_w, int dst_h, - int x, int y) + int x, int y, + int jobnr, int nb_jobs) { uint8_t alpha; ///< the amount of overlay to blend on to main uint8_t *s, *sa, *d, *da; int i, imax, j, jmax; + int slice_start, slice_end; + + imax = FFMIN(-y + dst_h, src_h); + slice_start = (imax * jobnr) / nb_jobs; + slice_end = ((imax * (jobnr+1)) / nb_jobs); i = FFMAX(-y, 0); - sa = src->data[3] + i * src->linesize[3]; - da = dst->data[3] + (y+i) * dst->linesize[3]; + sa = src->data[3] + (i + slice_start) * src->linesize[3]; + da = dst->data[3] + (y + i + slice_start) * dst->linesize[3]; - for (imax = FFMIN(-y + dst_h, src_h); i < imax; i++) { + for (i = i + slice_start; i < imax; i++) { j = FFMAX(-x, 0); s = sa + j; d = da + x+j; @@ -616,12 +640,13 @@ static inline void alpha_composite(const AVFrame *src, const AVFrame *dst, } } -static av_always_inline void blend_image_yuv(AVFilterContext *ctx, +static av_always_inline void blend_slice_yuv(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int hsub, int vsub, int main_has_alpha, int x, int y, - int is_straight) + int is_straight, + int jobnr, int nb_jobs) { OverlayContext *s = ctx->priv; const int src_w = src->width; @@ -630,22 +655,27 @@ static av_always_inline void blend_image_yuv(AVFilterContext *ctx, const int dst_h = dst->height; blend_plane(ctx, dst, src, src_w, src_h, dst_w, dst_h, 0, 0, 0, x, y, main_has_alpha, - s->main_desc->comp[0].plane, s->main_desc->comp[0].offset, s->main_desc->comp[0].step, is_straight, 1); + s->main_desc->comp[0].plane, s->main_desc->comp[0].offset, s->main_desc->comp[0].step, is_straight, 1, + jobnr, nb_jobs); blend_plane(ctx, dst, src, src_w, src_h, dst_w, dst_h, 1, hsub, vsub, x, y, main_has_alpha, - s->main_desc->comp[1].plane, s->main_desc->comp[1].offset, s->main_desc->comp[1].step, is_straight, 1); + s->main_desc->comp[1].plane, s->main_desc->comp[1].offset, s->main_desc->comp[1].step, is_straight, 1, + jobnr, nb_jobs); blend_plane(ctx, dst, src, src_w, src_h, dst_w, dst_h, 2, hsub, vsub, x, y, main_has_alpha, - s->main_desc->comp[2].plane, s->main_desc->comp[2].offset, s->main_desc->comp[2].step, is_straight, 1); + s->main_desc->comp[2].plane, s->main_desc->comp[2].offset, s->main_desc->comp[2].step, is_straight, 1, + jobnr, nb_jobs); if (main_has_alpha) - alpha_composite(src, dst, src_w, src_h, dst_w, dst_h, x, y); + alpha_composite(src, dst, src_w, src_h, dst_w, dst_h, x, y, jobnr, nb_jobs); } -static av_always_inline void blend_image_planar_rgb(AVFilterContext *ctx, +static av_always_inline void blend_slice_planar_rgb(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int hsub, int vsub, int main_has_alpha, int x, int y, - int is_straight) + int is_straight, + int jobnr, + int nb_jobs) { OverlayContext *s = ctx->priv; const int src_w = src->width; @@ -654,114 +684,177 @@ static av_always_inline void blend_image_planar_rgb(AVFilterContext *ctx, const int dst_h = dst->height; blend_plane(ctx, dst, src, src_w, src_h, dst_w, dst_h, 0, 0, 0, x, y, main_has_alpha, - s->main_desc->comp[1].plane, s->main_desc->comp[1].offset, s->main_desc->comp[1].step, is_straight, 0); + s->main_desc->comp[1].plane, s->main_desc->comp[1].offset, s->main_desc->comp[1].step, is_straight, 0, + jobnr, nb_jobs); blend_plane(ctx, dst, src, src_w, src_h, dst_w, dst_h, 1, hsub, vsub, x, y, main_has_alpha, - s->main_desc->comp[2].plane, s->main_desc->comp[2].offset, s->main_desc->comp[2].step, is_straight, 0); + s->main_desc->comp[2].plane, s->main_desc->comp[2].offset, s->main_desc->comp[2].step, is_straight, 0, + jobnr, nb_jobs); blend_plane(ctx, dst, src, src_w, src_h, dst_w, dst_h, 2, hsub, vsub, x, y, main_has_alpha, - s->main_desc->comp[0].plane, s->main_desc->comp[0].offset, s->main_desc->comp[0].step, is_straight, 0); + s->main_desc->comp[0].plane, s->main_desc->comp[0].offset, s->main_desc->comp[0].step, is_straight, 0, + jobnr, nb_jobs); if (main_has_alpha) - alpha_composite(src, dst, src_w, src_h, dst_w, dst_h, x, y); + alpha_composite(src, dst, src_w, src_h, dst_w, dst_h, x, y, jobnr, nb_jobs); } -static void blend_image_yuv420(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuv420(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 1, 1, 0, x, y, 1); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 1, 1, 0, s->x, s->y, 1, jobnr, nb_jobs); + return 0; } -static void blend_image_yuva420(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuva420(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 1, 1, 1, x, y, 1); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 1, 1, 1, s->x, s->y, 1, jobnr, nb_jobs); + return 0; } -static void blend_image_yuv422(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuv422(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 1, 0, 0, x, y, 1); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 1, 0, 0, s->x, s->y, 1, jobnr, nb_jobs); + return 0; } -static void blend_image_yuva422(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuva422(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 1, 0, 1, x, y, 1); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 1, 0, 1, s->x, s->y, 1, jobnr, nb_jobs); + return 0; } -static void blend_image_yuv444(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuv444(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 0, 0, 0, x, y, 1); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 0, 0, 0, s->x, s->y, 1, jobnr, nb_jobs); + return 0; } -static void blend_image_yuva444(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuva444(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 0, 0, 1, x, y, 1); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 0, 0, 1, s->x, s->y, 1, jobnr, nb_jobs); + return 0; } -static void blend_image_gbrp(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_gbrp(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_planar_rgb(ctx, dst, src, 0, 0, 0, x, y, 1); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_planar_rgb(ctx, td->dst, td->src, 0, 0, 0, s->x, s->y, 1, jobnr, nb_jobs); + return 0; } -static void blend_image_gbrap(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_gbrap(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_planar_rgb(ctx, dst, src, 0, 0, 1, x, y, 1); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_planar_rgb(ctx, td->dst, td->src, 0, 0, 1, s->x, s->y, 1, jobnr, nb_jobs); + return 0; } -static void blend_image_yuv420_pm(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuv420_pm(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 1, 1, 0, x, y, 0); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 1, 1, 0, s->x, s->y, 0, jobnr, nb_jobs); + return 0; } -static void blend_image_yuva420_pm(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuva420_pm(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 1, 1, 1, x, y, 0); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 1, 1, 1, s->x, s->y, 0, jobnr, nb_jobs); + return 0; } -static void blend_image_yuv422_pm(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuv422_pm(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 1, 0, 0, x, y, 0); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 1, 0, 0, s->x, s->y, 0, jobnr, nb_jobs); + return 0; } -static void blend_image_yuva422_pm(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuva422_pm(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 1, 0, 1, x, y, 0); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 1, 0, 1, s->x, s->y, 0, jobnr, nb_jobs); + return 0; } -static void blend_image_yuv444_pm(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuv444_pm(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 0, 0, 0, x, y, 0); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 0, 0, 0, s->x, s->y, 0, jobnr, nb_jobs); + return 0; } -static void blend_image_yuva444_pm(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_yuva444_pm(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_yuv(ctx, dst, src, 0, 0, 1, x, y, 0); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_yuv(ctx, td->dst, td->src, 0, 0, 1, s->x, s->y, 0, jobnr, nb_jobs); + return 0; } -static void blend_image_gbrp_pm(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_gbrp_pm(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_planar_rgb(ctx, dst, src, 0, 0, 0, x, y, 0); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_planar_rgb(ctx, td->dst, td->src, 0, 0, 0, s->x, s->y, 0, jobnr, nb_jobs); + return 0; } -static void blend_image_gbrap_pm(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_gbrap_pm(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_planar_rgb(ctx, dst, src, 0, 0, 1, x, y, 0); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_planar_rgb(ctx, td->dst, td->src, 0, 0, 1, s->x, s->y, 0, jobnr, nb_jobs); + return 0; } -static void blend_image_rgb(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_rgb(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_packed_rgb(ctx, dst, src, 0, x, y, 1); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_packed_rgb(ctx, td->dst, td->src, 0, s->x, s->y, 1, jobnr, nb_jobs); + return 0; } -static void blend_image_rgba(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_rgba(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_packed_rgb(ctx, dst, src, 1, x, y, 1); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_packed_rgb(ctx, td->dst, td->src, 1, s->x, s->y, 1, jobnr, nb_jobs); + return 0; } -static void blend_image_rgb_pm(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_rgb_pm(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_packed_rgb(ctx, dst, src, 0, x, y, 0); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_packed_rgb(ctx, td->dst, td->src, 0, s->x, s->y, 0, jobnr, nb_jobs); + return 0; } -static void blend_image_rgba_pm(AVFilterContext *ctx, AVFrame *dst, const AVFrame *src, int x, int y) +static int blend_slice_rgba_pm(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { - blend_image_packed_rgb(ctx, dst, src, 1, x, y, 0); + OverlayContext *s = ctx->priv; + ThreadData *td = arg; + blend_slice_packed_rgb(ctx, td->dst, td->src, 1, s->x, s->y, 0, jobnr, nb_jobs); + return 0; } static int config_input_main(AVFilterLink *inlink) @@ -781,39 +874,39 @@ static int config_input_main(AVFilterLink *inlink) s->main_has_alpha = ff_fmt_is_in(inlink->format, alpha_pix_fmts); switch (s->format) { case OVERLAY_FORMAT_YUV420: - s->blend_image = s->main_has_alpha ? blend_image_yuva420 : blend_image_yuv420; + s->blend_slice = s->main_has_alpha ? blend_slice_yuva420 : blend_slice_yuv420; break; case OVERLAY_FORMAT_YUV422: - s->blend_image = s->main_has_alpha ? blend_image_yuva422 : blend_image_yuv422; + s->blend_slice = s->main_has_alpha ? blend_slice_yuva422 : blend_slice_yuv422; break; case OVERLAY_FORMAT_YUV444: - s->blend_image = s->main_has_alpha ? blend_image_yuva444 : blend_image_yuv444; + s->blend_slice = s->main_has_alpha ? blend_slice_yuva444 : blend_slice_yuv444; break; case OVERLAY_FORMAT_RGB: - s->blend_image = s->main_has_alpha ? blend_image_rgba : blend_image_rgb; + s->blend_slice = s->main_has_alpha ? blend_slice_rgba : blend_slice_rgb; break; case OVERLAY_FORMAT_GBRP: - s->blend_image = s->main_has_alpha ? blend_image_gbrap : blend_image_gbrp; + s->blend_slice = s->main_has_alpha ? blend_slice_gbrap : blend_slice_gbrp; break; case OVERLAY_FORMAT_AUTO: switch (inlink->format) { case AV_PIX_FMT_YUVA420P: - s->blend_image = blend_image_yuva420; + s->blend_slice = blend_slice_yuva420; break; case AV_PIX_FMT_YUVA422P: - s->blend_image = blend_image_yuva422; + s->blend_slice = blend_slice_yuva422; break; case AV_PIX_FMT_YUVA444P: - s->blend_image = blend_image_yuva444; + s->blend_slice = blend_slice_yuva444; break; case AV_PIX_FMT_ARGB: case AV_PIX_FMT_RGBA: case AV_PIX_FMT_BGRA: case AV_PIX_FMT_ABGR: - s->blend_image = blend_image_rgba; + s->blend_slice = blend_slice_rgba; break; case AV_PIX_FMT_GBRAP: - s->blend_image = blend_image_gbrap; + s->blend_slice = blend_slice_gbrap; break; default: av_assert0(0); @@ -827,39 +920,39 @@ static int config_input_main(AVFilterLink *inlink) switch (s->format) { case OVERLAY_FORMAT_YUV420: - s->blend_image = s->main_has_alpha ? blend_image_yuva420_pm : blend_image_yuv420_pm; + s->blend_slice = s->main_has_alpha ? blend_slice_yuva420_pm : blend_slice_yuv420_pm; break; case OVERLAY_FORMAT_YUV422: - s->blend_image = s->main_has_alpha ? blend_image_yuva422_pm : blend_image_yuv422_pm; + s->blend_slice = s->main_has_alpha ? blend_slice_yuva422_pm : blend_slice_yuv422_pm; break; case OVERLAY_FORMAT_YUV444: - s->blend_image = s->main_has_alpha ? blend_image_yuva444_pm : blend_image_yuv444_pm; + s->blend_slice = s->main_has_alpha ? blend_slice_yuva444_pm : blend_slice_yuv444_pm; break; case OVERLAY_FORMAT_RGB: - s->blend_image = s->main_has_alpha ? blend_image_rgba_pm : blend_image_rgb_pm; + s->blend_slice = s->main_has_alpha ? blend_slice_rgba_pm : blend_slice_rgb_pm; break; case OVERLAY_FORMAT_GBRP: - s->blend_image = s->main_has_alpha ? blend_image_gbrap_pm : blend_image_gbrp_pm; + s->blend_slice = s->main_has_alpha ? blend_slice_gbrap_pm : blend_slice_gbrp_pm; break; case OVERLAY_FORMAT_AUTO: switch (inlink->format) { case AV_PIX_FMT_YUVA420P: - s->blend_image = blend_image_yuva420_pm; + s->blend_slice = blend_slice_yuva420_pm; break; case AV_PIX_FMT_YUVA422P: - s->blend_image = blend_image_yuva422_pm; + s->blend_slice = blend_slice_yuva422_pm; break; case AV_PIX_FMT_YUVA444P: - s->blend_image = blend_image_yuva444_pm; + s->blend_slice = blend_slice_yuva444_pm; break; case AV_PIX_FMT_ARGB: case AV_PIX_FMT_RGBA: case AV_PIX_FMT_BGRA: case AV_PIX_FMT_ABGR: - s->blend_image = blend_image_rgba_pm; + s->blend_slice = blend_slice_rgba_pm; break; case AV_PIX_FMT_GBRAP: - s->blend_image = blend_image_gbrap_pm; + s->blend_slice = blend_slice_gbrap_pm; break; default: av_assert0(0); @@ -905,8 +998,13 @@ static int do_blend(FFFrameSync *fs) } if (s->x < mainpic->width && s->x + second->width >= 0 || - s->y < mainpic->height && s->y + second->height >= 0) - s->blend_image(ctx, mainpic, second, s->x, s->y); + s->y < mainpic->height && s->y + second->height >= 0) { + ThreadData td; + + td.dst = mainpic; + td.src = second; + ctx->internal->execute(ctx, s->blend_slice, &td, NULL, ff_filter_get_nb_threads(ctx)); + } return ff_filter_frame(ctx->outputs[0], mainpic); } @@ -992,5 +1090,6 @@ AVFilter ff_vf_overlay = { .process_command = process_command, .inputs = avfilter_vf_overlay_inputs, .outputs = avfilter_vf_overlay_outputs, - .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_INTERNAL, + .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_INTERNAL | + AVFILTER_FLAG_SLICE_THREADS, };