From patchwork Sun Jul 2 12:32:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cox X-Patchwork-Id: 42381 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:3b1e:b0:12b:9ae3:586d with SMTP id c30csp3715417pzh; Sun, 2 Jul 2023 05:35:23 -0700 (PDT) X-Google-Smtp-Source: APBJJlH+JRQTQL9XtU0YBTIstcgIqjMI3uypfCzJTKzRF3PqCjcpC230sTlw5AoM0PB+2qmtknru X-Received: by 2002:a17:906:fa18:b0:992:e1:93c with SMTP id lo24-20020a170906fa1800b0099200e1093cmr5155710ejb.37.1688301323354; Sun, 02 Jul 2023 05:35:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688301323; cv=none; d=google.com; s=arc-20160816; b=bmgycFV+CX1x0XMNy5ncBuZJ3Yx/iMlcdwxDIn63+TXe1y2m+qhmKyfOA30EAf+8AB el0rH9dyyeJGhrR8hRqxO4SV3Xy4fr5wsaTA7TeUXkbQYf2rzOjqT/l/JpdJppWwYEmU mIrCzpMcFBOaJCDFx/arKsVVm+KZ2qzhtWcuYMQZbYQ8ei2LN1vpx4QFKkDy60MUnn9P 3QDIRu5aHaqvkxwl94jV6DG8I44Ushn0liowN2SF68fl8yhNXh8hRIz4SkgWfH+JdMCT Vj67ABtLetxwhIe8h6kThFSGptj0M23tk6Oon2SfWKAFcFQpp1VKncrCQQQe6736FtFL ZYLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=91P/fZJzXd5RDDfcXWR+hawymNWbOUhfPRBGQ06K2uc=; fh=2QQVLAqz5Dgp0O7PTQ7hb1i3rOEvtuxkp5BnHStC38U=; b=v1gRETQNpEp6gF9BAksY8AClQxEnP7LE09FDqMh+QZ8YDxIXRgrXylDKmUpN0GwepX Q+nOI5H3sQdituaGfTTDFI03+qvYIx6khPXpDjXiYiXRDErRd+yeGA+ZbYGK0wtMTlXL CcnBWBD/wS9vHd8HN6/+kkk1QNmyOWyzggwMBTlIWM9asY5a0ZOPBvOEkUxK6XiImOEj 81Py0gco9QcrLdsj1uJiEzvttTSR6JAvx0mk/r97O9JZoz/rM/VUsv768GkVnBNcaA/s zN1cUAWf0jxMU4hJr1s71S5+4sklysqvET13UxTjU+7eFitJReUJ5PqNIo2BfAH3FT6T 6R5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=akjvRmbr; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id t12-20020a170906064c00b0099360544774si470694ejb.614.2023.07.02.05.35.22; Sun, 02 Jul 2023 05:35:23 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@kynesim-co-uk.20221208.gappssmtp.com header.s=20221208 header.b=akjvRmbr; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2DA0F68C41C; Sun, 2 Jul 2023 15:33:28 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0756568C38E for ; Sun, 2 Jul 2023 15:33:21 +0300 (EEST) Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-3fbd33a57dcso10320235e9.0 for ; Sun, 02 Jul 2023 05:33:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kynesim-co-uk.20221208.gappssmtp.com; s=20221208; t=1688301201; x=1690893201; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jL3eY1NlYyK1cmFkxNofu288S3SJwEoOYaWc0ZFN/uA=; b=akjvRmbrKl3oI4iMPU1u+Z30rovNQDsbYqGmVAV0IYZRVzpeXLA7NP1uOPTdHN+aLE tprrUE0UDf/4ykuoGD49PD8yIFYKS0rDblG8/1JTrllTT9f50Thh511UrQjuxrNvzVxx vmseKJDwpUv6T3a71+g+EzL4M2Q7K6817TzsN4qP9xDMLlWPNT/1LQlemZRgSElevsmD Cj4xP/9okp/VeKuHRcI0tUfjN4r+3li2ZTholE4B9YXmGIhqcoqlL/uNdh8iq9v1WsEH ZdJK9R5Yxpl+66InRyWwQzaZXAaMqewGIV9GGYZp4Yz4ECdITh7NEPM4gephq+i7W75Z RZVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688301201; x=1690893201; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jL3eY1NlYyK1cmFkxNofu288S3SJwEoOYaWc0ZFN/uA=; b=lpLJ7Pe3WpqSmWAXbIy2tSJh9+KAW8ts7LO2Ij4CaCzW+hhBwg06Hiap+Olx9Kdjo6 +SF2495S9qYvOzda7DcKgSqC5DhwD4qA1WhlxQgjfpvJB1pbhhFQwyJbkXr4HB9YQDmt wCVwjN2XZ8j17jM+EZga3/DkIfPzWuNCUS4AcljbVC7t2zzJJ1eAn4Sq4NO1m/uPPDJc ma+NtRkWVK3lrzhzjdVLAkvsDN0XipeNHF4l/t0m7C/QN/Qp84nQzpRIG3mLzkTe8kz0 /R8PiR4U24wZK/oC47Lxf4p2vr+9YpVbdhlTkwZqwouoJFinFnbOhd7bpab9PpIOCRGx lJug== X-Gm-Message-State: AC+VfDw+Ehw/yt+3KZcSM031ajy1Z5SxJNnv5H1WtLlmTTSQSrB60f2J M1F8p33e3rm3BIUnX0jbzyNcfnF1nh8r73RGeXw= X-Received: by 2002:a05:600c:218e:b0:3fb:b1af:a44b with SMTP id e14-20020a05600c218e00b003fbb1afa44bmr6403599wme.5.1688301201184; Sun, 02 Jul 2023 05:33:21 -0700 (PDT) Received: from sucnaath.outer.uphall.net (cpc1-cmbg20-2-0-cust759.5-4.cable.virginm.net. [86.21.218.248]) by smtp.gmail.com with ESMTPSA id f12-20020a7bc8cc000000b003fbbe41fd78sm8816167wml.10.2023.07.02.05.33.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 02 Jul 2023 05:33:20 -0700 (PDT) From: John Cox To: ffmpeg-devel@ffmpeg.org Date: Sun, 2 Jul 2023 12:32:42 +0000 Message-Id: <20230702123242.232484-16-jc@kynesim.co.uk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230702123242.232484-1-jc@kynesim.co.uk> References: <20230702123242.232484-1-jc@kynesim.co.uk> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 15/15] avfilter/vf_bwdif: Block filter slices into a multiple of 4 lines X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: thomas.mundt@hr.de, John Cox , martin@martin.st Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: D6oZDK9yZ1Ig Round job start lines down to a multiple of 4. This means that if filter_line3 exists then filter_line will not sometimes be called once at the end of a slice depending on thread count. The final slice may do up to 3 extra lines but filter_edge is faster than filter_line so it is unlikely to create any noticable thread load variation. Signed-off-by: John Cox --- libavfilter/vf_bwdif.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/libavfilter/vf_bwdif.c b/libavfilter/vf_bwdif.c index 52bc676cf8..6701208efe 100644 --- a/libavfilter/vf_bwdif.c +++ b/libavfilter/vf_bwdif.c @@ -237,6 +237,13 @@ static void filter_edge_16bit(void *dst1, void *prev1, void *cur1, void *next1, FILTER2() } +// Round job start line down to multiple of 4 so that if filter_line3 exists +// and the frame is a multiple of 4 high then filter_line will never be called +static inline int job_start(const int jobnr, const int nb_jobs, const int h) +{ + return jobnr >= nb_jobs ? h : ((h * jobnr) / nb_jobs) & ~3; +} + static int filter_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { BWDIFContext *s = ctx->priv; @@ -246,8 +253,8 @@ static int filter_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) int clip_max = (1 << (yadif->csp->comp[td->plane].depth)) - 1; int df = (yadif->csp->comp[td->plane].depth + 7) / 8; int refs = linesize / df; - int slice_start = (td->h * jobnr ) / nb_jobs; - int slice_end = (td->h * (jobnr+1)) / nb_jobs; + int slice_start = job_start(jobnr, nb_jobs, td->h); + int slice_end = job_start(jobnr + 1, nb_jobs, td->h); int y; for (y = slice_start; y < slice_end; y++) { @@ -310,7 +317,7 @@ static void filter(AVFilterContext *ctx, AVFrame *dstpic, td.plane = i; ff_filter_execute(ctx, filter_slice, &td, NULL, - FFMIN(h, ff_filter_get_nb_threads(ctx))); + FFMIN((h+3)/4, ff_filter_get_nb_threads(ctx))); } if (yadif->current_field == YADIF_FIELD_END) { yadif->current_field = YADIF_FIELD_NORMAL;