From patchwork Wed Oct 2 01:43:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nuo Mi X-Patchwork-Id: 51983 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:cb8a:0:b0:48e:c0f8:d0de with SMTP id d10csp671027vqv; Tue, 1 Oct 2024 19:01:13 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCV7S9+xTPsDFwXIB8WT0D+UvO+VBoQFu6d1jt3UNbkCPxIDPuHucM/hr5qZvwl6tcuHgL5OG3FZrs5q4JDbjPdA@gmail.com X-Google-Smtp-Source: AGHT+IF7s+NVkAHfgkSSmg+koQGbCH15j+ECsCpMhsjiyhUNkhIqR2ujLOZsXtTYPyFIRUSHfHnZ X-Received: by 2002:a05:6512:1245:b0:538:9b5d:9885 with SMTP id 2adb3069b0e04-539a0601d0amr765814e87.0.1727834473224; Tue, 01 Oct 2024 19:01:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1727834473; cv=none; d=google.com; s=arc-20240605; b=Co06UP8JVfTGj/UlsSjTpu27SMJVMFDZmlzuerctw3TdvRRyhTDCD8I/n75zi6m4m0 igtmPqfp0GFSwXN0sBIAGF2hGn2USxIySe+cpVywEKGU99UU8BINN7tb4YGNJwn71IAZ HbTjQ4P41UrMy2bfqhvNqBFUASPYRMt6qp360s7398t2PMSMgLsvRc3FO9ogczU2ybcF O5i832fsb6LDhlSE7sh/+nJIt9BHhk4pFqxErR5vZFzylFnsvULaupCU4ozmunRIEnOw ejG/u8c5OuOsNgnVRjA2sVSxP7Y13h9wU+q3N0B8lpOE6RnGrzgqnJLnlimQ3d4O3iMp LrAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=Hgwmwsf4YZ2d3my4S4s43iF6+8W8xnEYmWkE+qw3WK4=; fh=mZk9AfRmPBMGW9h158yccPeJgZmEjzU2tMQtLZcF184=; b=LqlZGKrVL5IPHA7n37rfNq/JHO8r77dePb64bRWxWGZRDOK+A99pL7/r9seiehaAu6 bfm5OfsTtscGy96HcBHWyWWGGFhndX/FJx/36w/RGQmRC/cLESaRDY5CT1nFWEBgfpE9 JJ91ZcaYlWj2Bu43aEjaKStw/nj3DggMbDLpjV7KzGtTl2l7GxAJOKkjE0XIFRg38TU4 OpfydFT5fjClU1faXa/JfUtdKL0roIylsFeQ6MzqPJExeES6Iti36jiJJ/ls6sEL2ATC ia14f0lv9QFB8fCQmFBM89J7SGNbQ4POnGfJ7Mpfod/4bqsFDWvauRq9hSErBBekpy9b VU8Q==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=QKg6+Emv; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a93c2903634si820566666b.216.2024.10.01.19.01.12; Tue, 01 Oct 2024 19:01:13 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=QKg6+Emv; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EB5A968D5E9; Wed, 2 Oct 2024 04:44:11 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f170.google.com (mail-pg1-f170.google.com [209.85.215.170]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E561268CB87 for ; Wed, 2 Oct 2024 04:44:05 +0300 (EEST) Received: by mail-pg1-f170.google.com with SMTP id 41be03b00d2f7-7e6ba3f93fdso3893159a12.1 for ; Tue, 01 Oct 2024 18:44:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727833443; x=1728438243; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=POccK/nl7fep4fg3Tii+W0OI/aGS+UwUXIAa34XSOp0=; b=QKg6+Emvb7UM437zZ1OI9pSn6jxr01NfCx1IHGhJvM6VFK5bozYb2GZYaJFbwU6MLo uJMlkPP8QYb1xO5GBX6yH0ceE8K+NjY0bFTar6Qxj6MadHCtkKnVp/D7xyZd4tEkoQKF Uc8pWBewtAlyY8pi4obYw/3KmOJeuHsT7or9j7D19I0OTN+v5B9dGohtEglH7ZDAvgdg acdrPd+V9tZjWBDIa9jXpUh9ivJvnXfK9ebOq48l+LTSL4qcivqXwkBPgg+uCeJrNWmj tZfj41qG/D8zNluwWAsvqDQfRq1O6JZw8dXYeNW4rpTiI1xzeRBfKtMLbUu5i051bs9b d6aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727833443; x=1728438243; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=POccK/nl7fep4fg3Tii+W0OI/aGS+UwUXIAa34XSOp0=; b=XN480skCcoe5qkvrhuGBw87T+TTaSO8I+OkakVpZI8pK9/xae+GJDh4W8CCVFJIk+5 sgq1jIQAeP4D3ccK5HKtSeitQJEztU5o+xfD5ASCrc6urGEGtyLkr+qRlSMqRPf8Sc1r kWcM2lvvQlk/Qf8H4McAELKcO6EQbUaM5nQzRe8Rc1Y6iLHteJXica/JzdBB1MVwwiOr XK5kOKbSEbTFxJe4xJIBlhURVmN4gT0ohuQ+WJTB8I8KORN6FMnNRlbXnqrm2kP9kw/X 6O/GhFEOleMr3eSwEiMbl/xvuuv9OR6YquLgnI1ASMZyL3vVLJwPOy3l/Aq0so3MD2/r nhww== X-Gm-Message-State: AOJu0Yw2IWMImlwf5xHIwOROJPab4F0CYD6aYq/AIFaMyGxrBq5mareB xIYPvmbU7WTRtitcmmrOCOUD86gfZaRng/0DCID4TI8+8eMmsLEWZ9txgmLN X-Received: by 2002:a17:90a:b398:b0:2de:e798:48bc with SMTP id 98e67ed59e1d1-2e184943b82mr2223655a91.33.1727833443380; Tue, 01 Oct 2024 18:44:03 -0700 (PDT) Received: from localhost ([112.64.8.17]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e18f89cd67sm332611a91.24.2024.10.01.18.44.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Oct 2024 18:44:03 -0700 (PDT) From: Nuo Mi To: ffmpeg-devel@ffmpeg.org Date: Wed, 2 Oct 2024 09:43:56 +0800 Message-Id: <20241002014358.296769-1-nuomi2021@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 1/3] avcodec: make a local copy of executor X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Nuo Mi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: +gN3lDfRrCLt We still need several refactors to improve the current VVC decoder's performance, which will frequently break the API/ABI. To mitigate this, we've copied the executor from avutil to avcodec. Once the API/ABI is stable, we will move this class back to avutil --- libavcodec/Makefile | 1 + libavcodec/executor.c | 221 ++++++++++++++++++++++++++++++++++++++++ libavcodec/executor.h | 73 +++++++++++++ libavcodec/vvc/dec.h | 2 +- libavcodec/vvc/thread.c | 22 ++-- libavcodec/vvc/thread.h | 4 +- 6 files changed, 309 insertions(+), 14 deletions(-) create mode 100644 libavcodec/executor.c create mode 100644 libavcodec/executor.h diff --git a/libavcodec/Makefile b/libavcodec/Makefile index a4fcce3b42..da1a1aa945 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -43,6 +43,7 @@ OBJS = ac3_parser.o \ dirac.o \ dv_profile.o \ encode.o \ + executor.o \ get_buffer.o \ imgconvert.o \ jni.o \ diff --git a/libavcodec/executor.c b/libavcodec/executor.c new file mode 100644 index 0000000000..574c5c7be7 --- /dev/null +++ b/libavcodec/executor.c @@ -0,0 +1,221 @@ +/* + * Copyright (C) 2024 Nuo Mi + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config.h" + +#include + +#include "libavutil/mem.h" +#include "libavutil/thread.h" + +#include "executor.h" + +#if !HAVE_THREADS + +#define ExecutorThread char + +#define executor_thread_create(t, a, s, ar) 0 +#define executor_thread_join(t, r) do {} while(0) + +#else + +#define ExecutorThread pthread_t + +#define executor_thread_create(t, a, s, ar) pthread_create(t, a, s, ar) +#define executor_thread_join(t, r) pthread_join(t, r) + +#endif //!HAVE_THREADS + +typedef struct ThreadInfo { + FFExecutor *e; + ExecutorThread thread; +} ThreadInfo; + +struct FFExecutor { + FFTaskCallbacks cb; + int thread_count; + bool recursive; + + ThreadInfo *threads; + uint8_t *local_contexts; + + AVMutex lock; + AVCond cond; + int die; + + FFTask *tasks; +}; + +static FFTask* remove_task(FFTask **prev, FFTask *t) +{ + *prev = t->next; + t->next = NULL; + return t; +} + +static void add_task(FFTask **prev, FFTask *t) +{ + t->next = *prev; + *prev = t; +} + +static int run_one_task(FFExecutor *e, void *lc) +{ + FFTaskCallbacks *cb = &e->cb; + FFTask **prev; + + for (prev = &e->tasks; *prev && !cb->ready(*prev, cb->user_data); prev = &(*prev)->next) + /* nothing */; + if (*prev) { + FFTask *t = remove_task(prev, *prev); + if (e->thread_count > 0) + ff_mutex_unlock(&e->lock); + cb->run(t, lc, cb->user_data); + if (e->thread_count > 0) + ff_mutex_lock(&e->lock); + return 1; + } + return 0; +} + +#if HAVE_THREADS +static void *executor_worker_task(void *data) +{ + ThreadInfo *ti = (ThreadInfo*)data; + FFExecutor *e = ti->e; + void *lc = e->local_contexts + (ti - e->threads) * e->cb.local_context_size; + + ff_mutex_lock(&e->lock); + while (1) { + if (e->die) break; + + if (!run_one_task(e, lc)) { + //no task in one loop + ff_cond_wait(&e->cond, &e->lock); + } + } + ff_mutex_unlock(&e->lock); + return NULL; +} +#endif + +static void executor_free(FFExecutor *e, const int has_lock, const int has_cond) +{ + if (e->thread_count) { + //signal die + ff_mutex_lock(&e->lock); + e->die = 1; + ff_cond_broadcast(&e->cond); + ff_mutex_unlock(&e->lock); + + for (int i = 0; i < e->thread_count; i++) + executor_thread_join(e->threads[i].thread, NULL); + } + if (has_cond) + ff_cond_destroy(&e->cond); + if (has_lock) + ff_mutex_destroy(&e->lock); + + av_free(e->threads); + av_free(e->local_contexts); + + av_free(e); +} + +FFExecutor* ff_executor_alloc(const FFTaskCallbacks *cb, int thread_count) +{ + FFExecutor *e; + int has_lock = 0, has_cond = 0; + if (!cb || !cb->user_data || !cb->ready || !cb->run || !cb->priority_higher) + return NULL; + + e = av_mallocz(sizeof(*e)); + if (!e) + return NULL; + e->cb = *cb; + + e->local_contexts = av_calloc(FFMAX(thread_count, 1), e->cb.local_context_size); + if (!e->local_contexts) + goto free_executor; + + e->threads = av_calloc(FFMAX(thread_count, 1), sizeof(*e->threads)); + if (!e->threads) + goto free_executor; + + if (!thread_count) + return e; + + has_lock = !ff_mutex_init(&e->lock, NULL); + has_cond = !ff_cond_init(&e->cond, NULL); + + if (!has_lock || !has_cond) + goto free_executor; + + for (/* nothing */; e->thread_count < thread_count; e->thread_count++) { + ThreadInfo *ti = e->threads + e->thread_count; + ti->e = e; + if (executor_thread_create(&ti->thread, NULL, executor_worker_task, ti)) + goto free_executor; + } + return e; + +free_executor: + executor_free(e, has_lock, has_cond); + return NULL; +} + +void ff_executor_free(FFExecutor **executor) +{ + int thread_count; + + if (!executor || !*executor) + return; + thread_count = (*executor)->thread_count; + executor_free(*executor, thread_count, thread_count); + *executor = NULL; +} + +void ff_executor_execute(FFExecutor *e, FFTask *t) +{ + FFTaskCallbacks *cb = &e->cb; + FFTask **prev; + + if (e->thread_count) + ff_mutex_lock(&e->lock); + if (t) { + for (prev = &e->tasks; *prev && cb->priority_higher(*prev, t); prev = &(*prev)->next) + /* nothing */; + add_task(prev, t); + } + if (e->thread_count) { + ff_cond_signal(&e->cond); + ff_mutex_unlock(&e->lock); + } + + if (!e->thread_count || !HAVE_THREADS) { + if (e->recursive) + return; + e->recursive = true; + // We are running in a single-threaded environment, so we must handle all tasks ourselves + while (run_one_task(e, e->local_contexts)) + /* nothing */; + e->recursive = false; + } +} diff --git a/libavcodec/executor.h b/libavcodec/executor.h new file mode 100644 index 0000000000..2d02734ad6 --- /dev/null +++ b/libavcodec/executor.h @@ -0,0 +1,73 @@ +/* + * Copyright (C) 2024 Nuo Mi + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* + * We still need several refactors to improve the current VVC decoder's performance, + * which will frequently break the API/ABI. To mitigate this, we've copied the executor from + * avutil to avcodec. Once the API/ABI is stable, we will move this class back to avutil + */ + +#ifndef AVCODEC_EXECUTOR_H +#define AVCODEC_EXECUTOR_H + +typedef struct FFExecutor FFExecutor; +typedef struct FFTask FFTask; + +struct FFTask { + FFTask *next; +}; + +typedef struct FFTaskCallbacks { + void *user_data; + + int local_context_size; + + // return 1 if a's priority > b's priority + int (*priority_higher)(const FFTask *a, const FFTask *b); + + // task is ready for run + int (*ready)(const FFTask *t, void *user_data); + + // run the task + int (*run)(FFTask *t, void *local_context, void *user_data); +} FFTaskCallbacks; + +/** + * Alloc executor + * @param callbacks callback structure for executor + * @param thread_count worker thread number, 0 for run on caller's thread directly + * @return return the executor + */ +FFExecutor* ff_executor_alloc(const FFTaskCallbacks *callbacks, int thread_count); + +/** + * Free executor + * @param e pointer to executor + */ +void ff_executor_free(FFExecutor **e); + +/** + * Add task to executor + * @param e pointer to executor + * @param t pointer to task. If NULL, it will wakeup one work thread + */ +void ff_executor_execute(FFExecutor *e, FFTask *t); + +#endif //AVCODEC_EXECUTOR_H diff --git a/libavcodec/vvc/dec.h b/libavcodec/vvc/dec.h index d27cf52ca2..159c60942b 100644 --- a/libavcodec/vvc/dec.h +++ b/libavcodec/vvc/dec.h @@ -236,7 +236,7 @@ typedef struct VVCContext { uint16_t seq_decode; uint16_t seq_output; - struct AVExecutor *executor; + struct FFExecutor *executor; VVCFrameContext *fcs; int nb_fcs; diff --git a/libavcodec/vvc/thread.c b/libavcodec/vvc/thread.c index 86a7753c6a..e6907fd764 100644 --- a/libavcodec/vvc/thread.c +++ b/libavcodec/vvc/thread.c @@ -22,7 +22,7 @@ #include -#include "libavutil/executor.h" +#include "libavcodec/executor.h" #include "libavutil/mem.h" #include "libavutil/thread.h" @@ -55,7 +55,7 @@ typedef enum VVCTaskStage { typedef struct VVCTask { union { struct VVCTask *next; //for executor debug only - AVTask task; + FFTask task; } u; VVCTaskStage stage; @@ -109,7 +109,7 @@ static void add_task(VVCContext *s, VVCTask *t) atomic_fetch_add(&ft->nb_scheduled_tasks, 1); - av_executor_execute(s->executor, &t->u.task); + ff_executor_execute(s->executor, &t->u.task); } static void task_init(VVCTask *t, VVCTaskStage stage, VVCFrameContext *fc, const int rx, const int ry) @@ -372,7 +372,7 @@ static int task_is_stage_ready(VVCTask *t, int add) return task_has_target_score(t, stage, score); } -static int task_ready(const AVTask *_t, void *user_data) +static int task_ready(const FFTask *_t, void *user_data) { VVCTask *t = (VVCTask*)_t; @@ -385,7 +385,7 @@ static int task_ready(const AVTask *_t, void *user_data) return (a) < (b); \ } while (0) -static int task_priority_higher(const AVTask *_a, const AVTask *_b) +static int task_priority_higher(const FFTask *_a, const FFTask *_b) { const VVCTask *a = (const VVCTask*)_a; const VVCTask *b = (const VVCTask*)_b; @@ -661,7 +661,7 @@ static void task_run_stage(VVCTask *t, VVCContext *s, VVCLocalContext *lc) return; } -static int task_run(AVTask *_t, void *local_context, void *user_data) +static int task_run(FFTask *_t, void *local_context, void *user_data) { VVCTask *t = (VVCTask*)_t; VVCContext *s = (VVCContext *)user_data; @@ -683,21 +683,21 @@ static int task_run(AVTask *_t, void *local_context, void *user_data) return 0; } -AVExecutor* ff_vvc_executor_alloc(VVCContext *s, const int thread_count) +FFExecutor* ff_vvc_executor_alloc(VVCContext *s, const int thread_count) { - AVTaskCallbacks callbacks = { + FFTaskCallbacks callbacks = { s, sizeof(VVCLocalContext), task_priority_higher, task_ready, task_run, }; - return av_executor_alloc(&callbacks, thread_count); + return ff_executor_alloc(&callbacks, thread_count); } -void ff_vvc_executor_free(AVExecutor **e) +void ff_vvc_executor_free(FFExecutor **e) { - av_executor_free(e); + ff_executor_free(e); } void ff_vvc_frame_thread_free(VVCFrameContext *fc) diff --git a/libavcodec/vvc/thread.h b/libavcodec/vvc/thread.h index 7b15dbee59..b89aee3b32 100644 --- a/libavcodec/vvc/thread.h +++ b/libavcodec/vvc/thread.h @@ -25,8 +25,8 @@ #include "dec.h" -struct AVExecutor* ff_vvc_executor_alloc(VVCContext *s, int thread_count); -void ff_vvc_executor_free(struct AVExecutor **e); +struct FFExecutor* ff_vvc_executor_alloc(VVCContext *s, int thread_count); +void ff_vvc_executor_free(struct FFExecutor **e); int ff_vvc_frame_thread_init(VVCFrameContext *fc); void ff_vvc_frame_thread_free(VVCFrameContext *fc); From patchwork Wed Oct 2 01:43:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nuo Mi X-Patchwork-Id: 51981 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:cb8a:0:b0:48e:c0f8:d0de with SMTP id d10csp665941vqv; Tue, 1 Oct 2024 18:44:27 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUqr7Y1cIV6UnlASxvzfwHxUspyLWD7/46jCfkDGktHKvue/fG9DpNyL9HoyGG0yIdNQXmYErmVUjmgwxO4Fqap@gmail.com X-Google-Smtp-Source: AGHT+IF3h87iZfLsQ/mx6EY0pVdiwd12Etw3GZjt9cboLnOuUW8c0+YT5g3NaFa3bds0qKhPCggS X-Received: by 2002:a17:906:dc92:b0:a8d:2623:dd18 with SMTP id a640c23a62f3a-a98f8386df2mr60497266b.13.1727833466644; Tue, 01 Oct 2024 18:44:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1727833466; cv=none; d=google.com; s=arc-20240605; b=OtT5KfVB4AWO2gVDlSpm1hEUXLV2h5VGaelZyiCDNpjrAIsS376SKGLNpZu0pXD0Rr +SPne8gKXEdeKdSNV8Fg7fZP8obWcWwzyHmXepTdTztheBRu3BncBSZeHo+zRbmFmrqI 3W0fLeAFZjHZXNMNGgG5kTCXqJraWr7PlYy8+UMFb4JRc0aVpmg/5CSsOpU3CTmZYjpZ o875pHJO6GIcyQ5JiMNIN6EOEyzSCJxLXm83l2PtWXB2TElNfFZTUWzV9Zkf4r+byQDu eNhEQxnlQvhIiqGEDiJXStMiVw6fVfsBqLh2hP8t2wyf77+AAF8FbKKA4XBNOr/Ahzfo gqfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=ZXfJYAy7r7dlHf62vJeKMeqKamZM43dPmP0OXALmXbs=; fh=mZk9AfRmPBMGW9h158yccPeJgZmEjzU2tMQtLZcF184=; b=DJDrwrlUlFmG8DFBRtUpaDZYzhv4TAVn13+bE2ul7b30IxakJE174J+bptzvU/qMIO FFWfycu2gANd8Qwq7RxwVMUQ4LxRR6FXYtNj062K8y3USKCG5StsnYSzsz/0wNh5uwsv 6SY8BBB2xdW1MCzL0NH3Ie+XTzkluPbwu8r707vR+iCzvLho+00fpko6cdFFyzfTrI0g C2CsiwfHKUdnVzMD2Y1HxA3KIhRb8z0a1F9yTmb1dgwj7OTplmsnqNe32yaVC6dfTyxO 0SMdcM7s0XzyEUmpa2EMcxAQbOnpg81RBdlTwY7vbRIiF8XrHs+QmoAmv+/6OGd64jJg QkTw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=E2UDj4D5; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a640c23a62f3a-a93c27f2219si814837866b.159.2024.10.01.18.44.25; Tue, 01 Oct 2024 18:44:26 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=E2UDj4D5; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F209068D68B; Wed, 2 Oct 2024 04:44:17 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 387BB68D653 for ; Wed, 2 Oct 2024 04:44:11 +0300 (EEST) Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-20bc2970df5so6577865ad.3 for ; Tue, 01 Oct 2024 18:44:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727833449; x=1728438249; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jyGEArRnTGZ0KuxOe03Cdqxm2weUt3wO9IuddLnW8/4=; b=E2UDj4D5q4zWxmmPzpJvq11xrIWFcXt2T+UEdtOD1LzutxAiMJYlMkV0XHlHgffG5M pHvHlQPzqY05h/lA1Y53qokopeh8Ki0CDAbY/3FKSB8cM1GKFWfOPZoQxGRcIU8VWzXN yfhNbfSv3jH991Xjt1vQl92a8lLb5wyrac0wHfj4nAHvsSqO2dZlX4UK5/kTNZRVSbCV tR53x6NflMXXb6+gak8ZIFFuwbGgv30E42iBfYoaJp+Vbs84pIQvWQ3eTKywJdUmCh1P DvWr55tMNUUlVxtrtHvQ5dkk0InwcaxJMUqGHRYd66xj7xcuPvzh6lR7QNSbBGg8tie/ U5Eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727833449; x=1728438249; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jyGEArRnTGZ0KuxOe03Cdqxm2weUt3wO9IuddLnW8/4=; b=Hq6vnMD6QLwpNLcOOKPcCuwpLrDJ1e4iaq6Tgjgwtao2zvpD0UmEe8C4zS7Q/y6ik+ 3GsY5S6zsjZ1GPlM7e5v7PGrzeHloLZBykn+kpajxS4Oa7UrCYMZ+T9IxvvJciT/i1wY GUeuTpkauha7zMkMqaQYlGA4k3VJoyesEAAqL0aLOF9B94i1G+90SjOrOlhsrEKnKnjV Z9LxNmzUd0J0ydtZ4QPCEyy5HlJWP/8vC6XnF+BXMN1ajwXvwLjwZ/rwM7Z4YdVv0r8n JxeD7mzoyOGLiMuPRMJzToIoyVjgkfmvFcPdG8SIewfWmFCwQ/Gbec6JkfVxDkfabl8X y/Gw== X-Gm-Message-State: AOJu0YxZNK4EQ+QmABcPQDLjgBGsl9zwDMTZGl60GLhfJCgM+ds2QMpy xx09l5eFxpcm/hINRa6wElLLPKgTpkc42njWTa9rTv1ZRogVN2B3FkLEjSb+ X-Received: by 2002:a17:902:d48d:b0:20b:81bb:4a8a with SMTP id d9443c01a7336-20bc59992f4mr19236675ad.6.1727833448705; Tue, 01 Oct 2024 18:44:08 -0700 (PDT) Received: from localhost ([112.64.8.17]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20b37d5edc6sm75483605ad.28.2024.10.01.18.44.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Oct 2024 18:44:08 -0700 (PDT) From: Nuo Mi To: ffmpeg-devel@ffmpeg.org Date: Wed, 2 Oct 2024 09:43:57 +0800 Message-Id: <20241002014358.296769-2-nuomi2021@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241002014358.296769-1-nuomi2021@gmail.com> References: <20241002014358.296769-1-nuomi2021@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 2/3] avcodec/executor: remove unused ready callback X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Nuo Mi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: QBQ2D9fb3wtp Due to the nature of multithreading, using a "ready check" mechanism may introduce a deadlock. For example: Suppose all tasks have been submitted to the executor, and the last thread checks the entire list and finds no ready tasks. It then goes to sleep, waiting for a new task. However, for some multithreading-related reason, a task becomes ready after the check. Since no other thread is aware of this and no new tasks are being added to the executor, a deadlock occurs. In VVC, this function is unnecessary because we use a scoreboard. All tasks submitted to the executor are ready tasks. --- libavcodec/executor.c | 6 ++---- libavcodec/executor.h | 3 --- libavcodec/vvc/thread.c | 8 -------- 3 files changed, 2 insertions(+), 15 deletions(-) diff --git a/libavcodec/executor.c b/libavcodec/executor.c index 574c5c7be7..21ebad3def 100644 --- a/libavcodec/executor.c +++ b/libavcodec/executor.c @@ -79,10 +79,8 @@ static void add_task(FFTask **prev, FFTask *t) static int run_one_task(FFExecutor *e, void *lc) { FFTaskCallbacks *cb = &e->cb; - FFTask **prev; + FFTask **prev = &e->tasks; - for (prev = &e->tasks; *prev && !cb->ready(*prev, cb->user_data); prev = &(*prev)->next) - /* nothing */; if (*prev) { FFTask *t = remove_task(prev, *prev); if (e->thread_count > 0) @@ -143,7 +141,7 @@ FFExecutor* ff_executor_alloc(const FFTaskCallbacks *cb, int thread_count) { FFExecutor *e; int has_lock = 0, has_cond = 0; - if (!cb || !cb->user_data || !cb->ready || !cb->run || !cb->priority_higher) + if (!cb || !cb->user_data || !cb->run || !cb->priority_higher) return NULL; e = av_mallocz(sizeof(*e)); diff --git a/libavcodec/executor.h b/libavcodec/executor.h index 2d02734ad6..51763ec25e 100644 --- a/libavcodec/executor.h +++ b/libavcodec/executor.h @@ -42,9 +42,6 @@ typedef struct FFTaskCallbacks { // return 1 if a's priority > b's priority int (*priority_higher)(const FFTask *a, const FFTask *b); - // task is ready for run - int (*ready)(const FFTask *t, void *user_data); - // run the task int (*run)(FFTask *t, void *local_context, void *user_data); } FFTaskCallbacks; diff --git a/libavcodec/vvc/thread.c b/libavcodec/vvc/thread.c index e6907fd764..a8c19b17cf 100644 --- a/libavcodec/vvc/thread.c +++ b/libavcodec/vvc/thread.c @@ -372,13 +372,6 @@ static int task_is_stage_ready(VVCTask *t, int add) return task_has_target_score(t, stage, score); } -static int task_ready(const FFTask *_t, void *user_data) -{ - VVCTask *t = (VVCTask*)_t; - - return task_is_stage_ready(t, 0); -} - #define CHECK(a, b) \ do { \ if ((a) != (b)) \ @@ -689,7 +682,6 @@ FFExecutor* ff_vvc_executor_alloc(VVCContext *s, const int thread_count) s, sizeof(VVCLocalContext), task_priority_higher, - task_ready, task_run, }; return ff_executor_alloc(&callbacks, thread_count); From patchwork Wed Oct 2 01:43:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Nuo Mi X-Patchwork-Id: 51982 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a59:cb8a:0:b0:48e:c0f8:d0de with SMTP id d10csp666010vqv; Tue, 1 Oct 2024 18:44:37 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWs3hlIwSEXjSRIqtoccq0ws2N3dSB3shdztQcQ/4dXfuXV/Yj3SiUE30DA3kg2Yenb10PRLLK5SM2cmRHKqhuZ@gmail.com X-Google-Smtp-Source: AGHT+IGh5JqGA3WRlwGSyJ5nqMjzKXdsCMA2Q1vuRTxXJczhPOlXtVUhTiROAd91z1u57hPn8hQ1 X-Received: by 2002:a05:6000:1445:b0:37b:8b7c:107a with SMTP id ffacd0b85a97d-37cfb8cfc81mr547814f8f.8.1727833476911; Tue, 01 Oct 2024 18:44:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1727833476; cv=none; d=google.com; s=arc-20240605; b=XNPMIAS1hj8wWQvgB3655WPuTp7ZtUdaNvPaoJaQ0H04g8Yd4G1/bySwhb/Rp/d0LH MAF7Z63Kg62TG6/KvixkLvMhrEiH1U8s+ASgaZ7mGX4MHL1WlBl215MDtRO/ku87WhG+ ymbHLzWX95cmQx/e0C0229GDxuSryAg6NJkThCTVNHgMvXgvIU+HiOC+UIx1ubx6ueOw CiN1163kG/KxWfzrDPnHfqJwU1qghZS6zBw4Pv+yc7hZg2wSrAukFsq1meSUtcktNM4o M/RQAZ7OeE6W/qvhmnu0zPfw9IWBcpoTHpMUs2WAs3InV0DJGsPNPVHdwqbUCSAYgQtH vV7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=spFuP21By2g+c1toKEuMy5eZX9wXx4PGUjzIrOyxDeY=; fh=mZk9AfRmPBMGW9h158yccPeJgZmEjzU2tMQtLZcF184=; b=OGufmgWz8c7bTvvklKLzsG6Cu/NECXUrpd4+aEqujO/NBlTS0b+VFU0B98F89mEZAj JzzXcnazULpObJpxY0RFNizI4/on5NOLb6Q7fXPjuRoO9W+w8RPyrB6Ni++5B/URxMIh U3Zec+Ar2h1n2pS7gRLdW3p2STbGujBsSB6sUx/oxc9Q1xAFjh0/3O+EY2AJ+pXhaeqL 6yxPNm0il9a5SmuUodRFQyZHSJuQp4OWsTXAnQtrvggQAIYrSoLJjQPcLTxYib1ZydKM lm6P10nEWCJW/oJwWteACW09h4HWury0hIYdH5IKz3G1TqceV4bS9yWpIfpd2jOqKHat Ay4Q==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="Dq8cr/EI"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 4fb4d7f45d1cf-5c8824a7744si7781958a12.546.2024.10.01.18.44.34; Tue, 01 Oct 2024 18:44:36 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="Dq8cr/EI"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=fail header.i=@gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1BDDA68D676; Wed, 2 Oct 2024 04:44:21 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E2C8A68D66D for ; Wed, 2 Oct 2024 04:44:14 +0300 (EEST) Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-20b4a0940e3so44742725ad.0 for ; Tue, 01 Oct 2024 18:44:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727833453; x=1728438253; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kSFOvLfnw415P6K05jRAq4K9fcqLSg/LdYnlMH8IU9E=; b=Dq8cr/EIlx0z7TavAd6b9/wnoYU2DgJwCbN5EVuUDxa9tZwTY4zVVa5Aogb43pA45/ QhdZLPFxA0SvzDEiU5K8lIo6j9vDKIzXjwGhwG+Q9rBGKII5JftovODMPmqiHQIk1Qzl VdeUNPHASdNHYYdbGR4unrGOrIx0MtYNfsAJrYbc97S89xLsqfkg4F73e9DEIiIIz3a8 vciPBB432K0Lqu+AtTHvOXlRzEZXqO+YMWGkZGOHuZImRffSLESyOaMyktxuMbX7oQCd cg4f9H3sw9EQKeyY9Gd45DSr6eLD/upszPOjW4Pv4P6ZsHWUB581PNoqkMsRPehg6x9/ dFJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727833453; x=1728438253; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kSFOvLfnw415P6K05jRAq4K9fcqLSg/LdYnlMH8IU9E=; b=K89ZOkqG04oImZDlGNKgI+C1vrqM0imKyKkeXISgLL83I/Pn/PpJJliE+U4R+qy6M1 9vMB4gMQuEvVEimpAYFRBDGiSCDFSmMjmfSVakheUel87piEbZyomE3NoFbEnrRjk9Kv 2flqbPQGWANBW9BCP9TomXfVlfseFUqoCM788tUhmgGoLhn/2IFKjNgoaZUsmMgJPlsl NkYL9ed1evIwS3lEowmjjtk8nWZz4V7OASMGZ2AteJv4AmYleTNER9UdU6WtfOo7S/l5 Qyw81av9CUQbtcOnQN+lPItV4xIS2AI13P7VCb3rB8a+eu6hIZMO/lDt/VxUceSZPjKP 6pVw== X-Gm-Message-State: AOJu0YxK3JCUA6cPal70xfqQpPLD4HOLJHvkzEWnPE/vVDUuOOWvXwYR sKU6L6PU45A6qgg1Dm03CWa0vf6eZLJypRnQNNefC1jVoTpbdf3EZD+QqqMs X-Received: by 2002:a17:903:1103:b0:20b:8776:4906 with SMTP id d9443c01a7336-20bc5a94f25mr21801475ad.37.1727833452778; Tue, 01 Oct 2024 18:44:12 -0700 (PDT) Received: from localhost ([112.64.8.17]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20b37e63f0asm75447755ad.294.2024.10.01.18.44.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Oct 2024 18:44:12 -0700 (PDT) From: Nuo Mi To: ffmpeg-devel@ffmpeg.org Date: Wed, 2 Oct 2024 09:43:58 +0800 Message-Id: <20241002014358.296769-3-nuomi2021@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241002014358.296769-1-nuomi2021@gmail.com> References: <20241002014358.296769-1-nuomi2021@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 3/3] avcodec/vvc: simplify priority logical to improve performance for 4K/8K X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Nuo Mi Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 5YIAfuw2KpaK For 4K/8K video processing, it's possible to have over 1,000 tasks pending on the executor. In such cases, O(n) and O(log(n)) insertion times are too costly. Reducing this to O(1) will significantly decrease the time spent in critical sections clip | before | after | delta ------------------------------------------------------------|--------|--------|------- VVC_HDR_UHDTV2_OpenGOP_7680x4320_50fps_HLG10.bit | 24 | 27 | 12.5% VVC_HDR_UHDTV2_OpenGOP_7680x4320_50fps_HLG10_HighBitrate.bit| 12 | 17 | 41.7% tears_of_steel_4k_8M_8bit_2000.vvc | 34 | 102 | 200.0% VVC_UHDTV1_OpenGOP_3840x2160_60fps_HLG10.bit | 126 | 128 | 1.6% RitualDance_1920x1080_60_10_420_37_RA.266 | 350 | 378 | 8.0% NovosobornayaSquare_1920x1080.bin | 341 | 369 | 8.2% Tango2_3840x2160_60_10_420_27_LD.266 | 69 | 70 | 1.4% RitualDance_1920x1080_60_10_420_32_LD.266 | 243 | 259 | 6.6% Chimera_8bit_1080P_1000_frames.vvc | 420 | 392 | -6.7% BQTerrace_1920x1080_60_10_420_22_RA.vvc | 148 | 144 | -2.7% --- libavcodec/executor.c | 52 ++++++++++++++++++++++++++--------------- libavcodec/executor.h | 5 ++-- libavcodec/vvc/thread.c | 48 +++++++++++++++---------------------- 3 files changed, 55 insertions(+), 50 deletions(-) diff --git a/libavcodec/executor.c b/libavcodec/executor.c index 21ebad3def..7a86e894f8 100644 --- a/libavcodec/executor.c +++ b/libavcodec/executor.c @@ -48,6 +48,11 @@ typedef struct ThreadInfo { ExecutorThread thread; } ThreadInfo; +typedef struct Queue { + FFTask *head; + FFTask *tail; +} Queue; + struct FFExecutor { FFTaskCallbacks cb; int thread_count; @@ -60,29 +65,39 @@ struct FFExecutor { AVCond cond; int die; - FFTask *tasks; + Queue *q; }; -static FFTask* remove_task(FFTask **prev, FFTask *t) +static FFTask* remove_task(Queue *q) { - *prev = t->next; - t->next = NULL; + FFTask *t = q->head; + if (t) { + q->head = t->next; + t->next = NULL; + if (!q->head) + q->tail = NULL; + } return t; } -static void add_task(FFTask **prev, FFTask *t) +static void add_task(Queue *q, FFTask *t) { - t->next = *prev; - *prev = t; + t->next = NULL; + if (!q->head) + q->tail = q->head = t; + else + q->tail = q->tail->next = t; } static int run_one_task(FFExecutor *e, void *lc) { FFTaskCallbacks *cb = &e->cb; - FFTask **prev = &e->tasks; + FFTask *t = NULL; + + for (int i = 0; i < e->cb.priorities && !t; i++) + t = remove_task(e->q + i); - if (*prev) { - FFTask *t = remove_task(prev, *prev); + if (t) { if (e->thread_count > 0) ff_mutex_unlock(&e->lock); cb->run(t, lc, cb->user_data); @@ -132,6 +147,7 @@ static void executor_free(FFExecutor *e, const int has_lock, const int has_cond) ff_mutex_destroy(&e->lock); av_free(e->threads); + av_free(e->q); av_free(e->local_contexts); av_free(e); @@ -141,7 +157,7 @@ FFExecutor* ff_executor_alloc(const FFTaskCallbacks *cb, int thread_count) { FFExecutor *e; int has_lock = 0, has_cond = 0; - if (!cb || !cb->user_data || !cb->run || !cb->priority_higher) + if (!cb || !cb->user_data || !cb->run || !cb->priorities) return NULL; e = av_mallocz(sizeof(*e)); @@ -153,6 +169,10 @@ FFExecutor* ff_executor_alloc(const FFTaskCallbacks *cb, int thread_count) if (!e->local_contexts) goto free_executor; + e->q = av_calloc(e->cb.priorities, sizeof(Queue)); + if (!e->q) + goto free_executor; + e->threads = av_calloc(FFMAX(thread_count, 1), sizeof(*e->threads)); if (!e->threads) goto free_executor; @@ -192,16 +212,10 @@ void ff_executor_free(FFExecutor **executor) void ff_executor_execute(FFExecutor *e, FFTask *t) { - FFTaskCallbacks *cb = &e->cb; - FFTask **prev; - if (e->thread_count) ff_mutex_lock(&e->lock); - if (t) { - for (prev = &e->tasks; *prev && cb->priority_higher(*prev, t); prev = &(*prev)->next) - /* nothing */; - add_task(prev, t); - } + if (t) + add_task(e->q + t->priority % e->cb.priorities, t); if (e->thread_count) { ff_cond_signal(&e->cond); ff_mutex_unlock(&e->lock); diff --git a/libavcodec/executor.h b/libavcodec/executor.h index 51763ec25e..cd13d4c518 100644 --- a/libavcodec/executor.h +++ b/libavcodec/executor.h @@ -32,6 +32,7 @@ typedef struct FFTask FFTask; struct FFTask { FFTask *next; + int priority; // task priority should >= 0 and < AVTaskCallbacks.priorities }; typedef struct FFTaskCallbacks { @@ -39,8 +40,8 @@ typedef struct FFTaskCallbacks { int local_context_size; - // return 1 if a's priority > b's priority - int (*priority_higher)(const FFTask *a, const FFTask *b); + // how many priorities do we have? + int priorities; // run the task int (*run)(FFTask *t, void *local_context, void *user_data); diff --git a/libavcodec/vvc/thread.c b/libavcodec/vvc/thread.c index a8c19b17cf..d75784e242 100644 --- a/libavcodec/vvc/thread.c +++ b/libavcodec/vvc/thread.c @@ -103,13 +103,28 @@ typedef struct VVCFrameThread { AVCond cond; } VVCFrameThread; +#define PRIORITY_LOWEST 2 static void add_task(VVCContext *s, VVCTask *t) { - VVCFrameThread *ft = t->fc->ft; + VVCFrameThread *ft = t->fc->ft; + FFTask *task = &t->u.task; + const int priorities[] = { + 0, // VVC_TASK_STAGE_INIT, + 0, // VVC_TASK_STAGE_PARSE, + // For an 8K clip, a CTU line completed in the reference frame may trigger 64 and more inter tasks. + // We assign these tasks the lowest priority to avoid being overwhelmed with inter tasks. + PRIORITY_LOWEST, // VVC_TASK_STAGE_INTER + 1, // VVC_TASK_STAGE_RECON, + 1, // VVC_TASK_STAGE_LMCS, + 1, // VVC_TASK_STAGE_DEBLOCK_V, + 1, // VVC_TASK_STAGE_DEBLOCK_H, + 1, // VVC_TASK_STAGE_SAO, + 1, // VVC_TASK_STAGE_ALF, + }; atomic_fetch_add(&ft->nb_scheduled_tasks, 1); - - ff_executor_execute(s->executor, &t->u.task); + task->priority = priorities[t->stage]; + ff_executor_execute(s->executor, task); } static void task_init(VVCTask *t, VVCTaskStage stage, VVCFrameContext *fc, const int rx, const int ry) @@ -372,31 +387,6 @@ static int task_is_stage_ready(VVCTask *t, int add) return task_has_target_score(t, stage, score); } -#define CHECK(a, b) \ - do { \ - if ((a) != (b)) \ - return (a) < (b); \ - } while (0) - -static int task_priority_higher(const FFTask *_a, const FFTask *_b) -{ - const VVCTask *a = (const VVCTask*)_a; - const VVCTask *b = (const VVCTask*)_b; - - - if (a->stage <= VVC_TASK_STAGE_PARSE || b->stage <= VVC_TASK_STAGE_PARSE) { - CHECK(a->stage, b->stage); - CHECK(a->fc->decode_order, b->fc->decode_order); //decode order - CHECK(a->ry, b->ry); - return a->rx < b->rx; - } - - CHECK(a->fc->decode_order, b->fc->decode_order); //decode order - CHECK(a->rx + a->ry + a->stage, b->rx + b->ry + b->stage); //zigzag with type - CHECK(a->rx + a->ry, b->rx + b->ry); //zigzag - return a->ry < b->ry; -} - static void check_colocation(VVCContext *s, VVCTask *t) { const VVCFrameContext *fc = t->fc; @@ -681,7 +671,7 @@ FFExecutor* ff_vvc_executor_alloc(VVCContext *s, const int thread_count) FFTaskCallbacks callbacks = { s, sizeof(VVCLocalContext), - task_priority_higher, + PRIORITY_LOWEST + 1, task_run, }; return ff_executor_alloc(&callbacks, thread_count);