From patchwork Wed Aug 4 11:51:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shubhanshu Saxena X-Patchwork-Id: 29240 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:6c0f:0:0:0:0:0 with SMTP id a15csp3326669ioh; Wed, 4 Aug 2021 05:22:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyNezEMQpE/8JDECyxFbR41B9LB1XuGrmf8rKvkzZ7NT+of9Cg/MJoyxu914aVgGTiv+wx8 X-Received: by 2002:a17:906:705:: with SMTP id y5mr24877914ejb.149.1628079776556; Wed, 04 Aug 2021 05:22:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628079776; cv=none; d=google.com; s=arc-20160816; b=juJBih7ooTkqINKkC9GVctZmYpU2GMhsBEXbjXFvuOyoMagA36B3RUEGjNodd+mMBk ZdPLZxOaeQclfBInYtOOsLdZPRt0EnRWUNLnXj97Vd/rpoX9kq/OSG4YPnelRXVmKTTa 44ssk6Q6k6+83GvOdSizN44uzhibK6svtvbz7TC8FqXGVZ0PCz0Hjq+xT0GAsxLSrC2v Y/EBR4WKzdDYavKeZZEb4XOJ3IKW2kTN6br1axILCpRfHkIYpqdtPDnfcZk3DMWBVQUn 2o0U8Bm5hGM8Km9g2qHw9g7DX0stHozXCVDuEcVBz/GnSis5g5wnE4OUgBQ/1AKXraqP UPzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:delivered-to; bh=V1+ICQ21S2gPq0yCfj6r+YiDN7xWoN8JN+sdt1HlhDA=; b=RpPUcgb8+xa0z9a9uHGffzCIWb2a0oep1Pf9K1Pgi3eOkQIjntTVSmJmJPvSzVtYXg WBHEeumnP1A1Ucey19XpFmSdXwQEs1TGntjVa5lcxXmJcd/6h6c7JM6I3HNRKUGMbcs2 ZArY1vv2QS+IJ7WAyQBEMjKc+R8YHzRYa9JGf5dpuP3Kvi1o0CqqLigEEIKRpLqQEBFX cut5h0BpNP/YsxoA1S6JSCsmU/1DmViTfj4mGH4HWuE60Rda7e+hh8FGLe4iBOe2Bzc5 wf081x0CQ3dvdnaFQpRMbbAI776VFBkplAZUnq4keBS03k2KP2CZw9m8MisLyw+7DMWn XwFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=UAgxzBIw; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id c8si2055092ede.304.2021.08.04.05.22.55; Wed, 04 Aug 2021 05:22:56 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20161025 header.b=UAgxzBIw; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5A0B1689955; Wed, 4 Aug 2021 15:22:52 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 74E4068806D for ; Wed, 4 Aug 2021 15:22:45 +0300 (EEST) Received: by mail-qv1-f54.google.com with SMTP id kl2so955124qvb.11 for ; Wed, 04 Aug 2021 05:22:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=VmvynFoDx+JpYeqIScffxYLG+oVJVOqfxROPxE6S540=; b=UAgxzBIwH5laIRxT2UoYsgPSTXR7fxedsTOZnnp+Rt0u+z9bpwJaWJ88QBT+OJ2FNi tVbVvkuPY+YyP1Er8e6Af9p2ZJlPzgHQo757EmKfTF0IPch63527/quhW5gq0sCJk3UH amjNc6YKVOeaw0CTQbe2Leb6x43mznEzZLPbo29+5dcpVERQazVUxaIQj49IRtDdstHL OoLO+fplycQ9LG4McZqQKAN4i0us9V5zRljNfv5Wn+ryq0Zlm7wyNKzz4+YuOodrRqbm cLBct6RtzX3T00eJgaT7kUcMmggi31CgcgTVAg4AWeJMJwYPmdQl+MtfLdmrCFHSBwQy fyPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VmvynFoDx+JpYeqIScffxYLG+oVJVOqfxROPxE6S540=; b=QY0r6Q3RkB2yZBPeV9E0CvpgMmtcCrSb2KcIc3oKdbDDPi3sR7INyFRZE4DBekfSBN 8aIET/jj/WRLSgT9fRixhuWdvvzG3kGwPq2vbRcNrxLuAeGtkV2B9+iyGPgGn6xfeWDS F6ECepNfYYz71oWkW9O8MvOyY9taYaJ+ofSpDcfiqKWq/sAa9Z1t29dvdpeGxKYRi7ZF xKMWCf8RONZP6dxvcSOO9ScLXd5i63ITjEHI5sttuwnluP3HEo4fZHdCjcMTFEJdEhr8 rpywQ1mL2WyVv8WH+cMRdCjhc70pn+Y73DLBWCIHTe/H70tUrJuKWhpreWJj0NiSfPTG yrrw== X-Gm-Message-State: AOAM5303VCwmbhw8tocBi1sPv/P6y2pK5gX5ihf6+n3RTMYuumyAdqHX Dd3PKJ4rml1ccRgI6TqhMMx1ZMTPxGsZ6w== X-Received: by 2002:aa7:810d:0:b029:363:7359:f355 with SMTP id b13-20020aa7810d0000b02903637359f355mr27258109pfi.64.1628077925583; Wed, 04 Aug 2021 04:52:05 -0700 (PDT) Received: from Pavilion-x360.bbrouter ([103.133.121.9]) by smtp.googlemail.com with ESMTPSA id ms8sm6040305pjb.36.2021.08.04.04.52.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Aug 2021 04:52:05 -0700 (PDT) From: Shubhanshu Saxena To: ffmpeg-devel@ffmpeg.org Date: Wed, 4 Aug 2021 17:21:34 +0530 Message-Id: <20210804115138.64475-4-shubhanshu.e01@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210804115138.64475-1-shubhanshu.e01@gmail.com> References: <20210804115138.64475-1-shubhanshu.e01@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH V2 4/8] [GSoC] lavfi/dnn: Async Support for TensorFlow Backend X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Shubhanshu Saxena Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: nlEnaKEGgBky This commit enables async execution in the TensorFlow backend and adds function to flush extra frames. The async execution mechanism executes the TFInferRequests on a separate thread which is joined before the next execution of same TFRequestItem/while freeing the model. The following is the comparison of this mechanism with the existing sync mechanism on TensorFlow C API 2.5 CPU variant. Async Mode: 4m32.846s Sync Mode: 5m17.582s The above was performed on super resolution filter using SRCNN model. Signed-off-by: Shubhanshu Saxena --- libavfilter/dnn/dnn_backend_tf.c | 121 ++++++++++++++++++++++++++----- libavfilter/dnn/dnn_backend_tf.h | 3 + libavfilter/dnn/dnn_interface.c | 3 + 3 files changed, 109 insertions(+), 18 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_tf.c b/libavfilter/dnn/dnn_backend_tf.c index 805a0328b6..d3658c3308 100644 --- a/libavfilter/dnn/dnn_backend_tf.c +++ b/libavfilter/dnn/dnn_backend_tf.c @@ -37,7 +37,6 @@ #include "dnn_io_proc.h" #include "dnn_backend_common.h" #include "safe_queue.h" -#include "queue.h" #include typedef struct TFOptions{ @@ -58,6 +57,7 @@ typedef struct TFModel{ TF_Status *status; SafeQueue *request_queue; Queue *inference_queue; + Queue *task_queue; } TFModel; /** @@ -74,7 +74,7 @@ typedef struct TFInferRequest { typedef struct TFRequestItem { TFInferRequest *infer_request; InferenceItem *inference; - // further properties will be added later for async + DNNAsyncExecModule exec_module; } TFRequestItem; #define OFFSET(x) offsetof(TFContext, x) @@ -88,6 +88,7 @@ static const AVOption dnn_tensorflow_options[] = { AVFILTER_DEFINE_CLASS(dnn_tensorflow); static DNNReturnType execute_model_tf(TFRequestItem *request, Queue *inference_queue); +static void infer_completion_callback(void *args); static void free_buffer(void *data, size_t length) { @@ -885,6 +886,9 @@ DNNModel *ff_dnn_load_model_tf(const char *model_filename, DNNFunctionType func_ av_freep(&item); goto err; } + item->exec_module.start_inference = &tf_start_inference; + item->exec_module.callback = &infer_completion_callback; + item->exec_module.args = item; if (ff_safe_queue_push_back(tf_model->request_queue, item) < 0) { av_freep(&item->infer_request); @@ -898,6 +902,11 @@ DNNModel *ff_dnn_load_model_tf(const char *model_filename, DNNFunctionType func_ goto err; } + tf_model->task_queue = ff_queue_create(); + if (!tf_model->task_queue) { + goto err; + } + model->model = tf_model; model->get_input = &get_input_tf; model->get_output = &get_output_tf; @@ -1060,7 +1069,6 @@ static DNNReturnType execute_model_tf(TFRequestItem *request, Queue *inference_q { TFModel *tf_model; TFContext *ctx; - TFInferRequest *infer_request; InferenceItem *inference; TaskItem *task; @@ -1073,23 +1081,14 @@ static DNNReturnType execute_model_tf(TFRequestItem *request, Queue *inference_q tf_model = task->model; ctx = &tf_model->ctx; - if (task->async) { - avpriv_report_missing_feature(ctx, "Async execution not supported"); + if (fill_model_input_tf(tf_model, request) != DNN_SUCCESS) { return DNN_ERROR; - } else { - if (fill_model_input_tf(tf_model, request) != DNN_SUCCESS) { - return DNN_ERROR; - } + } - infer_request = request->infer_request; - TF_SessionRun(tf_model->session, NULL, - infer_request->tf_input, &infer_request->input_tensor, 1, - infer_request->tf_outputs, infer_request->output_tensors, - task->nb_output, NULL, 0, NULL, - tf_model->status); - if (TF_GetCode(tf_model->status) != TF_OK) { - tf_free_request(infer_request); - av_log(ctx, AV_LOG_ERROR, "Failed to run session when executing model\n"); + if (task->async) { + return ff_dnn_start_inference_async(ctx, &request->exec_module); + } else { + if (tf_start_inference(request) != DNN_SUCCESS) { return DNN_ERROR; } infer_completion_callback(request); @@ -1126,6 +1125,83 @@ DNNReturnType ff_dnn_execute_model_tf(const DNNModel *model, DNNExecBaseParams * return execute_model_tf(request, tf_model->inference_queue); } +DNNReturnType ff_dnn_execute_model_async_tf(const DNNModel *model, DNNExecBaseParams *exec_params) { + TFModel *tf_model = model->model; + TFContext *ctx = &tf_model->ctx; + TaskItem *task; + TFRequestItem *request; + + if (ff_check_exec_params(ctx, DNN_TF, model->func_type, exec_params) != 0) { + return DNN_ERROR; + } + + task = av_malloc(sizeof(*task)); + if (!task) { + av_log(ctx, AV_LOG_ERROR, "unable to alloc memory for task item.\n"); + return DNN_ERROR; + } + + if (ff_dnn_fill_task(task, exec_params, tf_model, 1, 1) != DNN_SUCCESS) { + av_freep(&task); + return DNN_ERROR; + } + + if (ff_queue_push_back(tf_model->task_queue, task) < 0) { + av_freep(&task); + av_log(ctx, AV_LOG_ERROR, "unable to push back task_queue.\n"); + return DNN_ERROR; + } + + if (extract_inference_from_task(task, tf_model->inference_queue) != DNN_SUCCESS) { + av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n"); + return DNN_ERROR; + } + + request = ff_safe_queue_pop_front(tf_model->request_queue); + if (!request) { + av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); + return DNN_ERROR; + } + return execute_model_tf(request, tf_model->inference_queue); +} + +DNNAsyncStatusType ff_dnn_get_async_result_tf(const DNNModel *model, AVFrame **in, AVFrame **out) +{ + TFModel *tf_model = model->model; + return ff_dnn_get_async_result_common(tf_model->task_queue, in, out); +} + +DNNReturnType ff_dnn_flush_tf(const DNNModel *model) +{ + TFModel *tf_model = model->model; + TFContext *ctx = &tf_model->ctx; + TFRequestItem *request; + DNNReturnType ret; + + if (ff_queue_size(tf_model->inference_queue) == 0) { + // no pending task need to flush + return DNN_SUCCESS; + } + + request = ff_safe_queue_pop_front(tf_model->request_queue); + if (!request) { + av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); + return DNN_ERROR; + } + + ret = fill_model_input_tf(tf_model, request); + if (ret != DNN_SUCCESS) { + av_log(ctx, AV_LOG_ERROR, "Failed to fill model input.\n"); + if (ff_safe_queue_push_back(tf_model->request_queue, request) < 0) { + av_freep(&request->infer_request); + av_freep(&request); + } + return ret; + } + + return ff_dnn_start_inference_async(ctx, &request->exec_module); +} + void ff_dnn_free_model_tf(DNNModel **model) { TFModel *tf_model; @@ -1134,6 +1210,7 @@ void ff_dnn_free_model_tf(DNNModel **model) tf_model = (*model)->model; while (ff_safe_queue_size(tf_model->request_queue) != 0) { TFRequestItem *item = ff_safe_queue_pop_front(tf_model->request_queue); + ff_dnn_async_module_cleanup(&item->exec_module); tf_free_request(item->infer_request); av_freep(&item->infer_request); av_freep(&item); @@ -1146,6 +1223,14 @@ void ff_dnn_free_model_tf(DNNModel **model) } ff_queue_destroy(tf_model->inference_queue); + while (ff_queue_size(tf_model->task_queue) != 0) { + TaskItem *item = ff_queue_pop_front(tf_model->task_queue); + av_frame_free(&item->in_frame); + av_frame_free(&item->out_frame); + av_freep(&item); + } + ff_queue_destroy(tf_model->task_queue); + if (tf_model->graph){ TF_DeleteGraph(tf_model->graph); } diff --git a/libavfilter/dnn/dnn_backend_tf.h b/libavfilter/dnn/dnn_backend_tf.h index 3dfd6e4280..aec0fc2011 100644 --- a/libavfilter/dnn/dnn_backend_tf.h +++ b/libavfilter/dnn/dnn_backend_tf.h @@ -32,6 +32,9 @@ DNNModel *ff_dnn_load_model_tf(const char *model_filename, DNNFunctionType func_type, const char *options, AVFilterContext *filter_ctx); DNNReturnType ff_dnn_execute_model_tf(const DNNModel *model, DNNExecBaseParams *exec_params); +DNNReturnType ff_dnn_execute_model_async_tf(const DNNModel *model, DNNExecBaseParams *exec_params); +DNNAsyncStatusType ff_dnn_get_async_result_tf(const DNNModel *model, AVFrame **in, AVFrame **out); +DNNReturnType ff_dnn_flush_tf(const DNNModel *model); void ff_dnn_free_model_tf(DNNModel **model); diff --git a/libavfilter/dnn/dnn_interface.c b/libavfilter/dnn/dnn_interface.c index 02e532fc1b..81af934dd5 100644 --- a/libavfilter/dnn/dnn_interface.c +++ b/libavfilter/dnn/dnn_interface.c @@ -48,6 +48,9 @@ DNNModule *ff_get_dnn_module(DNNBackendType backend_type) #if (CONFIG_LIBTENSORFLOW == 1) dnn_module->load_model = &ff_dnn_load_model_tf; dnn_module->execute_model = &ff_dnn_execute_model_tf; + dnn_module->execute_model_async = &ff_dnn_execute_model_async_tf; + dnn_module->get_async_result = &ff_dnn_get_async_result_tf; + dnn_module->flush = &ff_dnn_flush_tf; dnn_module->free_model = &ff_dnn_free_model_tf; #else av_freep(&dnn_module);