From patchwork Sun Apr 18 10:07:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Yejun" X-Patchwork-Id: 26964 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a25:49c5:0:0:0:0:0 with SMTP id w188csp228738yba; Sun, 18 Apr 2021 03:20:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw7yRDw1AllnGyPjVLY97I5qmm6ILbBSwVjUKCfnPQuTZC9L7Rv+7RWBjj/hXrY9JD9keQu X-Received: by 2002:a17:906:3945:: with SMTP id g5mr16544756eje.427.1618741204824; Sun, 18 Apr 2021 03:20:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618741204; cv=none; d=google.com; s=arc-20160816; b=GkiIzs/oUa5hZaEe6NIUTFVyJqc81EwBVLofaxSiupvwzgyiYaQK/1EKHNkraQrHHw eMy0Of07LJFgvGBZIgXKMY8ymD24ZbP17E59Oclh3cKns+cym+5mF70ACP4zt3krpUHj qgWbS5By8t/LHiidBP7/+DkyVxAPLovkL4jqjjwOu8Pp8PsB4l5AScVrmXkf1Jf/QYSr XYJ/w4EVGafpXBneDwt/IPFQ9Vi1PxSGXyTrqc52Jcm/IJ+ZChTgrx0DhNTjfTnQMTLC sFL+QKnlXtESutwEjfC5H+bEQ84T7zFPTAZPXvHmTcBmXhOBa3HMG4CuiZZjTSBvVXag KlbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:message-id:date:to:from:ironport-sdr :ironport-sdr:delivered-to; bh=q7Q5i5jieDCVmIcIV1X5qNU6Ty+JvqODDUa+cvV7rFU=; b=o+ppN9R/9lyP2ocXJx6eqs2pEeSRlUif6TVhJZy5zTk00f2w32MVSE+9EiVAGlCahb KVbSHlmr3Mw32B42hn0e2vR0+CMBHjCFWvVcx2j7U1/rRtZWeIvnv1H4PHl/80rOri3D /ybDyOk4kb5LUp9mIOVG0sB+kOth96h70bD4HKzo7PxedOYmH7+EMzGIAixizdEXTarN Vq5bXHF/v6IXxJAz+VQzGUCzAGobAZjRL85djDxirrTFmv74/LUvEoU+rFeAFdGXFkE+ OOFiPwHKKf5wPhgxe9KD9fqiGwxTIUSVHelMZRmBhq2Nlxt1OEamdBAN5GItnb3BrX++ ZlSg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ds10si10976242ejc.709.2021.04.18.03.20.04; Sun, 18 Apr 2021 03:20:04 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5F50E680998; Sun, 18 Apr 2021 13:20:02 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 57A01680577 for ; Sun, 18 Apr 2021 13:19:55 +0300 (EEST) IronPort-SDR: yJq9IwsOsLUcw/PaEQuZPHub51paBDjvhFHrUxc2+5tQTM5XozPU/yJjAlQPM5EP+0hWzDfkYI uavlheXtApXg== X-IronPort-AV: E=McAfee;i="6200,9189,9957"; a="256523543" X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="256523543" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2021 03:19:52 -0700 IronPort-SDR: QCBAnqJ5+2wzO0Ldr9yfDJVYj4nqb5HGmjqahSN5oBszUlEBFhcoBHXba866lBZ3wvZz/fi8w9 MhFS43b2ZBxw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="453918908" Received: from yguo18-skl-u1604.sh.intel.com ([10.239.159.53]) by fmsmga002.fm.intel.com with ESMTP; 18 Apr 2021 03:19:50 -0700 From: "Guo, Yejun" To: ffmpeg-devel@ffmpeg.org Date: Sun, 18 Apr 2021 18:07:57 +0800 Message-Id: <20210418100802.19017-1-yejun.guo@intel.com> X-Mailer: git-send-email 2.17.1 Subject: [FFmpeg-devel] [PATCH 1/6] lavfi/dnn_backend_openvino.c: unify code for infer request for sync/async X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: yejun.guo@intel.com MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 6NUzfpGiTZMJ Content-Length: 4508 --- libavfilter/dnn/dnn_backend_openvino.c | 49 +++++++++++--------------- 1 file changed, 21 insertions(+), 28 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_openvino.c b/libavfilter/dnn/dnn_backend_openvino.c index 0757727a9c..874354ecef 100644 --- a/libavfilter/dnn/dnn_backend_openvino.c +++ b/libavfilter/dnn/dnn_backend_openvino.c @@ -52,9 +52,6 @@ typedef struct OVModel{ ie_core_t *core; ie_network_t *network; ie_executable_network_t *exe_network; - ie_infer_request_t *infer_request; - - /* for async execution */ SafeQueue *request_queue; // holds RequestItem Queue *task_queue; // holds TaskItem } OVModel; @@ -269,12 +266,9 @@ static void infer_completion_callback(void *args) ie_blob_free(&output_blob); request->task_count = 0; - - if (task->async) { - if (ff_safe_queue_push_back(requestq, request) < 0) { - av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n"); - return; - } + if (ff_safe_queue_push_back(requestq, request) < 0) { + av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n"); + return; } } @@ -347,11 +341,6 @@ static DNNReturnType init_model_ov(OVModel *ov_model, const char *input_name, co goto err; } - // create infer_request for sync execution - status = ie_exec_network_create_infer_request(ov_model->exe_network, &ov_model->infer_request); - if (status != OK) - goto err; - // create infer_requests for async execution if (ctx->options.nireq <= 0) { // the default value is a rough estimation @@ -502,10 +491,9 @@ static DNNReturnType get_output_ov(void *model, const char *input_name, int inpu OVModel *ov_model = model; OVContext *ctx = &ov_model->ctx; TaskItem task; - RequestItem request; + RequestItem *request; AVFrame *in_frame = NULL; AVFrame *out_frame = NULL; - TaskItem *ptask = &task; IEStatusCode status; input_shapes_t input_shapes; @@ -557,11 +545,16 @@ static DNNReturnType get_output_ov(void *model, const char *input_name, int inpu task.out_frame = out_frame; task.ov_model = ov_model; - request.infer_request = ov_model->infer_request; - request.task_count = 1; - request.tasks = &ptask; + request = ff_safe_queue_pop_front(ov_model->request_queue); + if (!request) { + av_frame_free(&out_frame); + av_frame_free(&in_frame); + av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); + return DNN_ERROR; + } + request->tasks[request->task_count++] = &task; - ret = execute_model_ov(&request); + ret = execute_model_ov(request); *output_width = out_frame->width; *output_height = out_frame->height; @@ -633,8 +626,7 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_n OVModel *ov_model = model->model; OVContext *ctx = &ov_model->ctx; TaskItem task; - RequestItem request; - TaskItem *ptask = &task; + RequestItem *request; if (!in_frame) { av_log(ctx, AV_LOG_ERROR, "in frame is NULL when execute model.\n"); @@ -674,11 +666,14 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_n task.out_frame = out_frame; task.ov_model = ov_model; - request.infer_request = ov_model->infer_request; - request.task_count = 1; - request.tasks = &ptask; + request = ff_safe_queue_pop_front(ov_model->request_queue); + if (!request) { + av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); + return DNN_ERROR; + } + request->tasks[request->task_count++] = &task; - return execute_model_ov(&request); + return execute_model_ov(request); } DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *input_name, AVFrame *in_frame, @@ -821,8 +816,6 @@ void ff_dnn_free_model_ov(DNNModel **model) } ff_queue_destroy(ov_model->task_queue); - if (ov_model->infer_request) - ie_infer_request_free(&ov_model->infer_request); if (ov_model->exe_network) ie_exec_network_free(&ov_model->exe_network); if (ov_model->network) From patchwork Sun Apr 18 10:07:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Yejun" X-Patchwork-Id: 26967 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a25:49c5:0:0:0:0:0 with SMTP id w188csp228824yba; Sun, 18 Apr 2021 03:20:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzFTnoVY8BuOQld69AMeJHurkLd8AOUDP7xsaLlTSXPi21SYdCyXmqHvq3wjWa1+3+OyfPB X-Received: by 2002:a05:6402:11c7:: with SMTP id j7mr1741679edw.119.1618741214476; Sun, 18 Apr 2021 03:20:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618741214; cv=none; d=google.com; s=arc-20160816; b=jIoLHwwZW0ebxHZDFQeT8JqMwP5H6VTpcdHIuD20FtxZE9cgXTV+Ub7X+EFjrVfZVf n1LgG7WrB7VwiZNLWvWymCcVZmzxZEfqkzpOh38na6nsHIts2HmpaLY66sKwSlTmrFlu plMJlM1OV8h7JWNIw+E4XDN07w3sdMM+GZu3/7Mai28Vo6m9f6ceBf843IfNZmlg327a c4fvikr1taPxF9Dq5r0DB/20ZudAU+ur4kQu2pT4RX868c4vVo2oK54iSsuxMZPKeNte 2vOGlpZATszqsZIPNR6+3GaEObku1JXsvyIE6JnY//xI0uYb+CAYdPV4rX97lI/XjX+H rHwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:ironport-sdr:ironport-sdr:delivered-to; bh=dF9l/4tUQSOC2vOHRHqIrtfYD+H0Kz8eoH3BZWKZkPc=; b=L2uOFzTEwwW3MFdhgNm86kapnVIOcxPOatIx68of/Zt3hf3se+Z12WknWXzlS0uKa7 GwIBEcb0jOmxH9h8ccXhQR20zOTkH6nhSoCmboXrnzXgXOzC1oQub6m4dYhu4L83xLoT mhXdJZP8dAgtFlJjDO8H9Yj+fnqePIZbbJt005M5ZHhL6+iQ4nT52R1VENWLejGs2nE7 +hoDArCuuYaz6F6ig+8djLOtaNVQ+RU1EjW7lpeQB118267HSk+sstPok4D7NkFtEN1p Moze8vH0VbjiaARoa0mwOuIZg+bHc+QtpqMNcAIOxiP+/c4ZQk50YNg/jygQ1qs6briG DDFw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h18si1487023eds.187.2021.04.18.03.20.14; Sun, 18 Apr 2021 03:20:14 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C53176809EB; Sun, 18 Apr 2021 13:20:04 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CC502680652 for ; Sun, 18 Apr 2021 13:19:55 +0300 (EEST) IronPort-SDR: vuLW0qQzSzAStBSHymHmp+dX+bIBaSEYHILmAxpz7v3O51wGIFecgSGj7RVewZWXPg5RUv59XR Cthms1AhhocA== X-IronPort-AV: E=McAfee;i="6200,9189,9957"; a="256523544" X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="256523544" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2021 03:19:53 -0700 IronPort-SDR: L+wdvjAaqFj8tEZhQnMyXZBzc3afkEZizRo1JSyf6O0KADR3rHGr0zADqwmk5qEaYr6f4gcyfJ Zp0kxlvMugQw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="453918914" Received: from yguo18-skl-u1604.sh.intel.com ([10.239.159.53]) by fmsmga002.fm.intel.com with ESMTP; 18 Apr 2021 03:19:51 -0700 From: "Guo, Yejun" To: ffmpeg-devel@ffmpeg.org Date: Sun, 18 Apr 2021 18:07:58 +0800 Message-Id: <20210418100802.19017-2-yejun.guo@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210418100802.19017-1-yejun.guo@intel.com> References: <20210418100802.19017-1-yejun.guo@intel.com> Subject: [FFmpeg-devel] [PATCH 2/6] lavfi/dnn_backend_openvino.c: add InferenceItem between TaskItem and RequestItem X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: yejun.guo@intel.com MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: SfJwv5pYtzF4 Content-Length: 13463 There's one task item for one function call from dnn interface, there's one request item for one call to openvino. For classify, one task might need multiple inference for classification on every bounding box, so add InferenceItem. --- libavfilter/dnn/dnn_backend_openvino.c | 157 ++++++++++++++++++------- 1 file changed, 115 insertions(+), 42 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_openvino.c b/libavfilter/dnn/dnn_backend_openvino.c index 874354ecef..3692a381e2 100644 --- a/libavfilter/dnn/dnn_backend_openvino.c +++ b/libavfilter/dnn/dnn_backend_openvino.c @@ -54,8 +54,10 @@ typedef struct OVModel{ ie_executable_network_t *exe_network; SafeQueue *request_queue; // holds RequestItem Queue *task_queue; // holds TaskItem + Queue *inference_queue; // holds InferenceItem } OVModel; +// one task for one function call from dnn interface typedef struct TaskItem { OVModel *ov_model; const char *input_name; @@ -64,13 +66,20 @@ typedef struct TaskItem { AVFrame *out_frame; int do_ioproc; int async; - int done; + uint32_t inference_todo; + uint32_t inference_done; } TaskItem; +// one task might have multiple inferences +typedef struct InferenceItem { + TaskItem *task; +} InferenceItem; + +// one request for one call to openvino typedef struct RequestItem { ie_infer_request_t *infer_request; - TaskItem **tasks; - int task_count; + InferenceItem **inferences; + uint32_t inference_count; ie_complete_call_back_t callback; } RequestItem; @@ -127,7 +136,12 @@ static DNNReturnType fill_model_input_ov(OVModel *ov_model, RequestItem *request IEStatusCode status; DNNData input; ie_blob_t *input_blob = NULL; - TaskItem *task = request->tasks[0]; + InferenceItem *inference; + TaskItem *task; + + inference = ff_queue_peek_front(ov_model->inference_queue); + av_assert0(inference); + task = inference->task; status = ie_infer_request_get_blob(request->infer_request, task->input_name, &input_blob); if (status != OK) { @@ -159,9 +173,14 @@ static DNNReturnType fill_model_input_ov(OVModel *ov_model, RequestItem *request // change to be an option when necessary. input.order = DCO_BGR; - av_assert0(request->task_count <= dims.dims[0]); - for (int i = 0; i < request->task_count; ++i) { - task = request->tasks[i]; + for (int i = 0; i < ctx->options.batch_size; ++i) { + inference = ff_queue_pop_front(ov_model->inference_queue); + if (!inference) { + break; + } + request->inferences[i] = inference; + request->inference_count = i + 1; + task = inference->task; if (task->do_ioproc) { if (ov_model->model->frame_pre_proc != NULL) { ov_model->model->frame_pre_proc(task->in_frame, &input, ov_model->model->filter_ctx); @@ -183,7 +202,8 @@ static void infer_completion_callback(void *args) precision_e precision; IEStatusCode status; RequestItem *request = args; - TaskItem *task = request->tasks[0]; + InferenceItem *inference = request->inferences[0]; + TaskItem *task = inference->task; SafeQueue *requestq = task->ov_model->request_queue; ie_blob_t *output_blob = NULL; ie_blob_buffer_t blob_buffer; @@ -229,10 +249,11 @@ static void infer_completion_callback(void *args) output.dt = precision_to_datatype(precision); output.data = blob_buffer.buffer; - av_assert0(request->task_count <= dims.dims[0]); - av_assert0(request->task_count >= 1); - for (int i = 0; i < request->task_count; ++i) { - task = request->tasks[i]; + av_assert0(request->inference_count <= dims.dims[0]); + av_assert0(request->inference_count >= 1); + for (int i = 0; i < request->inference_count; ++i) { + task = request->inferences[i]->task; + task->inference_done++; switch (task->ov_model->model->func_type) { case DFT_PROCESS_FRAME: @@ -259,13 +280,13 @@ static void infer_completion_callback(void *args) break; } - task->done = 1; + av_freep(&request->inferences[i]); output.data = (uint8_t *)output.data + output.width * output.height * output.channels * get_datatype_size(output.dt); } ie_blob_free(&output_blob); - request->task_count = 0; + request->inference_count = 0; if (ff_safe_queue_push_back(requestq, request) < 0) { av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n"); return; @@ -370,11 +391,11 @@ static DNNReturnType init_model_ov(OVModel *ov_model, const char *input_name, co goto err; } - item->tasks = av_malloc_array(ctx->options.batch_size, sizeof(*item->tasks)); - if (!item->tasks) { + item->inferences = av_malloc_array(ctx->options.batch_size, sizeof(*item->inferences)); + if (!item->inferences) { goto err; } - item->task_count = 0; + item->inference_count = 0; } ov_model->task_queue = ff_queue_create(); @@ -382,6 +403,11 @@ static DNNReturnType init_model_ov(OVModel *ov_model, const char *input_name, co goto err; } + ov_model->inference_queue = ff_queue_create(); + if (!ov_model->inference_queue) { + goto err; + } + return DNN_SUCCESS; err: @@ -389,15 +415,24 @@ err: return DNN_ERROR; } -static DNNReturnType execute_model_ov(RequestItem *request) +static DNNReturnType execute_model_ov(RequestItem *request, Queue *inferenceq) { IEStatusCode status; DNNReturnType ret; - TaskItem *task = request->tasks[0]; - OVContext *ctx = &task->ov_model->ctx; + InferenceItem *inference; + TaskItem *task; + OVContext *ctx; + + if (ff_queue_size(inferenceq) == 0) { + return DNN_SUCCESS; + } + + inference = ff_queue_peek_front(inferenceq); + task = inference->task; + ctx = &task->ov_model->ctx; if (task->async) { - if (request->task_count < ctx->options.batch_size) { + if (ff_queue_size(inferenceq) < ctx->options.batch_size) { if (ff_safe_queue_push_front(task->ov_model->request_queue, request) < 0) { av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n"); return DNN_ERROR; @@ -430,7 +465,7 @@ static DNNReturnType execute_model_ov(RequestItem *request) return DNN_ERROR; } infer_completion_callback(request); - return task->done ? DNN_SUCCESS : DNN_ERROR; + return (task->inference_done == task->inference_todo) ? DNN_SUCCESS : DNN_ERROR; } } @@ -484,6 +519,31 @@ static DNNReturnType get_input_ov(void *model, DNNData *input, const char *input return DNN_ERROR; } +static DNNReturnType extract_inference_from_task(DNNFunctionType func_type, TaskItem *task, Queue *inference_queue) +{ + switch (func_type) { + case DFT_PROCESS_FRAME: + case DFT_ANALYTICS_DETECT: + { + InferenceItem *inference = av_malloc(sizeof(*inference)); + if (!inference) { + return DNN_ERROR; + } + task->inference_todo = 1; + task->inference_done = 0; + inference->task = task; + if (ff_queue_push_back(inference_queue, inference) < 0) { + av_freep(&inference); + return DNN_ERROR; + } + return DNN_SUCCESS; + } + default: + av_assert0(!"should not reach here"); + return DNN_ERROR; + } +} + static DNNReturnType get_output_ov(void *model, const char *input_name, int input_width, int input_height, const char *output_name, int *output_width, int *output_height) { @@ -536,7 +596,6 @@ static DNNReturnType get_output_ov(void *model, const char *input_name, int inpu return DNN_ERROR; } - task.done = 0; task.do_ioproc = 0; task.async = 0; task.input_name = input_name; @@ -545,6 +604,13 @@ static DNNReturnType get_output_ov(void *model, const char *input_name, int inpu task.out_frame = out_frame; task.ov_model = ov_model; + if (extract_inference_from_task(ov_model->model->func_type, &task, ov_model->inference_queue) != DNN_SUCCESS) { + av_frame_free(&out_frame); + av_frame_free(&in_frame); + av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n"); + return DNN_ERROR; + } + request = ff_safe_queue_pop_front(ov_model->request_queue); if (!request) { av_frame_free(&out_frame); @@ -552,9 +618,8 @@ static DNNReturnType get_output_ov(void *model, const char *input_name, int inpu av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); return DNN_ERROR; } - request->tasks[request->task_count++] = &task; - ret = execute_model_ov(request); + ret = execute_model_ov(request, ov_model->inference_queue); *output_width = out_frame->width; *output_height = out_frame->height; @@ -657,7 +722,6 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_n } } - task.done = 0; task.do_ioproc = 1; task.async = 0; task.input_name = input_name; @@ -666,14 +730,18 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_n task.out_frame = out_frame; task.ov_model = ov_model; + if (extract_inference_from_task(ov_model->model->func_type, &task, ov_model->inference_queue) != DNN_SUCCESS) { + av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n"); + return DNN_ERROR; + } + request = ff_safe_queue_pop_front(ov_model->request_queue); if (!request) { av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); return DNN_ERROR; } - request->tasks[request->task_count++] = &task; - return execute_model_ov(request); + return execute_model_ov(request, ov_model->inference_queue); } DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *input_name, AVFrame *in_frame, @@ -707,7 +775,6 @@ DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *i return DNN_ERROR; } - task->done = 0; task->do_ioproc = 1; task->async = 1; task->input_name = input_name; @@ -721,14 +788,18 @@ DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *i return DNN_ERROR; } + if (extract_inference_from_task(ov_model->model->func_type, task, ov_model->inference_queue) != DNN_SUCCESS) { + av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n"); + return DNN_ERROR; + } + request = ff_safe_queue_pop_front(ov_model->request_queue); if (!request) { av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); return DNN_ERROR; } - request->tasks[request->task_count++] = task; - return execute_model_ov(request); + return execute_model_ov(request, ov_model->inference_queue); } DNNAsyncStatusType ff_dnn_get_async_result_ov(const DNNModel *model, AVFrame **in, AVFrame **out) @@ -740,7 +811,7 @@ DNNAsyncStatusType ff_dnn_get_async_result_ov(const DNNModel *model, AVFrame **i return DAST_EMPTY_QUEUE; } - if (!task->done) { + if (task->inference_done != task->inference_todo) { return DAST_NOT_READY; } @@ -760,21 +831,17 @@ DNNReturnType ff_dnn_flush_ov(const DNNModel *model) IEStatusCode status; DNNReturnType ret; + if (ff_queue_size(ov_model->inference_queue) == 0) { + // no pending task need to flush + return DNN_SUCCESS; + } + request = ff_safe_queue_pop_front(ov_model->request_queue); if (!request) { av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); return DNN_ERROR; } - if (request->task_count == 0) { - // no pending task need to flush - if (ff_safe_queue_push_back(ov_model->request_queue, request) < 0) { - av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n"); - return DNN_ERROR; - } - return DNN_SUCCESS; - } - ret = fill_model_input_ov(ov_model, request); if (ret != DNN_SUCCESS) { av_log(ctx, AV_LOG_ERROR, "Failed to fill model input.\n"); @@ -803,11 +870,17 @@ void ff_dnn_free_model_ov(DNNModel **model) if (item && item->infer_request) { ie_infer_request_free(&item->infer_request); } - av_freep(&item->tasks); + av_freep(&item->inferences); av_freep(&item); } ff_safe_queue_destroy(ov_model->request_queue); + while (ff_queue_size(ov_model->inference_queue) != 0) { + TaskItem *item = ff_queue_pop_front(ov_model->inference_queue); + av_freep(&item); + } + ff_queue_destroy(ov_model->inference_queue); + while (ff_queue_size(ov_model->task_queue) != 0) { TaskItem *item = ff_queue_pop_front(ov_model->task_queue); av_frame_free(&item->in_frame); From patchwork Sun Apr 18 10:07:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Yejun" X-Patchwork-Id: 26961 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a25:49c5:0:0:0:0:0 with SMTP id w188csp228905yba; Sun, 18 Apr 2021 03:20:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwy6zdU2rJi9R57fzYwYbd1nWzsXJaFmgX1DjRRm8AZnR/CF/tl2NZRYfkW8ciXFQM/NZB0 X-Received: by 2002:a05:6402:270e:: with SMTP id y14mr19902250edd.283.1618741227433; Sun, 18 Apr 2021 03:20:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618741227; cv=none; d=google.com; s=arc-20160816; b=ntyYNrzRFcPuqFts1Q7loG6zQJzPa0PC3qfgCNodC+rWPwxA+ElFUqX5xjywOFfyry puoyFNwr+wa15wv0gMB/rqXkw3tibOJx7SKx9tsdZI7UhLdzxDmaQ/7NPQs4vrB2/PuR QPlO2X6TVntTwqd5fOyUo+NpkbGNh/vgdyevBJ70slavKz7z8q0kgHDOVcaIP2Cnixf+ 9eA4c5Lx4V3fuV9tD6a/zCwft2xzaGs8KpWJmcOTpNtD5ddbSElqSbmIbU1KNEQTj2l9 BnyNvC6e7HHZYNmGLyCx1xMcjflUjwu9w4P3WEqGFufcGFGEDysgG6IVeya59wsT40gD ZUSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:ironport-sdr:ironport-sdr:delivered-to; bh=7k7EuPCxANB976C2IthZi0t8xqwD36quwIllnmWF+cw=; b=TG3EX5dtz0mPnOs2TMeJh4OW+woYGb2aPIR9ufBbRo3lD0smyZyc7aXKFdzRBuyzAN WCRRUZuduYKYktDnKFg8HFI1L/HXugjRojp4V1N1iCpdTi3RA2J1XOy6/oKchgrSaOJ0 +jZn2L0Uhs4WiaqY0FCYLczfRZgEOuF0cDT62dlpPd3RusNqbVdKHkDkW6t8C0AWaj/x n76OgwpNIIy2uz43PMybnN+FiRJ9uhVht3H/6DHsSflanfnvRXJXDT0K7mzUCRSt6WfJ 1+e0PAulZeoUnFyjUZHA+Phe9nMa9IBmg1eN2b/uxL1soY4dxE1H3E3UDuMHlxMGYNI0 lH5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id eb8si10798895edb.353.2021.04.18.03.20.27; Sun, 18 Apr 2021 03:20:27 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 572BE680A2B; Sun, 18 Apr 2021 13:20:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 4C8BD680987 for ; Sun, 18 Apr 2021 13:19:57 +0300 (EEST) IronPort-SDR: 4XpopCxiR2t0p6SsOCYR1TNGBAMGmWMxYkOFeS/rflAjilOsmkc/CLBSnkSr/tzCBqeGAd//gk s3qkkovG9gVQ== X-IronPort-AV: E=McAfee;i="6200,9189,9957"; a="256523545" X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="256523545" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2021 03:19:54 -0700 IronPort-SDR: 7X8lej/1lKYjoAfvH/UfEVo+N1QIDaFEczIW6+w1BSdYGGIFS1xelv8x3YIuB+KdhKrErRVezT y6Ep8h77eyBw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="453918922" Received: from yguo18-skl-u1604.sh.intel.com ([10.239.159.53]) by fmsmga002.fm.intel.com with ESMTP; 18 Apr 2021 03:19:53 -0700 From: "Guo, Yejun" To: ffmpeg-devel@ffmpeg.org Date: Sun, 18 Apr 2021 18:07:59 +0800 Message-Id: <20210418100802.19017-3-yejun.guo@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210418100802.19017-1-yejun.guo@intel.com> References: <20210418100802.19017-1-yejun.guo@intel.com> Subject: [FFmpeg-devel] [PATCH 3/6] lavfi/dnn_backend_openvino.c: move the logic for batch mode earlier X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: yejun.guo@intel.com MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: w0c9u1av+PkE Content-Length: 1678 --- libavfilter/dnn/dnn_backend_openvino.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_openvino.c b/libavfilter/dnn/dnn_backend_openvino.c index 3692a381e2..a695d863b5 100644 --- a/libavfilter/dnn/dnn_backend_openvino.c +++ b/libavfilter/dnn/dnn_backend_openvino.c @@ -432,13 +432,6 @@ static DNNReturnType execute_model_ov(RequestItem *request, Queue *inferenceq) ctx = &task->ov_model->ctx; if (task->async) { - if (ff_queue_size(inferenceq) < ctx->options.batch_size) { - if (ff_safe_queue_push_front(task->ov_model->request_queue, request) < 0) { - av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n"); - return DNN_ERROR; - } - return DNN_SUCCESS; - } ret = fill_model_input_ov(task->ov_model, request); if (ret != DNN_SUCCESS) { return ret; @@ -793,6 +786,11 @@ DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *i return DNN_ERROR; } + if (ff_queue_size(ov_model->inference_queue) < ctx->options.batch_size) { + // not enough inference items queued for a batch + return DNN_SUCCESS; + } + request = ff_safe_queue_pop_front(ov_model->request_queue); if (!request) { av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); From patchwork Sun Apr 18 10:08:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Yejun" X-Patchwork-Id: 26954 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a25:49c5:0:0:0:0:0 with SMTP id w188csp228952yba; Sun, 18 Apr 2021 03:20:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxP5WO6QhV0KQ9nlazNzVeocBYTWphMKJ0GAJIxPoxE7b203cYEAyWlXykAl84CwbRLFNvw X-Received: by 2002:a05:6402:105a:: with SMTP id e26mr19685077edu.164.1618741234766; Sun, 18 Apr 2021 03:20:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618741234; cv=none; d=google.com; s=arc-20160816; b=NB6zrwER0zCdQTuwupTfK3+5O5AQviQTLdy8KIicOxUnvt7Aj5z/XqGIyGmzr6CK7O oc0F1FJIvLG7zbyVbc4K/cWFioILg9dmVygYRcfAur2jzLS1+uwIraVoRb1p1drsFx0D k7ABRDdrmGvxkGP0DJPIMy6QXA3XnNw/rW/gli7unZJ1nMgAMS2Ui7jdp3JzRWVdl8QS sv6B09uvJVO4w4AZVm/Q/uODa+RW25S7et6HBKubWrfUCiiTeAjin0qnI9Lln1UFt645 eSge85slb65YUH4FEMcjJOTtvT3FoCi/ksx/ChO2BLAtErvHgwuZz1dQKg7ks1N+/Q0a rV+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:ironport-sdr:ironport-sdr:delivered-to; bh=50MLpwcFQcoXS9pxkS0QGD4W2UNCiiouGp0A1m2hJio=; b=xYbwfkRNu2XIsxFjXqD/Tv51xabafeQpjmnHjRZuaSUEzbU/0BEF6m4xXuk2/GmEdq 8fvo4SBjEgnM69j6tZCqDRWZGyaVi1I3B5gx9puweTslF4nOTz8LlBtV9Tr6SYU/Y0BE Q6VTbQyvny6uwjFHdIkWp+Op10vASDruWgzZLzVNd/CvpLoZds3rLgGswhUPfSBwXupe DG2zMdWmyqoCkZjYpXI1AzlOdNbIurVuBVAArCiUl1QEDzKVlJl5g5AoRCwg69tljbxL Tf+sr/EWuzTdr9pd/UX7NygWF0Pcis3Vrd7tbDK0OxWxGyHL3WEdVrUimH1qdU8qLM2R SSfQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j10si217189eds.134.2021.04.18.03.20.33; Sun, 18 Apr 2021 03:20:34 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DB38A680A7F; Sun, 18 Apr 2021 13:20:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 46D656809D2 for ; Sun, 18 Apr 2021 13:20:01 +0300 (EEST) IronPort-SDR: tbWtIf/FwhKxZd6UDDQx7hX5VEG70GWz7enWedjUwUJfb9aLWK4nxR0OLza66Qlt9zNXOqICqg hh5Ioby02TtQ== X-IronPort-AV: E=McAfee;i="6200,9189,9957"; a="256523547" X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="256523547" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2021 03:19:55 -0700 IronPort-SDR: YnErYWxAMSgVHK1yh6eK9+4Wi3iuXou0vPZWQslg9g+E8+u9oSuVINrFS8VTypXi7Ax5lTWNki +U9eEmP4tY1Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="453918926" Received: from yguo18-skl-u1604.sh.intel.com ([10.239.159.53]) by fmsmga002.fm.intel.com with ESMTP; 18 Apr 2021 03:19:54 -0700 From: "Guo, Yejun" To: ffmpeg-devel@ffmpeg.org Date: Sun, 18 Apr 2021 18:08:00 +0800 Message-Id: <20210418100802.19017-4-yejun.guo@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210418100802.19017-1-yejun.guo@intel.com> References: <20210418100802.19017-1-yejun.guo@intel.com> Subject: [FFmpeg-devel] [PATCH 4/6] lavfi/dnn: refine dnn interface to add DNNExecBaseParams X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: yejun.guo@intel.com MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: rh9E4/fOgXv5 Content-Length: 20082 Different function type of model requires different parameters, for example, object detection detects lots of objects (cat/dog/...) in the frame, and classifcation needs to know which object (cat or dog) it is going to classify. The current interface needs to add a new function with more parameters to support new requirement, with this change, we can just add a new struct (for example DNNExecClassifyParams) based on DNNExecBaseParams, and so we can continue to use the current interface execute_model just with params changed. --- libavfilter/dnn/Makefile | 1 + libavfilter/dnn/dnn_backend_common.c | 51 ++++++++++++++++++++++++++ libavfilter/dnn/dnn_backend_common.h | 31 ++++++++++++++++ libavfilter/dnn/dnn_backend_native.c | 15 +++----- libavfilter/dnn/dnn_backend_native.h | 3 +- libavfilter/dnn/dnn_backend_openvino.c | 50 ++++++++----------------- libavfilter/dnn/dnn_backend_openvino.h | 6 +-- libavfilter/dnn/dnn_backend_tf.c | 18 +++------ libavfilter/dnn/dnn_backend_tf.h | 3 +- libavfilter/dnn_filter_common.c | 20 ++++++++-- libavfilter/dnn_interface.h | 14 +++++-- 11 files changed, 139 insertions(+), 73 deletions(-) create mode 100644 libavfilter/dnn/dnn_backend_common.c create mode 100644 libavfilter/dnn/dnn_backend_common.h diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile index d6d58f4b61..4cfbce0efc 100644 --- a/libavfilter/dnn/Makefile +++ b/libavfilter/dnn/Makefile @@ -2,6 +2,7 @@ OBJS-$(CONFIG_DNN) += dnn/dnn_interface.o OBJS-$(CONFIG_DNN) += dnn/dnn_io_proc.o OBJS-$(CONFIG_DNN) += dnn/queue.o OBJS-$(CONFIG_DNN) += dnn/safe_queue.o +OBJS-$(CONFIG_DNN) += dnn/dnn_backend_common.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layers.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_avgpool.o diff --git a/libavfilter/dnn/dnn_backend_common.c b/libavfilter/dnn/dnn_backend_common.c new file mode 100644 index 0000000000..a522ab5650 --- /dev/null +++ b/libavfilter/dnn/dnn_backend_common.c @@ -0,0 +1,51 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * DNN common functions different backends. + */ + +#include "dnn_backend_common.h" + +int ff_check_exec_params(void *ctx, DNNBackendType backend, DNNFunctionType func_type, DNNExecBaseParams *exec_params) +{ + if (!exec_params) { + av_log(ctx, AV_LOG_ERROR, "exec_params is null when execute model.\n"); + return AVERROR(EINVAL); + } + + if (!exec_params->in_frame) { + av_log(ctx, AV_LOG_ERROR, "in frame is NULL when execute model.\n"); + return AVERROR(EINVAL); + } + + if (!exec_params->out_frame) { + av_log(ctx, AV_LOG_ERROR, "out frame is NULL when execute model.\n"); + return AVERROR(EINVAL); + } + + if (exec_params->nb_output != 1 && backend != DNN_TF) { + // currently, the filter does not need multiple outputs, + // so we just pending the support until we really need it. + avpriv_report_missing_feature(ctx, "multiple outputs"); + return AVERROR(EINVAL); + } + + return 0; +} diff --git a/libavfilter/dnn/dnn_backend_common.h b/libavfilter/dnn/dnn_backend_common.h new file mode 100644 index 0000000000..cd9c0f5339 --- /dev/null +++ b/libavfilter/dnn/dnn_backend_common.h @@ -0,0 +1,31 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * DNN common functions different backends. + */ + +#ifndef AVFILTER_DNN_DNN_BACKEND_COMMON_H +#define AVFILTER_DNN_DNN_BACKEND_COMMON_H + +#include "../dnn_interface.h" + +int ff_check_exec_params(void *ctx, DNNBackendType backend, DNNFunctionType func_type, DNNExecBaseParams *exec_params); + +#endif diff --git a/libavfilter/dnn/dnn_backend_native.c b/libavfilter/dnn/dnn_backend_native.c index d9762eeaf6..b5f1c16538 100644 --- a/libavfilter/dnn/dnn_backend_native.c +++ b/libavfilter/dnn/dnn_backend_native.c @@ -28,6 +28,7 @@ #include "dnn_backend_native_layer_conv2d.h" #include "dnn_backend_native_layers.h" #include "dnn_io_proc.h" +#include "dnn_backend_common.h" #define OFFSET(x) offsetof(NativeContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM @@ -372,23 +373,17 @@ static DNNReturnType execute_model_native(const DNNModel *model, const char *inp return DNN_SUCCESS; } -DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, const char *input_name, AVFrame *in_frame, - const char **output_names, uint32_t nb_output, AVFrame *out_frame) +DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, DNNExecBaseParams *exec_params) { NativeModel *native_model = model->model; NativeContext *ctx = &native_model->ctx; - if (!in_frame) { - av_log(ctx, AV_LOG_ERROR, "in frame is NULL when execute model.\n"); - return DNN_ERROR; - } - - if (!out_frame) { - av_log(ctx, AV_LOG_ERROR, "out frame is NULL when execute model.\n"); + if (ff_check_exec_params(ctx, DNN_NATIVE, model->func_type, exec_params) != 0) { return DNN_ERROR; } - return execute_model_native(model, input_name, in_frame, output_names, nb_output, out_frame, 1); + return execute_model_native(model, exec_params->input_name, exec_params->in_frame, + exec_params->output_names, exec_params->nb_output, exec_params->out_frame, 1); } int32_t ff_calculate_operand_dims_count(const DnnOperand *oprd) diff --git a/libavfilter/dnn/dnn_backend_native.h b/libavfilter/dnn/dnn_backend_native.h index d313c48f3a..89bcb8e358 100644 --- a/libavfilter/dnn/dnn_backend_native.h +++ b/libavfilter/dnn/dnn_backend_native.h @@ -130,8 +130,7 @@ typedef struct NativeModel{ DNNModel *ff_dnn_load_model_native(const char *model_filename, DNNFunctionType func_type, const char *options, AVFilterContext *filter_ctx); -DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, const char *input_name, AVFrame *in_frame, - const char **output_names, uint32_t nb_output, AVFrame *out_frame); +DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, DNNExecBaseParams *exec_params); void ff_dnn_free_model_native(DNNModel **model); diff --git a/libavfilter/dnn/dnn_backend_openvino.c b/libavfilter/dnn/dnn_backend_openvino.c index a695d863b5..fcdd738f8a 100644 --- a/libavfilter/dnn/dnn_backend_openvino.c +++ b/libavfilter/dnn/dnn_backend_openvino.c @@ -33,6 +33,7 @@ #include "queue.h" #include "safe_queue.h" #include +#include "dnn_backend_common.h" typedef struct OVOptions{ char *device_type; @@ -678,28 +679,14 @@ err: return NULL; } -DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_name, AVFrame *in_frame, - const char **output_names, uint32_t nb_output, AVFrame *out_frame) +DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, DNNExecBaseParams *exec_params) { OVModel *ov_model = model->model; OVContext *ctx = &ov_model->ctx; TaskItem task; RequestItem *request; - if (!in_frame) { - av_log(ctx, AV_LOG_ERROR, "in frame is NULL when execute model.\n"); - return DNN_ERROR; - } - - if (!out_frame && model->func_type == DFT_PROCESS_FRAME) { - av_log(ctx, AV_LOG_ERROR, "out frame is NULL when execute model.\n"); - return DNN_ERROR; - } - - if (nb_output != 1) { - // currently, the filter does not need multiple outputs, - // so we just pending the support until we really need it. - avpriv_report_missing_feature(ctx, "multiple outputs"); + if (ff_check_exec_params(ctx, DNN_OV, model->func_type, exec_params) != 0) { return DNN_ERROR; } @@ -709,7 +696,7 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_n } if (!ov_model->exe_network) { - if (init_model_ov(ov_model, input_name, output_names[0]) != DNN_SUCCESS) { + if (init_model_ov(ov_model, exec_params->input_name, exec_params->output_names[0]) != DNN_SUCCESS) { av_log(ctx, AV_LOG_ERROR, "Failed init OpenVINO exectuable network or inference request\n"); return DNN_ERROR; } @@ -717,10 +704,10 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_n task.do_ioproc = 1; task.async = 0; - task.input_name = input_name; - task.in_frame = in_frame; - task.output_name = output_names[0]; - task.out_frame = out_frame; + task.input_name = exec_params->input_name; + task.in_frame = exec_params->in_frame; + task.output_name = exec_params->output_names[0]; + task.out_frame = exec_params->out_frame ? exec_params->out_frame : exec_params->in_frame; task.ov_model = ov_model; if (extract_inference_from_task(ov_model->model->func_type, &task, ov_model->inference_queue) != DNN_SUCCESS) { @@ -737,26 +724,19 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_n return execute_model_ov(request, ov_model->inference_queue); } -DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *input_name, AVFrame *in_frame, - const char **output_names, uint32_t nb_output, AVFrame *out_frame) +DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, DNNExecBaseParams *exec_params) { OVModel *ov_model = model->model; OVContext *ctx = &ov_model->ctx; RequestItem *request; TaskItem *task; - if (!in_frame) { - av_log(ctx, AV_LOG_ERROR, "in frame is NULL when async execute model.\n"); - return DNN_ERROR; - } - - if (!out_frame && model->func_type == DFT_PROCESS_FRAME) { - av_log(ctx, AV_LOG_ERROR, "out frame is NULL when async execute model.\n"); + if (ff_check_exec_params(ctx, DNN_OV, model->func_type, exec_params) != 0) { return DNN_ERROR; } if (!ov_model->exe_network) { - if (init_model_ov(ov_model, input_name, output_names[0]) != DNN_SUCCESS) { + if (init_model_ov(ov_model, exec_params->input_name, exec_params->output_names[0]) != DNN_SUCCESS) { av_log(ctx, AV_LOG_ERROR, "Failed init OpenVINO exectuable network or inference request\n"); return DNN_ERROR; } @@ -770,10 +750,10 @@ DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *i task->do_ioproc = 1; task->async = 1; - task->input_name = input_name; - task->in_frame = in_frame; - task->output_name = output_names[0]; - task->out_frame = out_frame; + task->input_name = exec_params->input_name; + task->in_frame = exec_params->in_frame; + task->output_name = exec_params->output_names[0]; + task->out_frame = exec_params->out_frame ? exec_params->out_frame : exec_params->in_frame; task->ov_model = ov_model; if (ff_queue_push_back(ov_model->task_queue, task) < 0) { av_freep(&task); diff --git a/libavfilter/dnn/dnn_backend_openvino.h b/libavfilter/dnn/dnn_backend_openvino.h index a484a7be32..046d0c5b5a 100644 --- a/libavfilter/dnn/dnn_backend_openvino.h +++ b/libavfilter/dnn/dnn_backend_openvino.h @@ -31,10 +31,8 @@ DNNModel *ff_dnn_load_model_ov(const char *model_filename, DNNFunctionType func_type, const char *options, AVFilterContext *filter_ctx); -DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_name, AVFrame *in_frame, - const char **output_names, uint32_t nb_output, AVFrame *out_frame); -DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *input_name, AVFrame *in_frame, - const char **output_names, uint32_t nb_output, AVFrame *out_frame); +DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, DNNExecBaseParams *exec_params); +DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, DNNExecBaseParams *exec_params); DNNAsyncStatusType ff_dnn_get_async_result_ov(const DNNModel *model, AVFrame **in, AVFrame **out); DNNReturnType ff_dnn_flush_ov(const DNNModel *model); diff --git a/libavfilter/dnn/dnn_backend_tf.c b/libavfilter/dnn/dnn_backend_tf.c index fb799d2b70..427edb08e1 100644 --- a/libavfilter/dnn/dnn_backend_tf.c +++ b/libavfilter/dnn/dnn_backend_tf.c @@ -33,7 +33,7 @@ #include "dnn_backend_native_layer_pad.h" #include "dnn_backend_native_layer_maximum.h" #include "dnn_io_proc.h" - +#include "dnn_backend_common.h" #include typedef struct TFOptions{ @@ -840,23 +840,17 @@ static DNNReturnType execute_model_tf(const DNNModel *model, const char *input_n return DNN_SUCCESS; } -DNNReturnType ff_dnn_execute_model_tf(const DNNModel *model, const char *input_name, AVFrame *in_frame, - const char **output_names, uint32_t nb_output, AVFrame *out_frame) +DNNReturnType ff_dnn_execute_model_tf(const DNNModel *model, DNNExecBaseParams *exec_params) { TFModel *tf_model = model->model; TFContext *ctx = &tf_model->ctx; - if (!in_frame) { - av_log(ctx, AV_LOG_ERROR, "in frame is NULL when execute model.\n"); - return DNN_ERROR; - } - - if (!out_frame) { - av_log(ctx, AV_LOG_ERROR, "out frame is NULL when execute model.\n"); - return DNN_ERROR; + if (ff_check_exec_params(ctx, DNN_TF, model->func_type, exec_params) != 0) { + return DNN_ERROR; } - return execute_model_tf(model, input_name, in_frame, output_names, nb_output, out_frame, 1); + return execute_model_tf(model, exec_params->input_name, exec_params->in_frame, + exec_params->output_names, exec_params->nb_output, exec_params->out_frame, 1); } void ff_dnn_free_model_tf(DNNModel **model) diff --git a/libavfilter/dnn/dnn_backend_tf.h b/libavfilter/dnn/dnn_backend_tf.h index 8cec04748e..3dfd6e4280 100644 --- a/libavfilter/dnn/dnn_backend_tf.h +++ b/libavfilter/dnn/dnn_backend_tf.h @@ -31,8 +31,7 @@ DNNModel *ff_dnn_load_model_tf(const char *model_filename, DNNFunctionType func_type, const char *options, AVFilterContext *filter_ctx); -DNNReturnType ff_dnn_execute_model_tf(const DNNModel *model, const char *input_name, AVFrame *in_frame, - const char **output_names, uint32_t nb_output, AVFrame *out_frame); +DNNReturnType ff_dnn_execute_model_tf(const DNNModel *model, DNNExecBaseParams *exec_params); void ff_dnn_free_model_tf(DNNModel **model); diff --git a/libavfilter/dnn_filter_common.c b/libavfilter/dnn_filter_common.c index 1b922455a3..c085884eb4 100644 --- a/libavfilter/dnn_filter_common.c +++ b/libavfilter/dnn_filter_common.c @@ -90,14 +90,26 @@ DNNReturnType ff_dnn_get_output(DnnContext *ctx, int input_width, int input_heig DNNReturnType ff_dnn_execute_model(DnnContext *ctx, AVFrame *in_frame, AVFrame *out_frame) { - return (ctx->dnn_module->execute_model)(ctx->model, ctx->model_inputname, in_frame, - (const char **)&ctx->model_outputname, 1, out_frame); + DNNExecBaseParams exec_params = { + .input_name = ctx->model_inputname, + .output_names = (const char **)&ctx->model_outputname, + .nb_output = 1, + .in_frame = in_frame, + .out_frame = out_frame, + }; + return (ctx->dnn_module->execute_model)(ctx->model, &exec_params); } DNNReturnType ff_dnn_execute_model_async(DnnContext *ctx, AVFrame *in_frame, AVFrame *out_frame) { - return (ctx->dnn_module->execute_model_async)(ctx->model, ctx->model_inputname, in_frame, - (const char **)&ctx->model_outputname, 1, out_frame); + DNNExecBaseParams exec_params = { + .input_name = ctx->model_inputname, + .output_names = (const char **)&ctx->model_outputname, + .nb_output = 1, + .in_frame = in_frame, + .out_frame = out_frame, + }; + return (ctx->dnn_module->execute_model_async)(ctx->model, &exec_params); } DNNAsyncStatusType ff_dnn_get_async_result(DnnContext *ctx, AVFrame **in_frame, AVFrame **out_frame) diff --git a/libavfilter/dnn_interface.h b/libavfilter/dnn_interface.h index ae5a488341..941670675d 100644 --- a/libavfilter/dnn_interface.h +++ b/libavfilter/dnn_interface.h @@ -63,6 +63,14 @@ typedef struct DNNData{ DNNColorOrder order; } DNNData; +typedef struct DNNExecBaseParams { + const char *input_name; + const char **output_names; + uint32_t nb_output; + AVFrame *in_frame; + AVFrame *out_frame; +} DNNExecBaseParams; + typedef int (*FramePrePostProc)(AVFrame *frame, DNNData *model, AVFilterContext *filter_ctx); typedef int (*DetectPostProc)(AVFrame *frame, DNNData *output, uint32_t nb, AVFilterContext *filter_ctx); @@ -96,11 +104,9 @@ typedef struct DNNModule{ // Loads model and parameters from given file. Returns NULL if it is not possible. DNNModel *(*load_model)(const char *model_filename, DNNFunctionType func_type, const char *options, AVFilterContext *filter_ctx); // Executes model with specified input and output. Returns DNN_ERROR otherwise. - DNNReturnType (*execute_model)(const DNNModel *model, const char *input_name, AVFrame *in_frame, - const char **output_names, uint32_t nb_output, AVFrame *out_frame); + DNNReturnType (*execute_model)(const DNNModel *model, DNNExecBaseParams *exec_params); // Executes model with specified input and output asynchronously. Returns DNN_ERROR otherwise. - DNNReturnType (*execute_model_async)(const DNNModel *model, const char *input_name, AVFrame *in_frame, - const char **output_names, uint32_t nb_output, AVFrame *out_frame); + DNNReturnType (*execute_model_async)(const DNNModel *model, DNNExecBaseParams *exec_params); // Retrieve inference result. DNNAsyncStatusType (*get_async_result)(const DNNModel *model, AVFrame **in, AVFrame **out); // Flush all the pending tasks. From patchwork Sun Apr 18 10:08:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Yejun" X-Patchwork-Id: 26977 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a25:49c5:0:0:0:0:0 with SMTP id w188csp229075yba; Sun, 18 Apr 2021 03:20:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxDyU57+UzCtf7noBWx0uwrFf9uaWNinKzn1aHzPWnRh7XDvqo/S22JYTskKI5vjt7lgV4I X-Received: by 2002:a05:6402:b41:: with SMTP id bx1mr2245756edb.105.1618741249479; Sun, 18 Apr 2021 03:20:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618741249; cv=none; d=google.com; s=arc-20160816; b=UleFCUE2yyK/YXEL98qFtkeWP6NvmWHAYf7LUkk2zSw/brIq/xJXkF81CPRJyMzfly 7Q6bUoV6BgrGG62zOLLcUyifTDEsAnj7bpkh4lh5p9e0CiZnNInzLL2iAUdNI2Gc2eZ4 qunCtWh9mmrmz5m85LXmtrLp7IZARIEpfudO64Ng7N+GaAsRQZ29QYMsRJvGsijrSWv5 uTDaOd4bV0EXIimhJEDvODxG3muYoOC8WUsRpyBsjXN56gOb2I8d3mXJMVTiIUwiNqyF MwRgtkOsMsg8pTidTzyvYy5KqIwEZKLueSdFwdkVlagntRRYWKHdJ9NrD9bBIeJFFwPD 3GhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:ironport-sdr:ironport-sdr:delivered-to; bh=+pKKztNmLyctN9drJw3qUIBMhUtGHfhNKP0vSAmkQjU=; b=UiU6iWKca1AdV2/xi6HApzUmobuCmSOGW7HRWujql2R93MqkiZ1Fz7/KnonZRm0/XH KV3WQyrXaFkfR51y9/XoOEKP6o3wtJYBY1waP2SVFfEBXfwl93ok5jSHQJMf7CC1GlYa yubpAgljdGRKYaB1S5DAgMwLF3mCH5r96Tb/LrVG6CUBQ/opBMcnUBA2F5uzIZ0kalTM L9YAt48sHVidsKUf7uq/C4Kgmgmd14vbJOheihEYdqODiXAVZGSrBEDxBiHTvzPVzbIT f5FRY7jtf99jm8EwFj7jqdFZPbJyQgbTEPHLWTl/tfsLbnnEkGwC/eZFgf2vWP4tUdVl LAEw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id ca17si9477173edb.501.2021.04.18.03.20.49; Sun, 18 Apr 2021 03:20:49 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8D9836804FE; Sun, 18 Apr 2021 13:20:10 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E98E768096A for ; Sun, 18 Apr 2021 13:20:01 +0300 (EEST) IronPort-SDR: i/xuG6wcArjC5hTDDochYj4HEUdm6BEeVlXuNiRmMmTkNACEsH4L3bqQ4C6jz0B/5Ybi1apI9S Q7ksfeOQWRjg== X-IronPort-AV: E=McAfee;i="6200,9189,9957"; a="256523548" X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="256523548" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2021 03:19:56 -0700 IronPort-SDR: FLkTbUXk+2cqEMvMKI5+sHyXDbhpcW7kLK0o3/NE5bG3PzUzTpVuH1iPui9IUX11V0nlvGZX4y CyRr42cEg18g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="453918930" Received: from yguo18-skl-u1604.sh.intel.com ([10.239.159.53]) by fmsmga002.fm.intel.com with ESMTP; 18 Apr 2021 03:19:55 -0700 From: "Guo, Yejun" To: ffmpeg-devel@ffmpeg.org Date: Sun, 18 Apr 2021 18:08:01 +0800 Message-Id: <20210418100802.19017-5-yejun.guo@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210418100802.19017-1-yejun.guo@intel.com> References: <20210418100802.19017-1-yejun.guo@intel.com> Subject: [FFmpeg-devel] [PATCH 5/6] lavfi/dnn: add classify support with openvino backend X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: yejun.guo@intel.com MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: /2Ra/bhyW6Fc Content-Length: 17977 Signed-off-by: Guo, Yejun --- libavfilter/dnn/dnn_backend_openvino.c | 143 +++++++++++++++++++++---- libavfilter/dnn/dnn_io_proc.c | 60 +++++++++++ libavfilter/dnn/dnn_io_proc.h | 1 + libavfilter/dnn_filter_common.c | 21 ++++ libavfilter/dnn_filter_common.h | 2 + libavfilter/dnn_interface.h | 10 +- 6 files changed, 218 insertions(+), 19 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_openvino.c b/libavfilter/dnn/dnn_backend_openvino.c index fcdd738f8a..4bdd6a3a6d 100644 --- a/libavfilter/dnn/dnn_backend_openvino.c +++ b/libavfilter/dnn/dnn_backend_openvino.c @@ -29,6 +29,7 @@ #include "libavutil/avassert.h" #include "libavutil/opt.h" #include "libavutil/avstring.h" +#include "libavutil/detection_bbox.h" #include "../internal.h" #include "queue.h" #include "safe_queue.h" @@ -74,6 +75,7 @@ typedef struct TaskItem { // one task might have multiple inferences typedef struct InferenceItem { TaskItem *task; + uint32_t bbox_index; } InferenceItem; // one request for one call to openvino @@ -182,12 +184,23 @@ static DNNReturnType fill_model_input_ov(OVModel *ov_model, RequestItem *request request->inferences[i] = inference; request->inference_count = i + 1; task = inference->task; - if (task->do_ioproc) { - if (ov_model->model->frame_pre_proc != NULL) { - ov_model->model->frame_pre_proc(task->in_frame, &input, ov_model->model->filter_ctx); - } else { - ff_proc_from_frame_to_dnn(task->in_frame, &input, ov_model->model->func_type, ctx); + switch (task->ov_model->model->func_type) { + case DFT_PROCESS_FRAME: + case DFT_ANALYTICS_DETECT: + if (task->do_ioproc) { + if (ov_model->model->frame_pre_proc != NULL) { + ov_model->model->frame_pre_proc(task->in_frame, &input, ov_model->model->filter_ctx); + } else { + ff_proc_from_frame_to_dnn(task->in_frame, &input, ov_model->model->func_type, ctx); + } } + break; + case DFT_ANALYTICS_CLASSIFY: + ff_frame_to_dnn_classify(task->in_frame, &input, inference->bbox_index, ctx); + break; + default: + av_assert0(!"should not reach here"); + break; } input.data = (uint8_t *)input.data + input.width * input.height * input.channels * get_datatype_size(input.dt); @@ -276,6 +289,13 @@ static void infer_completion_callback(void *args) } task->ov_model->model->detect_post_proc(task->out_frame, &output, 1, task->ov_model->model->filter_ctx); break; + case DFT_ANALYTICS_CLASSIFY: + if (!task->ov_model->model->classify_post_proc) { + av_log(ctx, AV_LOG_ERROR, "classify filter needs to provide post proc\n"); + return; + } + task->ov_model->model->classify_post_proc(task->out_frame, &output, request->inferences[i]->bbox_index, task->ov_model->model->filter_ctx); + break; default: av_assert0(!"should not reach here"); break; @@ -513,7 +533,44 @@ static DNNReturnType get_input_ov(void *model, DNNData *input, const char *input return DNN_ERROR; } -static DNNReturnType extract_inference_from_task(DNNFunctionType func_type, TaskItem *task, Queue *inference_queue) +static int contain_valid_detection_bbox(AVFrame *frame) +{ + AVFrameSideData *sd; + const AVDetectionBBoxHeader *header; + const AVDetectionBBox *bbox; + + sd = av_frame_get_side_data(frame, AV_FRAME_DATA_DETECTION_BBOXES); + if (!sd) { // this frame has nothing detected + return 0; + } + + if (!sd->size) { + return 0; + } + + header = (const AVDetectionBBoxHeader *)sd->data; + if (!header->nb_bboxes) { + return 0; + } + + for (uint32_t i = 0; i < header->nb_bboxes; i++) { + bbox = av_get_detection_bbox(header, i); + if (bbox->x < 0 || bbox->w < 0 || bbox->x + bbox->w >= frame->width) { + return 0; + } + if (bbox->y < 0 || bbox->h < 0 || bbox->y + bbox->h >= frame->width) { + return 0; + } + + if (bbox->classify_count == AV_NUM_DETECTION_BBOX_CLASSIFY) { + return 0; + } + } + + return 1; +} + +static DNNReturnType extract_inference_from_task(DNNFunctionType func_type, TaskItem *task, Queue *inference_queue, DNNExecBaseParams *exec_params) { switch (func_type) { case DFT_PROCESS_FRAME: @@ -532,6 +589,45 @@ static DNNReturnType extract_inference_from_task(DNNFunctionType func_type, Task } return DNN_SUCCESS; } + case DFT_ANALYTICS_CLASSIFY: + { + const AVDetectionBBoxHeader *header; + AVFrame *frame = task->in_frame; + AVFrameSideData *sd; + DNNExecClassificationParams *params = (DNNExecClassificationParams *)exec_params; + + task->inference_todo = 0; + task->inference_done = 0; + + if (!contain_valid_detection_bbox(frame)) { + return DNN_SUCCESS; + } + + sd = av_frame_get_side_data(frame, AV_FRAME_DATA_DETECTION_BBOXES); + header = (const AVDetectionBBoxHeader *)sd->data; + + for (uint32_t i = 0; i < header->nb_bboxes; i++) { + InferenceItem *inference; + const AVDetectionBBox *bbox = av_get_detection_bbox(header, i); + + if (av_strncasecmp(bbox->detect_label, params->target, sizeof(bbox->detect_label)) != 0) { + continue; + } + + inference = av_malloc(sizeof(*inference)); + if (!inference) { + return DNN_ERROR; + } + task->inference_todo++; + inference->task = task; + inference->bbox_index = i; + if (ff_queue_push_back(inference_queue, inference) < 0) { + av_freep(&inference); + return DNN_ERROR; + } + } + return DNN_SUCCESS; + } default: av_assert0(!"should not reach here"); return DNN_ERROR; @@ -598,7 +694,7 @@ static DNNReturnType get_output_ov(void *model, const char *input_name, int inpu task.out_frame = out_frame; task.ov_model = ov_model; - if (extract_inference_from_task(ov_model->model->func_type, &task, ov_model->inference_queue) != DNN_SUCCESS) { + if (extract_inference_from_task(ov_model->model->func_type, &task, ov_model->inference_queue, NULL) != DNN_SUCCESS) { av_frame_free(&out_frame); av_frame_free(&in_frame); av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n"); @@ -690,6 +786,14 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, DNNExecBaseParams * return DNN_ERROR; } + if (model->func_type == DFT_ANALYTICS_CLASSIFY) { + // Once we add async support for tensorflow backend and native backend, + // we'll combine the two sync/async functions in dnn_interface.h to + // simplify the code in filter, and async will be an option within backends. + // so, do not support now, and classify filter will not call this function. + return DNN_ERROR; + } + if (ctx->options.batch_size > 1) { avpriv_report_missing_feature(ctx, "batch mode for sync execution"); return DNN_ERROR; @@ -710,7 +814,7 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, DNNExecBaseParams * task.out_frame = exec_params->out_frame ? exec_params->out_frame : exec_params->in_frame; task.ov_model = ov_model; - if (extract_inference_from_task(ov_model->model->func_type, &task, ov_model->inference_queue) != DNN_SUCCESS) { + if (extract_inference_from_task(ov_model->model->func_type, &task, ov_model->inference_queue, exec_params) != DNN_SUCCESS) { av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n"); return DNN_ERROR; } @@ -730,6 +834,7 @@ DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, DNNExecBasePa OVContext *ctx = &ov_model->ctx; RequestItem *request; TaskItem *task; + DNNReturnType ret; if (ff_check_exec_params(ctx, DNN_OV, model->func_type, exec_params) != 0) { return DNN_ERROR; @@ -761,23 +866,25 @@ DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, DNNExecBasePa return DNN_ERROR; } - if (extract_inference_from_task(ov_model->model->func_type, task, ov_model->inference_queue) != DNN_SUCCESS) { + if (extract_inference_from_task(model->func_type, task, ov_model->inference_queue, exec_params) != DNN_SUCCESS) { av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n"); return DNN_ERROR; } - if (ff_queue_size(ov_model->inference_queue) < ctx->options.batch_size) { - // not enough inference items queued for a batch - return DNN_SUCCESS; - } + while (ff_queue_size(ov_model->inference_queue) >= ctx->options.batch_size) { + request = ff_safe_queue_pop_front(ov_model->request_queue); + if (!request) { + av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); + return DNN_ERROR; + } - request = ff_safe_queue_pop_front(ov_model->request_queue); - if (!request) { - av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); - return DNN_ERROR; + ret = execute_model_ov(request, ov_model->inference_queue); + if (ret != DNN_SUCCESS) { + return ret; + } } - return execute_model_ov(request, ov_model->inference_queue); + return DNN_SUCCESS; } DNNAsyncStatusType ff_dnn_get_async_result_ov(const DNNModel *model, AVFrame **in, AVFrame **out) diff --git a/libavfilter/dnn/dnn_io_proc.c b/libavfilter/dnn/dnn_io_proc.c index e104cc5064..5f60d68078 100644 --- a/libavfilter/dnn/dnn_io_proc.c +++ b/libavfilter/dnn/dnn_io_proc.c @@ -22,6 +22,7 @@ #include "libavutil/imgutils.h" #include "libswscale/swscale.h" #include "libavutil/avassert.h" +#include "libavutil/detection_bbox.h" DNNReturnType ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx) { @@ -175,6 +176,65 @@ static enum AVPixelFormat get_pixel_format(DNNData *data) return AV_PIX_FMT_BGR24; } +DNNReturnType ff_frame_to_dnn_classify(AVFrame *frame, DNNData *input, uint32_t bbox_index, void *log_ctx) +{ + const AVPixFmtDescriptor *desc; + int offsetx[4], offsety[4]; + uint8_t *bbox_data[4]; + struct SwsContext *sws_ctx; + int linesizes[4]; + enum AVPixelFormat fmt; + int left, top, width, height; + const AVDetectionBBoxHeader *header; + const AVDetectionBBox *bbox; + AVFrameSideData *sd = av_frame_get_side_data(frame, AV_FRAME_DATA_DETECTION_BBOXES); + av_assert0(sd); + + header = (const AVDetectionBBoxHeader *)sd->data; + bbox = av_get_detection_bbox(header, bbox_index); + + left = bbox->x; + width = bbox->w; + top = bbox->y; + height = bbox->h; + + fmt = get_pixel_format(input); + sws_ctx = sws_getContext(width, height, frame->format, + input->width, input->height, fmt, + SWS_FAST_BILINEAR, NULL, NULL, NULL); + if (!sws_ctx) { + av_log(log_ctx, AV_LOG_ERROR, "Failed to create scale context for the conversion " + "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", + av_get_pix_fmt_name(frame->format), width, height, + av_get_pix_fmt_name(fmt), input->width, input->height); + return DNN_ERROR; + } + + if (av_image_fill_linesizes(linesizes, fmt, input->width) < 0) { + av_log(log_ctx, AV_LOG_ERROR, "unable to get linesizes with av_image_fill_linesizes"); + sws_freeContext(sws_ctx); + return DNN_ERROR; + } + + desc = av_pix_fmt_desc_get(frame->format); + offsetx[1] = offsetx[2] = AV_CEIL_RSHIFT(left, desc->log2_chroma_w); + offsetx[0] = offsetx[3] = left; + + offsety[1] = offsety[2] = AV_CEIL_RSHIFT(top, desc->log2_chroma_h); + offsety[0] = offsety[3] = top; + + for (int k = 0; frame->data[k]; k++) + bbox_data[k] = frame->data[k] + offsety[k] * frame->linesize[k] + offsetx[k]; + + sws_scale(sws_ctx, (const uint8_t *const *)&bbox_data, frame->linesize, + 0, height, + (uint8_t *const *)(&input->data), linesizes); + + sws_freeContext(sws_ctx); + + return DNN_SUCCESS; +} + static DNNReturnType proc_from_frame_to_dnn_analytics(AVFrame *frame, DNNData *input, void *log_ctx) { struct SwsContext *sws_ctx; diff --git a/libavfilter/dnn/dnn_io_proc.h b/libavfilter/dnn/dnn_io_proc.h index 91ad3cb261..16dcdd6d1a 100644 --- a/libavfilter/dnn/dnn_io_proc.h +++ b/libavfilter/dnn/dnn_io_proc.h @@ -32,5 +32,6 @@ DNNReturnType ff_proc_from_frame_to_dnn(AVFrame *frame, DNNData *input, DNNFunctionType func_type, void *log_ctx); DNNReturnType ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx); +DNNReturnType ff_frame_to_dnn_classify(AVFrame *frame, DNNData *input, uint32_t bbox_index, void *log_ctx); #endif diff --git a/libavfilter/dnn_filter_common.c b/libavfilter/dnn_filter_common.c index c085884eb4..52c7a5392a 100644 --- a/libavfilter/dnn_filter_common.c +++ b/libavfilter/dnn_filter_common.c @@ -77,6 +77,12 @@ int ff_dnn_set_detect_post_proc(DnnContext *ctx, DetectPostProc post_proc) return 0; } +int ff_dnn_set_classify_post_proc(DnnContext *ctx, ClassifyPostProc post_proc) +{ + ctx->model->classify_post_proc = post_proc; + return 0; +} + DNNReturnType ff_dnn_get_input(DnnContext *ctx, DNNData *input) { return ctx->model->get_input(ctx->model->model, input, ctx->model_inputname); @@ -112,6 +118,21 @@ DNNReturnType ff_dnn_execute_model_async(DnnContext *ctx, AVFrame *in_frame, AVF return (ctx->dnn_module->execute_model_async)(ctx->model, &exec_params); } +DNNReturnType ff_dnn_execute_model_classification(DnnContext *ctx, AVFrame *in_frame, AVFrame *out_frame, char *target) +{ + DNNExecClassificationParams class_params = { + { + .input_name = ctx->model_inputname, + .output_names = (const char **)&ctx->model_outputname, + .nb_output = 1, + .in_frame = in_frame, + .out_frame = out_frame, + }, + .target = target, + }; + return (ctx->dnn_module->execute_model_async)(ctx->model, &class_params.base); +} + DNNAsyncStatusType ff_dnn_get_async_result(DnnContext *ctx, AVFrame **in_frame, AVFrame **out_frame) { return (ctx->dnn_module->get_async_result)(ctx->model, in_frame, out_frame); diff --git a/libavfilter/dnn_filter_common.h b/libavfilter/dnn_filter_common.h index 8deb18b39a..e7736d2bac 100644 --- a/libavfilter/dnn_filter_common.h +++ b/libavfilter/dnn_filter_common.h @@ -50,10 +50,12 @@ typedef struct DnnContext { int ff_dnn_init(DnnContext *ctx, DNNFunctionType func_type, AVFilterContext *filter_ctx); int ff_dnn_set_frame_proc(DnnContext *ctx, FramePrePostProc pre_proc, FramePrePostProc post_proc); int ff_dnn_set_detect_post_proc(DnnContext *ctx, DetectPostProc post_proc); +int ff_dnn_set_classify_post_proc(DnnContext *ctx, ClassifyPostProc post_proc); DNNReturnType ff_dnn_get_input(DnnContext *ctx, DNNData *input); DNNReturnType ff_dnn_get_output(DnnContext *ctx, int input_width, int input_height, int *output_width, int *output_height); DNNReturnType ff_dnn_execute_model(DnnContext *ctx, AVFrame *in_frame, AVFrame *out_frame); DNNReturnType ff_dnn_execute_model_async(DnnContext *ctx, AVFrame *in_frame, AVFrame *out_frame); +DNNReturnType ff_dnn_execute_model_classification(DnnContext *ctx, AVFrame *in_frame, AVFrame *out_frame, char *target); DNNAsyncStatusType ff_dnn_get_async_result(DnnContext *ctx, AVFrame **in_frame, AVFrame **out_frame); DNNReturnType ff_dnn_flush(DnnContext *ctx); void ff_dnn_uninit(DnnContext *ctx); diff --git a/libavfilter/dnn_interface.h b/libavfilter/dnn_interface.h index 941670675d..799244ee14 100644 --- a/libavfilter/dnn_interface.h +++ b/libavfilter/dnn_interface.h @@ -52,7 +52,7 @@ typedef enum { DFT_NONE, DFT_PROCESS_FRAME, // process the whole frame DFT_ANALYTICS_DETECT, // detect from the whole frame - // we can add more such as detect_from_crop, classify_from_bbox, etc. + DFT_ANALYTICS_CLASSIFY, // classify for each bounding box }DNNFunctionType; typedef struct DNNData{ @@ -71,8 +71,14 @@ typedef struct DNNExecBaseParams { AVFrame *out_frame; } DNNExecBaseParams; +typedef struct DNNExecClassificationParams { + DNNExecBaseParams base; + const char *target; +} DNNExecClassificationParams; + typedef int (*FramePrePostProc)(AVFrame *frame, DNNData *model, AVFilterContext *filter_ctx); typedef int (*DetectPostProc)(AVFrame *frame, DNNData *output, uint32_t nb, AVFilterContext *filter_ctx); +typedef int (*ClassifyPostProc)(AVFrame *frame, DNNData *output, uint32_t bbox_index, AVFilterContext *filter_ctx); typedef struct DNNModel{ // Stores model that can be different for different backends. @@ -97,6 +103,8 @@ typedef struct DNNModel{ FramePrePostProc frame_post_proc; // set the post process to interpret detect result from DNNData DetectPostProc detect_post_proc; + // set the post process to interpret classify result from DNNData + ClassifyPostProc classify_post_proc; } DNNModel; // Stores pointers to functions for loading, executing, freeing DNN models for one of the backends. From patchwork Sun Apr 18 10:08:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Yejun" X-Patchwork-Id: 26957 Delivered-To: andriy.gelman@gmail.com Received: by 2002:a25:49c5:0:0:0:0:0 with SMTP id w188csp229123yba; Sun, 18 Apr 2021 03:20:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwzft5xHULmdJeQCRSe7Tq55UvwxtLYESMWiwr/tdbKOimI3k1VC+PUl7t6gpZ7XrsBqS0Q X-Received: by 2002:a17:906:b0cd:: with SMTP id bk13mr11790539ejb.184.1618741256666; Sun, 18 Apr 2021 03:20:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618741256; cv=none; d=google.com; s=arc-20160816; b=kyVatLQPCXAMXjzfmf+ed4XjjeX/Aq1NWz/ea+jwGjsGCKvPZNRb4ghzHqZTDnQ8oo MoPSlr9xU0Y+6y8UVpYV7IL/YW0czqW5hA6lSznAeHCUdn5rXCQ2CjrhyMxUrwLvCz9o +fzKzrGay2S9muluIwEzH0wip5vpbRxQJC+7PyvShaKTdPEEwAdf8WEPa9Gp6aYNV1k1 OAdwBP55J7wyvXxT/DxNBoFVXnVqcV2iSpdgdu5gU+49P4HlieWjBTE8TxyeI4aAHp3T 1SrvCVgJq9XhfrbTm/FTmkdTliEhmD7Jo3sMWEUMEIpTb/54oVgFXby7zvCttFAhJMeq L0FQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:ironport-sdr:ironport-sdr:delivered-to; bh=ho0AhNzemsNoVr4oBKGbSK1Nsz2z7XiHKt4gKI5im1Y=; b=Del84akARPrUE8ELChfYiP72zaBZJmbA6pH9EZ7b5j4YDxKjekqT9C7IPRWTxhx8tH 8eBgk1IMF9ntZmTlyptoszEPCBNvvNo7SI3pV2okwpzsDuRiwQeiRBFAlAkkUaEbNJIi pOFkpIpmIjazZbGYG4fo3yOqhLaW9I/2Zf91ZiJa7/evoRu/SHk/BrKZg2DRuE4dj8Rc 57aZPcw7XeSocC++xXAD+NZZgOJmDid/2F0o91k0rAo04U+r7sypVKTQYCRSkzkB5zxU 1J5EDoZFkomBlLBuFY7f8j2wSWBX1L0l6J00LOGIDM86kZZbupOHX4M8kBqyIftT9e0/ tQlw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id b1si10515133ejb.714.2021.04.18.03.20.56; Sun, 18 Apr 2021 03:20:56 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 283A6680B17; Sun, 18 Apr 2021 13:20:12 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3C8856809D2 for ; Sun, 18 Apr 2021 13:20:03 +0300 (EEST) IronPort-SDR: 8l7vHjzNlWB514JXTBrpXvLvr9k2HoxwHAw+37Auc6HzM4MIiKwBO8G/oMRSTcjX65mhBWBVll qj28Elx+UwJw== X-IronPort-AV: E=McAfee;i="6200,9189,9957"; a="256523549" X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="256523549" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2021 03:19:57 -0700 IronPort-SDR: OqxfeA8KCT/2opOy28KUTXtvWt0Jw1J+Gs4lvXnIwZBi8d3x3uzih8pRmrw+gkg9pa4zldJwDU DFVBdKbWDrNg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,231,1613462400"; d="scan'208";a="453918934" Received: from yguo18-skl-u1604.sh.intel.com ([10.239.159.53]) by fmsmga002.fm.intel.com with ESMTP; 18 Apr 2021 03:19:56 -0700 From: "Guo, Yejun" To: ffmpeg-devel@ffmpeg.org Date: Sun, 18 Apr 2021 18:08:02 +0800 Message-Id: <20210418100802.19017-6-yejun.guo@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210418100802.19017-1-yejun.guo@intel.com> References: <20210418100802.19017-1-yejun.guo@intel.com> Subject: [FFmpeg-devel] [PATCH 6/6] lavfi/dnn_classify: add filter dnn_classify for classification based on detection bounding boxes X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: yejun.guo@intel.com MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: l8ZSXALcmFBq Content-Length: 16023 classification is done on every detection bounding box in frame's side data, which are the results of object detection (filter dnn_detect). Please refer to commit log of dnn_detect for the material for detection, and see below for classification. - download material for classifcation: wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.bin wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.xml wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.label - run command as: ./ffmpeg -i cici.jpg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:input=data:output=detection_out:confidence=0.6:labels=face-detection-adas-0001.label,dnn_classify=dnn_backend=openvino:model=emotions-recognition-retail-0003.xml:input=data:output=prob_emotion:confidence=0.3:labels=emotions-recognition-retail-0003.label:target=face,showinfo -f null - We'll see the detect&classify result as below: [Parsed_showinfo_2 @ 0x55b7d25e77c0] side data - detection bounding boxes: [Parsed_showinfo_2 @ 0x55b7d25e77c0] source: face-detection-adas-0001.xml, emotions-recognition-retail-0003.xml [Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 0, region: (1005, 813) -> (1086, 905), label: face, confidence: 10000/10000. [Parsed_showinfo_2 @ 0x55b7d25e77c0] classify: label: happy, confidence: 6757/10000. [Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 1, region: (888, 839) -> (967, 926), label: face, confidence: 6917/10000. [Parsed_showinfo_2 @ 0x55b7d25e77c0] classify: label: anger, confidence: 4320/10000. Signed-off-by: Guo, Yejun --- configure | 1 + doc/filters.texi | 36 ++++ libavfilter/Makefile | 1 + libavfilter/allfilters.c | 1 + libavfilter/vf_dnn_classify.c | 330 ++++++++++++++++++++++++++++++++++ 5 files changed, 369 insertions(+) create mode 100644 libavfilter/vf_dnn_classify.c diff --git a/configure b/configure index cc1013fb1d..d1fc0d05a7 100755 --- a/configure +++ b/configure @@ -3555,6 +3555,7 @@ derain_filter_select="dnn" deshake_filter_select="pixelutils" deshake_opencl_filter_deps="opencl" dilation_opencl_filter_deps="opencl" +dnn_classify_filter_select="dnn" dnn_detect_filter_select="dnn" dnn_processing_filter_select="dnn" drawtext_filter_deps="libfreetype" diff --git a/doc/filters.texi b/doc/filters.texi index 68f17dd563..9975db7326 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -10127,6 +10127,42 @@ ffmpeg -i INPUT -f lavfi -i nullsrc=hd720,geq='r=128+80*(sin(sqrt((X-W/2)*(X-W/2 @end example @end itemize +@section dnn_classify + +Do classification with deep neural networks based on bounding boxes. + +The filter accepts the following options: + +@table @option +@item dnn_backend +Specify which DNN backend to use for model loading and execution. This option accepts +only openvino now, tensorflow backends will be added. + +@item model +Set path to model file specifying network architecture and its parameters. +Note that different backends use different file formats. + +@item input +Set the input name of the dnn network. + +@item output +Set the output name of the dnn network. + +@item confidence +Set the confidence threshold (default: 0.5). + +@item labels +Set path to label file specifying the mapping between label id and name. +Each label name is written in one line, tailing spaces and empty lines are skipped. +The first line is the name of label id 0, +and the second line is the name of label id 1, etc. +The label id is considered as name if the label file is not provided. + +@item backend_configs +Set the configs to be passed into backend + +@end table + @section dnn_detect Do object detection with deep neural networks. diff --git a/libavfilter/Makefile b/libavfilter/Makefile index b77f2276a4..dd4decdd71 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -245,6 +245,7 @@ OBJS-$(CONFIG_DILATION_FILTER) += vf_neighbor.o OBJS-$(CONFIG_DILATION_OPENCL_FILTER) += vf_neighbor_opencl.o opencl.o \ opencl/neighbor.o OBJS-$(CONFIG_DISPLACE_FILTER) += vf_displace.o framesync.o +OBJS-$(CONFIG_DNN_CLASSIFY_FILTER) += vf_dnn_classify.o OBJS-$(CONFIG_DNN_DETECT_FILTER) += vf_dnn_detect.o OBJS-$(CONFIG_DNN_PROCESSING_FILTER) += vf_dnn_processing.o OBJS-$(CONFIG_DOUBLEWEAVE_FILTER) += vf_weave.o diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index 0d2bf7bbee..9b24a2da29 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -230,6 +230,7 @@ extern AVFilter ff_vf_detelecine; extern AVFilter ff_vf_dilation; extern AVFilter ff_vf_dilation_opencl; extern AVFilter ff_vf_displace; +extern AVFilter ff_vf_dnn_classify; extern AVFilter ff_vf_dnn_detect; extern AVFilter ff_vf_dnn_processing; extern AVFilter ff_vf_doubleweave; diff --git a/libavfilter/vf_dnn_classify.c b/libavfilter/vf_dnn_classify.c new file mode 100644 index 0000000000..dd61f743d6 --- /dev/null +++ b/libavfilter/vf_dnn_classify.c @@ -0,0 +1,330 @@ +/* + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * implementing an classification filter using deep learning networks. + */ + +#include "libavformat/avio.h" +#include "libavutil/opt.h" +#include "libavutil/pixdesc.h" +#include "libavutil/avassert.h" +#include "libavutil/imgutils.h" +#include "filters.h" +#include "dnn_filter_common.h" +#include "formats.h" +#include "internal.h" +#include "libavutil/time.h" +#include "libavutil/avstring.h" +#include "libavutil/detection_bbox.h" + +typedef struct DnnClassifyContext { + const AVClass *class; + DnnContext dnnctx; + float confidence; + char *labels_filename; + char *target; + char **labels; + int label_count; +} DnnClassifyContext; + +#define OFFSET(x) offsetof(DnnClassifyContext, dnnctx.x) +#define OFFSET2(x) offsetof(DnnClassifyContext, x) +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM +static const AVOption dnn_classify_options[] = { + { "dnn_backend", "DNN backend", OFFSET(backend_type), AV_OPT_TYPE_INT, { .i64 = 2 }, INT_MIN, INT_MAX, FLAGS, "backend" }, +#if (CONFIG_LIBOPENVINO == 1) + { "openvino", "openvino backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 2 }, 0, 0, FLAGS, "backend" }, +#endif + DNN_COMMON_OPTIONS + { "confidence", "threshold of confidence", OFFSET2(confidence), AV_OPT_TYPE_FLOAT, { .dbl = 0.5 }, 0, 1, FLAGS}, + { "labels", "path to labels file", OFFSET2(labels_filename), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, FLAGS }, + { "target", "which one to be classified", OFFSET2(target), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, FLAGS }, + { NULL } +}; + +AVFILTER_DEFINE_CLASS(dnn_classify); + +static int dnn_classify_post_proc(AVFrame *frame, DNNData *output, uint32_t bbox_index, AVFilterContext *filter_ctx) +{ + DnnClassifyContext *ctx = filter_ctx->priv; + float conf_threshold = ctx->confidence; + AVDetectionBBoxHeader *header; + AVDetectionBBox *bbox; + float *classifications; + uint32_t label_id; + float confidence; + AVFrameSideData *sd; + + if (output->channels <= 0) { + return -1; + } + + sd = av_frame_get_side_data(frame, AV_FRAME_DATA_DETECTION_BBOXES); + header = (AVDetectionBBoxHeader *)sd->data; + + if (bbox_index == 0) { + av_strlcat(header->source, ", ", sizeof(header->source)); + av_strlcat(header->source, ctx->dnnctx.model_filename, sizeof(header->source)); + } + + classifications = output->data; + label_id = 0; + confidence= classifications[0]; + for (int i = 1; i < output->channels; i++) { + if (classifications[i] > confidence) { + label_id = i; + confidence= classifications[i]; + } + } + + if (confidence < conf_threshold) { + return 0; + } + + bbox = av_get_detection_bbox(header, bbox_index); + bbox->classify_confidences[bbox->classify_count] = av_make_q((int)(confidence * 10000), 10000); + + if (ctx->labels && label_id < ctx->label_count) { + av_strlcpy(bbox->classify_labels[bbox->classify_count], ctx->labels[label_id], sizeof(bbox->classify_labels[bbox->classify_count])); + } else { + snprintf(bbox->classify_labels[bbox->classify_count], sizeof(bbox->classify_labels[bbox->classify_count]), "%d", label_id); + } + + bbox->classify_count++; + + return 0; +} + +static void free_classify_labels(DnnClassifyContext *ctx) +{ + for (int i = 0; i < ctx->label_count; i++) { + av_freep(&ctx->labels[i]); + } + ctx->label_count = 0; + av_freep(&ctx->labels); +} + +static int read_classify_label_file(AVFilterContext *context) +{ + int line_len; + FILE *file; + DnnClassifyContext *ctx = context->priv; + + file = av_fopen_utf8(ctx->labels_filename, "r"); + if (!file){ + av_log(context, AV_LOG_ERROR, "failed to open file %s\n", ctx->labels_filename); + return AVERROR(EINVAL); + } + + while (!feof(file)) { + char *label; + char buf[256]; + if (!fgets(buf, 256, file)) { + break; + } + + line_len = strlen(buf); + while (line_len) { + int i = line_len - 1; + if (buf[i] == '\n' || buf[i] == '\r' || buf[i] == ' ') { + buf[i] = '\0'; + line_len--; + } else { + break; + } + } + + if (line_len == 0) // empty line + continue; + + if (line_len >= AV_DETECTION_BBOX_LABEL_NAME_MAX_SIZE) { + av_log(context, AV_LOG_ERROR, "label %s too long\n", buf); + fclose(file); + return AVERROR(EINVAL); + } + + label = av_strdup(buf); + if (!label) { + av_log(context, AV_LOG_ERROR, "failed to allocate memory for label %s\n", buf); + fclose(file); + return AVERROR(ENOMEM); + } + + if (av_dynarray_add_nofree(&ctx->labels, &ctx->label_count, label) < 0) { + av_log(context, AV_LOG_ERROR, "failed to do av_dynarray_add\n"); + fclose(file); + av_freep(&label); + return AVERROR(ENOMEM); + } + } + + fclose(file); + return 0; +} + +static av_cold int dnn_classify_init(AVFilterContext *context) +{ + DnnClassifyContext *ctx = context->priv; + int ret = ff_dnn_init(&ctx->dnnctx, DFT_ANALYTICS_CLASSIFY, context); + if (ret < 0) + return ret; + ff_dnn_set_classify_post_proc(&ctx->dnnctx, dnn_classify_post_proc); + + if (ctx->labels_filename) { + return read_classify_label_file(context); + } + return 0; +} + +static int dnn_classify_query_formats(AVFilterContext *context) +{ + static const enum AVPixelFormat pix_fmts[] = { + AV_PIX_FMT_RGB24, AV_PIX_FMT_BGR24, + AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAYF32, + AV_PIX_FMT_YUV420P, AV_PIX_FMT_YUV422P, + AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV410P, AV_PIX_FMT_YUV411P, + AV_PIX_FMT_NV12, + AV_PIX_FMT_NONE + }; + AVFilterFormats *fmts_list = ff_make_format_list(pix_fmts); + return ff_set_common_formats(context, fmts_list); +} + +static int dnn_classify_flush_frame(AVFilterLink *outlink, int64_t pts, int64_t *out_pts) +{ + DnnClassifyContext *ctx = outlink->src->priv; + int ret; + DNNAsyncStatusType async_state; + + ret = ff_dnn_flush(&ctx->dnnctx); + if (ret != DNN_SUCCESS) { + return -1; + } + + do { + AVFrame *in_frame = NULL; + AVFrame *out_frame = NULL; + async_state = ff_dnn_get_async_result(&ctx->dnnctx, &in_frame, &out_frame); + if (out_frame) { + av_assert0(in_frame == out_frame); + ret = ff_filter_frame(outlink, out_frame); + if (ret < 0) + return ret; + if (out_pts) + *out_pts = out_frame->pts + pts; + } + av_usleep(5000); + } while (async_state >= DAST_NOT_READY); + + return 0; +} + +static int dnn_classify_activate(AVFilterContext *filter_ctx) +{ + AVFilterLink *inlink = filter_ctx->inputs[0]; + AVFilterLink *outlink = filter_ctx->outputs[0]; + DnnClassifyContext *ctx = filter_ctx->priv; + AVFrame *in = NULL; + int64_t pts; + int ret, status; + int got_frame = 0; + int async_state; + + FF_FILTER_FORWARD_STATUS_BACK(outlink, inlink); + + do { + // drain all input frames + ret = ff_inlink_consume_frame(inlink, &in); + if (ret < 0) + return ret; + if (ret > 0) { + if (ff_dnn_execute_model_classification(&ctx->dnnctx, in, in, ctx->target) != DNN_SUCCESS) { + return AVERROR(EIO); + } + } + } while (ret > 0); + + // drain all processed frames + do { + AVFrame *in_frame = NULL; + AVFrame *out_frame = NULL; + async_state = ff_dnn_get_async_result(&ctx->dnnctx, &in_frame, &out_frame); + if (out_frame) { + av_assert0(in_frame == out_frame); + ret = ff_filter_frame(outlink, out_frame); + if (ret < 0) + return ret; + got_frame = 1; + } + } while (async_state == DAST_SUCCESS); + + // if frame got, schedule to next filter + if (got_frame) + return 0; + + if (ff_inlink_acknowledge_status(inlink, &status, &pts)) { + if (status == AVERROR_EOF) { + int64_t out_pts = pts; + ret = dnn_classify_flush_frame(outlink, pts, &out_pts); + ff_outlink_set_status(outlink, status, out_pts); + return ret; + } + } + + FF_FILTER_FORWARD_WANTED(outlink, inlink); + + return 0; +} + +static av_cold void dnn_classify_uninit(AVFilterContext *context) +{ + DnnClassifyContext *ctx = context->priv; + ff_dnn_uninit(&ctx->dnnctx); + free_classify_labels(ctx); +} + +static const AVFilterPad dnn_classify_inputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + }, + { NULL } +}; + +static const AVFilterPad dnn_classify_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + }, + { NULL } +}; + +AVFilter ff_vf_dnn_classify = { + .name = "dnn_classify", + .description = NULL_IF_CONFIG_SMALL("Apply DNN classify filter to the input."), + .priv_size = sizeof(DnnClassifyContext), + .init = dnn_classify_init, + .uninit = dnn_classify_uninit, + .query_formats = dnn_classify_query_formats, + .inputs = dnn_classify_inputs, + .outputs = dnn_classify_outputs, + .priv_class = &dnn_classify_class, + .activate = dnn_classify_activate, +};