From patchwork Thu Apr 29 13:36:53 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Guo, Yejun" <yejun.guo@intel.com>
X-Patchwork-Id: 27477
Delivered-To: ffmpegpatchwork2@gmail.com
Received: by 2002:a05:6a11:4023:0:0:0:0 with SMTP id ky35csp1492653pxb;
        Thu, 29 Apr 2021 06:49:36 -0700 (PDT)
X-Google-Smtp-Source: 
 ABdhPJzto0mywXSNFmGawybKYQjDLBW+1KILVuajeHS15SXgJNtKxTYfbzQdI2USYyo0KT7TLeEY
X-Received: by 2002:a17:906:6789:: with SMTP id
 q9mr35250266ejp.295.1619704176597;
        Thu, 29 Apr 2021 06:49:36 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1619704176; cv=none;
        d=google.com; s=arc-20160816;
        b=crxzx40srWgbfmkrRkP4GUmuwiJjq/AK6jFNMC2OCxv8Ad0WF7c4lpVbEPeKRXkbd3
         AJ1GmwhQJXF6rfWPsISeIUwXx2peF1/N7rxfa1EVkcFTMDQ+eQb4FdkpF/Grdbzu4kNB
         tr0Ye278qpJpZrzuvd3u8KTbCDU68oc3jd6LtXcxvCg4Q6HD89rF8IotU7lwcyff0in6
         P16N+EQegfIYaREQLsE/DOAuPmTE6EtYmDRdi+armNl8pCNEp9SGOXVWMB+VgbS9fACE
         eQgoLkDP56FR4cu5Qzg0RCEktHQ0tQEQk7nUu1Lu6xQyDC8EZxufiB+MCy2qRCPLiF5J
         s7qw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:content-transfer-encoding:mime-version:cc:reply-to
         :list-subscribe:list-help:list-post:list-archive:list-unsubscribe
         :list-id:precedence:subject:references:in-reply-to:message-id:date
         :to:from:ironport-sdr:ironport-sdr:delivered-to;
        bh=UJ+xJFWP5vGK8RAM4qrEzk55L0RSwne0WMCZO82ebN0=;
        b=RpM2juTwJaZZBaXzF9mSUGxZ+XUCKq8uoodqLIpa5pZbouaFnFvyCFJbfqwsLIqyhv
         8uxsZYjV1IZx79dO+Yn40/e2ww6m2M1ld6U+Ntf+xbJo2Tk8fUep+3lUUvrtzpxnd09t
         psV3ki5nINnKRWIf1ij5WxPeY2KzY4XNB+VNLIGnuBs9ASXU3Ki5chwN9O0hMe6wS8O2
         EZx+zaeltzgt8WROBnb6w+VFiKvRR1W+UPg+DHVp9jE2jm68n7Nx6OF10oRpmppZcLds
         C6HRO0rUADhLX3VSDfCaVOfySN53xxWX5j8uEWrO3tR5JKSkvL3Qs/7XSx6rkiu3RK91
         owfA==
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100])
        by mx.google.com with ESMTP id 8si42777ejx.447.2021.04.29.06.49.35;
        Thu, 29 Apr 2021 06:49:36 -0700 (PDT)
Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org
 designates 79.124.17.100 as permitted sender)
 smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 357ED68A0E9;
	Thu, 29 Apr 2021 16:49:22 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6CC70689D64
 for <ffmpeg-devel@ffmpeg.org>; Thu, 29 Apr 2021 16:49:14 +0300 (EEST)
IronPort-SDR: 
 sH8n8QTd8e5dJYWnIIK647O3WuqMNJtVhLZFXSkwJmP2nKlQSb3QEIdXm3dvhKF5fXUos5x2PW
 0CYD24olglMA==
X-IronPort-AV: E=McAfee;i="6200,9189,9969"; a="260956492"
X-IronPort-AV: E=Sophos;i="5.82,259,1613462400"; d="scan'208";a="260956492"
Received: from fmsmga008.fm.intel.com ([10.253.24.58])
 by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 29 Apr 2021 06:49:10 -0700
IronPort-SDR: 
 ldSMSIyu+nU7RRIsqUQn6rtoXzRXbvE+NCR7GM6FmVF7GRyJJq7jSMDZ5y3D8YiI9f4VY2PdYD
 stP7KIrpHYTg==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.82,259,1613462400"; d="scan'208";a="424096048"
Received: from yguo18-skl-u1604.sh.intel.com ([10.239.159.53])
 by fmsmga008.fm.intel.com with ESMTP; 29 Apr 2021 06:49:07 -0700
From: "Guo, Yejun" <yejun.guo@intel.com>
To: ffmpeg-devel@ffmpeg.org
Date: Thu, 29 Apr 2021 21:36:53 +0800
Message-Id: <20210429133657.23076-2-yejun.guo@intel.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20210429133657.23076-1-yejun.guo@intel.com>
References: <20210429133657.23076-1-yejun.guo@intel.com>
Subject: [FFmpeg-devel] [PATCH V2 2/6] lavfi/dnn_backend_openvino.c: add
 InferenceItem between TaskItem and RequestItem
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: yejun.guo@intel.com
MIME-Version: 1.0
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
X-TUID: LXoT8IkGu+Ef

There's one task item for one function call from dnn interface,
there's one request item for one call to openvino. For classify,
one task might need multiple inference for classification on every
bounding box, so add InferenceItem.
---
 libavfilter/dnn/dnn_backend_openvino.c | 157 ++++++++++++++++++-------
 1 file changed, 115 insertions(+), 42 deletions(-)

diff --git a/libavfilter/dnn/dnn_backend_openvino.c b/libavfilter/dnn/dnn_backend_openvino.c
index 267c154c87..a8a02d7589 100644
--- a/libavfilter/dnn/dnn_backend_openvino.c
+++ b/libavfilter/dnn/dnn_backend_openvino.c
@@ -54,8 +54,10 @@ typedef struct OVModel{
     ie_executable_network_t *exe_network;
     SafeQueue *request_queue;   // holds RequestItem
     Queue *task_queue;          // holds TaskItem
+    Queue *inference_queue;     // holds InferenceItem
 } OVModel;
 
+// one task for one function call from dnn interface
 typedef struct TaskItem {
     OVModel *ov_model;
     const char *input_name;
@@ -64,13 +66,20 @@ typedef struct TaskItem {
     AVFrame *out_frame;
     int do_ioproc;
     int async;
-    int done;
+    uint32_t inference_todo;
+    uint32_t inference_done;
 } TaskItem;
 
+// one task might have multiple inferences
+typedef struct InferenceItem {
+    TaskItem *task;
+} InferenceItem;
+
+// one request for one call to openvino
 typedef struct RequestItem {
     ie_infer_request_t *infer_request;
-    TaskItem **tasks;
-    int task_count;
+    InferenceItem **inferences;
+    uint32_t inference_count;
     ie_complete_call_back_t callback;
 } RequestItem;
 
@@ -127,7 +136,12 @@ static DNNReturnType fill_model_input_ov(OVModel *ov_model, RequestItem *request
     IEStatusCode status;
     DNNData input;
     ie_blob_t *input_blob = NULL;
-    TaskItem *task = request->tasks[0];
+    InferenceItem *inference;
+    TaskItem *task;
+
+    inference = ff_queue_peek_front(ov_model->inference_queue);
+    av_assert0(inference);
+    task = inference->task;
 
     status = ie_infer_request_get_blob(request->infer_request, task->input_name, &input_blob);
     if (status != OK) {
@@ -159,9 +173,14 @@ static DNNReturnType fill_model_input_ov(OVModel *ov_model, RequestItem *request
     // change to be an option when necessary.
     input.order = DCO_BGR;
 
-    av_assert0(request->task_count <= dims.dims[0]);
-    for (int i = 0; i < request->task_count; ++i) {
-        task = request->tasks[i];
+    for (int i = 0; i < ctx->options.batch_size; ++i) {
+        inference = ff_queue_pop_front(ov_model->inference_queue);
+        if (!inference) {
+            break;
+        }
+        request->inferences[i] = inference;
+        request->inference_count = i + 1;
+        task = inference->task;
         if (task->do_ioproc) {
             if (ov_model->model->frame_pre_proc != NULL) {
                 ov_model->model->frame_pre_proc(task->in_frame, &input, ov_model->model->filter_ctx);
@@ -183,7 +202,8 @@ static void infer_completion_callback(void *args)
     precision_e precision;
     IEStatusCode status;
     RequestItem *request = args;
-    TaskItem *task = request->tasks[0];
+    InferenceItem *inference = request->inferences[0];
+    TaskItem *task = inference->task;
     SafeQueue *requestq = task->ov_model->request_queue;
     ie_blob_t *output_blob = NULL;
     ie_blob_buffer_t blob_buffer;
@@ -229,10 +249,11 @@ static void infer_completion_callback(void *args)
     output.dt       = precision_to_datatype(precision);
     output.data     = blob_buffer.buffer;
 
-    av_assert0(request->task_count <= dims.dims[0]);
-    av_assert0(request->task_count >= 1);
-    for (int i = 0; i < request->task_count; ++i) {
-        task = request->tasks[i];
+    av_assert0(request->inference_count <= dims.dims[0]);
+    av_assert0(request->inference_count >= 1);
+    for (int i = 0; i < request->inference_count; ++i) {
+        task = request->inferences[i]->task;
+        task->inference_done++;
 
         switch (task->ov_model->model->func_type) {
         case DFT_PROCESS_FRAME:
@@ -259,13 +280,13 @@ static void infer_completion_callback(void *args)
             break;
         }
 
-        task->done = 1;
+        av_freep(&request->inferences[i]);
         output.data = (uint8_t *)output.data
                       + output.width * output.height * output.channels * get_datatype_size(output.dt);
     }
     ie_blob_free(&output_blob);
 
-    request->task_count = 0;
+    request->inference_count = 0;
     if (ff_safe_queue_push_back(requestq, request) < 0) {
         av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n");
         return;
@@ -370,11 +391,11 @@ static DNNReturnType init_model_ov(OVModel *ov_model, const char *input_name, co
             goto err;
         }
 
-        item->tasks = av_malloc_array(ctx->options.batch_size, sizeof(*item->tasks));
-        if (!item->tasks) {
+        item->inferences = av_malloc_array(ctx->options.batch_size, sizeof(*item->inferences));
+        if (!item->inferences) {
             goto err;
         }
-        item->task_count = 0;
+        item->inference_count = 0;
     }
 
     ov_model->task_queue = ff_queue_create();
@@ -382,6 +403,11 @@ static DNNReturnType init_model_ov(OVModel *ov_model, const char *input_name, co
         goto err;
     }
 
+    ov_model->inference_queue = ff_queue_create();
+    if (!ov_model->inference_queue) {
+        goto err;
+    }
+
     return DNN_SUCCESS;
 
 err:
@@ -389,15 +415,24 @@ err:
     return DNN_ERROR;
 }
 
-static DNNReturnType execute_model_ov(RequestItem *request)
+static DNNReturnType execute_model_ov(RequestItem *request, Queue *inferenceq)
 {
     IEStatusCode status;
     DNNReturnType ret;
-    TaskItem *task = request->tasks[0];
-    OVContext *ctx = &task->ov_model->ctx;
+    InferenceItem *inference;
+    TaskItem *task;
+    OVContext *ctx;
+
+    if (ff_queue_size(inferenceq) == 0) {
+        return DNN_SUCCESS;
+    }
+
+    inference = ff_queue_peek_front(inferenceq);
+    task = inference->task;
+    ctx = &task->ov_model->ctx;
 
     if (task->async) {
-        if (request->task_count < ctx->options.batch_size) {
+        if (ff_queue_size(inferenceq) < ctx->options.batch_size) {
             if (ff_safe_queue_push_front(task->ov_model->request_queue, request) < 0) {
                 av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n");
                 return DNN_ERROR;
@@ -430,7 +465,7 @@ static DNNReturnType execute_model_ov(RequestItem *request)
             return DNN_ERROR;
         }
         infer_completion_callback(request);
-        return task->done ? DNN_SUCCESS : DNN_ERROR;
+        return (task->inference_done == task->inference_todo) ? DNN_SUCCESS : DNN_ERROR;
     }
 }
 
@@ -484,6 +519,31 @@ static DNNReturnType get_input_ov(void *model, DNNData *input, const char *input
     return DNN_ERROR;
 }
 
+static DNNReturnType extract_inference_from_task(DNNFunctionType func_type, TaskItem *task, Queue *inference_queue)
+{
+    switch (func_type) {
+    case DFT_PROCESS_FRAME:
+    case DFT_ANALYTICS_DETECT:
+    {
+        InferenceItem *inference = av_malloc(sizeof(*inference));
+        if (!inference) {
+            return DNN_ERROR;
+        }
+        task->inference_todo = 1;
+        task->inference_done = 0;
+        inference->task = task;
+        if (ff_queue_push_back(inference_queue, inference) < 0) {
+            av_freep(&inference);
+            return DNN_ERROR;
+        }
+        return DNN_SUCCESS;
+    }
+    default:
+        av_assert0(!"should not reach here");
+        return DNN_ERROR;
+    }
+}
+
 static DNNReturnType get_output_ov(void *model, const char *input_name, int input_width, int input_height,
                                    const char *output_name, int *output_width, int *output_height)
 {
@@ -536,7 +596,6 @@ static DNNReturnType get_output_ov(void *model, const char *input_name, int inpu
         return DNN_ERROR;
     }
 
-    task.done = 0;
     task.do_ioproc = 0;
     task.async = 0;
     task.input_name = input_name;
@@ -545,6 +604,13 @@ static DNNReturnType get_output_ov(void *model, const char *input_name, int inpu
     task.out_frame = out_frame;
     task.ov_model = ov_model;
 
+    if (extract_inference_from_task(ov_model->model->func_type, &task, ov_model->inference_queue) != DNN_SUCCESS) {
+        av_frame_free(&out_frame);
+        av_frame_free(&in_frame);
+        av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n");
+        return DNN_ERROR;
+    }
+
     request = ff_safe_queue_pop_front(ov_model->request_queue);
     if (!request) {
         av_frame_free(&out_frame);
@@ -552,9 +618,8 @@ static DNNReturnType get_output_ov(void *model, const char *input_name, int inpu
         av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n");
         return DNN_ERROR;
     }
-    request->tasks[request->task_count++] = &task;
 
-    ret = execute_model_ov(request);
+    ret = execute_model_ov(request, ov_model->inference_queue);
     *output_width = out_frame->width;
     *output_height = out_frame->height;
 
@@ -657,7 +722,6 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_n
         }
     }
 
-    task.done = 0;
     task.do_ioproc = 1;
     task.async = 0;
     task.input_name = input_name;
@@ -666,14 +730,18 @@ DNNReturnType ff_dnn_execute_model_ov(const DNNModel *model, const char *input_n
     task.out_frame = out_frame;
     task.ov_model = ov_model;
 
+    if (extract_inference_from_task(ov_model->model->func_type, &task, ov_model->inference_queue) != DNN_SUCCESS) {
+        av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n");
+        return DNN_ERROR;
+    }
+
     request = ff_safe_queue_pop_front(ov_model->request_queue);
     if (!request) {
         av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n");
         return DNN_ERROR;
     }
-    request->tasks[request->task_count++] = &task;
 
-    return execute_model_ov(request);
+    return execute_model_ov(request, ov_model->inference_queue);
 }
 
 DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *input_name, AVFrame *in_frame,
@@ -707,7 +775,6 @@ DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *i
         return DNN_ERROR;
     }
 
-    task->done = 0;
     task->do_ioproc = 1;
     task->async = 1;
     task->input_name = input_name;
@@ -721,14 +788,18 @@ DNNReturnType ff_dnn_execute_model_async_ov(const DNNModel *model, const char *i
         return DNN_ERROR;
     }
 
+    if (extract_inference_from_task(ov_model->model->func_type, task, ov_model->inference_queue) != DNN_SUCCESS) {
+        av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n");
+        return DNN_ERROR;
+    }
+
     request = ff_safe_queue_pop_front(ov_model->request_queue);
     if (!request) {
         av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n");
         return DNN_ERROR;
     }
 
-    request->tasks[request->task_count++] = task;
-    return execute_model_ov(request);
+    return execute_model_ov(request, ov_model->inference_queue);
 }
 
 DNNAsyncStatusType ff_dnn_get_async_result_ov(const DNNModel *model, AVFrame **in, AVFrame **out)
@@ -740,7 +811,7 @@ DNNAsyncStatusType ff_dnn_get_async_result_ov(const DNNModel *model, AVFrame **i
         return DAST_EMPTY_QUEUE;
     }
 
-    if (!task->done) {
+    if (task->inference_done != task->inference_todo) {
         return DAST_NOT_READY;
     }
 
@@ -760,21 +831,17 @@ DNNReturnType ff_dnn_flush_ov(const DNNModel *model)
     IEStatusCode status;
     DNNReturnType ret;
 
+    if (ff_queue_size(ov_model->inference_queue) == 0) {
+        // no pending task need to flush
+        return DNN_SUCCESS;
+    }
+
     request = ff_safe_queue_pop_front(ov_model->request_queue);
     if (!request) {
         av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n");
         return DNN_ERROR;
     }
 
-    if (request->task_count == 0) {
-        // no pending task need to flush
-        if (ff_safe_queue_push_back(ov_model->request_queue, request) < 0) {
-            av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n");
-            return DNN_ERROR;
-        }
-        return DNN_SUCCESS;
-    }
-
     ret = fill_model_input_ov(ov_model, request);
     if (ret != DNN_SUCCESS) {
         av_log(ctx, AV_LOG_ERROR, "Failed to fill model input.\n");
@@ -803,11 +870,17 @@ void ff_dnn_free_model_ov(DNNModel **model)
             if (item && item->infer_request) {
                 ie_infer_request_free(&item->infer_request);
             }
-            av_freep(&item->tasks);
+            av_freep(&item->inferences);
             av_freep(&item);
         }
         ff_safe_queue_destroy(ov_model->request_queue);
 
+        while (ff_queue_size(ov_model->inference_queue) != 0) {
+            TaskItem *item = ff_queue_pop_front(ov_model->inference_queue);
+            av_freep(&item);
+        }
+        ff_queue_destroy(ov_model->inference_queue);
+
         while (ff_queue_size(ov_model->task_queue) != 0) {
             TaskItem *item = ff_queue_pop_front(ov_model->task_queue);
             av_frame_free(&item->in_frame);