From patchwork Thu May 6 08:46:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fu, Ting" X-Patchwork-Id: 27614 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:6109:0:0:0:0:0 with SMTP id v9csp1098254iob; Thu, 6 May 2021 01:56:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz0DkCgqrpc+8nBk7ZZQTyPWKAMp8pWxtIxnEBmd9ytUHgFRmORFe8bo1DuLBGrzeTZjzzM X-Received: by 2002:a17:906:e206:: with SMTP id gf6mr3193920ejb.434.1620291371113; Thu, 06 May 2021 01:56:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620291371; cv=none; d=google.com; s=arc-20160816; b=A//XENbkgNBEuEh5KyRdgYsPosha11COG/sSezOIqY6lCSIUo0qSz9CTwYGC1Ey4wZ sqW26c8AroGHOAE71IDFh9qRZH8FU7NRVBeL1L73xEZU61G9TsTEhzRGSmOkB2O4IaYE zWC2YAvDNyst9n1NpEgpa23hbWKdC9s37HKSAUOhqewqRWYj29tpBpcvxeDYHNBQZhSt VK2IPkhSZWi8y819MrZtSk0pXDsXRv1RnysumsHH4qAMZCZwsRiFliII3qAKe36kUq4B wV/oKA5Pog5O/OC5mNO6DUSKRxurJGxK+iKVYKCWiabWsEbZiO34uIjJ0xEC2N7l24P0 RJcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:message-id:date:to:from:ironport-sdr :ironport-sdr:delivered-to; bh=NAWxRGGlma+qwKJZ8gG5IraveUDaXiln4PrUU5HDVBg=; b=ZJoal81z4cSKFHBIa7ZyNyw2h29ysWp0rah1+dN4mSZI+/+UOHvE5rRG8PIstN0Lk2 CX2vqGG4VZqfb7bak93UUkAUaytvaOEnvicIj19ygbFHnwQTbJ5/uywJhGl0Bf5Yu7Lv vaniQDZhBT0eOvFKtrKUHef+0ShZsAZhkhnBQri/M5T824mfULgmXreKm/sEPGFr5UpT Ii3cOxakiwg5Mk7SUecmVPo9ilNjtf1LwyZ01kB6VlUCO1y7JTH8BFrruOzsRLoAHFw8 WAuTor/AF5veEUV2uUuyq/PUl/N+IeRDJka6mKj9DOEFHGwNJvC8G5W65OeEIqwNPqH9 /Zmw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id a20si1707613edx.287.2021.05.06.01.56.10; Thu, 06 May 2021 01:56:11 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7888668022F; Thu, 6 May 2021 11:56:06 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1E5F1680260 for ; Thu, 6 May 2021 11:55:56 +0300 (EEST) IronPort-SDR: M2UdRm74NIXrZ3RxBfqVNrA1tIVsGLzeP4uYCPhF9OwTYyjpYbfSRlul9SB9jDYt+VP4GyXORR g/Ym2FG5s9GQ== X-IronPort-AV: E=McAfee;i="6200,9189,9975"; a="177977575" X-IronPort-AV: E=Sophos;i="5.82,277,1613462400"; d="scan'208";a="177977575" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 May 2021 01:55:52 -0700 IronPort-SDR: 6vJ0kLw2WRPvATJQUdsxnzRnzWjCWaalbn/pGfMNVfbmbYcc/gOBHvFYHchJ0yoGZmSOdBQwTy v8n5BSpv3N1g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,277,1613462400"; d="scan'208";a="607740287" Received: from semmer-ubuntu.sh.intel.com ([10.239.159.83]) by orsmga005.jf.intel.com with ESMTP; 06 May 2021 01:55:51 -0700 From: Ting Fu To: ffmpeg-devel@ffmpeg.org Date: Thu, 6 May 2021 16:46:07 +0800 Message-Id: <20210506084610.23487-1-ting.fu@intel.com> X-Mailer: git-send-email 2.17.1 Subject: [FFmpeg-devel] [PATCH V2 1/4] dnn: add DCO_RGB color order to enum DNNColorOrder X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 9ZkyM3GhQmhE Adding DCO_RGB color order to DNNColorOrder, since tensorflow model needs this kind of color oder as input. Signed-off-by: Ting Fu --- V2: Rebase patch to latest code libavfilter/dnn/dnn_backend_tf.c | 1 + libavfilter/dnn/dnn_io_proc.c | 14 +++++++++++--- libavfilter/dnn_interface.h | 1 + 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_tf.c b/libavfilter/dnn/dnn_backend_tf.c index 03fe310b03..45da29ae70 100644 --- a/libavfilter/dnn/dnn_backend_tf.c +++ b/libavfilter/dnn/dnn_backend_tf.c @@ -143,6 +143,7 @@ static DNNReturnType get_input_tf(void *model, DNNData *input, const char *input tf_output.index = 0; input->dt = TF_OperationOutputType(tf_output); + input->order = DCO_RGB; status = TF_NewStatus(); TF_GraphGetTensorShape(tf_model->graph, tf_output, dims, 4, status); diff --git a/libavfilter/dnn/dnn_io_proc.c b/libavfilter/dnn/dnn_io_proc.c index 5f60d68078..1e2bef3f9a 100644 --- a/libavfilter/dnn/dnn_io_proc.c +++ b/libavfilter/dnn/dnn_io_proc.c @@ -168,11 +168,19 @@ static DNNReturnType proc_from_frame_to_dnn_frameprocessing(AVFrame *frame, DNND static enum AVPixelFormat get_pixel_format(DNNData *data) { - if (data->dt == DNN_UINT8 && data->order == DCO_BGR) { - return AV_PIX_FMT_BGR24; + if (data->dt == DNN_UINT8) { + switch (data->order) { + case DCO_BGR: + return AV_PIX_FMT_BGR24; + case DCO_RGB: + return AV_PIX_FMT_RGB24; + default: + av_assert0(!"unsupported data pixel format.\n"); + return AV_PIX_FMT_BGR24; + } } - av_assert0(!"not supported yet.\n"); + av_assert0(!"unsupported data type.\n"); return AV_PIX_FMT_BGR24; } diff --git a/libavfilter/dnn_interface.h b/libavfilter/dnn_interface.h index 799244ee14..5e9ffeb077 100644 --- a/libavfilter/dnn_interface.h +++ b/libavfilter/dnn_interface.h @@ -39,6 +39,7 @@ typedef enum {DNN_FLOAT = 1, DNN_UINT8 = 4} DNNDataType; typedef enum { DCO_NONE, DCO_BGR, + DCO_RGB, } DNNColorOrder; typedef enum { From patchwork Thu May 6 08:46:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fu, Ting" X-Patchwork-Id: 27613 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:6109:0:0:0:0:0 with SMTP id v9csp1098329iob; Thu, 6 May 2021 01:56:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxt9M4aO6tMTc2gtH3rEm+WXSovIJcPjlEk3i+qkYMMyFzqaUMm+SkHlGdDgudNIrXr0FEu X-Received: by 2002:a50:ba88:: with SMTP id x8mr3822004ede.28.1620291382774; Thu, 06 May 2021 01:56:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620291382; cv=none; d=google.com; s=arc-20160816; b=e8AsRVLru9MlL6gg97qoHf5fCCy/clxtJUg7q15Z5wNkfNXa+N27pMSLXqDsuQt7YG J6vxjDsDEY1npLycncHj3QTfw3hq16shNco2+h5DomVb0fNq4e376Q/UjGdBajNKLi6X zrsSYYKceuzL6SJ8XLstPkZkiRfO368aDjn4Yh416Og7kwZ3zWD9dkC8fHivoS/F5NOI bENbxdrFFyP9arycrQFv/Lzs4FMlxZHP/8rvYkBxePAOxxd4jP9vCxXHVE17Szx+Cy0e k94Oy9TcZS20arq9dyufsvuQjGvepdQFR6XVg41hnODGdsQSLuCkffAriKGnHxrB9WaJ Ml8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:ironport-sdr:ironport-sdr:delivered-to; bh=i9wCDQmGp9lWXZjB6U8IlsEWm6lVVshGBKN+Raepeog=; b=j3uRiui/YapTnYKuWn2yneBWT9AIyQ8G2QSu73M6Zk80BGkAgffC2FOPsqFSySCu75 E3S099DksqOV/5nCIyNOcJ9xrWDZuCeHm7dpHbwzbIiEXd1suP19A+RhN/IHPndGHz6u mIYiy55rUppVQ9h8/L0+DZpohFOlNOIDPsb/7sXO2jDBsN1lFsb/Qkw6WmOzZa0xg+yt uk//k8wFlnHethDKAYbou7b/ZScLtGoDrLGjWIOA7cH9qUD11nhOEH8RwZ3fwnLU/55x FIgAoLd4OO4LwZLw77v2UW/BJwTBd1PJWYdEJ9cX/C2hkXZxYV6VhWhnL6C1uemOgsN8 bfyg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id p15si1729407ejb.277.2021.05.06.01.56.22; Thu, 06 May 2021 01:56:22 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 900B468074A; Thu, 6 May 2021 11:56:12 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 74D05680267 for ; Thu, 6 May 2021 11:56:05 +0300 (EEST) IronPort-SDR: sjoIxdzNhzOerxPNUojSE2YmEXLJ302Ea4pHNNheNnod/LZNib6os8xQI/Ge1qfOKOS+T4yOg1 g5u1F+gGL6SA== X-IronPort-AV: E=McAfee;i="6200,9189,9975"; a="177977588" X-IronPort-AV: E=Sophos;i="5.82,277,1613462400"; d="scan'208";a="177977588" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 May 2021 01:55:53 -0700 IronPort-SDR: Q3/YOWjrs6hteMKKmnoOYJBLgsjSOSrsjaVkxssararrCIzal8XEwDSz+rUKH8vpnFLznh1Crn wunM6j7AG3ag== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,277,1613462400"; d="scan'208";a="607740295" Received: from semmer-ubuntu.sh.intel.com ([10.239.159.83]) by orsmga005.jf.intel.com with ESMTP; 06 May 2021 01:55:52 -0700 From: Ting Fu To: ffmpeg-devel@ffmpeg.org Date: Thu, 6 May 2021 16:46:08 +0800 Message-Id: <20210506084610.23487-2-ting.fu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210506084610.23487-1-ting.fu@intel.com> References: <20210506084610.23487-1-ting.fu@intel.com> Subject: [FFmpeg-devel] [PATCH V2 2/4] lavfi/dnn_backend_tensorflow: add multiple outputs support X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: djVSNJ2dSll9 Signed-off-by: Ting Fu --- libavfilter/dnn/dnn_backend_tf.c | 49 ++++++++++++++--------------- libavfilter/dnn_filter_common.c | 53 ++++++++++++++++++++++++++------ libavfilter/dnn_filter_common.h | 6 ++-- libavfilter/vf_derain.c | 2 +- libavfilter/vf_sr.c | 2 +- 5 files changed, 75 insertions(+), 37 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_tf.c b/libavfilter/dnn/dnn_backend_tf.c index 45da29ae70..b6b1812cd9 100644 --- a/libavfilter/dnn/dnn_backend_tf.c +++ b/libavfilter/dnn/dnn_backend_tf.c @@ -155,7 +155,7 @@ static DNNReturnType get_input_tf(void *model, DNNData *input, const char *input TF_DeleteStatus(status); // currently only NHWC is supported - av_assert0(dims[0] == 1); + av_assert0(dims[0] == 1 || dims[0] == -1); input->height = dims[1]; input->width = dims[2]; input->channels = dims[3]; @@ -707,7 +707,7 @@ static DNNReturnType execute_model_tf(const DNNModel *model, const char *input_n TF_Output *tf_outputs; TFModel *tf_model = model->model; TFContext *ctx = &tf_model->ctx; - DNNData input, output; + DNNData input, *outputs; TF_Tensor **output_tensors; TF_Output tf_input; TF_Tensor *input_tensor; @@ -738,14 +738,6 @@ static DNNReturnType execute_model_tf(const DNNModel *model, const char *input_n } } - if (nb_output != 1) { - // currently, the filter does not need multiple outputs, - // so we just pending the support until we really need it. - TF_DeleteTensor(input_tensor); - avpriv_report_missing_feature(ctx, "multiple outputs"); - return DNN_ERROR; - } - tf_outputs = av_malloc_array(nb_output, sizeof(*tf_outputs)); if (tf_outputs == NULL) { TF_DeleteTensor(input_tensor); @@ -785,23 +777,31 @@ static DNNReturnType execute_model_tf(const DNNModel *model, const char *input_n return DNN_ERROR; } + outputs = av_malloc_array(nb_output, sizeof(*outputs)); + if (!outputs) { + TF_DeleteTensor(input_tensor); + av_freep(&tf_outputs); + av_freep(&output_tensors); + av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for *outputs\n"); \ + return DNN_ERROR; + } + for (uint32_t i = 0; i < nb_output; ++i) { - output.height = TF_Dim(output_tensors[i], 1); - output.width = TF_Dim(output_tensors[i], 2); - output.channels = TF_Dim(output_tensors[i], 3); - output.data = TF_TensorData(output_tensors[i]); - output.dt = TF_TensorType(output_tensors[i]); - - if (do_ioproc) { - if (tf_model->model->frame_post_proc != NULL) { - tf_model->model->frame_post_proc(out_frame, &output, tf_model->model->filter_ctx); - } else { - ff_proc_from_dnn_to_frame(out_frame, &output, ctx); - } + outputs[i].height = TF_Dim(output_tensors[i], 1); + outputs[i].width = TF_Dim(output_tensors[i], 2); + outputs[i].channels = TF_Dim(output_tensors[i], 3); + outputs[i].data = TF_TensorData(output_tensors[i]); + outputs[i].dt = TF_TensorType(output_tensors[i]); + } + if (do_ioproc) { + if (tf_model->model->frame_post_proc != NULL) { + tf_model->model->frame_post_proc(out_frame, outputs, tf_model->model->filter_ctx); } else { - out_frame->width = output.width; - out_frame->height = output.height; + ff_proc_from_dnn_to_frame(out_frame, outputs, ctx); } + } else { + out_frame->width = outputs[0].width; + out_frame->height = outputs[0].height; } for (uint32_t i = 0; i < nb_output; ++i) { @@ -812,6 +812,7 @@ static DNNReturnType execute_model_tf(const DNNModel *model, const char *input_n TF_DeleteTensor(input_tensor); av_freep(&output_tensors); av_freep(&tf_outputs); + av_freep(&outputs); return DNN_SUCCESS; } diff --git a/libavfilter/dnn_filter_common.c b/libavfilter/dnn_filter_common.c index 52c7a5392a..0ed0ac2e30 100644 --- a/libavfilter/dnn_filter_common.c +++ b/libavfilter/dnn_filter_common.c @@ -17,6 +17,39 @@ */ #include "dnn_filter_common.h" +#include "libavutil/avstring.h" + +#define MAX_SUPPORTED_OUTPUTS_NB 4 + +static char **separate_output_names(const char *expr, const char *val_sep, int *separated_nb) +{ + char *val, **parsed_vals = NULL; + int val_num = 0; + if (!expr || !val_sep || !separated_nb) { + return NULL; + } + + parsed_vals = av_mallocz_array(MAX_SUPPORTED_OUTPUTS_NB, sizeof(*parsed_vals)); + if (!parsed_vals) { + return NULL; + } + + do { + val = av_get_token(&expr, val_sep); + if(val) { + parsed_vals[val_num] = val; + val_num++; + } + if (*expr) { + expr++; + } + } while(*expr); + + parsed_vals[val_num] = NULL; + *separated_nb = val_num; + + return parsed_vals; +} int ff_dnn_init(DnnContext *ctx, DNNFunctionType func_type, AVFilterContext *filter_ctx) { @@ -28,8 +61,10 @@ int ff_dnn_init(DnnContext *ctx, DNNFunctionType func_type, AVFilterContext *fil av_log(filter_ctx, AV_LOG_ERROR, "input name of the model network is not specified\n"); return AVERROR(EINVAL); } - if (!ctx->model_outputname) { - av_log(filter_ctx, AV_LOG_ERROR, "output name of the model network is not specified\n"); + + ctx->model_outputnames = separate_output_names(ctx->model_outputnames_string, "&", &ctx->nb_outputs); + if (!ctx->model_outputnames) { + av_log(filter_ctx, AV_LOG_ERROR, "could not parse model output names\n"); return AVERROR(EINVAL); } @@ -91,15 +126,15 @@ DNNReturnType ff_dnn_get_input(DnnContext *ctx, DNNData *input) DNNReturnType ff_dnn_get_output(DnnContext *ctx, int input_width, int input_height, int *output_width, int *output_height) { return ctx->model->get_output(ctx->model->model, ctx->model_inputname, input_width, input_height, - ctx->model_outputname, output_width, output_height); + (const char *)ctx->model_outputnames[0], output_width, output_height); } DNNReturnType ff_dnn_execute_model(DnnContext *ctx, AVFrame *in_frame, AVFrame *out_frame) { DNNExecBaseParams exec_params = { .input_name = ctx->model_inputname, - .output_names = (const char **)&ctx->model_outputname, - .nb_output = 1, + .output_names = (const char **)ctx->model_outputnames, + .nb_output = ctx->nb_outputs, .in_frame = in_frame, .out_frame = out_frame, }; @@ -110,8 +145,8 @@ DNNReturnType ff_dnn_execute_model_async(DnnContext *ctx, AVFrame *in_frame, AVF { DNNExecBaseParams exec_params = { .input_name = ctx->model_inputname, - .output_names = (const char **)&ctx->model_outputname, - .nb_output = 1, + .output_names = (const char **)ctx->model_outputnames, + .nb_output = ctx->nb_outputs, .in_frame = in_frame, .out_frame = out_frame, }; @@ -123,8 +158,8 @@ DNNReturnType ff_dnn_execute_model_classification(DnnContext *ctx, AVFrame *in_f DNNExecClassificationParams class_params = { { .input_name = ctx->model_inputname, - .output_names = (const char **)&ctx->model_outputname, - .nb_output = 1, + .output_names = (const char **)ctx->model_outputnames, + .nb_output = ctx->nb_outputs, .in_frame = in_frame, .out_frame = out_frame, }, diff --git a/libavfilter/dnn_filter_common.h b/libavfilter/dnn_filter_common.h index e7736d2bac..e3a396d74a 100644 --- a/libavfilter/dnn_filter_common.h +++ b/libavfilter/dnn_filter_common.h @@ -30,10 +30,12 @@ typedef struct DnnContext { char *model_filename; DNNBackendType backend_type; char *model_inputname; - char *model_outputname; + char *model_outputnames_string; + uint32_t nb_outputs; char *backend_options; int async; + char **model_outputnames; DNNModule *dnn_module; DNNModel *model; } DnnContext; @@ -41,7 +43,7 @@ typedef struct DnnContext { #define DNN_COMMON_OPTIONS \ { "model", "path to model file", OFFSET(model_filename), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, FLAGS },\ { "input", "input name of the model", OFFSET(model_inputname), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, FLAGS },\ - { "output", "output name of the model", OFFSET(model_outputname), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, FLAGS },\ + { "output", "output name of the model", OFFSET(model_outputnames_string), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, FLAGS },\ { "backend_configs", "backend configs", OFFSET(backend_options), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, FLAGS },\ { "options", "backend configs", OFFSET(backend_options), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, FLAGS },\ { "async", "use DNN async inference", OFFSET(async), AV_OPT_TYPE_BOOL, { .i64 = 1}, 0, 1, FLAGS}, diff --git a/libavfilter/vf_derain.c b/libavfilter/vf_derain.c index 76c4ef414f..5037f3a5f7 100644 --- a/libavfilter/vf_derain.c +++ b/libavfilter/vf_derain.c @@ -50,7 +50,7 @@ static const AVOption derain_options[] = { #endif { "model", "path to model file", OFFSET(dnnctx.model_filename), AV_OPT_TYPE_STRING, { .str = NULL }, 0, 0, FLAGS }, { "input", "input name of the model", OFFSET(dnnctx.model_inputname), AV_OPT_TYPE_STRING, { .str = "x" }, 0, 0, FLAGS }, - { "output", "output name of the model", OFFSET(dnnctx.model_outputname), AV_OPT_TYPE_STRING, { .str = "y" }, 0, 0, FLAGS }, + { "output", "output name of the model", OFFSET(dnnctx.model_outputnames_string), AV_OPT_TYPE_STRING, { .str = "y" }, 0, 0, FLAGS }, { NULL } }; diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c index 4360439ca6..f930b38748 100644 --- a/libavfilter/vf_sr.c +++ b/libavfilter/vf_sr.c @@ -54,7 +54,7 @@ static const AVOption sr_options[] = { { "scale_factor", "scale factor for SRCNN model", OFFSET(scale_factor), AV_OPT_TYPE_INT, { .i64 = 2 }, 2, 4, FLAGS }, { "model", "path to model file specifying network architecture and its parameters", OFFSET(dnnctx.model_filename), AV_OPT_TYPE_STRING, {.str=NULL}, 0, 0, FLAGS }, { "input", "input name of the model", OFFSET(dnnctx.model_inputname), AV_OPT_TYPE_STRING, { .str = "x" }, 0, 0, FLAGS }, - { "output", "output name of the model", OFFSET(dnnctx.model_outputname), AV_OPT_TYPE_STRING, { .str = "y" }, 0, 0, FLAGS }, + { "output", "output name of the model", OFFSET(dnnctx.model_outputnames_string), AV_OPT_TYPE_STRING, { .str = "y" }, 0, 0, FLAGS }, { NULL } }; From patchwork Thu May 6 08:46:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fu, Ting" X-Patchwork-Id: 27611 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:6109:0:0:0:0:0 with SMTP id v9csp1098469iob; Thu, 6 May 2021 01:56:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzwviAFpkaelHFsTlzVy2U+jnOt/BJG2gHtbnPq4Bt841F3amKo6MNM+2H4xJ7iKoyLRZRK X-Received: by 2002:a17:906:4d02:: with SMTP id r2mr3231042eju.464.1620291392720; Thu, 06 May 2021 01:56:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620291392; cv=none; d=google.com; s=arc-20160816; b=gg6aqVGxJKgyYZLcoMo4amwudMHe0FYChigHMACqelRXyU9P/z5yd808Z6HC53Iplf 7cVPTdUm5VkkaXSHnxZuy6jHOgzgZigqxxe9gXufihQ5Kh6MyI+Uonv80SNNNMeGCbes h/WCwAdFpChgGiU9W3TCwg2LuxyclDyhIJhJSHVPkfp3n3ahRpo5ZJ71XghgxgXvGXZ8 yXSROn+NC7omvZ/SBU2inFNGIf2MJ5p3C/fXGBCprWgXb+IjvxBpKDgS9dzVy2kFsjEV qcygyVXKbAvNKJCt+MW2imT+8kyJ7ncPGmPClRmClYI4PU1DXgYy39cwFlFE0QRngUFv QBGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:ironport-sdr:ironport-sdr:delivered-to; bh=/TeRKr6GnUKyPGfQLBVXLstWDYaYxnHpMEmdCCiquEQ=; b=iKmPndqbIwOnbSH9SPBNP1VljKzBXoXxfCIiFYWaXS+1YMN1K3ioak93+gNamDQD55 Rw0BVu1nRFcnwiEyPS5PY3Yzp29pnu23X2sH/FaFm9LOtfVBfH4Efm10HmBYPmcBVuW2 MD4Ht99COCMC7PApK4F60yBH7ueFy7IWKc5+TEDLklNH+3cSVNX44a87bnvU4N7a21VT pWTvkvbpdFUOyWwo3qCiI+6uUBYUTBpkViwb/IpDGNPFVLYA7i0gwLh4+1B5plL668Oe pdfTm2sKFOKYuUGL8IhCmVN0aJGGidnU/IEjHIxHj1Nf7fqKVllt9VeRqx5nvuUD/kGd UIKg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id q19si1742838ejd.250.2021.05.06.01.56.32; Thu, 06 May 2021 01:56:32 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 7CA3A6808A7; Thu, 6 May 2021 11:56:13 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id AF81768074A for ; Thu, 6 May 2021 11:56:06 +0300 (EEST) IronPort-SDR: NZJAe4tvaSR9vp7u0f2+0Axxz0rdknO8xaLc7TFDPetu8oes+eh/a36UZXQOXL3n7xK83TEu3m MTY21UJGgz5Q== X-IronPort-AV: E=McAfee;i="6200,9189,9975"; a="177977595" X-IronPort-AV: E=Sophos;i="5.82,277,1613462400"; d="scan'208";a="177977595" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 May 2021 01:55:54 -0700 IronPort-SDR: /HJ3BlMdtCczWBvKxISdzzgcKlzY4Q6wCz3c9aoi471AXLfTC3cmwuRY9h0JkgDAbg0j9VAghq Uq9vTNFGcdmw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,277,1613462400"; d="scan'208";a="607740308" Received: from semmer-ubuntu.sh.intel.com ([10.239.159.83]) by orsmga005.jf.intel.com with ESMTP; 06 May 2021 01:55:53 -0700 From: Ting Fu To: ffmpeg-devel@ffmpeg.org Date: Thu, 6 May 2021 16:46:09 +0800 Message-Id: <20210506084610.23487-3-ting.fu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210506084610.23487-1-ting.fu@intel.com> References: <20210506084610.23487-1-ting.fu@intel.com> Subject: [FFmpeg-devel] [PATCH V2 3/4] lavfi/dnn_backend_tensorflow: support detect model X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 0ouQnccBRgQc Signed-off-by: Ting Fu --- libavfilter/dnn/dnn_backend_tf.c | 39 ++++++++++++++++++++++++++------ libavfilter/vf_dnn_detect.c | 32 +++++++++++++++++++++++++- 2 files changed, 63 insertions(+), 8 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_tf.c b/libavfilter/dnn/dnn_backend_tf.c index b6b1812cd9..622b5a8464 100644 --- a/libavfilter/dnn/dnn_backend_tf.c +++ b/libavfilter/dnn/dnn_backend_tf.c @@ -793,15 +793,40 @@ static DNNReturnType execute_model_tf(const DNNModel *model, const char *input_n outputs[i].data = TF_TensorData(output_tensors[i]); outputs[i].dt = TF_TensorType(output_tensors[i]); } - if (do_ioproc) { - if (tf_model->model->frame_post_proc != NULL) { - tf_model->model->frame_post_proc(out_frame, outputs, tf_model->model->filter_ctx); + switch (model->func_type) { + case DFT_PROCESS_FRAME: + //it only support 1 output if it's frame in & frame out + if (do_ioproc) { + if (tf_model->model->frame_post_proc != NULL) { + tf_model->model->frame_post_proc(out_frame, outputs, tf_model->model->filter_ctx); + } else { + ff_proc_from_dnn_to_frame(out_frame, outputs, ctx); + } } else { - ff_proc_from_dnn_to_frame(out_frame, outputs, ctx); + out_frame->width = outputs[0].width; + out_frame->height = outputs[0].height; + } + break; + case DFT_ANALYTICS_DETECT: + if (!model->detect_post_proc) { + av_log(ctx, AV_LOG_ERROR, "Detect filter needs provide post proc\n"); + return DNN_ERROR; + } + model->detect_post_proc(out_frame, outputs, nb_output, model->filter_ctx); + break; + default: + for (uint32_t i = 0; i < nb_output; ++i) { + if (output_tensors[i]) { + TF_DeleteTensor(output_tensors[i]); + } } - } else { - out_frame->width = outputs[0].width; - out_frame->height = outputs[0].height; + TF_DeleteTensor(input_tensor); + av_freep(&output_tensors); + av_freep(&tf_outputs); + av_freep(&outputs); + + av_log(ctx, AV_LOG_ERROR, "Tensorflow backend does not support this kind of dnn filter now\n"); + return DNN_ERROR; } for (uint32_t i = 0; i < nb_output; ++i) { diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c index 1dbe4f29a4..7d39acb653 100644 --- a/libavfilter/vf_dnn_detect.c +++ b/libavfilter/vf_dnn_detect.c @@ -203,10 +203,40 @@ static int read_detect_label_file(AVFilterContext *context) return 0; } +static int check_output_nb(DnnDetectContext *ctx, DNNBackendType backend_type, int output_nb) +{ + switch(backend_type) { + case DNN_TF: + if (output_nb != 4) { + av_log(ctx, AV_LOG_ERROR, "Only support tensorflow detect model with 4 outputs, \ + but get %d instead\n", output_nb); + return AVERROR(EINVAL); + } + return 0; + case DNN_OV: + if (output_nb != 1) { + av_log(ctx, AV_LOG_ERROR, "Dnn detect filter with openvino backend needs 1 output only, \ + but get %d instead\n", output_nb); + return AVERROR(EINVAL); + } + return 0; + default: + avpriv_report_missing_feature(ctx, "Dnn detect filter does not support current backend\n"); + return AVERROR(EINVAL); + } + return 0; +} + static av_cold int dnn_detect_init(AVFilterContext *context) { DnnDetectContext *ctx = context->priv; - int ret = ff_dnn_init(&ctx->dnnctx, DFT_ANALYTICS_DETECT, context); + DnnContext *dnn_ctx = &ctx->dnnctx; + int ret; + + ret = ff_dnn_init(&ctx->dnnctx, DFT_ANALYTICS_DETECT, context); + if (ret < 0) + return ret; + ret = check_output_nb(ctx, dnn_ctx->backend_type, dnn_ctx->nb_outputs); if (ret < 0) return ret; ff_dnn_set_detect_post_proc(&ctx->dnnctx, dnn_detect_post_proc); From patchwork Thu May 6 08:46:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fu, Ting" X-Patchwork-Id: 27612 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:6109:0:0:0:0:0 with SMTP id v9csp1098450iob; Thu, 6 May 2021 01:56:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJynLmxeyRWoQD6G4UkBOsQDobvJUJN80wstG6DgysOmV5fX5jjLURTLviijvvI92QW2PPSb X-Received: by 2002:a05:6402:1a2f:: with SMTP id be15mr3840292edb.207.1620291403207; Thu, 06 May 2021 01:56:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620291403; cv=none; d=google.com; s=arc-20160816; b=jSh9Kh/PsWBUWNqo87HcBSXo3UabOruxyyQVxiTgnJEPunpj3Ej1mPuDmYN392sgIj rpTQezsbIZQTU9eQv64Ef2gEkAe+pe6ZXufGOADmsEuPOCZJSLNpSMImUgzUK5wOcJ+Q 4/KlfDgpGt/ayDhHMOZJdPg/XYzLyappWygHI4YhaVcjpuvowN/TVPTgBZfs5sEYis5W FQGkhzYc5ZnIHly1gwjVrXmGGTwW2R9cl8/OparFOJVch1MarrgCZZbBGuZpVhz8For+ +YyXwVxq0LIoWdl8g79RJ/qKaSEoh3qomlQNXsB5Ad2hmwO4u9/B6cTr7v2JgROpFt9F 7f7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:ironport-sdr:ironport-sdr:delivered-to; bh=JNCtRzTD5qlR+ulMJNMdrT9v2ETcj0LbWb9wpqQ7Frg=; b=zcbmd4sewi489PIy4XZ72FY6iDA2VWy77928cx7y1BdbQd7CPi2712ffVfOwBAc8Gv Pc9JWSpLZZ85CEqWWbZEpji8QV4jV3f8ATplPHl7v7nK4U95UQO2lqxGW9BvM0X8rZac kFb7KVLz5dSqzSA+wC4xAbzgkYR+3G0cUGeLXbLLQ1gAj4dWuEe6F7JH9rdlggMvK4Gc 7vVhBQ7aEkxgsMr+r8oTNPU4JnApmNelQEdrAwm5KgytQtTQxLbJ5U1B10RAcGh92fxT p6wruFRjDaiT3jp4y6GkloAJ6+TafeBQJYFo8yrN+MQlpLwYoMxTTdGr1RjXTsvdoKwy iWLw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id l7si2248952ejk.583.2021.05.06.01.56.42; Thu, 06 May 2021 01:56:43 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6E5E06808E6; Thu, 6 May 2021 11:56:19 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8B6506807AA for ; Thu, 6 May 2021 11:56:11 +0300 (EEST) IronPort-SDR: 01DgP0q8a2C93okCdM2t0abUFpKpnDaHwq362mmF6Tz7GVXVSIA+91SoHCgzzVnKdiY9QLFyNb 8LFSFlOjKFug== X-IronPort-AV: E=McAfee;i="6200,9189,9975"; a="177977607" X-IronPort-AV: E=Sophos;i="5.82,277,1613462400"; d="scan'208";a="177977607" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 May 2021 01:55:55 -0700 IronPort-SDR: TzaRl45vcFV120WM2aueGRvDUKFIPLTGTv1Y3fEwL+kVQGvp3F5ZQ6ys5WxqlA9zhkb8bRk/a7 Muqg2abfvRsQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,277,1613462400"; d="scan'208";a="607740318" Received: from semmer-ubuntu.sh.intel.com ([10.239.159.83]) by orsmga005.jf.intel.com with ESMTP; 06 May 2021 01:55:54 -0700 From: Ting Fu To: ffmpeg-devel@ffmpeg.org Date: Thu, 6 May 2021 16:46:10 +0800 Message-Id: <20210506084610.23487-4-ting.fu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210506084610.23487-1-ting.fu@intel.com> References: <20210506084610.23487-1-ting.fu@intel.com> Subject: [FFmpeg-devel] [PATCH V2 4/4] dnn/vf_dnn_detect: add tensorflow output parse support X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 53ztXwTksQzZ Testing model is tensorflow offical model in github repo, please refer https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md to download the detect model as you need. For example, local testing was carried on with 'ssd_mobilenet_v2_coco_2018_03_29.tar.gz', and used one image of dog in https://github.com/tensorflow/models/blob/master/research/object_detection/test_images/image1.jpg Testing command is: ./ffmpeg -i image1.jpg -vf dnn_detect=dnn_backend=tensorflow:input=image_tensor:output=\ "num_detections&detection_scores&detection_classes&detection_boxes":model=ssd_mobilenet_v2_coco.pb,\ showinfo -f null - We will see the result similar as below: [Parsed_showinfo_1 @ 0x33e65f0] side data - detection bounding boxes: [Parsed_showinfo_1 @ 0x33e65f0] source: ssd_mobilenet_v2_coco.pb [Parsed_showinfo_1 @ 0x33e65f0] index: 0, region: (382, 60) -> (1005, 593), label: 18, confidence: 9834/10000. [Parsed_showinfo_1 @ 0x33e65f0] index: 1, region: (12, 8) -> (328, 549), label: 18, confidence: 8555/10000. [Parsed_showinfo_1 @ 0x33e65f0] index: 2, region: (293, 7) -> (682, 458), label: 1, confidence: 8033/10000. [Parsed_showinfo_1 @ 0x33e65f0] index: 3, region: (342, 0) -> (690, 325), label: 1, confidence: 5878/10000. There are two boxes of dog with cores 94.05% & 93.45% and two boxes of person with scores 80.33% & 58.78%. Signed-off-by: Ting Fu --- libavfilter/vf_dnn_detect.c | 95 ++++++++++++++++++++++++++++++++++++- 1 file changed, 94 insertions(+), 1 deletion(-) diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c index 7d39acb653..818b53a052 100644 --- a/libavfilter/vf_dnn_detect.c +++ b/libavfilter/vf_dnn_detect.c @@ -48,6 +48,9 @@ typedef struct DnnDetectContext { #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM static const AVOption dnn_detect_options[] = { { "dnn_backend", "DNN backend", OFFSET(backend_type), AV_OPT_TYPE_INT, { .i64 = 2 }, INT_MIN, INT_MAX, FLAGS, "backend" }, +#if (CONFIG_LIBTENSORFLOW == 1) + { "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, +#endif #if (CONFIG_LIBOPENVINO == 1) { "openvino", "openvino backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 2 }, 0, 0, FLAGS, "backend" }, #endif @@ -59,7 +62,7 @@ static const AVOption dnn_detect_options[] = { AVFILTER_DEFINE_CLASS(dnn_detect); -static int dnn_detect_post_proc(AVFrame *frame, DNNData *output, uint32_t nb, AVFilterContext *filter_ctx) +static int dnn_detect_post_proc_ov(AVFrame *frame, DNNData *output, AVFilterContext *filter_ctx) { DnnDetectContext *ctx = filter_ctx->priv; float conf_threshold = ctx->confidence; @@ -136,6 +139,96 @@ static int dnn_detect_post_proc(AVFrame *frame, DNNData *output, uint32_t nb, AV return 0; } +static int dnn_detect_post_proc_tf(AVFrame *frame, DNNData *output, AVFilterContext *filter_ctx) +{ + DnnDetectContext *ctx = filter_ctx->priv; + int proposal_count; + float conf_threshold = ctx->confidence; + float *conf, *position, *label_id, x0, y0, x1, y1; + int nb_bboxes = 0; + AVFrameSideData *sd; + AVDetectionBBox *bbox; + AVDetectionBBoxHeader *header; + + proposal_count = *(float *)(output[0].data); + conf = output[1].data; + position = output[3].data; + label_id = output[2].data; + + sd = av_frame_get_side_data(frame, AV_FRAME_DATA_DETECTION_BBOXES); + if (sd) { + av_log(filter_ctx, AV_LOG_ERROR, "already have dnn bounding boxes in side data.\n"); + return -1; + } + + for (int i = 0; i < proposal_count; ++i) { + if (conf[i] < conf_threshold) + continue; + nb_bboxes++; + } + + if (nb_bboxes == 0) { + av_log(filter_ctx, AV_LOG_VERBOSE, "nothing detected in this frame.\n"); + return 0; + } + + header = av_detection_bbox_create_side_data(frame, nb_bboxes); + if (!header) { + av_log(filter_ctx, AV_LOG_ERROR, "failed to create side data with %d bounding boxes\n", nb_bboxes); + return -1; + } + + av_strlcpy(header->source, ctx->dnnctx.model_filename, sizeof(header->source)); + + for (int i = 0; i < proposal_count; ++i) { + y0 = position[i * 4]; + x0 = position[i * 4 + 1]; + y1 = position[i * 4 + 2]; + x1 = position[i * 4 + 3]; + + bbox = av_get_detection_bbox(header, i); + + if (conf[i] < conf_threshold) { + continue; + } + + bbox->x = (int)(x0 * frame->width); + bbox->w = (int)(x1 * frame->width) - bbox->x; + bbox->y = (int)(y0 * frame->height); + bbox->h = (int)(y1 * frame->height) - bbox->y; + + bbox->detect_confidence = av_make_q((int)(conf[i] * 10000), 10000); + bbox->classify_count = 0; + + if (ctx->labels && label_id[i] < ctx->label_count) { + av_strlcpy(bbox->detect_label, ctx->labels[(int)label_id[i]], sizeof(bbox->detect_label)); + } else { + snprintf(bbox->detect_label, sizeof(bbox->detect_label), "%d", (int)label_id[i]); + } + + nb_bboxes--; + if (nb_bboxes == 0) { + break; + } + } + return 0; +} + +static int dnn_detect_post_proc(AVFrame *frame, DNNData *output, uint32_t nb, AVFilterContext *filter_ctx) +{ + DnnDetectContext *ctx = filter_ctx->priv; + DnnContext *dnn_ctx = &ctx->dnnctx; + switch (dnn_ctx->backend_type) { + case DNN_OV: + return dnn_detect_post_proc_ov(frame, output, filter_ctx); + case DNN_TF: + return dnn_detect_post_proc_tf(frame, output, filter_ctx); + default: + avpriv_report_missing_feature(filter_ctx, "Current dnn backend do not support detect filter\n"); + return AVERROR(EINVAL); + } +} + static void free_detect_labels(DnnDetectContext *ctx) { for (int i = 0; i < ctx->label_count; i++) {