From patchwork Thu Sep 21 01:26:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chen, Wenbin" X-Patchwork-Id: 43858 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1e62:b0:149:dfde:5c0a with SMTP id cy34csp315394pzb; Wed, 20 Sep 2023 18:26:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEmDw6sM8oD30BEum8ySpEuw6+6Pwqdsj7KBjeWXLF794UN/RsjlWOgDXWfzQzBCOFggL44 X-Received: by 2002:a05:6512:3da7:b0:502:fd1a:9fa0 with SMTP id k39-20020a0565123da700b00502fd1a9fa0mr4690640lfv.53.1695259610945; Wed, 20 Sep 2023 18:26:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695259610; cv=none; d=google.com; s=arc-20160816; b=LueaeHEoWB+K4qzjUt4CHBmtRjxOw97pgMN203bluhfs5uKVLJ8kJUfxN060qOQFNW OZuqvFSLPLEH5uZxchyHLhHdVlOtUW6OouOjt/guFERxqrLnztFmoJ0QqKrvTZO4KMph FEt1z1iFcTS2bHkYMCvmBPboP1VIxSzo8E1cKKDo16rM+Z8mz4I78OkjRuURIsJmaBiF phHeUZAuFxd/h8TTHdeYQJ1JY982/juk27go/51KGnGJHkZugIQkuPuKKgnfAdLfD6uo Vfz0hjtFMkQzAmWlz5JYbDrjcz9pfwukD3oruOIAo4Pxcp9PfU+wMGISMbVtpTsywbNW qfmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=mYSt2KafjH/LEkHXc5Jjt0oG4xoKavzY4RFyZupcZLw=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=Jx75CVcW7Hm6WngseVgPeXDH8moluRZLpy1t7tt6t462SCwVIbsqipa8IOdb4/62Gh pi7lFjMz+DVjQRsaL1u2JfVvE9wi0igtNo3lTFzKtYLuevT+EcZdDd4JbHKdIGtgNXBQ 4T9kV6zHnbPXQy8W4DTnlx12OHEw4tTEkoKbrkEWm+GPQ/bYZq8VG9Uo+3MJs3ITCJuh 7Vm8eXzH8+PmtfQHjySkl6UNnd24M1ig6D/45BJp/LVutLKqNyIPU3NPsuSMRYlLOqnW ARh8dsHQ43KrjQ9IdiCu6IkNZGPzSAw2X5YJDT+fCIKJbtmLBo7GKaVghm/2jDJyQhW7 2ktQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=ZWj9XkNV; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id dy16-20020a05640231f000b0052bd8711478si240382edb.686.2023.09.20.18.26.50; Wed, 20 Sep 2023 18:26:50 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=ZWj9XkNV; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 99F4B68C8C0; Thu, 21 Sep 2023 04:26:46 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.20]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3375868C872 for ; Thu, 21 Sep 2023 04:26:38 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695259604; x=1726795604; h=from:to:subject:date:message-id:mime-version: content-transfer-encoding; bh=4jSHVXNxta8rtlvxIhIQLWw+VZDKjUH1McbcF9ETCts=; b=ZWj9XkNVJTd9mG+x/Q3VumR3/IehTleZZGcb7w6AY9bn0b/VnpMV4jgi SwFBd/jQ6YeNctrdsx4ImV87ahkt5mozelCnOGNqWTz62i17BmyyTpu6F HS9cz8VGW6LYpUnObEVmAMPyaELvwnNmeG48hB5C0hqinyYl/kxdmvNBP OAPn+tBUEcSAKptAy2/W1uFppJMvGp1vWt1yQ696ZY8ydCkLrXzRZKfIv Lv/oDzY7aLQarhSt2vMEju0ItjBKcJAPKWBPz5rvig2qJ3qyI/1vNoYeb BCtUJSYMON50N67ycZZIj87+8jqkhNfvDfJ5F/qEIp26mBx0LbjATBFGV Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="370700943" X-IronPort-AV: E=Sophos;i="6.03,162,1694761200"; d="scan'208";a="370700943" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2023 18:26:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="781943177" X-IronPort-AV: E=Sophos;i="6.03,162,1694761200"; d="scan'208";a="781943177" Received: from wenbin-z390-aorus-ultra.sh.intel.com ([10.239.156.43]) by orsmga001.jf.intel.com with ESMTP; 20 Sep 2023 18:26:34 -0700 From: wenbin.chen-at-intel.com@ffmpeg.org To: ffmpeg-devel@ffmpeg.org Date: Thu, 21 Sep 2023 09:26:31 +0800 Message-Id: <20230921012633.16241-1-wenbin.chen@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 1/3] libavfilter/dnn: add layout option to openvino backend X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: fNT7RsMk65Na From: Wenbin Chen Dnn models have different input layout (NCHW or NHWC), so a "layout" option is added Use openvino's API to do layout conversion for input data. Use swscale to do layout conversion for output data as openvino doesn't have similiar C API for output. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_openvino.c | 47 +++++++- libavfilter/dnn/dnn_io_proc.c | 151 ++++++++++++++++++++++--- libavfilter/dnn_interface.h | 7 ++ 3 files changed, 185 insertions(+), 20 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_openvino.c b/libavfilter/dnn/dnn_backend_openvino.c index 4922833b07..3ba5f5331a 100644 --- a/libavfilter/dnn/dnn_backend_openvino.c +++ b/libavfilter/dnn/dnn_backend_openvino.c @@ -45,6 +45,7 @@ typedef struct OVOptions{ uint8_t async; int batch_size; int input_resizable; + DNNLayout layout; } OVOptions; typedef struct OVContext { @@ -100,6 +101,10 @@ static const AVOption dnn_openvino_options[] = { DNN_BACKEND_COMMON_OPTIONS { "batch_size", "batch size per request", OFFSET(options.batch_size), AV_OPT_TYPE_INT, { .i64 = 1 }, 1, 1000, FLAGS}, { "input_resizable", "can input be resizable or not", OFFSET(options.input_resizable), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, FLAGS }, + { "layout", "input layout of model", OFFSET(options.layout), AV_OPT_TYPE_INT, { .i64 = DL_NONE}, DL_NONE, DL_NHWC, FLAGS, "layout" }, + { "none", "none", 0, AV_OPT_TYPE_CONST, { .i64 = DL_NONE }, 0, 0, FLAGS, "layout"}, + { "nchw", "nchw", 0, AV_OPT_TYPE_CONST, { .i64 = DL_NCHW }, 0, 0, FLAGS, "layout"}, + { "nhwc", "nhwc", 0, AV_OPT_TYPE_CONST, { .i64 = DL_NHWC }, 0, 0, FLAGS, "layout"}, { NULL } }; @@ -231,9 +236,9 @@ static int fill_model_input_ov(OVModel *ov_model, OVRequestItem *request) avpriv_report_missing_feature(ctx, "Do not support dynamic model."); return AVERROR(ENOSYS); } - input.height = dims[2]; - input.width = dims[3]; - input.channels = dims[1]; + input.height = dims[1]; + input.width = dims[2]; + input.channels = dims[3]; input.dt = precision_to_datatype(precision); input.data = av_malloc(input.height * input.width * input.channels * get_datatype_size(input.dt)); if (!input.data) @@ -403,6 +408,7 @@ static void infer_completion_callback(void *args) av_assert0(request->lltask_count <= dims.dims[0]); #endif output.dt = precision_to_datatype(precision); + output.layout = ctx->options.layout; av_assert0(request->lltask_count >= 1); for (int i = 0; i < request->lltask_count; ++i) { @@ -521,11 +527,14 @@ static int init_model_ov(OVModel *ov_model, const char *input_name, const char * OVContext *ctx = &ov_model->ctx; #if HAVE_OPENVINO2 ov_status_e status; - ov_preprocess_input_tensor_info_t* input_tensor_info; - ov_preprocess_output_tensor_info_t* output_tensor_info; + ov_preprocess_input_tensor_info_t* input_tensor_info = NULL; + ov_preprocess_output_tensor_info_t* output_tensor_info = NULL; + ov_preprocess_input_model_info_t* input_model_info = NULL; ov_model_t *tmp_ov_model; ov_layout_t* NHWC_layout = NULL; + ov_layout_t* NCHW_layout = NULL; const char* NHWC_desc = "NHWC"; + const char* NCHW_desc = "NCHW"; const char* device = ctx->options.device_type; #else IEStatusCode status; @@ -570,6 +579,7 @@ static int init_model_ov(OVModel *ov_model, const char *input_name, const char * //set input layout status = ov_layout_create(NHWC_desc, &NHWC_layout); + status |= ov_layout_create(NCHW_desc, &NCHW_layout); if (status != OK) { av_log(ctx, AV_LOG_ERROR, "Failed to create layout for input.\n"); ret = ov2_map_error(status, NULL); @@ -583,6 +593,22 @@ static int init_model_ov(OVModel *ov_model, const char *input_name, const char * goto err; } + status = ov_preprocess_input_info_get_model_info(ov_model->input_info, &input_model_info); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to get input model info\n"); + ret = ov2_map_error(status, NULL); + goto err; + } + if (ctx->options.layout == DL_NCHW) + status = ov_preprocess_input_model_info_set_layout(input_model_info, NCHW_layout); + else if (ctx->options.layout == DL_NHWC) + status = ov_preprocess_input_model_info_set_layout(input_model_info, NHWC_layout); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to get set input model layout\n"); + ret = ov2_map_error(status, NULL); + goto err; + } + if (ov_model->model->func_type != DFT_PROCESS_FRAME) //set precision only for detect and classify status = ov_preprocess_input_tensor_info_set_element_type(input_tensor_info, U8); @@ -618,6 +644,9 @@ static int init_model_ov(OVModel *ov_model, const char *input_name, const char * ret = ov2_map_error(status, NULL); goto err; } + ov_preprocess_input_model_info_free(input_model_info); + ov_layout_free(NCHW_layout); + ov_layout_free(NHWC_layout); #else if (ctx->options.batch_size > 1) { input_shapes_t input_shapes; @@ -762,6 +791,14 @@ static int init_model_ov(OVModel *ov_model, const char *input_name, const char * return 0; err: +#if HAVE_OPENVINO2 + if (NCHW_layout) + ov_layout_free(NCHW_layout); + if (NHWC_layout) + ov_layout_free(NHWC_layout); + if (input_model_info) + ov_preprocess_input_model_info_free(input_model_info); +#endif dnn_free_model_ov(&ov_model->model); return ret; } diff --git a/libavfilter/dnn/dnn_io_proc.c b/libavfilter/dnn/dnn_io_proc.c index 7961bf6b95..dfa0d5e5da 100644 --- a/libavfilter/dnn/dnn_io_proc.c +++ b/libavfilter/dnn/dnn_io_proc.c @@ -27,6 +27,12 @@ int ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx) { struct SwsContext *sws_ctx; + int ret = 0; + int linesize[4] = { 0 }; + void **dst_data = NULL; + void *middle_data = NULL; + uint8_t *planar_data[4] = { 0 }; + int plane_size = frame->width * frame->height * sizeof(uint8_t); int bytewidth = av_image_get_linesize(frame->format, frame->width, 0); if (bytewidth < 0) { return AVERROR(EINVAL); @@ -35,6 +41,17 @@ int ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx) avpriv_report_missing_feature(log_ctx, "data type rather than DNN_FLOAT"); return AVERROR(ENOSYS); } + dst_data = (void **)frame->data; + linesize[0] = frame->linesize[0]; + if (output->layout == DL_NCHW) { + middle_data = av_malloc(plane_size * output->channels); + if (!middle_data) { + ret = AVERROR(ENOMEM); + goto err; + } + dst_data = &middle_data; + linesize[0] = frame->width * 3; + } switch (frame->format) { case AV_PIX_FMT_RGB24: @@ -51,18 +68,52 @@ int ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx) "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", av_get_pix_fmt_name(AV_PIX_FMT_GRAYF32), frame->width * 3, frame->height, av_get_pix_fmt_name(AV_PIX_FMT_GRAY8), frame->width * 3, frame->height); - return AVERROR(EINVAL); + ret = AVERROR(EINVAL); + goto err; } sws_scale(sws_ctx, (const uint8_t *[4]){(const uint8_t *)output->data, 0, 0, 0}, (const int[4]){frame->width * 3 * sizeof(float), 0, 0, 0}, 0, frame->height, - (uint8_t * const*)frame->data, frame->linesize); + (uint8_t * const*)dst_data, linesize); sws_freeContext(sws_ctx); - return 0; + // convert data from planar to packed + if (output->layout == DL_NCHW) { + sws_ctx = sws_getContext(frame->width, + frame->height, + AV_PIX_FMT_GBRP, + frame->width, + frame->height, + frame->format, + 0, NULL, NULL, NULL); + if (!sws_ctx) { + av_log(log_ctx, AV_LOG_ERROR, "Impossible to create scale context for the conversion " + "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", + av_get_pix_fmt_name(AV_PIX_FMT_GBRP), frame->width, frame->height, + av_get_pix_fmt_name(frame->format),frame->width, frame->height); + ret = AVERROR(EINVAL); + goto err; + } + if (frame->format == AV_PIX_FMT_RGB24) { + planar_data[0] = (uint8_t *)middle_data + plane_size; + planar_data[1] = (uint8_t *)middle_data + plane_size * 2; + planar_data[2] = (uint8_t *)middle_data; + } else if (frame->format == AV_PIX_FMT_BGR24) { + planar_data[0] = (uint8_t *)middle_data + plane_size; + planar_data[1] = (uint8_t *)middle_data; + planar_data[2] = (uint8_t *)middle_data + plane_size * 2; + } + sws_scale(sws_ctx, (const uint8_t * const *)planar_data, + (const int [4]){frame->width * sizeof(uint8_t), + frame->width * sizeof(uint8_t), + frame->width * sizeof(uint8_t), 0}, + 0, frame->height, frame->data, frame->linesize); + sws_freeContext(sws_ctx); + } + break; case AV_PIX_FMT_GRAYF32: av_image_copy_plane(frame->data[0], frame->linesize[0], output->data, bytewidth, bytewidth, frame->height); - return 0; + break; case AV_PIX_FMT_YUV420P: case AV_PIX_FMT_YUV422P: case AV_PIX_FMT_YUV444P: @@ -82,24 +133,34 @@ int ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx) "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", av_get_pix_fmt_name(AV_PIX_FMT_GRAYF32), frame->width, frame->height, av_get_pix_fmt_name(AV_PIX_FMT_GRAY8), frame->width, frame->height); - return AVERROR(EINVAL); + ret = AVERROR(EINVAL); + goto err; } sws_scale(sws_ctx, (const uint8_t *[4]){(const uint8_t *)output->data, 0, 0, 0}, (const int[4]){frame->width * sizeof(float), 0, 0, 0}, 0, frame->height, (uint8_t * const*)frame->data, frame->linesize); sws_freeContext(sws_ctx); - return 0; + break; default: avpriv_report_missing_feature(log_ctx, "%s", av_get_pix_fmt_name(frame->format)); - return AVERROR(ENOSYS); + ret = AVERROR(ENOSYS); + goto err; } - return 0; +err: + av_free(middle_data); + return ret; } int ff_proc_from_frame_to_dnn(AVFrame *frame, DNNData *input, void *log_ctx) { struct SwsContext *sws_ctx; + int ret = 0; + int linesize[4] = { 0 }; + void **src_data = NULL; + void *middle_data = NULL; + uint8_t *planar_data[4] = { 0 }; + int plane_size = frame->width * frame->height * sizeof(uint8_t); int bytewidth = av_image_get_linesize(frame->format, frame->width, 0); if (bytewidth < 0) { return AVERROR(EINVAL); @@ -109,9 +170,54 @@ int ff_proc_from_frame_to_dnn(AVFrame *frame, DNNData *input, void *log_ctx) return AVERROR(ENOSYS); } + src_data = (void **)frame->data; + linesize[0] = frame->linesize[0]; + if (input->layout == DL_NCHW) { + middle_data = av_malloc(plane_size * input->channels); + if (!middle_data) { + ret = AVERROR(ENOMEM); + goto err; + } + src_data = &middle_data; + linesize[0] = frame->width * 3; + } + switch (frame->format) { case AV_PIX_FMT_RGB24: case AV_PIX_FMT_BGR24: + // convert data from planar to packed + if (input->layout == DL_NCHW) { + sws_ctx = sws_getContext(frame->width, + frame->height, + frame->format, + frame->width, + frame->height, + AV_PIX_FMT_GBRP, + 0, NULL, NULL, NULL); + if (!sws_ctx) { + av_log(log_ctx, AV_LOG_ERROR, "Impossible to create scale context for the conversion " + "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", + av_get_pix_fmt_name(frame->format), frame->width, frame->height, + av_get_pix_fmt_name(AV_PIX_FMT_GBRP),frame->width, frame->height); + ret = AVERROR(EINVAL); + goto err; + } + if (frame->format == AV_PIX_FMT_RGB24) { + planar_data[0] = (uint8_t *)middle_data + plane_size; + planar_data[1] = (uint8_t *)middle_data + plane_size * 2; + planar_data[2] = (uint8_t *)middle_data; + } else if (frame->format == AV_PIX_FMT_BGR24) { + planar_data[0] = (uint8_t *)middle_data + plane_size; + planar_data[1] = (uint8_t *)middle_data; + planar_data[2] = (uint8_t *)middle_data + plane_size * 2; + } + sws_scale(sws_ctx, (const uint8_t * const *)frame->data, + frame->linesize, 0, frame->height, planar_data, + (const int [4]){frame->width * sizeof(uint8_t), + frame->width * sizeof(uint8_t), + frame->width * sizeof(uint8_t), 0}); + sws_freeContext(sws_ctx); + } sws_ctx = sws_getContext(frame->width * 3, frame->height, AV_PIX_FMT_GRAY8, @@ -124,10 +230,11 @@ int ff_proc_from_frame_to_dnn(AVFrame *frame, DNNData *input, void *log_ctx) "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", av_get_pix_fmt_name(AV_PIX_FMT_GRAY8), frame->width * 3, frame->height, av_get_pix_fmt_name(AV_PIX_FMT_GRAYF32),frame->width * 3, frame->height); - return AVERROR(EINVAL); + ret = AVERROR(EINVAL); + goto err; } - sws_scale(sws_ctx, (const uint8_t **)frame->data, - frame->linesize, 0, frame->height, + sws_scale(sws_ctx, (const uint8_t **)src_data, + linesize, 0, frame->height, (uint8_t * const [4]){input->data, 0, 0, 0}, (const int [4]){frame->width * 3 * sizeof(float), 0, 0, 0}); sws_freeContext(sws_ctx); @@ -156,7 +263,8 @@ int ff_proc_from_frame_to_dnn(AVFrame *frame, DNNData *input, void *log_ctx) "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", av_get_pix_fmt_name(AV_PIX_FMT_GRAY8), frame->width, frame->height, av_get_pix_fmt_name(AV_PIX_FMT_GRAYF32),frame->width, frame->height); - return AVERROR(EINVAL); + ret = AVERROR(EINVAL); + goto err; } sws_scale(sws_ctx, (const uint8_t **)frame->data, frame->linesize, 0, frame->height, @@ -166,10 +274,12 @@ int ff_proc_from_frame_to_dnn(AVFrame *frame, DNNData *input, void *log_ctx) break; default: avpriv_report_missing_feature(log_ctx, "%s", av_get_pix_fmt_name(frame->format)); - return AVERROR(ENOSYS); + ret = AVERROR(ENOSYS); + goto err; } - - return 0; +err: + av_free(middle_data); + return ret; } static enum AVPixelFormat get_pixel_format(DNNData *data) @@ -205,6 +315,11 @@ int ff_frame_to_dnn_classify(AVFrame *frame, DNNData *input, uint32_t bbox_index AVFrameSideData *sd = av_frame_get_side_data(frame, AV_FRAME_DATA_DETECTION_BBOXES); av_assert0(sd); + if (input->layout == DL_NCHW) { + av_log(log_ctx, AV_LOG_ERROR, "dnn_classify input data doesn't support layout: NCHW\n"); + return AVERROR(ENOSYS); + } + header = (const AVDetectionBBoxHeader *)sd->data; bbox = av_get_detection_bbox(header, bbox_index); @@ -257,6 +372,12 @@ int ff_frame_to_dnn_detect(AVFrame *frame, DNNData *input, void *log_ctx) int linesizes[4]; int ret = 0; enum AVPixelFormat fmt = get_pixel_format(input); + + if (input->layout == DL_NCHW) { + av_log(log_ctx, AV_LOG_ERROR, "dnn_detect input data doesn't support layout: NCHW\n"); + return AVERROR(ENOSYS); + } + sws_ctx = sws_getContext(frame->width, frame->height, frame->format, input->width, input->height, fmt, SWS_FAST_BILINEAR, NULL, NULL, NULL); diff --git a/libavfilter/dnn_interface.h b/libavfilter/dnn_interface.h index 20c6a0a896..956a63443a 100644 --- a/libavfilter/dnn_interface.h +++ b/libavfilter/dnn_interface.h @@ -56,12 +56,19 @@ typedef enum { DFT_ANALYTICS_CLASSIFY, // classify for each bounding box }DNNFunctionType; +typedef enum { + DL_NONE, + DL_NCHW, + DL_NHWC, +} DNNLayout; + typedef struct DNNData{ void *data; int width, height, channels; // dt and order together decide the color format DNNDataType dt; DNNColorOrder order; + DNNLayout layout; } DNNData; typedef struct DNNExecBaseParams { From patchwork Thu Sep 21 01:26:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chen, Wenbin" X-Patchwork-Id: 43859 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1e62:b0:149:dfde:5c0a with SMTP id cy34csp315444pzb; Wed, 20 Sep 2023 18:27:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFfDzgTCiJI6fJr873dHkGlw4RGfR3rnASGUAEEBegxvjdSwIrfIVoFBy5gwT2/C/ultNF1 X-Received: by 2002:a17:906:db:b0:9ae:5f51:2e4a with SMTP id 27-20020a17090600db00b009ae5f512e4amr468192eji.36.1695259621162; Wed, 20 Sep 2023 18:27:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695259621; cv=none; d=google.com; s=arc-20160816; b=bZeSHbjsxak9lCo4g5JQ/nmAwPxb1+3Te6C361qtH5pnyLZuE4cDHxdTen2lJQDhvJ v+jpvOrGum7KR0HoZTGCzMMUxBv5267NH6FMUqo6mMOY5rySjeZnUaub+Jet9+XeL0v7 7zqh9suEMC4NDOoPMg7F8b5bHg6rkJpVNEIT9JxVhKhxfUbtaMVyL4l3ulDZRN9tSJyz UzKeXrMBBWb/1gR55OEygj2dmTJCoV6ZWYYe3QyfHvNzcUAVFph8nkc5Gvw9BSHnKk2g ZEI4jRXSB03NyPITxqvcYNdPYnxwZryRXNo/awW86jPS0wzitzHULw7K6gIAX6+VGKz2 LNzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=i5WXWz4B09Wq0NeJnFKffT0Nk0W1luHGCbCcuazBn0E=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=K3V/vPqVF16O8hiLksI4+aj4sZYRkothJbFSbaP1CxTer20tGQLY+3LvHm9HwmcR2j 7XUbZePekmZrpsjPJCFVZX743E1lJTcFIqrad9wxIVP6K2IQivQ42TgpRjsdm2juBeZH EHnJetidKAUdr4C9uc0iT/gKOLUD+McFlDx2/LScsakZrcDzDBQD1HTec9rgDBj/QjUR Z9QhUpEflCkUZb+9miKlNnuFY18uENLbEwAcOgjzkMzR8QuycLaJbYtKCFa8iVd7+Kqg rhu1Fk0QldZhozmmZQMFPcAFfGBQYKJq4f0UpTyh9PkznL14Iz8OWDNTInGGggjNONnF 0hgw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=UC0ivRbL; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id g17-20020a1709064e5100b009928b4e3b9esi366863ejw.313.2023.09.20.18.27.00; Wed, 20 Sep 2023 18:27:01 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=UC0ivRbL; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id AB65968C8FD; Thu, 21 Sep 2023 04:26:52 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.20]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1F75168C89F for ; Thu, 21 Sep 2023 04:26:44 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695259610; x=1726795610; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=3wLb7YY4sw7bss09ESnW05hs+MCKnbSK/QNXp51bdKQ=; b=UC0ivRbLAo+KN4OZcRbQUEHG5EvF0tE1y64QaJYgrLHrh8Ymjk3JxiXX bLCq56YNEugQE9ABiaBY2S0eqfVVvi05FCAwPAs+b5Mu17NEhjF6QQh/4 K36Ij+XZvpQynL7C1XJzqRRnOtCuRvyxeB3IWBvOLjmVizpdr7OIBoQq1 M8UOt/yGpOb9t+UDwp4WCyUJR6Oheg+4i2O9gj1QYSCMIZz8pHMmFIqFV n3mMokf3xNa1GAi+ooySET6fWoafBETEAAIFrZG5xhF59kjKkp7zwwF++ m82hN613GmC+fBCQH1MBcW3bCrYCZNp8O9msQsS+VIZ6QndNQIYLlBlbQ g==; X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="370700947" X-IronPort-AV: E=Sophos;i="6.03,162,1694761200"; d="scan'208";a="370700947" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2023 18:26:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="781943197" X-IronPort-AV: E=Sophos;i="6.03,162,1694761200"; d="scan'208";a="781943197" Received: from wenbin-z390-aorus-ultra.sh.intel.com ([10.239.156.43]) by orsmga001.jf.intel.com with ESMTP; 20 Sep 2023 18:26:36 -0700 From: wenbin.chen-at-intel.com@ffmpeg.org To: ffmpeg-devel@ffmpeg.org Date: Thu, 21 Sep 2023 09:26:32 +0800 Message-Id: <20230921012633.16241-2-wenbin.chen@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230921012633.16241-1-wenbin.chen@intel.com> References: <20230921012633.16241-1-wenbin.chen@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 2/3] libavfilter/dnn: Add scale and mean preprocess to openvino backend X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ymu1qhVsiSyJ From: Wenbin Chen Dnn models has different data preprocess requirements. Scale and mean parameters are added to preprocess input data. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_openvino.c | 43 ++++++++++++-- libavfilter/dnn/dnn_io_proc.c | 82 +++++++++++++++++++++----- libavfilter/dnn_interface.h | 2 + 3 files changed, 108 insertions(+), 19 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_openvino.c b/libavfilter/dnn/dnn_backend_openvino.c index 3ba5f5331a..4224600f94 100644 --- a/libavfilter/dnn/dnn_backend_openvino.c +++ b/libavfilter/dnn/dnn_backend_openvino.c @@ -46,6 +46,8 @@ typedef struct OVOptions{ int batch_size; int input_resizable; DNNLayout layout; + float scale; + float mean; } OVOptions; typedef struct OVContext { @@ -105,6 +107,8 @@ static const AVOption dnn_openvino_options[] = { { "none", "none", 0, AV_OPT_TYPE_CONST, { .i64 = DL_NONE }, 0, 0, FLAGS, "layout"}, { "nchw", "nchw", 0, AV_OPT_TYPE_CONST, { .i64 = DL_NCHW }, 0, 0, FLAGS, "layout"}, { "nhwc", "nhwc", 0, AV_OPT_TYPE_CONST, { .i64 = DL_NHWC }, 0, 0, FLAGS, "layout"}, + { "scale", "Add scale preprocess operation. Divide each element of input by specified value.", OFFSET(options.scale), AV_OPT_TYPE_FLOAT, { .dbl = 0 }, INT_MIN, INT_MAX, FLAGS}, + { "mean", "Add mean preprocess operation. Subtract specified value from each element of input.", OFFSET(options.mean), AV_OPT_TYPE_FLOAT, { .dbl = 0 }, INT_MIN, INT_MAX, FLAGS}, { NULL } }; @@ -209,6 +213,7 @@ static int fill_model_input_ov(OVModel *ov_model, OVRequestItem *request) ie_blob_t *input_blob = NULL; #endif + memset(&input, 0, sizeof(input)); lltask = ff_queue_peek_front(ov_model->lltask_queue); av_assert0(lltask); task = lltask->task; @@ -274,6 +279,9 @@ static int fill_model_input_ov(OVModel *ov_model, OVRequestItem *request) // all models in openvino open model zoo use BGR as input, // change to be an option when necessary. input.order = DCO_BGR; + // We use preprocess_steps to scale input data, so disable scale and mean here. + input.scale = 1; + input.mean = 0; for (int i = 0; i < ctx->options.batch_size; ++i) { lltask = ff_queue_pop_front(ov_model->lltask_queue); @@ -343,6 +351,7 @@ static void infer_completion_callback(void *args) ov_shape_t output_shape = {0}; ov_element_type_e precision; + memset(&output, 0, sizeof(output)); status = ov_infer_request_get_output_tensor_by_index(request->infer_request, 0, &output_tensor); if (status != OK) { av_log(ctx, AV_LOG_ERROR, @@ -409,6 +418,8 @@ static void infer_completion_callback(void *args) #endif output.dt = precision_to_datatype(precision); output.layout = ctx->options.layout; + output.scale = ctx->options.scale; + output.mean = ctx->options.mean; av_assert0(request->lltask_count >= 1); for (int i = 0; i < request->lltask_count; ++i) { @@ -542,7 +553,9 @@ static int init_model_ov(OVModel *ov_model, const char *input_name, const char * ie_config_t config = {NULL, NULL, NULL}; char *all_dev_names = NULL; #endif - + // We scale pixel by default when do frame processing. + if (fabsf(ctx->options.scale) < 1e-6f) + ctx->options.scale = ov_model->model->func_type == DFT_PROCESS_FRAME ? 255 : 1; // batch size if (ctx->options.batch_size <= 0) { ctx->options.batch_size = 1; @@ -609,15 +622,37 @@ static int init_model_ov(OVModel *ov_model, const char *input_name, const char * goto err; } + status = ov_preprocess_input_tensor_info_set_element_type(input_tensor_info, U8); if (ov_model->model->func_type != DFT_PROCESS_FRAME) - //set precision only for detect and classify - status = ov_preprocess_input_tensor_info_set_element_type(input_tensor_info, U8); - status |= ov_preprocess_output_set_element_type(output_tensor_info, F32); + status |= ov_preprocess_output_set_element_type(output_tensor_info, F32); + else if (fabsf(ctx->options.scale - 1) > 1e-6f || fabsf(ctx->options.mean) > 1e-6f) + status |= ov_preprocess_output_set_element_type(output_tensor_info, F32); + else + status |= ov_preprocess_output_set_element_type(output_tensor_info, U8); if (status != OK) { av_log(ctx, AV_LOG_ERROR, "Failed to set input/output element type\n"); ret = ov2_map_error(status, NULL); goto err; } + // set preprocess steps. + if (fabsf(ctx->options.scale - 1) > 1e-6f || fabsf(ctx->options.mean) > 1e-6f) { + ov_preprocess_preprocess_steps_t* input_process_steps = NULL; + status = ov_preprocess_input_info_get_preprocess_steps(ov_model->input_info, &input_process_steps); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to get preprocess steps\n"); + ret = ov2_map_error(status, NULL); + goto err; + } + status = ov_preprocess_preprocess_steps_convert_element_type(input_process_steps, F32); + status |= ov_preprocess_preprocess_steps_mean(input_process_steps, ctx->options.mean); + status |= ov_preprocess_preprocess_steps_scale(input_process_steps, ctx->options.scale); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to set preprocess steps\n"); + ret = ov2_map_error(status, NULL); + goto err; + } + ov_preprocess_preprocess_steps_free(input_process_steps); + } //update model if(ov_model->ov_model) diff --git a/libavfilter/dnn/dnn_io_proc.c b/libavfilter/dnn/dnn_io_proc.c index dfa0d5e5da..ab656e8ed7 100644 --- a/libavfilter/dnn/dnn_io_proc.c +++ b/libavfilter/dnn/dnn_io_proc.c @@ -24,6 +24,20 @@ #include "libavutil/avassert.h" #include "libavutil/detection_bbox.h" +static int get_datatype_size(DNNDataType dt) +{ + switch (dt) + { + case DNN_FLOAT: + return sizeof(float); + case DNN_UINT8: + return sizeof(uint8_t); + default: + av_assert0(!"not supported yet."); + return 1; + } +} + int ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx) { struct SwsContext *sws_ctx; @@ -33,14 +47,26 @@ int ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx) void *middle_data = NULL; uint8_t *planar_data[4] = { 0 }; int plane_size = frame->width * frame->height * sizeof(uint8_t); + enum AVPixelFormat src_fmt = AV_PIX_FMT_NONE; + int src_datatype_size = get_datatype_size(output->dt); + int bytewidth = av_image_get_linesize(frame->format, frame->width, 0); if (bytewidth < 0) { return AVERROR(EINVAL); } - if (output->dt != DNN_FLOAT) { - avpriv_report_missing_feature(log_ctx, "data type rather than DNN_FLOAT"); + /* scale == 1 and mean == 0 and dt == UINT8: passthrough */ + if (fabsf(output->scale - 1) < 1e-6f && fabsf(output->mean) < 1e-6 && output->dt == DNN_UINT8) + src_fmt = AV_PIX_FMT_GRAY8; + /* (scale == 255 or scale == 0) and mean == 0 and dt == FLOAT: normalization */ + else if ((fabsf(output->scale - 255) < 1e-6f || fabsf(output->scale) < 1e-6f) && + fabsf(output->mean) < 1e-6 && output->dt == DNN_FLOAT) + src_fmt = AV_PIX_FMT_GRAYF32; + else { + av_log(log_ctx, AV_LOG_ERROR, "dnn_process output data doesn't type: UINT8 " + "scale: %f, mean: %f\n", output->scale, output->mean); return AVERROR(ENOSYS); } + dst_data = (void **)frame->data; linesize[0] = frame->linesize[0]; if (output->layout == DL_NCHW) { @@ -58,7 +84,7 @@ int ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx) case AV_PIX_FMT_BGR24: sws_ctx = sws_getContext(frame->width * 3, frame->height, - AV_PIX_FMT_GRAYF32, + src_fmt, frame->width * 3, frame->height, AV_PIX_FMT_GRAY8, @@ -66,13 +92,13 @@ int ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx) if (!sws_ctx) { av_log(log_ctx, AV_LOG_ERROR, "Impossible to create scale context for the conversion " "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", - av_get_pix_fmt_name(AV_PIX_FMT_GRAYF32), frame->width * 3, frame->height, + av_get_pix_fmt_name(src_fmt), frame->width * 3, frame->height, av_get_pix_fmt_name(AV_PIX_FMT_GRAY8), frame->width * 3, frame->height); ret = AVERROR(EINVAL); goto err; } sws_scale(sws_ctx, (const uint8_t *[4]){(const uint8_t *)output->data, 0, 0, 0}, - (const int[4]){frame->width * 3 * sizeof(float), 0, 0, 0}, 0, frame->height, + (const int[4]){frame->width * 3 * src_datatype_size, 0, 0, 0}, 0, frame->height, (uint8_t * const*)dst_data, linesize); sws_freeContext(sws_ctx); // convert data from planar to packed @@ -131,13 +157,13 @@ int ff_proc_from_dnn_to_frame(AVFrame *frame, DNNData *output, void *log_ctx) if (!sws_ctx) { av_log(log_ctx, AV_LOG_ERROR, "Impossible to create scale context for the conversion " "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", - av_get_pix_fmt_name(AV_PIX_FMT_GRAYF32), frame->width, frame->height, + av_get_pix_fmt_name(src_fmt), frame->width, frame->height, av_get_pix_fmt_name(AV_PIX_FMT_GRAY8), frame->width, frame->height); ret = AVERROR(EINVAL); goto err; } sws_scale(sws_ctx, (const uint8_t *[4]){(const uint8_t *)output->data, 0, 0, 0}, - (const int[4]){frame->width * sizeof(float), 0, 0, 0}, 0, frame->height, + (const int[4]){frame->width * src_datatype_size, 0, 0, 0}, 0, frame->height, (uint8_t * const*)frame->data, frame->linesize); sws_freeContext(sws_ctx); break; @@ -161,12 +187,22 @@ int ff_proc_from_frame_to_dnn(AVFrame *frame, DNNData *input, void *log_ctx) void *middle_data = NULL; uint8_t *planar_data[4] = { 0 }; int plane_size = frame->width * frame->height * sizeof(uint8_t); + enum AVPixelFormat dst_fmt = AV_PIX_FMT_NONE; + int dst_datatype_size = get_datatype_size(input->dt); int bytewidth = av_image_get_linesize(frame->format, frame->width, 0); if (bytewidth < 0) { return AVERROR(EINVAL); } - if (input->dt != DNN_FLOAT) { - avpriv_report_missing_feature(log_ctx, "data type rather than DNN_FLOAT"); + /* scale == 1 and mean == 0 and dt == UINT8: passthrough */ + if (fabsf(input->scale - 1) < 1e-6f && fabsf(input->mean) < 1e-6 && input->dt == DNN_UINT8) + dst_fmt = AV_PIX_FMT_GRAY8; + /* (scale == 255 or scale == 0) and mean == 0 and dt == FLOAT: normalization */ + else if ((fabsf(input->scale - 255) < 1e-6f || fabsf(input->scale) < 1e-6f) && + fabsf(input->mean) < 1e-6 && input->dt == DNN_FLOAT) + dst_fmt = AV_PIX_FMT_GRAYF32; + else { + av_log(log_ctx, AV_LOG_ERROR, "dnn_process input data doesn't support type: UINT8 " + "scale: %f, mean: %f\n", input->scale, input->mean); return AVERROR(ENOSYS); } @@ -223,20 +259,20 @@ int ff_proc_from_frame_to_dnn(AVFrame *frame, DNNData *input, void *log_ctx) AV_PIX_FMT_GRAY8, frame->width * 3, frame->height, - AV_PIX_FMT_GRAYF32, + dst_fmt, 0, NULL, NULL, NULL); if (!sws_ctx) { av_log(log_ctx, AV_LOG_ERROR, "Impossible to create scale context for the conversion " "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", av_get_pix_fmt_name(AV_PIX_FMT_GRAY8), frame->width * 3, frame->height, - av_get_pix_fmt_name(AV_PIX_FMT_GRAYF32),frame->width * 3, frame->height); + av_get_pix_fmt_name(dst_fmt),frame->width * 3, frame->height); ret = AVERROR(EINVAL); goto err; } sws_scale(sws_ctx, (const uint8_t **)src_data, linesize, 0, frame->height, (uint8_t * const [4]){input->data, 0, 0, 0}, - (const int [4]){frame->width * 3 * sizeof(float), 0, 0, 0}); + (const int [4]){frame->width * 3 * dst_datatype_size, 0, 0, 0}); sws_freeContext(sws_ctx); break; case AV_PIX_FMT_GRAYF32: @@ -256,20 +292,20 @@ int ff_proc_from_frame_to_dnn(AVFrame *frame, DNNData *input, void *log_ctx) AV_PIX_FMT_GRAY8, frame->width, frame->height, - AV_PIX_FMT_GRAYF32, + dst_fmt, 0, NULL, NULL, NULL); if (!sws_ctx) { av_log(log_ctx, AV_LOG_ERROR, "Impossible to create scale context for the conversion " "fmt:%s s:%dx%d -> fmt:%s s:%dx%d\n", av_get_pix_fmt_name(AV_PIX_FMT_GRAY8), frame->width, frame->height, - av_get_pix_fmt_name(AV_PIX_FMT_GRAYF32),frame->width, frame->height); + av_get_pix_fmt_name(dst_fmt),frame->width, frame->height); ret = AVERROR(EINVAL); goto err; } sws_scale(sws_ctx, (const uint8_t **)frame->data, frame->linesize, 0, frame->height, (uint8_t * const [4]){input->data, 0, 0, 0}, - (const int [4]){frame->width * sizeof(float), 0, 0, 0}); + (const int [4]){frame->width * dst_datatype_size, 0, 0, 0}); sws_freeContext(sws_ctx); break; default: @@ -315,6 +351,14 @@ int ff_frame_to_dnn_classify(AVFrame *frame, DNNData *input, uint32_t bbox_index AVFrameSideData *sd = av_frame_get_side_data(frame, AV_FRAME_DATA_DETECTION_BBOXES); av_assert0(sd); + /* (scale != 1 and scale != 0) or mean != 0 */ + if ((fabsf(input->scale - 1) > 1e-6f && fabsf(input->scale) > 1e-6f) || + fabsf(input->mean) > 1e-6f) { + av_log(log_ctx, AV_LOG_ERROR, "dnn_classify input data doesn't support " + "scale: %f, mean: %f\n", input->scale, input->mean); + return AVERROR(ENOSYS); + } + if (input->layout == DL_NCHW) { av_log(log_ctx, AV_LOG_ERROR, "dnn_classify input data doesn't support layout: NCHW\n"); return AVERROR(ENOSYS); @@ -373,6 +417,14 @@ int ff_frame_to_dnn_detect(AVFrame *frame, DNNData *input, void *log_ctx) int ret = 0; enum AVPixelFormat fmt = get_pixel_format(input); + /* (scale != 1 and scale != 0) or mean != 0 */ + if ((fabsf(input->scale - 1) > 1e-6f && fabsf(input->scale) > 1e-6f) || + fabsf(input->mean) > 1e-6f) { + av_log(log_ctx, AV_LOG_ERROR, "dnn_detect input data doesn't support " + "scale: %f, mean: %f\n", input->scale, input->mean); + return AVERROR(ENOSYS); + } + if (input->layout == DL_NCHW) { av_log(log_ctx, AV_LOG_ERROR, "dnn_detect input data doesn't support layout: NCHW\n"); return AVERROR(ENOSYS); diff --git a/libavfilter/dnn_interface.h b/libavfilter/dnn_interface.h index 956a63443a..183d8418b2 100644 --- a/libavfilter/dnn_interface.h +++ b/libavfilter/dnn_interface.h @@ -69,6 +69,8 @@ typedef struct DNNData{ DNNDataType dt; DNNColorOrder order; DNNLayout layout; + float scale; + float mean; } DNNData; typedef struct DNNExecBaseParams { From patchwork Thu Sep 21 01:26:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chen, Wenbin" X-Patchwork-Id: 43860 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1e62:b0:149:dfde:5c0a with SMTP id cy34csp315505pzb; Wed, 20 Sep 2023 18:27:11 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGXvx99DOBNfrF+RjYVbC9y1EuHhvFaLfai6Ws8oOABfyLBOMumxtapNtfD1dEA4j+Nf2Sp X-Received: by 2002:a5d:460b:0:b0:321:63b4:f109 with SMTP id t11-20020a5d460b000000b0032163b4f109mr3874154wrq.41.1695259630793; Wed, 20 Sep 2023 18:27:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695259630; cv=none; d=google.com; s=arc-20160816; b=k7T1jo7V6boZt8fWc4XeihwX8AN1u9SjKz7XuNg6DTcXONmiboRPkK0XD68DCITTJu Md1Tho4w2NKu2Imxu7y43xZPDlx+kqCgS1wMhSfohWzyJzg6Dw1Xa7jiGBhxoStUgkDS m3zZCK8YESIm2fQFKF+3ALYuvVJoRpzSBQTSYusD0wfmindISd1yYHwvYmj21ElffSGc 74CmjwBgSlnNdyP8oIw8oYuNpC8L4ezUbOk/HNnJwMyd8VNNnkGxSobL86RTZyayLJeM 5n760Gf0PJjOVTHxj6vEzw0iz3CniW0HkE2zAgYdYe69Nd8DA+G7MxZnn1k0/7tuRI+3 jgfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=NRuLO3H4jfyHqRgsevd3MSE8hrdciDM9NOX58a8zoCQ=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=R2JwKbRf/8p4TmjFUd5JsCEQqRNOXVlFGCrsKU+lpMzCi+j3Dt8GUW3ZNEzRPxszJs 4zdUnMzgT2liPgp8dNrSVZVSjFMu/NXANAlhBOiTFiKzfmzt8i3juPMUm+4O6Cuxzws5 L7BYJes959NoG4d2xkc8eDffdyTAEwOhSbgRsqKpUp/S6tR1Cf4ZwbtZQ3jKGJKhBn9f EiKO2IqXj0Lql1JBddKqF3du92BcLAuaIKEgCvW+4cdaxRM+wGqTPpitAtTpUBRpRvjl KSB5sczNUdCIzNXLyz6s5eHgQaBR4ISHdKXa+M/MyaLx2URFCu46QQHdgrBLMIxP5XOt gM0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=AuLsHNaN; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id lo4-20020a170906fa0400b0099bcb187a0bsi316353ejb.387.2023.09.20.18.27.10; Wed, 20 Sep 2023 18:27:10 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=AuLsHNaN; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B3FEA68C912; Thu, 21 Sep 2023 04:26:53 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.20]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 867E568C378 for ; Thu, 21 Sep 2023 04:26:46 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695259611; x=1726795611; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=gRJBrY+gH9zsE1C5n70x4p5LXa2QAHNCe0vbFOx70Aw=; b=AuLsHNaNfs6cXAg1FJ6GWHsFVntD6tCxMjfci2u/Q/Jv7zeQlIldIte6 Zcr103BMiX0PgaXq0GYJFq5YCspbxgEbTsa3fhU9QefMjzMuKByzOeqUK Nw9jLi1AUxH8hcY1VuQU3INcS+q+jeiyxxE+uhy3xOGVh8XddWEFwALmU F+/szm/9ADULGEzZdBSZpgSTvHBVjG0JVsfaXsuqAl6WyWkiG8gQ3mpOF lphhceE8jqyiQAMrmgPDpEo5VmeVLuhIsQdNq0ovAPi9xknWCLHhtYZMw XQodM9SxV5rV5bMDJa+fdE1+Tg//pl+1M78F1Yva8T/spBkzEz3Swn2c3 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="370700950" X-IronPort-AV: E=Sophos;i="6.03,162,1694761200"; d="scan'208";a="370700950" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2023 18:26:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="781943214" X-IronPort-AV: E=Sophos;i="6.03,162,1694761200"; d="scan'208";a="781943214" Received: from wenbin-z390-aorus-ultra.sh.intel.com ([10.239.156.43]) by orsmga001.jf.intel.com with ESMTP; 20 Sep 2023 18:26:37 -0700 From: wenbin.chen-at-intel.com@ffmpeg.org To: ffmpeg-devel@ffmpeg.org Date: Thu, 21 Sep 2023 09:26:33 +0800 Message-Id: <20230921012633.16241-3-wenbin.chen@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230921012633.16241-1-wenbin.chen@intel.com> References: <20230921012633.16241-1-wenbin.chen@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v2 3/3] libavfilter/dnn: Initialze DNNData variables X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: CnPCNdqV5ubY From: Wenbin Chen Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_tf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_tf.c b/libavfilter/dnn/dnn_backend_tf.c index b521de7fbe..25046b58d9 100644 --- a/libavfilter/dnn/dnn_backend_tf.c +++ b/libavfilter/dnn/dnn_backend_tf.c @@ -622,7 +622,7 @@ err: } static int fill_model_input_tf(TFModel *tf_model, TFRequestItem *request) { - DNNData input; + DNNData input = { 0 }; LastLevelTaskItem *lltask; TaskItem *task; TFInferRequest *infer_request = NULL; @@ -724,7 +724,7 @@ static void infer_completion_callback(void *args) { TFModel *tf_model = task->model; TFContext *ctx = &tf_model->ctx; - outputs = av_malloc_array(task->nb_output, sizeof(*outputs)); + outputs = av_calloc(task->nb_output, sizeof(*outputs)); if (!outputs) { av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for *outputs\n"); goto err;