From patchwork Wed Dec 27 04:16:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chen, Wenbin" X-Patchwork-Id: 45336 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:6623:b0:194:e134:edd4 with SMTP id n35csp3406857pzh; Tue, 26 Dec 2023 20:17:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IGIhlxsMqbTiC6tfJGH+qtawmfKZ5F7fuLoBp6Hr1xuj3i2eLCiMkotJ3P9RtnWIZA/rEDl X-Received: by 2002:a17:906:c183:b0:a23:619b:d324 with SMTP id g3-20020a170906c18300b00a23619bd324mr2738619ejz.85.1703650648726; Tue, 26 Dec 2023 20:17:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703650648; cv=none; d=google.com; s=arc-20160816; b=A0KaMT30C7kpmjrqIBfhrMUF5PJcyrToV4f3Q1NYnsNGGjV2Yp/IgqMfDXRGDyi+5s 8021i/gSUp0eNqyeq15NR94mcC5wKYYJAn2eQghizEBaVXQ2R43K8MgmNerroPU4Wx44 LFXFgDYUBCwvm5wowgRdM/Hxh9x7FpNiYQvyLNTmR82yGOhvARYP/MY5tBx9KCqjBBXH 1EEX3630ME0Piyre+KLZOg2fNluFxpoI7IGadATi1QvuFcTH1azMfOwNjX1jH8I8JPnf I1e/iZm1F9rh0Z+j5Eu/hooilRnWuLWnbNNlpiVBPR2aSKu52+D82h38Fk+DdUQBB18s uceg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:message-id:date:to:from :dkim-signature:delivered-to; bh=NEAyPn4UDzqa0/ekOJYRYmT5ssV1nhtGUCP4wA2dpG0=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=nz44DQK+pOZ8s82k0+waCQFWPC/COVo+P3CjGuM8fTDGs303La2W1YMg7X3ME7m20G /sOu9dFpMUvcN6qExDtM+RX7qWRK7z3defJP9Nc0QnK2vfswuDHgjm4AjIKZOvsYJ49+ dTZ4iu3nX9hL3lFWJpCbQjaSCrBIMF2dYJ5Fv1aPMPV/EkQSIM9ajx8aG4K5jIMqKlfb l1qB9mS2kWDHmrxXU6k2Yh5U/rcovM9guYhGW7wCbJvGpFUGe4wJ82SWgobp4Rla04cL rA0csTMxj3g8B+dTHTZKvo36O/zthnUwGuraER/i1IQBLXzJUQbs2+BFEa8Lyg5dL2TQ 2HZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=a1clnx5+; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id kg25-20020a17090776f900b00a1f801c0c4bsi5819318ejc.893.2023.12.26.20.17.28; Tue, 26 Dec 2023 20:17:28 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=a1clnx5+; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 38B6F68CB2D; Wed, 27 Dec 2023 06:17:24 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D7D2968CB75 for ; Wed, 27 Dec 2023 06:17:16 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703650643; x=1735186643; h=from:to:subject:date:message-id:mime-version: content-transfer-encoding; bh=YekNQE4vUqFQktnfM5QzSGmMXqjPGHirrwZIQV92gQw=; b=a1clnx5+Fl4pauz+CqmUwUA6yb8Fa9/PrZ+jQtZyEReKLhDXMKIX+8IL Yl/DuaCk/bR8qoM/pqhtxsenAd5nZqqpZpINQH6wFl1yh2vAgBK3ereV/ lS1R2EL0Gh0Xr1LkcndTjyPvR0D7huzTMcVwt3rASYBiTMSNo9J7OCeJb t+5Srl1t78jkt/9nipnA924hTfs3xlZX9LS3OreMLW5O00dQbfdaPaKIV gahAUCyuIuXK5Z9L2Qb9JxNz6KL5wFmSqD8KMXrtN3ePA53XYcoiDh1kL N2XJIoNb1kMbcF5XESOkM6Xie5Y2GP/kMuieM5FHRBlZqnKgCHEmAxY51 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10935"; a="15082288" X-IronPort-AV: E=Sophos;i="6.04,308,1695711600"; d="scan'208";a="15082288" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Dec 2023 20:17:02 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10935"; a="848581532" X-IronPort-AV: E=Sophos;i="6.04,308,1695711600"; d="scan'208";a="848581532" Received: from wenbin-z390-aorus-ultra.sh.intel.com ([10.239.156.43]) by fmsmga004.fm.intel.com with ESMTP; 26 Dec 2023 20:16:59 -0800 From: wenbin.chen-at-intel.com@ffmpeg.org To: ffmpeg-devel@ffmpeg.org Date: Wed, 27 Dec 2023 12:16:57 +0800 Message-Id: <20231227041658.392174-1-wenbin.chen@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] libavfilter/dnn_backend_openvino: Add dynamic output support X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 8FvDNbwZxOfi From: Wenbin Chen Add dynamic outputs support. Some models don't have fixed output size. Its size changes according to result. Now openvino can run these kinds of models. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_openvino.c | 134 +++++++++++-------------- 1 file changed, 59 insertions(+), 75 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_openvino.c b/libavfilter/dnn/dnn_backend_openvino.c index 671a995c70..e207d44584 100644 --- a/libavfilter/dnn/dnn_backend_openvino.c +++ b/libavfilter/dnn/dnn_backend_openvino.c @@ -219,31 +219,26 @@ static int fill_model_input_ov(OVModel *ov_model, OVRequestItem *request) task = lltask->task; #if HAVE_OPENVINO2 - if (!ov_model_is_dynamic(ov_model->ov_model)) { - if (ov_model->input_port) { - ov_output_const_port_free(ov_model->input_port); - ov_model->input_port = NULL; - } - status = ov_model_const_input_by_name(ov_model->ov_model, task->input_name, &ov_model->input_port); - if (status != OK) { - av_log(ctx, AV_LOG_ERROR, "Failed to get input port shape.\n"); - return ov2_map_error(status, NULL); - } - status = ov_const_port_get_shape(ov_model->input_port, &input_shape); - if (status != OK) { - av_log(ctx, AV_LOG_ERROR, "Failed to get input port shape.\n"); - return ov2_map_error(status, NULL); - } - dims = input_shape.dims; - status = ov_port_get_element_type(ov_model->input_port, &precision); - if (status != OK) { - av_log(ctx, AV_LOG_ERROR, "Failed to get input port data type.\n"); - ov_shape_free(&input_shape); - return ov2_map_error(status, NULL); - } - } else { - avpriv_report_missing_feature(ctx, "Do not support dynamic model."); - return AVERROR(ENOSYS); + if (ov_model->input_port) { + ov_output_const_port_free(ov_model->input_port); + ov_model->input_port = NULL; + } + status = ov_model_const_input_by_name(ov_model->ov_model, task->input_name, &ov_model->input_port); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to get input port shape.\n"); + return ov2_map_error(status, NULL); + } + status = ov_const_port_get_shape(ov_model->input_port, &input_shape); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to get input port shape.\n"); + return ov2_map_error(status, NULL); + } + dims = input_shape.dims; + status = ov_port_get_element_type(ov_model->input_port, &precision); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to get input port data type.\n"); + ov_shape_free(&input_shape); + return ov2_map_error(status, NULL); } input.height = dims[1]; input.width = dims[2]; @@ -1049,30 +1044,22 @@ static int get_input_ov(void *model, DNNData *input, const char *input_name) ov_element_type_e precision; int64_t* dims; ov_status_e status; - if (!ov_model_is_dynamic(ov_model->ov_model)) { - status = ov_model_const_input_by_name(ov_model->ov_model, input_name, &ov_model->input_port); - if (status != OK) { - av_log(ctx, AV_LOG_ERROR, "Failed to get input port shape.\n"); - return ov2_map_error(status, NULL); - } - - status = ov_const_port_get_shape(ov_model->input_port, &input_shape); - if (status != OK) { - av_log(ctx, AV_LOG_ERROR, "Failed to get input port shape.\n"); - return ov2_map_error(status, NULL); - } - dims = input_shape.dims; - - status = ov_port_get_element_type(ov_model->input_port, &precision); - if (status != OK) { - av_log(ctx, AV_LOG_ERROR, "Failed to get input port data type.\n"); - return ov2_map_error(status, NULL); - } - } else { - avpriv_report_missing_feature(ctx, "Do not support dynamic model now."); - return AVERROR(ENOSYS); + status = ov_model_const_input_by_name(ov_model->ov_model, input_name, &ov_model->input_port); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to get input port shape.\n"); + return ov2_map_error(status, NULL); } - + status = ov_port_get_element_type(ov_model->input_port, &precision); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to get input port data type.\n"); + return ov2_map_error(status, NULL); + } + status = ov_const_port_get_shape(ov_model->input_port, &input_shape); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to get input port shape.\n"); + return ov2_map_error(status, NULL); + } + dims = input_shape.dims; if (dims[1] <= 3) { // NCHW input->channels = dims[1]; input->height = input_resizable ? -1 : dims[2]; @@ -1083,7 +1070,7 @@ static int get_input_ov(void *model, DNNData *input, const char *input_name) input->channels = dims[3]; } input->dt = precision_to_datatype(precision); - + ov_shape_free(&input_shape); return 0; #else char *model_input_name = NULL; @@ -1267,34 +1254,31 @@ static int get_output_ov(void *model, const char *input_name, int input_width, i #if HAVE_OPENVINO2 if (ctx->options.input_resizable) { - if (!ov_model_is_dynamic(ov_model->ov_model)) { - status = ov_partial_shape_create(4, dims, &partial_shape); - if (status != OK) { - av_log(ctx, AV_LOG_ERROR, "Failed create partial shape.\n"); - return ov2_map_error(status, NULL); - } - status = ov_const_port_get_shape(ov_model->input_port, &input_shape); - input_shape.dims[2] = input_height; - input_shape.dims[3] = input_width; - if (status != OK) { - av_log(ctx, AV_LOG_ERROR, "Failed create shape for model input resize.\n"); - return ov2_map_error(status, NULL); - } + status = ov_partial_shape_create(4, dims, &partial_shape); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to create partial shape.\n"); + return ov2_map_error(status, NULL); + } + status = ov_const_port_get_shape(ov_model->input_port, &input_shape); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to create shape for model input resize.\n"); + return ov2_map_error(status, NULL); + } + input_shape.dims[2] = input_height; + input_shape.dims[3] = input_width; - status = ov_shape_to_partial_shape(input_shape, &partial_shape); - if (status != OK) { - av_log(ctx, AV_LOG_ERROR, "Failed create partial shape for model input resize.\n"); - return ov2_map_error(status, NULL); - } + status = ov_shape_to_partial_shape(input_shape, &partial_shape); + ov_shape_free(&input_shape); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to create partial shape for model input resize.\n"); + return ov2_map_error(status, NULL); + } - status = ov_model_reshape_single_input(ov_model->ov_model, partial_shape); - if (status != OK) { - av_log(ctx, AV_LOG_ERROR, "Failed to reszie model input.\n"); - return ov2_map_error(status, NULL); - } - } else { - avpriv_report_missing_feature(ctx, "Do not support dynamic model."); - return AVERROR(ENOTSUP); + status = ov_model_reshape_single_input(ov_model->ov_model, partial_shape); + ov_partial_shape_free(&partial_shape); + if (status != OK) { + av_log(ctx, AV_LOG_ERROR, "Failed to reszie model input.\n"); + return ov2_map_error(status, NULL); } } From patchwork Wed Dec 27 04:16:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chen, Wenbin" X-Patchwork-Id: 45337 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:6623:b0:194:e134:edd4 with SMTP id n35csp3406908pzh; Tue, 26 Dec 2023 20:17:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IHxPiEcWN52eEuJVIdvkiG/bbS8Dl6uKVHTmG04i5d0z8IR/k7dMiJGx5T8S3qdMF8UKxwP X-Received: by 2002:a17:906:25d8:b0:a23:58f9:e1c6 with SMTP id n24-20020a17090625d800b00a2358f9e1c6mr8329602ejb.2.1703650658453; Tue, 26 Dec 2023 20:17:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703650658; cv=none; d=google.com; s=arc-20160816; b=OS8NDntJRGWrD06waycbuLHQMn+9oTIu2aNtYqpcDKgcnNBU0qXjrrKSXogBdlh6ji 4GFvlA0KToInFB3U4QJUyq0xpTCWhrBupLuOomjMyZBWwMLWISPIIe3P2ACI18cfYuHu CqjFg3bBUYSCc9ywAEhf3+G8hPoRcoUEV4bGbpAJqwURSWVXTONuRwOslrZm3j6nrwOK MTcipiKQG8UGs1yFpMY6QrC+Vv77mBUKKxLMgge8XhF5eGgavK2QJWC01zkLc2i8If33 QbSbMfBF9TYbglibdK5jvnaBQ+7wCGggyF4I/FVm5SoR8YjevTKfAqI2skasekQz/TjJ BvoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=6uiQzdpY8F1Gxw0qJGfYakIOrBTsBg9KZ85kj6UsNlc=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=Tl9qJeG2h+5DXFnnBf5U8x3DEWmlTidqYPptAJOealp/m5hKEdqXXBHt1dF2rG8Ndk k8oCytbMBIDJnx+ff3pq3wqXxJIwhccYMgrxKo9tIZzW0ewNZFL1TrPpy7dWyno58VnE NjBpDhmd0yTyt4iuTIVwvZfHZzqRoJmGc1SLS9EaCFlcAxTNR4ZM+EzoeaP0QTI7CBBG 9snvwYk3Hd9MPHZEzpFuRthfCXYu+k+0vR+KeeY5Yr/WMuwlJy19Rt+nGTpd/lurN+6K kkTMBfFx1brFqd3MkqJxs/JYbw9uAea4COYBZapcNBAtdm6cDoQ6kNNgRAXKPLXoCMGg E9Ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=LSPgUe0L; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j20-20020a170906411400b00a269915e7absi5539299ejk.419.2023.12.26.20.17.38; Tue, 26 Dec 2023 20:17:38 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=LSPgUe0L; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 45DEC68CC3E; Wed, 27 Dec 2023 06:17:30 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C3E8068CC09 for ; Wed, 27 Dec 2023 06:17:22 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703650649; x=1735186649; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=xX1GTHjUu3oPm/EADgKPzmymDjTcq6reRAPElMD6VEk=; b=LSPgUe0LwDaCzruVNyhsy/8NiQexCga9hkyWgFAilxw0i6YOA5F70h7G Z2v/t3HaOQ6x2L1BQ1WHwKFrfxy2kJ9VhHyBbzWOkdauNubWHNyRCa4Qw RB8YguwEyLq6E9MLEV9wokWUseTWWLIfUh3eorrT6klITQa5ZsFDEXpkd GPn0yHpxwn52w0LlXRKgvB13t8CmewSkURNKrB++AbtO83mRzodh2nWpf /tRlQL9d7hPZePaUH4PnG2whvmTKGXr9MdBfpTb4Iq7FvqAhi6gWeKACq +u5BDxkqtp/y95UU3Ak8Ijy+Sa05S0tMmWKiwr7LBiU8Jc8xCp4GrWo6Z w==; X-IronPort-AV: E=McAfee;i="6600,9927,10935"; a="15082291" X-IronPort-AV: E=Sophos;i="6.04,308,1695711600"; d="scan'208";a="15082291" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Dec 2023 20:17:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10935"; a="848581554" X-IronPort-AV: E=Sophos;i="6.04,308,1695711600"; d="scan'208";a="848581554" Received: from wenbin-z390-aorus-ultra.sh.intel.com ([10.239.156.43]) by fmsmga004.fm.intel.com with ESMTP; 26 Dec 2023 20:17:01 -0800 From: wenbin.chen-at-intel.com@ffmpeg.org To: ffmpeg-devel@ffmpeg.org Date: Wed, 27 Dec 2023 12:16:58 +0800 Message-Id: <20231227041658.392174-2-wenbin.chen@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231227041658.392174-1-wenbin.chen@intel.com> References: <20231227041658.392174-1-wenbin.chen@intel.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] libavfilter/vf_dnn_detect: Add two outputs ssd support X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: /Vy2Mz2aurDL From: Wenbin Chen For this kind of model, we can directly use its output as final result just like ssd model. The difference is that it splits output into two tensors. [x_min, y_min, x_max, y_max, confidence] and [lable_id]. Model example refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/person-detection-0106 Signed-off-by: Wenbin Chen --- libavfilter/vf_dnn_detect.c | 64 +++++++++++++++++++++++++++++-------- 1 file changed, 50 insertions(+), 14 deletions(-) diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c index 88865c8a8e..249cbba0f7 100644 --- a/libavfilter/vf_dnn_detect.c +++ b/libavfilter/vf_dnn_detect.c @@ -359,24 +359,48 @@ static int dnn_detect_post_proc_yolov3(AVFrame *frame, DNNData *output, return 0; } -static int dnn_detect_post_proc_ssd(AVFrame *frame, DNNData *output, AVFilterContext *filter_ctx) +static int dnn_detect_post_proc_ssd(AVFrame *frame, DNNData *output, int nb_outputs, + AVFilterContext *filter_ctx) { DnnDetectContext *ctx = filter_ctx->priv; float conf_threshold = ctx->confidence; - int proposal_count = output->height; - int detect_size = output->width; - float *detections = output->data; + int proposal_count = 0; + int detect_size = 0; + float *detections = NULL, *labels = NULL; int nb_bboxes = 0; AVDetectionBBoxHeader *header; AVDetectionBBox *bbox; - - if (output->width != 7) { + int scale_w = ctx->scale_width; + int scale_h = ctx->scale_height; + + if (nb_outputs == 1 && output->width == 7) { + proposal_count = output->height; + detect_size = output->width; + detections = output->data; + } else if (nb_outputs == 2 && output[0].width == 5) { + proposal_count = output[0].height; + detect_size = output[0].width; + detections = output[0].data; + labels = output[1].data; + } else if (nb_outputs == 2 && output[1].width == 5) { + proposal_count = output[1].height; + detect_size = output[1].width; + detections = output[1].data; + labels = output[0].data; + } else { av_log(filter_ctx, AV_LOG_ERROR, "Model output shape doesn't match ssd requirement.\n"); return AVERROR(EINVAL); } + if (proposal_count == 0) + return 0; + for (int i = 0; i < proposal_count; ++i) { - float conf = detections[i * detect_size + 2]; + float conf; + if (nb_outputs == 1) + conf = detections[i * detect_size + 2]; + else + conf = detections[i * detect_size + 4]; if (conf < conf_threshold) { continue; } @@ -398,12 +422,24 @@ static int dnn_detect_post_proc_ssd(AVFrame *frame, DNNData *output, AVFilterCon for (int i = 0; i < proposal_count; ++i) { int av_unused image_id = (int)detections[i * detect_size + 0]; - int label_id = (int)detections[i * detect_size + 1]; - float conf = detections[i * detect_size + 2]; - float x0 = detections[i * detect_size + 3]; - float y0 = detections[i * detect_size + 4]; - float x1 = detections[i * detect_size + 5]; - float y1 = detections[i * detect_size + 6]; + int label_id; + float conf, x0, y0, x1, y1; + + if (nb_outputs == 1) { + label_id = (int)detections[i * detect_size + 1]; + conf = detections[i * detect_size + 2]; + x0 = detections[i * detect_size + 3]; + y0 = detections[i * detect_size + 4]; + x1 = detections[i * detect_size + 5]; + y1 = detections[i * detect_size + 6]; + } else { + label_id = (int)labels[i]; + x0 = detections[i * detect_size] / scale_w; + y0 = detections[i * detect_size + 1] / scale_h; + x1 = detections[i * detect_size + 2] / scale_w; + y1 = detections[i * detect_size + 3] / scale_h; + conf = detections[i * detect_size + 4]; + } if (conf < conf_threshold) { continue; @@ -447,7 +483,7 @@ static int dnn_detect_post_proc_ov(AVFrame *frame, DNNData *output, int nb_outpu switch (ctx->model_type) { case DDMT_SSD: - ret = dnn_detect_post_proc_ssd(frame, output, filter_ctx); + ret = dnn_detect_post_proc_ssd(frame, output, nb_outputs, filter_ctx); if (ret < 0) return ret; break;