From patchwork Thu Apr 27 09:43:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ting Fu X-Patchwork-Id: 41346 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dca6:b0:f3:34fa:f187 with SMTP id ky38csp380117pzb; Thu, 27 Apr 2023 03:04:13 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6NcVnGMJREYF4v1CoZgdt6YRw5elug8uvl9Wbsc56QKwD6RIA3mT4cmAREjQ6F/hiPnrXD X-Received: by 2002:a17:907:701:b0:94f:865d:fb8d with SMTP id xb1-20020a170907070100b0094f865dfb8dmr1168004ejb.11.1682589853180; Thu, 27 Apr 2023 03:04:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682589853; cv=none; d=google.com; s=arc-20160816; b=P3utepAO7nHXv1BpBpL/TQ9XWMEn4wBJfZBcUGN8R+HH1P6NOifJmtrDro5cx9CT3G EtHg96NSaM1D0JTIL4IXxLdxSoYXBzfDnW9e0t39ViVdNQl/KjZCKSga8wbR/FnFssJC yZUzDiOCqFXguudC9ftZnvhsOWs4+LVQVaoMqcQ0jAqpfEeGDVKGUPc+/iWVAqvqKK4Q ZV6teOvqehypgk1etxv6+PPw38bRIArJB9ygMJtvMao77CUGO05suSCaqzi/MwenaIuy n8Jpr9eCVV0A2KkjJXZfc6I8NodKrsjM/eNKBdvsaR1HpYr9BPPb1y6mHKe2dGOKGZ/M gQBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to; bh=LHQshw6DhtsuIq03zl9m6pOq/kRqpFaBYeUihbQP0J0=; b=tF/at1liSLbsvd0LFGR3BQg1mHTCUh2O9npCNR0RGbr3dfPhN52AjPgsFihK0sl97B kdA1H1MaSWSPQPcQJEECWdZUMaZA8KEcJL2Pm6oJdRFqlHKwCtEgsffkFj77dA5/cK6F qlDOMuSb0p4BqAg8jvzIMz78avEryzfSnPoqRTkP+9TiNybufK9YBUR080ZzWHbzSfSZ 7q2LWtnAhZc0yusG0fSjzfhTcKAExvLZoyj8z/tEeRQcpGSeEozOzauE5QfMgcULb3ml LS1wKeedKlScUUXUq5Wl7uGFaKHXq0bcG57R6kXNMjrEdt39x2yn6Y+MWA2Z4KJ2cdnz xpWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=K6yzh3qb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id fh2-20020a1709073a8200b00956f4cd96f5si9277362ejc.96.2023.04.27.03.04.12; Thu, 27 Apr 2023 03:04:13 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=K6yzh3qb; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 06A4468BF10; Thu, 27 Apr 2023 13:04:00 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B004968BF10 for ; Thu, 27 Apr 2023 13:03:52 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682589837; x=1714125837; h=from:to:subject:date:message-id:in-reply-to:references; bh=umS3cLsn8rJBgNKMseAYCPnMK4MivLxyFTaCyPcDHvs=; b=K6yzh3qbkkLUS62WJx240W4ptp/23+mF1pPt4EoT2Hl0hdoqdO9IH1JC CRsEl87XtjdMaL3yQDpG9zwKTkUKzJZduz+JxqOW/v70O8yHYSdHMtgkB K0v9XQP2QWN4xxdXMndfap0+T97UJA910jhWtx/sX701pzKJWLPyrVTk5 mHnNy+JFYOzPHrbDK7UF3MvjDyzqIctSAIP9in3rkx8ZX+RjKAAjH89ne waQpoXR6WAWpCbpB4oFIGcSafT+iYywBM2VjNBNlJY2uictHwRt+oh2zR Yxqr/nHmj9ngSXt+7jNIDK9b5o7Vk2ZB8Mke3Cbimot5VI05UuF/fpE6j A==; X-IronPort-AV: E=McAfee;i="6600,9927,10692"; a="349346858" X-IronPort-AV: E=Sophos;i="5.99,230,1677571200"; d="scan'208";a="349346858" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Apr 2023 03:03:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10692"; a="688362231" X-IronPort-AV: E=Sophos;i="5.99,230,1677571200"; d="scan'208";a="688362231" Received: from semmer-ubuntu.sh.intel.com ([10.239.36.6]) by orsmga007.jf.intel.com with ESMTP; 27 Apr 2023 03:03:45 -0700 From: Ting Fu To: ffmpeg-devel@ffmpeg.org Date: Thu, 27 Apr 2023 17:43:45 +0800 Message-Id: <20230427094346.25234-2-ting.fu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230427094346.25234-1-ting.fu@intel.com> References: <20230427094346.25234-1-ting.fu@intel.com> Subject: [FFmpeg-devel] [PATCH V7 2/3] lavfi/dnn: Modified DNN native backend related tools and docs. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: FFXzI0dTwU+8 Will remove native backend, so change the default backend in filters, and also remove the python scripts which generate native model file. Signed-off-by: Ting Fu --- doc/filters.texi | 39 +- libavfilter/vf_derain.c | 2 +- libavfilter/vf_dnn_processing.c | 2 +- libavfilter/vf_sr.c | 2 +- tools/python/convert.py | 56 --- tools/python/convert_from_tensorflow.py | 607 ------------------------ tools/python/convert_header.py | 26 - 7 files changed, 7 insertions(+), 727 deletions(-) delete mode 100644 tools/python/convert.py delete mode 100644 tools/python/convert_from_tensorflow.py delete mode 100644 tools/python/convert_header.py diff --git a/doc/filters.texi b/doc/filters.texi index 5dde79919a..f1f87a24fd 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -11338,9 +11338,6 @@ See @url{http://openaccess.thecvf.com/content_ECCV_2018/papers/Xia_Li_Recurrent_ Training as well as model generation scripts are provided in the repository at @url{https://github.com/XueweiMeng/derain_filter.git}. -Native model files (.model) can be generated from TensorFlow model -files (.pb) by using tools/python/convert.py - The filter accepts the following options: @table @option @@ -11361,21 +11358,16 @@ Specify which DNN backend to use for model loading and execution. This option ac the following values: @table @samp -@item native -Native implementation of DNN loading and execution. - @item tensorflow TensorFlow backend. To enable this backend you need to install the TensorFlow for C library (see @url{https://www.tensorflow.org/install/lang_c}) and configure FFmpeg with @code{--enable-libtensorflow} @end table -Default value is @samp{native}. @item model Set path to model file specifying network architecture and its parameters. -Note that different backends use different file formats. TensorFlow and native -backend can load files for only its format. +Note that different backends use different file formats. TensorFlow can load files for only its format. @end table To get full functionality (such as async execution), please use the @ref{dnn_processing} filter. @@ -11699,9 +11691,6 @@ Specify which DNN backend to use for model loading and execution. This option ac the following values: @table @samp -@item native -Native implementation of DNN loading and execution. - @item tensorflow TensorFlow backend. To enable this backend you need to install the TensorFlow for C library (see @@ -11717,14 +11706,9 @@ be needed if the header files and libraries are not installed into system path) @end table -Default value is @samp{native}. - @item model Set path to model file specifying network architecture and its parameters. -Note that different backends use different file formats. TensorFlow, OpenVINO and native -backend can load files for only its format. - -Native model file (.model) can be generated from TensorFlow model file (.pb) by using tools/python/convert.py +Note that different backends use different file formats. TensorFlow, OpenVINO backend can load files for only its format. @item input Set the input name of the dnn network. @@ -11750,12 +11734,6 @@ Remove rain in rgb24 frame with can.pb (see @ref{derain} filter): ./ffmpeg -i rain.jpg -vf format=rgb24,dnn_processing=dnn_backend=tensorflow:model=can.pb:input=x:output=y derain.jpg @end example -@item -Halve the pixel value of the frame with format gray32f: -@example -ffmpeg -i input.jpg -vf format=grayf32,dnn_processing=model=halve_gray_float.model:input=dnn_in:output=dnn_out:dnn_backend=native -y out.native.png -@end example - @item Handle the Y channel with srcnn.pb (see @ref{sr} filter) for frame with yuv420p (planar YUV formats supported): @example @@ -21813,9 +21791,6 @@ Training scripts as well as scripts for model file (.pb) saving can be found at @url{https://github.com/XueweiMeng/sr/tree/sr_dnn_native}. Original repository is at @url{https://github.com/HighVoltageRocknRoll/sr.git}. -Native model files (.model) can be generated from TensorFlow model -files (.pb) by using tools/python/convert.py - The filter accepts the following options: @table @option @@ -21824,9 +21799,6 @@ Specify which DNN backend to use for model loading and execution. This option ac the following values: @table @samp -@item native -Native implementation of DNN loading and execution. - @item tensorflow TensorFlow backend. To enable this backend you need to install the TensorFlow for C library (see @@ -21834,13 +21806,10 @@ need to install the TensorFlow for C library (see @code{--enable-libtensorflow} @end table -Default value is @samp{native}. - @item model Set path to model file specifying network architecture and its parameters. -Note that different backends use different file formats. TensorFlow backend -can load files for both formats, while native backend can load files for only -its format. +Note that different backends use different file formats. TensorFlow, OpenVINO backend +can load files for only its format. @item scale_factor Set scale factor for SRCNN model. Allowed values are @code{2}, @code{3} and @code{4}. diff --git a/libavfilter/vf_derain.c b/libavfilter/vf_derain.c index 86e9eb8752..7e84cd65a3 100644 --- a/libavfilter/vf_derain.c +++ b/libavfilter/vf_derain.c @@ -43,7 +43,7 @@ static const AVOption derain_options[] = { { "filter_type", "filter type(derain/dehaze)", OFFSET(filter_type), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, FLAGS, "type" }, { "derain", "derain filter flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "type" }, { "dehaze", "dehaze filter flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "type" }, - { "dnn_backend", "DNN backend", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, FLAGS, "backend" }, + { "dnn_backend", "DNN backend", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 1 }, 0, 1, FLAGS, "backend" }, { "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, #if (CONFIG_LIBTENSORFLOW == 1) { "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, diff --git a/libavfilter/vf_dnn_processing.c b/libavfilter/vf_dnn_processing.c index 4462915073..968df666fc 100644 --- a/libavfilter/vf_dnn_processing.c +++ b/libavfilter/vf_dnn_processing.c @@ -45,7 +45,7 @@ typedef struct DnnProcessingContext { #define OFFSET(x) offsetof(DnnProcessingContext, dnnctx.x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM static const AVOption dnn_processing_options[] = { - { "dnn_backend", "DNN backend", OFFSET(backend_type), AV_OPT_TYPE_INT, { .i64 = 0 }, INT_MIN, INT_MAX, FLAGS, "backend" }, + { "dnn_backend", "DNN backend", OFFSET(backend_type), AV_OPT_TYPE_INT, { .i64 = 1 }, INT_MIN, INT_MAX, FLAGS, "backend" }, { "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, #if (CONFIG_LIBTENSORFLOW == 1) { "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c index cb24c096ce..e9fe746bae 100644 --- a/libavfilter/vf_sr.c +++ b/libavfilter/vf_sr.c @@ -46,7 +46,7 @@ typedef struct SRContext { #define OFFSET(x) offsetof(SRContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM static const AVOption sr_options[] = { - { "dnn_backend", "DNN backend used for model execution", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, FLAGS, "backend" }, + { "dnn_backend", "DNN backend used for model execution", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 1 }, 0, 1, FLAGS, "backend" }, { "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, #if (CONFIG_LIBTENSORFLOW == 1) { "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, diff --git a/tools/python/convert.py b/tools/python/convert.py deleted file mode 100644 index 64cf76b2d8..0000000000 --- a/tools/python/convert.py +++ /dev/null @@ -1,56 +0,0 @@ -# Copyright (c) 2019 Guo Yejun -# -# This file is part of FFmpeg. -# -# FFmpeg is free software; you can redistribute it and/or -# modify it under the terms of the GNU Lesser General Public -# License as published by the Free Software Foundation; either -# version 2.1 of the License, or (at your option) any later version. -# -# FFmpeg is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# Lesser General Public License for more details. -# -# You should have received a copy of the GNU Lesser General Public -# License along with FFmpeg; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA -# ============================================================================== - -# verified with Python 3.5.2 on Ubuntu 16.04 -import argparse -import os -from convert_from_tensorflow import * - -def get_arguments(): - parser = argparse.ArgumentParser(description='generate native mode model with weights from deep learning model') - parser.add_argument('--outdir', type=str, default='./', help='where to put generated files') - parser.add_argument('--infmt', type=str, default='tensorflow', help='format of the deep learning model') - parser.add_argument('infile', help='path to the deep learning model with weights') - parser.add_argument('--dump4tb', type=str, default='no', help='dump file for visualization in tensorboard') - - return parser.parse_args() - -def main(): - args = get_arguments() - - if not os.path.isfile(args.infile): - print('the specified input file %s does not exist' % args.infile) - exit(1) - - if not os.path.exists(args.outdir): - print('create output directory %s' % args.outdir) - os.mkdir(args.outdir) - - basefile = os.path.split(args.infile)[1] - basefile = os.path.splitext(basefile)[0] - outfile = os.path.join(args.outdir, basefile) + '.model' - dump4tb = False - if args.dump4tb.lower() in ('yes', 'true', 't', 'y', '1'): - dump4tb = True - - if args.infmt == 'tensorflow': - convert_from_tensorflow(args.infile, outfile, dump4tb) - -if __name__ == '__main__': - main() diff --git a/tools/python/convert_from_tensorflow.py b/tools/python/convert_from_tensorflow.py deleted file mode 100644 index 38e64c1c94..0000000000 --- a/tools/python/convert_from_tensorflow.py +++ /dev/null @@ -1,607 +0,0 @@ -# Copyright (c) 2019 Guo Yejun -# -# This file is part of FFmpeg. -# -# FFmpeg is free software; you can redistribute it and/or -# modify it under the terms of the GNU Lesser General Public -# License as published by the Free Software Foundation; either -# version 2.1 of the License, or (at your option) any later version. -# -# FFmpeg is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# Lesser General Public License for more details. -# -# You should have received a copy of the GNU Lesser General Public -# License along with FFmpeg; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA -# ============================================================================== - -import tensorflow as tf -import numpy as np -import sys, struct -import convert_header as header - -__all__ = ['convert_from_tensorflow'] - -class Operand(object): - IOTYPE_INPUT = 1 - IOTYPE_OUTPUT = 2 - IOTYPE_INTERMEDIATE = IOTYPE_INPUT | IOTYPE_OUTPUT - DTYPE_FLOAT = 1 - DTYPE_UINT8 = 4 - index = 0 - def __init__(self, name, dtype, dims): - self.name = name - self.dtype = dtype - self.dims = dims - self.iotype = 0 - self.used_count = 0 - self.index = Operand.index - Operand.index = Operand.index + 1 - self.iotype2str = {Operand.IOTYPE_INPUT: 'in', Operand.IOTYPE_OUTPUT: 'out', Operand.IOTYPE_INTERMEDIATE: 'inout'} - self.dtype2str = {Operand.DTYPE_FLOAT: 'DT_FLOAT', Operand.DTYPE_UINT8: 'DT_UINT8'} - - def add_iotype(self, iotype): - self.iotype = self.iotype | iotype - if iotype == Operand.IOTYPE_INPUT: - self.used_count = self.used_count + 1 - - def __str__(self): - return "{}: (name: {}, iotype: {}, dtype: {}, dims: {}, used_count: {})".format(self.index, - self.name, self.iotype2str[self.iotype], self.dtype2str[self.dtype], - self.dims, self.used_count) - - def __lt__(self, other): - return self.index < other.index - -class TFConverter: - def __init__(self, graph_def, nodes, outfile, dump4tb): - self.graph_def = graph_def - self.nodes = nodes - self.outfile = outfile - self.dump4tb = dump4tb - self.layer_number = 0 - self.output_names = [] - self.name_node_dict = {} - self.edges = {} - self.conv_activations = {'Relu':0, 'Tanh':1, 'Sigmoid':2, 'None':3, 'LeakyRelu':4} - self.conv_paddings = {'VALID':0, 'SAME':1} - self.pool_paddings = {'VALID':0, 'SAME':1} - self.converted_nodes = set() - self.conv2d_scope_names = set() - self.conv2d_scopename_inputname_dict = {} - self.dense_scope_names = set() - self.dense_scopename_inputname_dict = {} - self.op2code = {'Conv2D':1, 'DepthToSpace':2, 'MirrorPad':3, 'Maximum':4, - 'MathBinary':5, 'MathUnary':6, 'AvgPool':7, 'MatMul':8} - self.mathbin2code = {'Sub':0, 'Add':1, 'Mul':2, 'RealDiv':3, 'Minimum':4, 'FloorMod':5} - self.mathun2code = {'Abs':0, 'Sin':1, 'Cos':2, 'Tan':3, 'Asin':4, - 'Acos':5, 'Atan':6, 'Sinh':7, 'Cosh':8, 'Tanh':9, 'Asinh':10, - 'Acosh':11, 'Atanh':12, 'Ceil':13, 'Floor':14, 'Round':15, - 'Exp':16} - self.mirrorpad_mode = {'CONSTANT':0, 'REFLECT':1, 'SYMMETRIC':2} - self.name_operand_dict = {} - - - def add_operand(self, name, type): - node = self.name_node_dict[name] - if name not in self.name_operand_dict: - dtype = node.attr['dtype'].type - if dtype == 0: - dtype = node.attr['T'].type - dims = [-1,-1,-1,-1] - if 'shape' in node.attr: - dims[0] = node.attr['shape'].shape.dim[0].size - dims[1] = node.attr['shape'].shape.dim[1].size - dims[2] = node.attr['shape'].shape.dim[2].size - dims[3] = node.attr['shape'].shape.dim[3].size - operand = Operand(name, dtype, dims) - self.name_operand_dict[name] = operand; - self.name_operand_dict[name].add_iotype(type) - return self.name_operand_dict[name].index - - - def dump_for_tensorboard(self): - graph = tf.get_default_graph() - tf.import_graph_def(self.graph_def, name="") - tf.summary.FileWriter('/tmp/graph', graph) - print('graph saved, run "tensorboard --logdir=/tmp/graph" to see it') - - - def get_conv2d_params(self, conv2d_scope_name): - knode = self.name_node_dict[conv2d_scope_name + '/kernel'] - bnode = self.name_node_dict[conv2d_scope_name + '/bias'] - - if conv2d_scope_name + '/dilation_rate' in self.name_node_dict: - dnode = self.name_node_dict[conv2d_scope_name + '/dilation_rate'] - else: - dnode = None - - # the BiasAdd name is possible be changed into the output name, - # if activation is None, and BiasAdd.next is the last op which is Identity - if conv2d_scope_name + '/BiasAdd' in self.edges: - anode = self.edges[conv2d_scope_name + '/BiasAdd'][0] - if anode.op not in self.conv_activations: - anode = None - else: - anode = None - return knode, bnode, dnode, anode - - - def get_dense_params(self, dense_scope_name): - knode = self.name_node_dict[dense_scope_name + '/kernel'] - bnode = self.name_node_dict.get(dense_scope_name + '/bias') - # the BiasAdd name is possible be changed into the output name, - # if activation is None, and BiasAdd.next is the last op which is Identity - anode = None - if bnode: - if dense_scope_name + '/BiasAdd' in self.edges: - anode = self.edges[dense_scope_name + '/BiasAdd'][0] - if anode.op not in self.conv_activations: - anode = None - else: - anode = None - return knode, bnode, anode - - - def dump_complex_conv2d_to_file(self, node, f): - assert(node.op == 'Conv2D') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - - scope_name = TFConverter.get_scope_name(node.name) - #knode for kernel, bnode for bias, dnode for dilation, anode for activation - knode, bnode, dnode, anode = self.get_conv2d_params(scope_name) - - if dnode is not None: - dilation = struct.unpack('i', dnode.attr['value'].tensor.tensor_content[0:4])[0] - else: - dilation = 1 - - if anode is not None: - activation = anode.op - else: - activation = 'None' - - padding = node.attr['padding'].s.decode("utf-8") - # conv2d with dilation > 1 generates tens of nodes, not easy to parse them, so use this tricky method. - if dilation > 1 and scope_name + '/stack' in self.name_node_dict: - if self.name_node_dict[scope_name + '/stack'].op == "Const": - padding = 'SAME' - padding = self.conv_paddings[padding] - - ktensor = knode.attr['value'].tensor - filter_height = ktensor.tensor_shape.dim[0].size - filter_width = ktensor.tensor_shape.dim[1].size - in_channels = ktensor.tensor_shape.dim[2].size - out_channels = ktensor.tensor_shape.dim[3].size - kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32) - kernel = kernel.reshape(filter_height, filter_width, in_channels, out_channels) - kernel = np.transpose(kernel, [3, 0, 1, 2]) - - has_bias = 1 - np.array([self.op2code[node.op], dilation, padding, self.conv_activations[activation], in_channels, out_channels, filter_height, has_bias], dtype=np.uint32).tofile(f) - kernel.tofile(f) - - btensor = bnode.attr['value'].tensor - if btensor.tensor_shape.dim[0].size == 1: - bias = struct.pack("f", btensor.float_val[0]) - else: - bias = btensor.tensor_content - f.write(bias) - - input_name = self.conv2d_scopename_inputname_dict[scope_name] - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - - if anode is not None: - output_operand_index = self.add_operand(anode.name, Operand.IOTYPE_OUTPUT) - else: - output_operand_index = self.add_operand(self.edges[bnode.name][0].name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - def dump_dense_to_file(self, node, f): - assert(node.op == 'MatMul') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - - scope_name = TFConverter.get_scope_name(node.name) - #knode for kernel, bnode for bias, anode for activation - knode, bnode, anode = self.get_dense_params(scope_name.split('/')[0]) - - if bnode is not None: - has_bias = 1 - btensor = bnode.attr['value'].tensor - if btensor.tensor_shape.dim[0].size == 1: - bias = struct.pack("f", btensor.float_val[0]) - else: - bias = btensor.tensor_content - else: - has_bias = 0 - - if anode is not None: - activation = anode.op - else: - activation = 'None' - - ktensor = knode.attr['value'].tensor - in_channels = ktensor.tensor_shape.dim[0].size - out_channels = ktensor.tensor_shape.dim[1].size - if in_channels * out_channels == 1: - kernel = np.float32(ktensor.float_val[0]) - else: - kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32) - kernel = kernel.reshape(in_channels, out_channels) - kernel = np.transpose(kernel, [1, 0]) - - np.array([self.op2code[node.op], self.conv_activations[activation], in_channels, out_channels, has_bias], dtype=np.uint32).tofile(f) - kernel.tofile(f) - if has_bias: - f.write(bias) - - input_name = self.dense_scopename_inputname_dict[scope_name.split('/')[0]] - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - - if anode is not None: - output_operand_index = self.add_operand(anode.name, Operand.IOTYPE_OUTPUT) - else: - if bnode is not None: - output_operand_index = self.add_operand(self.edges[bnode.name][0].name, Operand.IOTYPE_OUTPUT) - else: - output_operand_index = self.add_operand(self.edges[scope_name+'/concat_1'][0].name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_simple_conv2d_to_file(self, node, f): - assert(node.op == 'Conv2D') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - - node0 = self.name_node_dict[node.input[0]] - node1 = self.name_node_dict[node.input[1]] - if node0.op == 'Const': - knode = node0 - input_name = node.input[1] - else: - knode = node1 - input_name = node.input[0] - - ktensor = knode.attr['value'].tensor - filter_height = ktensor.tensor_shape.dim[0].size - filter_width = ktensor.tensor_shape.dim[1].size - in_channels = ktensor.tensor_shape.dim[2].size - out_channels = ktensor.tensor_shape.dim[3].size - if filter_height * filter_width * in_channels * out_channels == 1: - kernel = np.float32(ktensor.float_val[0]) - else: - kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32) - kernel = kernel.reshape(filter_height, filter_width, in_channels, out_channels) - kernel = np.transpose(kernel, [3, 0, 1, 2]) - - has_bias = 0 - dilation = 1 - padding = node.attr['padding'].s.decode("utf-8") - np.array([self.op2code[node.op], dilation, self.conv_paddings[padding], self.conv_activations['None'], - in_channels, out_channels, filter_height, has_bias], dtype=np.uint32).tofile(f) - kernel.tofile(f) - - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_depth2space_to_file(self, node, f): - assert(node.op == 'DepthToSpace') - self.layer_number = self.layer_number + 1 - block_size = node.attr['block_size'].i - np.array([self.op2code[node.op], block_size], dtype=np.uint32).tofile(f) - self.converted_nodes.add(node.name) - input_operand_index = self.add_operand(node.input[0], Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_mirrorpad_to_file(self, node, f): - assert(node.op == 'MirrorPad') - self.layer_number = self.layer_number + 1 - mode = node.attr['mode'].s - mode = self.mirrorpad_mode[mode.decode("utf-8")] - np.array([self.op2code[node.op], mode], dtype=np.uint32).tofile(f) - pnode = self.name_node_dict[node.input[1]] - self.converted_nodes.add(pnode.name) - paddings = pnode.attr['value'].tensor.tensor_content - f.write(paddings) - self.converted_nodes.add(node.name) - input_operand_index = self.add_operand(node.input[0], Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_maximum_to_file(self, node, f): - assert(node.op == 'Maximum') - self.layer_number = self.layer_number + 1 - ynode = self.name_node_dict[node.input[1]] - y = ynode.attr['value'].tensor.float_val[0] - np.array([self.op2code[node.op]], dtype=np.uint32).tofile(f) - np.array([y], dtype=np.float32).tofile(f) - self.converted_nodes.add(node.name) - input_operand_index = self.add_operand(node.input[0], Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_mathbinary_to_file(self, node, f): - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - i0_node = self.name_node_dict[node.input[0]] - i1_node = self.name_node_dict[node.input[1]] - np.array([self.op2code['MathBinary'], self.mathbin2code[node.op]], dtype=np.uint32).tofile(f) - if i0_node.op == 'Const': - scalar = i0_node.attr['value'].tensor.float_val[0] - np.array([1], dtype=np.uint32).tofile(f) # broadcast: 1 - np.array([scalar], dtype=np.float32).tofile(f) - np.array([0], dtype=np.uint32).tofile(f) # broadcast: 0 - input_operand_index = self.add_operand(i1_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - elif i1_node.op == 'Const': - scalar = i1_node.attr['value'].tensor.float_val[0] - np.array([0], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i0_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - np.array([1], dtype=np.uint32).tofile(f) - np.array([scalar], dtype=np.float32).tofile(f) - else: - np.array([0], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i0_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - np.array([0], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i1_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_mathunary_to_file(self, node, f): - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - i0_node = self.name_node_dict[node.input[0]] - np.array([self.op2code['MathUnary'], self.mathun2code[node.op]], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i0_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([output_operand_index],dtype=np.uint32).tofile(f) - - - def dump_avg_pool_to_file(self, node, f): - assert(node.op == 'AvgPool') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - node0 = self.name_node_dict[node.input[0]] - strides = node.attr['strides'] - - # Tensorflow do not support pooling strides in batch dimension and - # current native NN do not support pooling strides in channel dimension, added assert() here. - assert(strides.list.i[1]==strides.list.i[2]) - assert(strides.list.i[0]==1) - assert(strides.list.i[3]==1) - strides = strides.list.i[1] - filter_node = node.attr['ksize'] - input_name = node.input[0] - - # Tensorflow do not support pooling ksize in batch dimension and channel dimension. - assert(filter_node.list.i[0]==1) - assert(filter_node.list.i[3]==1) - filter_height = filter_node.list.i[1] - filter_width = filter_node.list.i[2] - - padding = node.attr['padding'].s.decode("utf-8") - np.array([self.op2code[node.op], strides, self.pool_paddings[padding], filter_height], - dtype=np.uint32).tofile(f) - - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index],dtype=np.uint32).tofile(f) - - - def dump_layers_to_file(self, f): - for node in self.nodes: - if node.name in self.converted_nodes: - continue - - # conv2d with dilation generates very complex nodes, so handle it in special - if self.in_conv2d_scope(node.name): - if node.op == 'Conv2D': - self.dump_complex_conv2d_to_file(node, f) - continue - if self.in_dense_scope(node.name): - if node.op == 'MatMul': - self.dump_dense_to_file(node, f) - continue - - - if node.op == 'Conv2D': - self.dump_simple_conv2d_to_file(node, f) - continue - if node.name in self.output_names: - input_name = self.id_different_scope_dict[node.name] - if TFConverter.get_scope_name(input_name)!=TFConverter.get_scope_name(node.name): - continue - if node.op == 'AvgPool': - self.dump_avg_pool_to_file(node, f) - elif node.op == 'DepthToSpace': - self.dump_depth2space_to_file(node, f) - elif node.op == 'MirrorPad': - self.dump_mirrorpad_to_file(node, f) - elif node.op == 'Maximum': - self.dump_maximum_to_file(node, f) - elif node.op in self.mathbin2code: - self.dump_mathbinary_to_file(node, f) - elif node.op in self.mathun2code: - self.dump_mathunary_to_file(node, f) - - - def dump_operands_to_file(self, f): - operands = sorted(self.name_operand_dict.values()) - for operand in operands: - #print('{}'.format(operand)) - np.array([operand.index, len(operand.name)], dtype=np.uint32).tofile(f) - f.write(operand.name.encode('utf-8')) - np.array([operand.iotype, operand.dtype], dtype=np.uint32).tofile(f) - np.array(operand.dims, dtype=np.uint32).tofile(f) - - - def dump_to_file(self): - with open(self.outfile, 'wb') as f: - f.write(header.str.encode('utf-8')) - np.array([header.major, header.minor], dtype=np.uint32).tofile(f) - self.dump_layers_to_file(f) - self.dump_operands_to_file(f) - np.array([self.layer_number, len(self.name_operand_dict)], dtype=np.uint32).tofile(f) - - - def generate_name_node_dict(self): - for node in self.nodes: - self.name_node_dict[node.name] = node - - - def generate_output_names(self): - used_names = [] - for node in self.nodes: - for input in node.input: - used_names.append(input) - - for node in self.nodes: - if node.name not in used_names: - self.output_names.append(node.name) - - - def remove_identity(self): - self.id_different_scope_dict = {} - id_nodes = [] - id_dict = {} - for node in self.nodes: - if node.op == 'Identity': - name = node.name - input = node.input[0] - id_nodes.append(node) - # do not change the output name - if name in self.output_names: - self.name_node_dict[input].name = name - self.name_node_dict[name] = self.name_node_dict[input] - del self.name_node_dict[input] - self.id_different_scope_dict[name] = input - else: - id_dict[name] = input - - for idnode in id_nodes: - self.nodes.remove(idnode) - - for node in self.nodes: - for i in range(len(node.input)): - input = node.input[i] - if input in id_dict: - node.input[i] = id_dict[input] - - - def generate_edges(self): - for node in self.nodes: - for input in node.input: - if input in self.edges: - self.edges[input].append(node) - else: - self.edges[input] = [node] - - - @staticmethod - def get_scope_name(name): - index = name.rfind('/') - if index == -1: - return "" - return name[0:index] - - - def in_conv2d_scope(self, name): - inner_scope = TFConverter.get_scope_name(name) - if inner_scope == "": - return False; - for scope in self.conv2d_scope_names: - index = inner_scope.find(scope) - if index == 0: - return True - return False - - - def in_dense_scope(self, name): - inner_scope = TFConverter.get_scope_name(name) - if inner_scope == "": - return False; - for scope in self.dense_scope_names: - index = inner_scope.find(scope) - if index == 0: - return True - return False - - def generate_sub_block_op_scope_info(self): - # mostly, conv2d/dense is a sub block in graph, get the scope name - for node in self.nodes: - if node.op == 'Conv2D': - scope = TFConverter.get_scope_name(node.name) - # for the case tf.nn.conv2d is called directly - if scope == '': - continue - # for the case tf.nn.conv2d is called within a scope - if scope + '/kernel' not in self.name_node_dict: - continue - self.conv2d_scope_names.add(scope) - elif node.op == 'MatMul': - scope = TFConverter.get_scope_name(node.name) - # for the case tf.nn.dense is called directly - if scope == '': - continue - # for the case tf.nn.dense is called within a scope - if scope + '/kernel' not in self.name_node_dict and scope.split('/Tensordot')[0] + '/kernel' not in self.name_node_dict: - continue - self.dense_scope_names.add(scope.split('/Tensordot')[0]) - - # get the input name to the conv2d/dense sub block - for node in self.nodes: - scope = TFConverter.get_scope_name(node.name) - if scope in self.conv2d_scope_names: - if node.op == 'Conv2D' or node.op == 'Shape': - for inp in node.input: - if TFConverter.get_scope_name(inp) != scope: - self.conv2d_scopename_inputname_dict[scope] = inp - elif scope in self.dense_scope_names: - if node.op == 'MatMul' or node.op == 'Shape': - for inp in node.input: - if TFConverter.get_scope_name(inp) != scope: - self.dense_scopename_inputname_dict[scope] = inp - elif scope.split('/Tensordot')[0] in self.dense_scope_names: - if node.op == 'Transpose': - for inp in node.input: - if TFConverter.get_scope_name(inp).find(scope)<0 and TFConverter.get_scope_name(inp).find(scope.split('/')[0])<0: - self.dense_scopename_inputname_dict[scope.split('/Tensordot')[0]] = inp - - - def run(self): - self.generate_name_node_dict() - self.generate_output_names() - self.remove_identity() - self.generate_edges() - self.generate_sub_block_op_scope_info() - - if self.dump4tb: - self.dump_for_tensorboard() - - self.dump_to_file() - - -def convert_from_tensorflow(infile, outfile, dump4tb): - with open(infile, 'rb') as f: - # read the file in .proto format - graph_def = tf.GraphDef() - graph_def.ParseFromString(f.read()) - nodes = graph_def.node - - converter = TFConverter(graph_def, nodes, outfile, dump4tb) - converter.run() diff --git a/tools/python/convert_header.py b/tools/python/convert_header.py deleted file mode 100644 index 143f92c42e..0000000000 --- a/tools/python/convert_header.py +++ /dev/null @@ -1,26 +0,0 @@ -# Copyright (c) 2019 -# -# This file is part of FFmpeg. -# -# FFmpeg is free software; you can redistribute it and/or -# modify it under the terms of the GNU Lesser General Public -# License as published by the Free Software Foundation; either -# version 2.1 of the License, or (at your option) any later version. -# -# FFmpeg is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# Lesser General Public License for more details. -# -# You should have received a copy of the GNU Lesser General Public -# License along with FFmpeg; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA -# ============================================================================== - -str = 'FFMPEGDNNNATIVE' - -# increase major and reset minor when we have to re-convert the model file -major = 1 - -# increase minor when we don't have to re-convert the model file -minor = 23