From patchwork Mon Feb 6 09:45:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ting Fu X-Patchwork-Id: 40299 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:5494:b0:bf:7b3a:fd32 with SMTP id i20csp3227813pzk; Mon, 6 Feb 2023 02:02:51 -0800 (PST) X-Google-Smtp-Source: AK7set/mE9A/X1SCi9OlH4hOuKvbJ6lf4IeUSOArNjvvmIQBg46R97Q8SHEODPPqrRXJ5n1+w/Qz X-Received: by 2002:a17:907:8b91:b0:88a:cbd1:e663 with SMTP id tb17-20020a1709078b9100b0088acbd1e663mr19928003ejc.6.1675677771650; Mon, 06 Feb 2023 02:02:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675677771; cv=none; d=google.com; s=arc-20160816; b=Ld6w6lvcZ5if3No/RZBWCX8F+bdCuB0RZZnlHUMagjS046OKwOU9U4go1Usz5hoeNI RPW4VQ8DYpIAushUZaEqufXvSMKglGzkw01r6Y80HbIoxMjIsdSm3dRHfeRbbTisW4MC 4BY0v6ZrCvwAmXKy9HHl5PB7iM5p99ZFeIHndSnGvxd6rRTbMt9kaN8ooTXk+V/Ldq1t e7UqvcBks0uD4NfwQMusFHM9C9PFPADTKbUl2znlgF5CNg2KOMhuYLO6FEYfrQZg6vzx P6rehL0SIFYFVlY59RZpmvcsd+yz3/ik3S456vy8NEo3bA+TNx6zGxjXHFz6EQyECIuH iHtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:references:in-reply-to:message-id:date :to:from:dkim-signature:delivered-to; bh=sLLZni/5KQaDIvKsrcJ+xCK5aFcjqcTJDVhVHhayGdE=; b=IC7YzOh8/iTSCrBrjSwmD10trDeir4/U/RCLCBl7PjbBtosHgphmFwYieKTCGnDz2F cAS8Ea0oxO8JVQFgZkg6laDHb9vUz4HkGOVAk6RwCvITJRS4Aj6eFK95p3pybumDfYXt DxsONOELbrGm+3BuWRxIZ4mYCL+AGwhHAPpw6Pwek0grqtIYPTLACTJTuik/EKcpl/D8 FIVqXZpwhI1PWuAHsIluV+wM/7TKndnAAkN07w7Hz4pitqFR/qjOt2/R2TiQB1nxdDZ+ zGXa0IMoEYs0RRiaSRHY0ls69Fl7mbSfDaeul6rRbNlIZZjIoOc7qkp7lTp0ITaPhPvW 2wTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=UvAAlYVY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id uj8-20020a170907c98800b0087f5eb7d738si13074862ejc.618.2023.02.06.02.02.51; Mon, 06 Feb 2023 02:02:51 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@intel.com header.s=Intel header.b=UvAAlYVY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 52A0F68BE9C; Mon, 6 Feb 2023 12:02:39 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E2B3068BE95 for ; Mon, 6 Feb 2023 12:02:31 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675677757; x=1707213757; h=from:to:subject:date:message-id:in-reply-to:references; bh=vdiggkYO2FWtI56XwI7KHi0XKz907DFbrtEp1psHR2M=; b=UvAAlYVYWBfehTTpyidkp8iG4mDadDszidZoojLz9QfnzzWpbrK6+nRL N+7VAiYVPHvIwCr6Wju/msYA3KQIG/Mmvl01FDrbvBQ+yz0ME82t4ev3g wTfImsLygiD/vGK45VZYIXpEFRI+VVwCsjEuaQm7LbCcLh2LomMJMPKc0 p8ykm2fPB+j375707xDsBn49vDO7LsRN+JUGHxrI0CN8/3xUQHV9QLhEQ giMVfpVCe1/r1T+5SYQ5+pIfKA9krxDPiYlg3Xecpxwe3cfkxh/KDUlik pddoYwW0TuX17Ehv+UIs4gLseTTTq1Lb9xRzQJ9hIX+7SIfCef63Jbbt/ A==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="312824283" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="312824283" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 02:02:27 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="809072003" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="809072003" Received: from semmer-ubuntu.sh.intel.com ([10.239.159.83]) by fmsmga001.fm.intel.com with ESMTP; 06 Feb 2023 02:02:26 -0800 From: Ting Fu To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 Feb 2023 17:45:49 +0800 Message-Id: <20230206094550.7505-2-ting.fu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230206094550.7505-1-ting.fu@intel.com> References: <20230206094550.7505-1-ting.fu@intel.com> Subject: [FFmpeg-devel] [PATCH V5 2/3] lavfi/dnn: Modified DNN native backend related tools and docs. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: tcU2vtgPXmR3 Deleted the native backend related files in 'tools' dir. Modify its' docs and codes mentioned in such docs. Signed-off-by: Ting Fu --- doc/filters.texi | 43 +- libavfilter/vf_derain.c | 2 +- libavfilter/vf_dnn_processing.c | 2 +- libavfilter/vf_sr.c | 2 +- tools/python/convert.py | 56 --- tools/python/convert_from_tensorflow.py | 607 ------------------------ tools/python/convert_header.py | 26 - 7 files changed, 7 insertions(+), 731 deletions(-) delete mode 100644 tools/python/convert.py delete mode 100644 tools/python/convert_from_tensorflow.py delete mode 100644 tools/python/convert_header.py diff --git a/doc/filters.texi b/doc/filters.texi index 3a54c68f3e..6fcb506b38 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -11275,9 +11275,6 @@ See @url{http://openaccess.thecvf.com/content_ECCV_2018/papers/Xia_Li_Recurrent_ Training as well as model generation scripts are provided in the repository at @url{https://github.com/XueweiMeng/derain_filter.git}. -Native model files (.model) can be generated from TensorFlow model -files (.pb) by using tools/python/convert.py - The filter accepts the following options: @table @option @@ -11298,21 +11295,16 @@ Specify which DNN backend to use for model loading and execution. This option ac the following values: @table @samp -@item native -Native implementation of DNN loading and execution. - @item tensorflow TensorFlow backend. To enable this backend you need to install the TensorFlow for C library (see @url{https://www.tensorflow.org/install/lang_c}) and configure FFmpeg with @code{--enable-libtensorflow} @end table -Default value is @samp{native}. @item model Set path to model file specifying network architecture and its parameters. -Note that different backends use different file formats. TensorFlow and native -backend can load files for only its format. +Note that different backends use different file formats. TensorFlow can load files for only its format. @end table To get full functionality (such as async execution), please use the @ref{dnn_processing} filter. @@ -11636,9 +11628,6 @@ Specify which DNN backend to use for model loading and execution. This option ac the following values: @table @samp -@item native -Native implementation of DNN loading and execution. - @item tensorflow TensorFlow backend. To enable this backend you need to install the TensorFlow for C library (see @@ -11654,14 +11643,9 @@ be needed if the header files and libraries are not installed into system path) @end table -Default value is @samp{native}. - @item model Set path to model file specifying network architecture and its parameters. -Note that different backends use different file formats. TensorFlow, OpenVINO and native -backend can load files for only its format. - -Native model file (.model) can be generated from TensorFlow model file (.pb) by using tools/python/convert.py +Note that different backends use different file formats. TensorFlow, OpenVINO backend can load files for only its format. @item input Set the input name of the dnn network. @@ -11687,12 +11671,6 @@ Remove rain in rgb24 frame with can.pb (see @ref{derain} filter): ./ffmpeg -i rain.jpg -vf format=rgb24,dnn_processing=dnn_backend=tensorflow:model=can.pb:input=x:output=y derain.jpg @end example -@item -Halve the pixel value of the frame with format gray32f: -@example -ffmpeg -i input.jpg -vf format=grayf32,dnn_processing=model=halve_gray_float.model:input=dnn_in:output=dnn_out:dnn_backend=native -y out.native.png -@end example - @item Handle the Y channel with srcnn.pb (see @ref{sr} filter) for frame with yuv420p (planar YUV formats supported): @example @@ -21702,13 +21680,6 @@ Efficient Sub-Pixel Convolutional Neural Network model (ESPCN). See @url{https://arxiv.org/abs/1609.05158}. @end itemize -Training scripts as well as scripts for model file (.pb) saving can be found at -@url{https://github.com/XueweiMeng/sr/tree/sr_dnn_native}. Original repository -is at @url{https://github.com/HighVoltageRocknRoll/sr.git}. - -Native model files (.model) can be generated from TensorFlow model -files (.pb) by using tools/python/convert.py - The filter accepts the following options: @table @option @@ -21717,9 +21688,6 @@ Specify which DNN backend to use for model loading and execution. This option ac the following values: @table @samp -@item native -Native implementation of DNN loading and execution. - @item tensorflow TensorFlow backend. To enable this backend you need to install the TensorFlow for C library (see @@ -21727,13 +21695,10 @@ need to install the TensorFlow for C library (see @code{--enable-libtensorflow} @end table -Default value is @samp{native}. - @item model Set path to model file specifying network architecture and its parameters. -Note that different backends use different file formats. TensorFlow backend -can load files for both formats, while native backend can load files for only -its format. +Note that different backends use different file formats. TensorFlow, OpenVINO backend +can load files for only its format. @item scale_factor Set scale factor for SRCNN model. Allowed values are @code{2}, @code{3} and @code{4}. diff --git a/libavfilter/vf_derain.c b/libavfilter/vf_derain.c index 86e9eb8752..7e84cd65a3 100644 --- a/libavfilter/vf_derain.c +++ b/libavfilter/vf_derain.c @@ -43,7 +43,7 @@ static const AVOption derain_options[] = { { "filter_type", "filter type(derain/dehaze)", OFFSET(filter_type), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, FLAGS, "type" }, { "derain", "derain filter flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "type" }, { "dehaze", "dehaze filter flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "type" }, - { "dnn_backend", "DNN backend", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, FLAGS, "backend" }, + { "dnn_backend", "DNN backend", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 1 }, 0, 1, FLAGS, "backend" }, { "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, #if (CONFIG_LIBTENSORFLOW == 1) { "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, diff --git a/libavfilter/vf_dnn_processing.c b/libavfilter/vf_dnn_processing.c index 4462915073..28937346b5 100644 --- a/libavfilter/vf_dnn_processing.c +++ b/libavfilter/vf_dnn_processing.c @@ -45,7 +45,7 @@ typedef struct DnnProcessingContext { #define OFFSET(x) offsetof(DnnProcessingContext, dnnctx.x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM static const AVOption dnn_processing_options[] = { - { "dnn_backend", "DNN backend", OFFSET(backend_type), AV_OPT_TYPE_INT, { .i64 = 0 }, INT_MIN, INT_MAX, FLAGS, "backend" }, + { "dnn_backend", "DNN backend", OFFSET(backend_type), AV_OPT_TYPE_INT, { .i64 = 2 }, INT_MIN, INT_MAX, FLAGS, "backend" }, { "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, #if (CONFIG_LIBTENSORFLOW == 1) { "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c index cb24c096ce..e9fe746bae 100644 --- a/libavfilter/vf_sr.c +++ b/libavfilter/vf_sr.c @@ -46,7 +46,7 @@ typedef struct SRContext { #define OFFSET(x) offsetof(SRContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM static const AVOption sr_options[] = { - { "dnn_backend", "DNN backend used for model execution", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, FLAGS, "backend" }, + { "dnn_backend", "DNN backend used for model execution", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 1 }, 0, 1, FLAGS, "backend" }, { "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, #if (CONFIG_LIBTENSORFLOW == 1) { "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, diff --git a/tools/python/convert.py b/tools/python/convert.py deleted file mode 100644 index 64cf76b2d8..0000000000 --- a/tools/python/convert.py +++ /dev/null @@ -1,56 +0,0 @@ -# Copyright (c) 2019 Guo Yejun -# -# This file is part of FFmpeg. -# -# FFmpeg is free software; you can redistribute it and/or -# modify it under the terms of the GNU Lesser General Public -# License as published by the Free Software Foundation; either -# version 2.1 of the License, or (at your option) any later version. -# -# FFmpeg is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# Lesser General Public License for more details. -# -# You should have received a copy of the GNU Lesser General Public -# License along with FFmpeg; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA -# ============================================================================== - -# verified with Python 3.5.2 on Ubuntu 16.04 -import argparse -import os -from convert_from_tensorflow import * - -def get_arguments(): - parser = argparse.ArgumentParser(description='generate native mode model with weights from deep learning model') - parser.add_argument('--outdir', type=str, default='./', help='where to put generated files') - parser.add_argument('--infmt', type=str, default='tensorflow', help='format of the deep learning model') - parser.add_argument('infile', help='path to the deep learning model with weights') - parser.add_argument('--dump4tb', type=str, default='no', help='dump file for visualization in tensorboard') - - return parser.parse_args() - -def main(): - args = get_arguments() - - if not os.path.isfile(args.infile): - print('the specified input file %s does not exist' % args.infile) - exit(1) - - if not os.path.exists(args.outdir): - print('create output directory %s' % args.outdir) - os.mkdir(args.outdir) - - basefile = os.path.split(args.infile)[1] - basefile = os.path.splitext(basefile)[0] - outfile = os.path.join(args.outdir, basefile) + '.model' - dump4tb = False - if args.dump4tb.lower() in ('yes', 'true', 't', 'y', '1'): - dump4tb = True - - if args.infmt == 'tensorflow': - convert_from_tensorflow(args.infile, outfile, dump4tb) - -if __name__ == '__main__': - main() diff --git a/tools/python/convert_from_tensorflow.py b/tools/python/convert_from_tensorflow.py deleted file mode 100644 index 38e64c1c94..0000000000 --- a/tools/python/convert_from_tensorflow.py +++ /dev/null @@ -1,607 +0,0 @@ -# Copyright (c) 2019 Guo Yejun -# -# This file is part of FFmpeg. -# -# FFmpeg is free software; you can redistribute it and/or -# modify it under the terms of the GNU Lesser General Public -# License as published by the Free Software Foundation; either -# version 2.1 of the License, or (at your option) any later version. -# -# FFmpeg is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# Lesser General Public License for more details. -# -# You should have received a copy of the GNU Lesser General Public -# License along with FFmpeg; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA -# ============================================================================== - -import tensorflow as tf -import numpy as np -import sys, struct -import convert_header as header - -__all__ = ['convert_from_tensorflow'] - -class Operand(object): - IOTYPE_INPUT = 1 - IOTYPE_OUTPUT = 2 - IOTYPE_INTERMEDIATE = IOTYPE_INPUT | IOTYPE_OUTPUT - DTYPE_FLOAT = 1 - DTYPE_UINT8 = 4 - index = 0 - def __init__(self, name, dtype, dims): - self.name = name - self.dtype = dtype - self.dims = dims - self.iotype = 0 - self.used_count = 0 - self.index = Operand.index - Operand.index = Operand.index + 1 - self.iotype2str = {Operand.IOTYPE_INPUT: 'in', Operand.IOTYPE_OUTPUT: 'out', Operand.IOTYPE_INTERMEDIATE: 'inout'} - self.dtype2str = {Operand.DTYPE_FLOAT: 'DT_FLOAT', Operand.DTYPE_UINT8: 'DT_UINT8'} - - def add_iotype(self, iotype): - self.iotype = self.iotype | iotype - if iotype == Operand.IOTYPE_INPUT: - self.used_count = self.used_count + 1 - - def __str__(self): - return "{}: (name: {}, iotype: {}, dtype: {}, dims: {}, used_count: {})".format(self.index, - self.name, self.iotype2str[self.iotype], self.dtype2str[self.dtype], - self.dims, self.used_count) - - def __lt__(self, other): - return self.index < other.index - -class TFConverter: - def __init__(self, graph_def, nodes, outfile, dump4tb): - self.graph_def = graph_def - self.nodes = nodes - self.outfile = outfile - self.dump4tb = dump4tb - self.layer_number = 0 - self.output_names = [] - self.name_node_dict = {} - self.edges = {} - self.conv_activations = {'Relu':0, 'Tanh':1, 'Sigmoid':2, 'None':3, 'LeakyRelu':4} - self.conv_paddings = {'VALID':0, 'SAME':1} - self.pool_paddings = {'VALID':0, 'SAME':1} - self.converted_nodes = set() - self.conv2d_scope_names = set() - self.conv2d_scopename_inputname_dict = {} - self.dense_scope_names = set() - self.dense_scopename_inputname_dict = {} - self.op2code = {'Conv2D':1, 'DepthToSpace':2, 'MirrorPad':3, 'Maximum':4, - 'MathBinary':5, 'MathUnary':6, 'AvgPool':7, 'MatMul':8} - self.mathbin2code = {'Sub':0, 'Add':1, 'Mul':2, 'RealDiv':3, 'Minimum':4, 'FloorMod':5} - self.mathun2code = {'Abs':0, 'Sin':1, 'Cos':2, 'Tan':3, 'Asin':4, - 'Acos':5, 'Atan':6, 'Sinh':7, 'Cosh':8, 'Tanh':9, 'Asinh':10, - 'Acosh':11, 'Atanh':12, 'Ceil':13, 'Floor':14, 'Round':15, - 'Exp':16} - self.mirrorpad_mode = {'CONSTANT':0, 'REFLECT':1, 'SYMMETRIC':2} - self.name_operand_dict = {} - - - def add_operand(self, name, type): - node = self.name_node_dict[name] - if name not in self.name_operand_dict: - dtype = node.attr['dtype'].type - if dtype == 0: - dtype = node.attr['T'].type - dims = [-1,-1,-1,-1] - if 'shape' in node.attr: - dims[0] = node.attr['shape'].shape.dim[0].size - dims[1] = node.attr['shape'].shape.dim[1].size - dims[2] = node.attr['shape'].shape.dim[2].size - dims[3] = node.attr['shape'].shape.dim[3].size - operand = Operand(name, dtype, dims) - self.name_operand_dict[name] = operand; - self.name_operand_dict[name].add_iotype(type) - return self.name_operand_dict[name].index - - - def dump_for_tensorboard(self): - graph = tf.get_default_graph() - tf.import_graph_def(self.graph_def, name="") - tf.summary.FileWriter('/tmp/graph', graph) - print('graph saved, run "tensorboard --logdir=/tmp/graph" to see it') - - - def get_conv2d_params(self, conv2d_scope_name): - knode = self.name_node_dict[conv2d_scope_name + '/kernel'] - bnode = self.name_node_dict[conv2d_scope_name + '/bias'] - - if conv2d_scope_name + '/dilation_rate' in self.name_node_dict: - dnode = self.name_node_dict[conv2d_scope_name + '/dilation_rate'] - else: - dnode = None - - # the BiasAdd name is possible be changed into the output name, - # if activation is None, and BiasAdd.next is the last op which is Identity - if conv2d_scope_name + '/BiasAdd' in self.edges: - anode = self.edges[conv2d_scope_name + '/BiasAdd'][0] - if anode.op not in self.conv_activations: - anode = None - else: - anode = None - return knode, bnode, dnode, anode - - - def get_dense_params(self, dense_scope_name): - knode = self.name_node_dict[dense_scope_name + '/kernel'] - bnode = self.name_node_dict.get(dense_scope_name + '/bias') - # the BiasAdd name is possible be changed into the output name, - # if activation is None, and BiasAdd.next is the last op which is Identity - anode = None - if bnode: - if dense_scope_name + '/BiasAdd' in self.edges: - anode = self.edges[dense_scope_name + '/BiasAdd'][0] - if anode.op not in self.conv_activations: - anode = None - else: - anode = None - return knode, bnode, anode - - - def dump_complex_conv2d_to_file(self, node, f): - assert(node.op == 'Conv2D') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - - scope_name = TFConverter.get_scope_name(node.name) - #knode for kernel, bnode for bias, dnode for dilation, anode for activation - knode, bnode, dnode, anode = self.get_conv2d_params(scope_name) - - if dnode is not None: - dilation = struct.unpack('i', dnode.attr['value'].tensor.tensor_content[0:4])[0] - else: - dilation = 1 - - if anode is not None: - activation = anode.op - else: - activation = 'None' - - padding = node.attr['padding'].s.decode("utf-8") - # conv2d with dilation > 1 generates tens of nodes, not easy to parse them, so use this tricky method. - if dilation > 1 and scope_name + '/stack' in self.name_node_dict: - if self.name_node_dict[scope_name + '/stack'].op == "Const": - padding = 'SAME' - padding = self.conv_paddings[padding] - - ktensor = knode.attr['value'].tensor - filter_height = ktensor.tensor_shape.dim[0].size - filter_width = ktensor.tensor_shape.dim[1].size - in_channels = ktensor.tensor_shape.dim[2].size - out_channels = ktensor.tensor_shape.dim[3].size - kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32) - kernel = kernel.reshape(filter_height, filter_width, in_channels, out_channels) - kernel = np.transpose(kernel, [3, 0, 1, 2]) - - has_bias = 1 - np.array([self.op2code[node.op], dilation, padding, self.conv_activations[activation], in_channels, out_channels, filter_height, has_bias], dtype=np.uint32).tofile(f) - kernel.tofile(f) - - btensor = bnode.attr['value'].tensor - if btensor.tensor_shape.dim[0].size == 1: - bias = struct.pack("f", btensor.float_val[0]) - else: - bias = btensor.tensor_content - f.write(bias) - - input_name = self.conv2d_scopename_inputname_dict[scope_name] - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - - if anode is not None: - output_operand_index = self.add_operand(anode.name, Operand.IOTYPE_OUTPUT) - else: - output_operand_index = self.add_operand(self.edges[bnode.name][0].name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - def dump_dense_to_file(self, node, f): - assert(node.op == 'MatMul') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - - scope_name = TFConverter.get_scope_name(node.name) - #knode for kernel, bnode for bias, anode for activation - knode, bnode, anode = self.get_dense_params(scope_name.split('/')[0]) - - if bnode is not None: - has_bias = 1 - btensor = bnode.attr['value'].tensor - if btensor.tensor_shape.dim[0].size == 1: - bias = struct.pack("f", btensor.float_val[0]) - else: - bias = btensor.tensor_content - else: - has_bias = 0 - - if anode is not None: - activation = anode.op - else: - activation = 'None' - - ktensor = knode.attr['value'].tensor - in_channels = ktensor.tensor_shape.dim[0].size - out_channels = ktensor.tensor_shape.dim[1].size - if in_channels * out_channels == 1: - kernel = np.float32(ktensor.float_val[0]) - else: - kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32) - kernel = kernel.reshape(in_channels, out_channels) - kernel = np.transpose(kernel, [1, 0]) - - np.array([self.op2code[node.op], self.conv_activations[activation], in_channels, out_channels, has_bias], dtype=np.uint32).tofile(f) - kernel.tofile(f) - if has_bias: - f.write(bias) - - input_name = self.dense_scopename_inputname_dict[scope_name.split('/')[0]] - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - - if anode is not None: - output_operand_index = self.add_operand(anode.name, Operand.IOTYPE_OUTPUT) - else: - if bnode is not None: - output_operand_index = self.add_operand(self.edges[bnode.name][0].name, Operand.IOTYPE_OUTPUT) - else: - output_operand_index = self.add_operand(self.edges[scope_name+'/concat_1'][0].name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_simple_conv2d_to_file(self, node, f): - assert(node.op == 'Conv2D') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - - node0 = self.name_node_dict[node.input[0]] - node1 = self.name_node_dict[node.input[1]] - if node0.op == 'Const': - knode = node0 - input_name = node.input[1] - else: - knode = node1 - input_name = node.input[0] - - ktensor = knode.attr['value'].tensor - filter_height = ktensor.tensor_shape.dim[0].size - filter_width = ktensor.tensor_shape.dim[1].size - in_channels = ktensor.tensor_shape.dim[2].size - out_channels = ktensor.tensor_shape.dim[3].size - if filter_height * filter_width * in_channels * out_channels == 1: - kernel = np.float32(ktensor.float_val[0]) - else: - kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32) - kernel = kernel.reshape(filter_height, filter_width, in_channels, out_channels) - kernel = np.transpose(kernel, [3, 0, 1, 2]) - - has_bias = 0 - dilation = 1 - padding = node.attr['padding'].s.decode("utf-8") - np.array([self.op2code[node.op], dilation, self.conv_paddings[padding], self.conv_activations['None'], - in_channels, out_channels, filter_height, has_bias], dtype=np.uint32).tofile(f) - kernel.tofile(f) - - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_depth2space_to_file(self, node, f): - assert(node.op == 'DepthToSpace') - self.layer_number = self.layer_number + 1 - block_size = node.attr['block_size'].i - np.array([self.op2code[node.op], block_size], dtype=np.uint32).tofile(f) - self.converted_nodes.add(node.name) - input_operand_index = self.add_operand(node.input[0], Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_mirrorpad_to_file(self, node, f): - assert(node.op == 'MirrorPad') - self.layer_number = self.layer_number + 1 - mode = node.attr['mode'].s - mode = self.mirrorpad_mode[mode.decode("utf-8")] - np.array([self.op2code[node.op], mode], dtype=np.uint32).tofile(f) - pnode = self.name_node_dict[node.input[1]] - self.converted_nodes.add(pnode.name) - paddings = pnode.attr['value'].tensor.tensor_content - f.write(paddings) - self.converted_nodes.add(node.name) - input_operand_index = self.add_operand(node.input[0], Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_maximum_to_file(self, node, f): - assert(node.op == 'Maximum') - self.layer_number = self.layer_number + 1 - ynode = self.name_node_dict[node.input[1]] - y = ynode.attr['value'].tensor.float_val[0] - np.array([self.op2code[node.op]], dtype=np.uint32).tofile(f) - np.array([y], dtype=np.float32).tofile(f) - self.converted_nodes.add(node.name) - input_operand_index = self.add_operand(node.input[0], Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_mathbinary_to_file(self, node, f): - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - i0_node = self.name_node_dict[node.input[0]] - i1_node = self.name_node_dict[node.input[1]] - np.array([self.op2code['MathBinary'], self.mathbin2code[node.op]], dtype=np.uint32).tofile(f) - if i0_node.op == 'Const': - scalar = i0_node.attr['value'].tensor.float_val[0] - np.array([1], dtype=np.uint32).tofile(f) # broadcast: 1 - np.array([scalar], dtype=np.float32).tofile(f) - np.array([0], dtype=np.uint32).tofile(f) # broadcast: 0 - input_operand_index = self.add_operand(i1_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - elif i1_node.op == 'Const': - scalar = i1_node.attr['value'].tensor.float_val[0] - np.array([0], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i0_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - np.array([1], dtype=np.uint32).tofile(f) - np.array([scalar], dtype=np.float32).tofile(f) - else: - np.array([0], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i0_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - np.array([0], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i1_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_mathunary_to_file(self, node, f): - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - i0_node = self.name_node_dict[node.input[0]] - np.array([self.op2code['MathUnary'], self.mathun2code[node.op]], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i0_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([output_operand_index],dtype=np.uint32).tofile(f) - - - def dump_avg_pool_to_file(self, node, f): - assert(node.op == 'AvgPool') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - node0 = self.name_node_dict[node.input[0]] - strides = node.attr['strides'] - - # Tensorflow do not support pooling strides in batch dimension and - # current native NN do not support pooling strides in channel dimension, added assert() here. - assert(strides.list.i[1]==strides.list.i[2]) - assert(strides.list.i[0]==1) - assert(strides.list.i[3]==1) - strides = strides.list.i[1] - filter_node = node.attr['ksize'] - input_name = node.input[0] - - # Tensorflow do not support pooling ksize in batch dimension and channel dimension. - assert(filter_node.list.i[0]==1) - assert(filter_node.list.i[3]==1) - filter_height = filter_node.list.i[1] - filter_width = filter_node.list.i[2] - - padding = node.attr['padding'].s.decode("utf-8") - np.array([self.op2code[node.op], strides, self.pool_paddings[padding], filter_height], - dtype=np.uint32).tofile(f) - - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index],dtype=np.uint32).tofile(f) - - - def dump_layers_to_file(self, f): - for node in self.nodes: - if node.name in self.converted_nodes: - continue - - # conv2d with dilation generates very complex nodes, so handle it in special - if self.in_conv2d_scope(node.name): - if node.op == 'Conv2D': - self.dump_complex_conv2d_to_file(node, f) - continue - if self.in_dense_scope(node.name): - if node.op == 'MatMul': - self.dump_dense_to_file(node, f) - continue - - - if node.op == 'Conv2D': - self.dump_simple_conv2d_to_file(node, f) - continue - if node.name in self.output_names: - input_name = self.id_different_scope_dict[node.name] - if TFConverter.get_scope_name(input_name)!=TFConverter.get_scope_name(node.name): - continue - if node.op == 'AvgPool': - self.dump_avg_pool_to_file(node, f) - elif node.op == 'DepthToSpace': - self.dump_depth2space_to_file(node, f) - elif node.op == 'MirrorPad': - self.dump_mirrorpad_to_file(node, f) - elif node.op == 'Maximum': - self.dump_maximum_to_file(node, f) - elif node.op in self.mathbin2code: - self.dump_mathbinary_to_file(node, f) - elif node.op in self.mathun2code: - self.dump_mathunary_to_file(node, f) - - - def dump_operands_to_file(self, f): - operands = sorted(self.name_operand_dict.values()) - for operand in operands: - #print('{}'.format(operand)) - np.array([operand.index, len(operand.name)], dtype=np.uint32).tofile(f) - f.write(operand.name.encode('utf-8')) - np.array([operand.iotype, operand.dtype], dtype=np.uint32).tofile(f) - np.array(operand.dims, dtype=np.uint32).tofile(f) - - - def dump_to_file(self): - with open(self.outfile, 'wb') as f: - f.write(header.str.encode('utf-8')) - np.array([header.major, header.minor], dtype=np.uint32).tofile(f) - self.dump_layers_to_file(f) - self.dump_operands_to_file(f) - np.array([self.layer_number, len(self.name_operand_dict)], dtype=np.uint32).tofile(f) - - - def generate_name_node_dict(self): - for node in self.nodes: - self.name_node_dict[node.name] = node - - - def generate_output_names(self): - used_names = [] - for node in self.nodes: - for input in node.input: - used_names.append(input) - - for node in self.nodes: - if node.name not in used_names: - self.output_names.append(node.name) - - - def remove_identity(self): - self.id_different_scope_dict = {} - id_nodes = [] - id_dict = {} - for node in self.nodes: - if node.op == 'Identity': - name = node.name - input = node.input[0] - id_nodes.append(node) - # do not change the output name - if name in self.output_names: - self.name_node_dict[input].name = name - self.name_node_dict[name] = self.name_node_dict[input] - del self.name_node_dict[input] - self.id_different_scope_dict[name] = input - else: - id_dict[name] = input - - for idnode in id_nodes: - self.nodes.remove(idnode) - - for node in self.nodes: - for i in range(len(node.input)): - input = node.input[i] - if input in id_dict: - node.input[i] = id_dict[input] - - - def generate_edges(self): - for node in self.nodes: - for input in node.input: - if input in self.edges: - self.edges[input].append(node) - else: - self.edges[input] = [node] - - - @staticmethod - def get_scope_name(name): - index = name.rfind('/') - if index == -1: - return "" - return name[0:index] - - - def in_conv2d_scope(self, name): - inner_scope = TFConverter.get_scope_name(name) - if inner_scope == "": - return False; - for scope in self.conv2d_scope_names: - index = inner_scope.find(scope) - if index == 0: - return True - return False - - - def in_dense_scope(self, name): - inner_scope = TFConverter.get_scope_name(name) - if inner_scope == "": - return False; - for scope in self.dense_scope_names: - index = inner_scope.find(scope) - if index == 0: - return True - return False - - def generate_sub_block_op_scope_info(self): - # mostly, conv2d/dense is a sub block in graph, get the scope name - for node in self.nodes: - if node.op == 'Conv2D': - scope = TFConverter.get_scope_name(node.name) - # for the case tf.nn.conv2d is called directly - if scope == '': - continue - # for the case tf.nn.conv2d is called within a scope - if scope + '/kernel' not in self.name_node_dict: - continue - self.conv2d_scope_names.add(scope) - elif node.op == 'MatMul': - scope = TFConverter.get_scope_name(node.name) - # for the case tf.nn.dense is called directly - if scope == '': - continue - # for the case tf.nn.dense is called within a scope - if scope + '/kernel' not in self.name_node_dict and scope.split('/Tensordot')[0] + '/kernel' not in self.name_node_dict: - continue - self.dense_scope_names.add(scope.split('/Tensordot')[0]) - - # get the input name to the conv2d/dense sub block - for node in self.nodes: - scope = TFConverter.get_scope_name(node.name) - if scope in self.conv2d_scope_names: - if node.op == 'Conv2D' or node.op == 'Shape': - for inp in node.input: - if TFConverter.get_scope_name(inp) != scope: - self.conv2d_scopename_inputname_dict[scope] = inp - elif scope in self.dense_scope_names: - if node.op == 'MatMul' or node.op == 'Shape': - for inp in node.input: - if TFConverter.get_scope_name(inp) != scope: - self.dense_scopename_inputname_dict[scope] = inp - elif scope.split('/Tensordot')[0] in self.dense_scope_names: - if node.op == 'Transpose': - for inp in node.input: - if TFConverter.get_scope_name(inp).find(scope)<0 and TFConverter.get_scope_name(inp).find(scope.split('/')[0])<0: - self.dense_scopename_inputname_dict[scope.split('/Tensordot')[0]] = inp - - - def run(self): - self.generate_name_node_dict() - self.generate_output_names() - self.remove_identity() - self.generate_edges() - self.generate_sub_block_op_scope_info() - - if self.dump4tb: - self.dump_for_tensorboard() - - self.dump_to_file() - - -def convert_from_tensorflow(infile, outfile, dump4tb): - with open(infile, 'rb') as f: - # read the file in .proto format - graph_def = tf.GraphDef() - graph_def.ParseFromString(f.read()) - nodes = graph_def.node - - converter = TFConverter(graph_def, nodes, outfile, dump4tb) - converter.run() diff --git a/tools/python/convert_header.py b/tools/python/convert_header.py deleted file mode 100644 index 143f92c42e..0000000000 --- a/tools/python/convert_header.py +++ /dev/null @@ -1,26 +0,0 @@ -# Copyright (c) 2019 -# -# This file is part of FFmpeg. -# -# FFmpeg is free software; you can redistribute it and/or -# modify it under the terms of the GNU Lesser General Public -# License as published by the Free Software Foundation; either -# version 2.1 of the License, or (at your option) any later version. -# -# FFmpeg is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# Lesser General Public License for more details. -# -# You should have received a copy of the GNU Lesser General Public -# License along with FFmpeg; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA -# ============================================================================== - -str = 'FFMPEGDNNNATIVE' - -# increase major and reset minor when we have to re-convert the model file -major = 1 - -# increase minor when we don't have to re-convert the model file -minor = 23