[FFmpeg-devel,V6,2/3] lavfi/dnn: Modified DNN native backend related tools and docs.

Message ID	20230306135548.23001-2-ting.fu@intel.com
State	New
Headers	show Delivered-To: ffmpegpatchwork2@gmail.com Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; From: Ting Fu <ting.fu-at-intel.com@ffmpeg.org> To: ffmpeg-devel@ffmpeg.org Date: Mon, 6 Mar 2023 21:55:47 +0800 Message-Id: <20230306135548.23001-2-ting.fu@intel.com> In-Reply-To: <20230306135548.23001-1-ting.fu@intel.com> References: <20230306135548.23001-1-ting.fu@intel.com> Subject: [FFmpeg-devel] [PATCH V6 2/3] lavfi/dnn: Modified DNN native backend related tools and docs. Precedence: list Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Series	[FFmpeg-devel,V6,1/3] lavfi/dnn: Mark native backend as unsupported \| expand [FFmpeg-devel,V6,1/3] lavfi/dnn: Mark native backend as unsupported [FFmpeg-devel,V6,2/3] lavfi/dnn: Modified DNN native backend related tools and docs. [FFmpeg-devel,V6,3/3] lavfi/dnn: Remove DNN native backend

Context	Check	Description
yinshiyou/make_loongarch64	success	Make finished
yinshiyou/make_fate_loongarch64	success	Make fate finished
andriy/make_x86	success	Make finished
andriy/make_fate_x86	success	Make fate finished

diff --git a/doc/filters.texi b/doc/filters.texi index 7a7b2ba4e7..726d2fd7e2 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -11305,9 +11305,6 @@ See @url{http://openaccess.thecvf.com/content_ECCV_2018/papers/Xia_Li_Recurrent_ Training as well as model generation scripts are provided in the repository at @url{https://github.com/XueweiMeng/derain_filter.git}. -Native model files (.model) can be generated from TensorFlow model -files (.pb) by using tools/python/convert.py - The filter accepts the following options: @table @option @@ -11328,21 +11325,16 @@ Specify which DNN backend to use for model loading and execution. This option ac the following values: @table @samp -@item native -Native implementation of DNN loading and execution. - @item tensorflow TensorFlow backend. To enable this backend you need to install the TensorFlow for C library (see @url{https://www.tensorflow.org/install/lang_c}) and configure FFmpeg with @code{--enable-libtensorflow} @end table -Default value is @samp{native}. @item model Set path to model file specifying network architecture and its parameters. -Note that different backends use different file formats. TensorFlow and native -backend can load files for only its format. +Note that different backends use different file formats. TensorFlow can load files for only its format. @end table To get full functionality (such as async execution), please use the @ref{dnn_processing} filter. @@ -11666,9 +11658,6 @@ Specify which DNN backend to use for model loading and execution. This option ac the following values: @table @samp -@item native -Native implementation of DNN loading and execution. - @item tensorflow TensorFlow backend. To enable this backend you need to install the TensorFlow for C library (see @@ -11684,14 +11673,9 @@ be needed if the header files and libraries are not installed into system path) @end table -Default value is @samp{native}. - @item model Set path to model file specifying network architecture and its parameters. -Note that different backends use different file formats. TensorFlow, OpenVINO and native -backend can load files for only its format. - -Native model file (.model) can be generated from TensorFlow model file (.pb) by using tools/python/convert.py +Note that different backends use different file formats. TensorFlow, OpenVINO backend can load files for only its format. @item input Set the input name of the dnn network. @@ -11717,12 +11701,6 @@ Remove rain in rgb24 frame with can.pb (see @ref{derain} filter): ./ffmpeg -i rain.jpg -vf format=rgb24,dnn_processing=dnn_backend=tensorflow:model=can.pb:input=x:output=y derain.jpg @end example -@item -Halve the pixel value of the frame with format gray32f: -@example -ffmpeg -i input.jpg -vf format=grayf32,dnn_processing=model=halve_gray_float.model:input=dnn_in:output=dnn_out:dnn_backend=native -y out.native.png -@end example - @item Handle the Y channel with srcnn.pb (see @ref{sr} filter) for frame with yuv420p (planar YUV formats supported): @example @@ -21750,9 +21728,6 @@ Training scripts as well as scripts for model file (.pb) saving can be found at @url{https://github.com/XueweiMeng/sr/tree/sr_dnn_native}. Original repository is at @url{https://github.com/HighVoltageRocknRoll/sr.git}. -Native model files (.model) can be generated from TensorFlow model -files (.pb) by using tools/python/convert.py - The filter accepts the following options: @table @option @@ -21761,9 +21736,6 @@ Specify which DNN backend to use for model loading and execution. This option ac the following values: @table @samp -@item native -Native implementation of DNN loading and execution. - @item tensorflow TensorFlow backend. To enable this backend you need to install the TensorFlow for C library (see @@ -21771,13 +21743,10 @@ need to install the TensorFlow for C library (see @code{--enable-libtensorflow} @end table -Default value is @samp{native}. - @item model Set path to model file specifying network architecture and its parameters. -Note that different backends use different file formats. TensorFlow backend -can load files for both formats, while native backend can load files for only -its format. +Note that different backends use different file formats. TensorFlow, OpenVINO backend +can load files for only its format. @item scale_factor Set scale factor for SRCNN model. Allowed values are @code{2}, @code{3} and @code{4}. diff --git a/libavfilter/vf_derain.c b/libavfilter/vf_derain.c index 86e9eb8752..7e84cd65a3 100644 --- a/libavfilter/vf_derain.c +++ b/libavfilter/vf_derain.c @@ -43,7 +43,7 @@ static const AVOption derain_options[] = { { "filter_type", "filter type(derain/dehaze)", OFFSET(filter_type), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, FLAGS, "type" }, { "derain", "derain filter flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "type" }, { "dehaze", "dehaze filter flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "type" }, - { "dnn_backend", "DNN backend", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, FLAGS, "backend" }, + { "dnn_backend", "DNN backend", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 1 }, 0, 1, FLAGS, "backend" }, { "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, #if (CONFIG_LIBTENSORFLOW == 1) { "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, diff --git a/libavfilter/vf_dnn_processing.c b/libavfilter/vf_dnn_processing.c index 4462915073..28937346b5 100644 --- a/libavfilter/vf_dnn_processing.c +++ b/libavfilter/vf_dnn_processing.c @@ -45,7 +45,7 @@ typedef struct DnnProcessingContext { #define OFFSET(x) offsetof(DnnProcessingContext, dnnctx.x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM static const AVOption dnn_processing_options[] = { - { "dnn_backend", "DNN backend", OFFSET(backend_type), AV_OPT_TYPE_INT, { .i64 = 0 }, INT_MIN, INT_MAX, FLAGS, "backend" }, + { "dnn_backend", "DNN backend", OFFSET(backend_type), AV_OPT_TYPE_INT, { .i64 = 2 }, INT_MIN, INT_MAX, FLAGS, "backend" }, { "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, #if (CONFIG_LIBTENSORFLOW == 1) { "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, diff --git a/libavfilter/vf_sr.c b/libavfilter/vf_sr.c index cb24c096ce..e9fe746bae 100644 --- a/libavfilter/vf_sr.c +++ b/libavfilter/vf_sr.c @@ -46,7 +46,7 @@ typedef struct SRContext { #define OFFSET(x) offsetof(SRContext, x) #define FLAGS AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM static const AVOption sr_options[] = { - { "dnn_backend", "DNN backend used for model execution", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, FLAGS, "backend" }, + { "dnn_backend", "DNN backend used for model execution", OFFSET(dnnctx.backend_type), AV_OPT_TYPE_INT, { .i64 = 1 }, 0, 1, FLAGS, "backend" }, { "native", "native backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, FLAGS, "backend" }, #if (CONFIG_LIBTENSORFLOW == 1) { "tensorflow", "tensorflow backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 1 }, 0, 0, FLAGS, "backend" }, diff --git a/tools/python/convert.py b/tools/python/convert.py deleted file mode 100644 index 64cf76b2d8..0000000000 --- a/tools/python/convert.py +++ /dev/null @@ -1,56 +0,0 @@ -# Copyright (c) 2019 Guo Yejun -# -# This file is part of FFmpeg. -# -# FFmpeg is free software; you can redistribute it and/or -# modify it under the terms of the GNU Lesser General Public -# License as published by the Free Software Foundation; either -# version 2.1 of the License, or (at your option) any later version. -# -# FFmpeg is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# Lesser General Public License for more details. -# -# You should have received a copy of the GNU Lesser General Public -# License along with FFmpeg; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA -# ============================================================================== - -# verified with Python 3.5.2 on Ubuntu 16.04 -import argparse -import os -from convert_from_tensorflow import * - -def get_arguments(): - parser = argparse.ArgumentParser(description='generate native mode model with weights from deep learning model') - parser.add_argument('--outdir', type=str, default='./', help='where to put generated files') - parser.add_argument('--infmt', type=str, default='tensorflow', help='format of the deep learning model') - parser.add_argument('infile', help='path to the deep learning model with weights') - parser.add_argument('--dump4tb', type=str, default='no', help='dump file for visualization in tensorboard') - - return parser.parse_args() - -def main(): - args = get_arguments() - - if not os.path.isfile(args.infile): - print('the specified input file %s does not exist' % args.infile) - exit(1) - - if not os.path.exists(args.outdir): - print('create output directory %s' % args.outdir) - os.mkdir(args.outdir) - - basefile = os.path.split(args.infile)[1] - basefile = os.path.splitext(basefile)[0] - outfile = os.path.join(args.outdir, basefile) + '.model' - dump4tb = False - if args.dump4tb.lower() in ('yes', 'true', 't', 'y', '1'): - dump4tb = True - - if args.infmt == 'tensorflow': - convert_from_tensorflow(args.infile, outfile, dump4tb) - -if __name__ == '__main__': - main() diff --git a/tools/python/convert_from_tensorflow.py b/tools/python/convert_from_tensorflow.py deleted file mode 100644 index 38e64c1c94..0000000000 --- a/tools/python/convert_from_tensorflow.py +++ /dev/null @@ -1,607 +0,0 @@ -# Copyright (c) 2019 Guo Yejun -# -# This file is part of FFmpeg. -# -# FFmpeg is free software; you can redistribute it and/or -# modify it under the terms of the GNU Lesser General Public -# License as published by the Free Software Foundation; either -# version 2.1 of the License, or (at your option) any later version. -# -# FFmpeg is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# Lesser General Public License for more details. -# -# You should have received a copy of the GNU Lesser General Public -# License along with FFmpeg; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA -# ============================================================================== - -import tensorflow as tf -import numpy as np -import sys, struct -import convert_header as header - -__all__ = ['convert_from_tensorflow'] - -class Operand(object): - IOTYPE_INPUT = 1 - IOTYPE_OUTPUT = 2 - IOTYPE_INTERMEDIATE = IOTYPE_INPUT | IOTYPE_OUTPUT - DTYPE_FLOAT = 1 - DTYPE_UINT8 = 4 - index = 0 - def __init__(self, name, dtype, dims): - self.name = name - self.dtype = dtype - self.dims = dims - self.iotype = 0 - self.used_count = 0 - self.index = Operand.index - Operand.index = Operand.index + 1 - self.iotype2str = {Operand.IOTYPE_INPUT: 'in', Operand.IOTYPE_OUTPUT: 'out', Operand.IOTYPE_INTERMEDIATE: 'inout'} - self.dtype2str = {Operand.DTYPE_FLOAT: 'DT_FLOAT', Operand.DTYPE_UINT8: 'DT_UINT8'} - - def add_iotype(self, iotype): - self.iotype = self.iotype | iotype - if iotype == Operand.IOTYPE_INPUT: - self.used_count = self.used_count + 1 - - def __str__(self): - return "{}: (name: {}, iotype: {}, dtype: {}, dims: {}, used_count: {})".format(self.index, - self.name, self.iotype2str[self.iotype], self.dtype2str[self.dtype], - self.dims, self.used_count) - - def __lt__(self, other): - return self.index < other.index - -class TFConverter: - def __init__(self, graph_def, nodes, outfile, dump4tb): - self.graph_def = graph_def - self.nodes = nodes - self.outfile = outfile - self.dump4tb = dump4tb - self.layer_number = 0 - self.output_names = [] - self.name_node_dict = {} - self.edges = {} - self.conv_activations = {'Relu':0, 'Tanh':1, 'Sigmoid':2, 'None':3, 'LeakyRelu':4} - self.conv_paddings = {'VALID':0, 'SAME':1} - self.pool_paddings = {'VALID':0, 'SAME':1} - self.converted_nodes = set() - self.conv2d_scope_names = set() - self.conv2d_scopename_inputname_dict = {} - self.dense_scope_names = set() - self.dense_scopename_inputname_dict = {} - self.op2code = {'Conv2D':1, 'DepthToSpace':2, 'MirrorPad':3, 'Maximum':4, - 'MathBinary':5, 'MathUnary':6, 'AvgPool':7, 'MatMul':8} - self.mathbin2code = {'Sub':0, 'Add':1, 'Mul':2, 'RealDiv':3, 'Minimum':4, 'FloorMod':5} - self.mathun2code = {'Abs':0, 'Sin':1, 'Cos':2, 'Tan':3, 'Asin':4, - 'Acos':5, 'Atan':6, 'Sinh':7, 'Cosh':8, 'Tanh':9, 'Asinh':10, - 'Acosh':11, 'Atanh':12, 'Ceil':13, 'Floor':14, 'Round':15, - 'Exp':16} - self.mirrorpad_mode = {'CONSTANT':0, 'REFLECT':1, 'SYMMETRIC':2} - self.name_operand_dict = {} - - - def add_operand(self, name, type): - node = self.name_node_dict[name] - if name not in self.name_operand_dict: - dtype = node.attr['dtype'].type - if dtype == 0: - dtype = node.attr['T'].type - dims = [-1,-1,-1,-1] - if 'shape' in node.attr: - dims[0] = node.attr['shape'].shape.dim[0].size - dims[1] = node.attr['shape'].shape.dim[1].size - dims[2] = node.attr['shape'].shape.dim[2].size - dims[3] = node.attr['shape'].shape.dim[3].size - operand = Operand(name, dtype, dims) - self.name_operand_dict[name] = operand; - self.name_operand_dict[name].add_iotype(type) - return self.name_operand_dict[name].index - - - def dump_for_tensorboard(self): - graph = tf.get_default_graph() - tf.import_graph_def(self.graph_def, name="") - tf.summary.FileWriter('/tmp/graph', graph) - print('graph saved, run "tensorboard --logdir=/tmp/graph" to see it') - - - def get_conv2d_params(self, conv2d_scope_name): - knode = self.name_node_dict[conv2d_scope_name + '/kernel'] - bnode = self.name_node_dict[conv2d_scope_name + '/bias'] - - if conv2d_scope_name + '/dilation_rate' in self.name_node_dict: - dnode = self.name_node_dict[conv2d_scope_name + '/dilation_rate'] - else: - dnode = None - - # the BiasAdd name is possible be changed into the output name, - # if activation is None, and BiasAdd.next is the last op which is Identity - if conv2d_scope_name + '/BiasAdd' in self.edges: - anode = self.edges[conv2d_scope_name + '/BiasAdd'][0] - if anode.op not in self.conv_activations: - anode = None - else: - anode = None - return knode, bnode, dnode, anode - - - def get_dense_params(self, dense_scope_name): - knode = self.name_node_dict[dense_scope_name + '/kernel'] - bnode = self.name_node_dict.get(dense_scope_name + '/bias') - # the BiasAdd name is possible be changed into the output name, - # if activation is None, and BiasAdd.next is the last op which is Identity - anode = None - if bnode: - if dense_scope_name + '/BiasAdd' in self.edges: - anode = self.edges[dense_scope_name + '/BiasAdd'][0] - if anode.op not in self.conv_activations: - anode = None - else: - anode = None - return knode, bnode, anode - - - def dump_complex_conv2d_to_file(self, node, f): - assert(node.op == 'Conv2D') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - - scope_name = TFConverter.get_scope_name(node.name) - #knode for kernel, bnode for bias, dnode for dilation, anode for activation - knode, bnode, dnode, anode = self.get_conv2d_params(scope_name) - - if dnode is not None: - dilation = struct.unpack('i', dnode.attr['value'].tensor.tensor_content[0:4])[0] - else: - dilation = 1 - - if anode is not None: - activation = anode.op - else: - activation = 'None' - - padding = node.attr['padding'].s.decode("utf-8") - # conv2d with dilation > 1 generates tens of nodes, not easy to parse them, so use this tricky method. - if dilation > 1 and scope_name + '/stack' in self.name_node_dict: - if self.name_node_dict[scope_name + '/stack'].op == "Const": - padding = 'SAME' - padding = self.conv_paddings[padding] - - ktensor = knode.attr['value'].tensor - filter_height = ktensor.tensor_shape.dim[0].size - filter_width = ktensor.tensor_shape.dim[1].size - in_channels = ktensor.tensor_shape.dim[2].size - out_channels = ktensor.tensor_shape.dim[3].size - kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32) - kernel = kernel.reshape(filter_height, filter_width, in_channels, out_channels) - kernel = np.transpose(kernel, [3, 0, 1, 2]) - - has_bias = 1 - np.array([self.op2code[node.op], dilation, padding, self.conv_activations[activation], in_channels, out_channels, filter_height, has_bias], dtype=np.uint32).tofile(f) - kernel.tofile(f) - - btensor = bnode.attr['value'].tensor - if btensor.tensor_shape.dim[0].size == 1: - bias = struct.pack("f", btensor.float_val[0]) - else: - bias = btensor.tensor_content - f.write(bias) - - input_name = self.conv2d_scopename_inputname_dict[scope_name] - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - - if anode is not None: - output_operand_index = self.add_operand(anode.name, Operand.IOTYPE_OUTPUT) - else: - output_operand_index = self.add_operand(self.edges[bnode.name][0].name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - def dump_dense_to_file(self, node, f): - assert(node.op == 'MatMul') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - - scope_name = TFConverter.get_scope_name(node.name) - #knode for kernel, bnode for bias, anode for activation - knode, bnode, anode = self.get_dense_params(scope_name.split('/')[0]) - - if bnode is not None: - has_bias = 1 - btensor = bnode.attr['value'].tensor - if btensor.tensor_shape.dim[0].size == 1: - bias = struct.pack("f", btensor.float_val[0]) - else: - bias = btensor.tensor_content - else: - has_bias = 0 - - if anode is not None: - activation = anode.op - else: - activation = 'None' - - ktensor = knode.attr['value'].tensor - in_channels = ktensor.tensor_shape.dim[0].size - out_channels = ktensor.tensor_shape.dim[1].size - if in_channels * out_channels == 1: - kernel = np.float32(ktensor.float_val[0]) - else: - kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32) - kernel = kernel.reshape(in_channels, out_channels) - kernel = np.transpose(kernel, [1, 0]) - - np.array([self.op2code[node.op], self.conv_activations[activation], in_channels, out_channels, has_bias], dtype=np.uint32).tofile(f) - kernel.tofile(f) - if has_bias: - f.write(bias) - - input_name = self.dense_scopename_inputname_dict[scope_name.split('/')[0]] - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - - if anode is not None: - output_operand_index = self.add_operand(anode.name, Operand.IOTYPE_OUTPUT) - else: - if bnode is not None: - output_operand_index = self.add_operand(self.edges[bnode.name][0].name, Operand.IOTYPE_OUTPUT) - else: - output_operand_index = self.add_operand(self.edges[scope_name+'/concat_1'][0].name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_simple_conv2d_to_file(self, node, f): - assert(node.op == 'Conv2D') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - - node0 = self.name_node_dict[node.input[0]] - node1 = self.name_node_dict[node.input[1]] - if node0.op == 'Const': - knode = node0 - input_name = node.input[1] - else: - knode = node1 - input_name = node.input[0] - - ktensor = knode.attr['value'].tensor - filter_height = ktensor.tensor_shape.dim[0].size - filter_width = ktensor.tensor_shape.dim[1].size - in_channels = ktensor.tensor_shape.dim[2].size - out_channels = ktensor.tensor_shape.dim[3].size - if filter_height * filter_width * in_channels * out_channels == 1: - kernel = np.float32(ktensor.float_val[0]) - else: - kernel = np.frombuffer(ktensor.tensor_content, dtype=np.float32) - kernel = kernel.reshape(filter_height, filter_width, in_channels, out_channels) - kernel = np.transpose(kernel, [3, 0, 1, 2]) - - has_bias = 0 - dilation = 1 - padding = node.attr['padding'].s.decode("utf-8") - np.array([self.op2code[node.op], dilation, self.conv_paddings[padding], self.conv_activations['None'], - in_channels, out_channels, filter_height, has_bias], dtype=np.uint32).tofile(f) - kernel.tofile(f) - - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_depth2space_to_file(self, node, f): - assert(node.op == 'DepthToSpace') - self.layer_number = self.layer_number + 1 - block_size = node.attr['block_size'].i - np.array([self.op2code[node.op], block_size], dtype=np.uint32).tofile(f) - self.converted_nodes.add(node.name) - input_operand_index = self.add_operand(node.input[0], Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_mirrorpad_to_file(self, node, f): - assert(node.op == 'MirrorPad') - self.layer_number = self.layer_number + 1 - mode = node.attr['mode'].s - mode = self.mirrorpad_mode[mode.decode("utf-8")] - np.array([self.op2code[node.op], mode], dtype=np.uint32).tofile(f) - pnode = self.name_node_dict[node.input[1]] - self.converted_nodes.add(pnode.name) - paddings = pnode.attr['value'].tensor.tensor_content - f.write(paddings) - self.converted_nodes.add(node.name) - input_operand_index = self.add_operand(node.input[0], Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_maximum_to_file(self, node, f): - assert(node.op == 'Maximum') - self.layer_number = self.layer_number + 1 - ynode = self.name_node_dict[node.input[1]] - y = ynode.attr['value'].tensor.float_val[0] - np.array([self.op2code[node.op]], dtype=np.uint32).tofile(f) - np.array([y], dtype=np.float32).tofile(f) - self.converted_nodes.add(node.name) - input_operand_index = self.add_operand(node.input[0], Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_mathbinary_to_file(self, node, f): - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - i0_node = self.name_node_dict[node.input[0]] - i1_node = self.name_node_dict[node.input[1]] - np.array([self.op2code['MathBinary'], self.mathbin2code[node.op]], dtype=np.uint32).tofile(f) - if i0_node.op == 'Const': - scalar = i0_node.attr['value'].tensor.float_val[0] - np.array([1], dtype=np.uint32).tofile(f) # broadcast: 1 - np.array([scalar], dtype=np.float32).tofile(f) - np.array([0], dtype=np.uint32).tofile(f) # broadcast: 0 - input_operand_index = self.add_operand(i1_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - elif i1_node.op == 'Const': - scalar = i1_node.attr['value'].tensor.float_val[0] - np.array([0], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i0_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - np.array([1], dtype=np.uint32).tofile(f) - np.array([scalar], dtype=np.float32).tofile(f) - else: - np.array([0], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i0_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - np.array([0], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i1_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([output_operand_index], dtype=np.uint32).tofile(f) - - - def dump_mathunary_to_file(self, node, f): - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - i0_node = self.name_node_dict[node.input[0]] - np.array([self.op2code['MathUnary'], self.mathun2code[node.op]], dtype=np.uint32).tofile(f) - input_operand_index = self.add_operand(i0_node.name, Operand.IOTYPE_INPUT) - np.array([input_operand_index], dtype=np.uint32).tofile(f) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([output_operand_index],dtype=np.uint32).tofile(f) - - - def dump_avg_pool_to_file(self, node, f): - assert(node.op == 'AvgPool') - self.layer_number = self.layer_number + 1 - self.converted_nodes.add(node.name) - node0 = self.name_node_dict[node.input[0]] - strides = node.attr['strides'] - - # Tensorflow do not support pooling strides in batch dimension and - # current native NN do not support pooling strides in channel dimension, added assert() here. - assert(strides.list.i[1]==strides.list.i[2]) - assert(strides.list.i[0]==1) - assert(strides.list.i[3]==1) - strides = strides.list.i[1] - filter_node = node.attr['ksize'] - input_name = node.input[0] - - # Tensorflow do not support pooling ksize in batch dimension and channel dimension. - assert(filter_node.list.i[0]==1) - assert(filter_node.list.i[3]==1) - filter_height = filter_node.list.i[1] - filter_width = filter_node.list.i[2] - - padding = node.attr['padding'].s.decode("utf-8") - np.array([self.op2code[node.op], strides, self.pool_paddings[padding], filter_height], - dtype=np.uint32).tofile(f) - - input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) - output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) - np.array([input_operand_index, output_operand_index],dtype=np.uint32).tofile(f) - - - def dump_layers_to_file(self, f): - for node in self.nodes: - if node.name in self.converted_nodes: - continue - - # conv2d with dilation generates very complex nodes, so handle it in special - if self.in_conv2d_scope(node.name): - if node.op == 'Conv2D': - self.dump_complex_conv2d_to_file(node, f) - continue - if self.in_dense_scope(node.name): - if node.op == 'MatMul': - self.dump_dense_to_file(node, f) - continue - - - if node.op == 'Conv2D': - self.dump_simple_conv2d_to_file(node, f) - continue - if node.name in self.output_names: - input_name = self.id_different_scope_dict[node.name] - if TFConverter.get_scope_name(input_name)!=TFConverter.get_scope_name(node.name): - continue - if node.op == 'AvgPool': - self.dump_avg_pool_to_file(node, f) - elif node.op == 'DepthToSpace': - self.dump_depth2space_to_file(node, f) - elif node.op == 'MirrorPad': - self.dump_mirrorpad_to_file(node, f) - elif node.op == 'Maximum': - self.dump_maximum_to_file(node, f) - elif node.op in self.mathbin2code: - self.dump_mathbinary_to_file(node, f) - elif node.op in self.mathun2code: - self.dump_mathunary_to_file(node, f) - - - def dump_operands_to_file(self, f): - operands = sorted(self.name_operand_dict.values()) - for operand in operands: - #print('{}'.format(operand)) - np.array([operand.index, len(operand.name)], dtype=np.uint32).tofile(f) - f.write(operand.name.encode('utf-8')) - np.array([operand.iotype, operand.dtype], dtype=np.uint32).tofile(f) - np.array(operand.dims, dtype=np.uint32).tofile(f) - - - def dump_to_file(self): - with open(self.outfile, 'wb') as f: - f.write(header.str.encode('utf-8')) - np.array([header.major, header.minor], dtype=np.uint32).tofile(f) - self.dump_layers_to_file(f) - self.dump_operands_to_file(f) - np.array([self.layer_number, len(self.name_operand_dict)], dtype=np.uint32).tofile(f) - - - def generate_name_node_dict(self): - for node in self.nodes: - self.name_node_dict[node.name] = node - - - def generate_output_names(self): - used_names = [] - for node in self.nodes: - for input in node.input: - used_names.append(input) - - for node in self.nodes: - if node.name not in used_names: - self.output_names.append(node.name) - - - def remove_identity(self): - self.id_different_scope_dict = {} - id_nodes = [] - id_dict = {} - for node in self.nodes: - if node.op == 'Identity': - name = node.name - input = node.input[0] - id_nodes.append(node) - # do not change the output name - if name in self.output_names: - self.name_node_dict[input].name = name - self.name_node_dict[name] = self.name_node_dict[input] - del self.name_node_dict[input] - self.id_different_scope_dict[name] = input - else: - id_dict[name] = input - - for idnode in id_nodes: - self.nodes.remove(idnode) - - for node in self.nodes: - for i in range(len(node.input)): - input = node.input[i] - if input in id_dict: - node.input[i] = id_dict[input] - - - def generate_edges(self): - for node in self.nodes: - for input in node.input: - if input in self.edges: - self.edges[input].append(node) - else: - self.edges[input] = [node] - - - @staticmethod - def get_scope_name(name): - index = name.rfind('/') - if index == -1: - return "" - return name[0:index] - - - def in_conv2d_scope(self, name): - inner_scope = TFConverter.get_scope_name(name) - if inner_scope == "": - return False; - for scope in self.conv2d_scope_names: - index = inner_scope.find(scope) - if index == 0: - return True - return False - - - def in_dense_scope(self, name): - inner_scope = TFConverter.get_scope_name(name) - if inner_scope == "": - return False; - for scope in self.dense_scope_names: - index = inner_scope.find(scope) - if index == 0: - return True - return False - - def generate_sub_block_op_scope_info(self): - # mostly, conv2d/dense is a sub block in graph, get the scope name - for node in self.nodes: - if node.op == 'Conv2D': - scope = TFConverter.get_scope_name(node.name) - # for the case tf.nn.conv2d is called directly - if scope == '': - continue - # for the case tf.nn.conv2d is called within a scope - if scope + '/kernel' not in self.name_node_dict: - continue - self.conv2d_scope_names.add(scope) - elif node.op == 'MatMul': - scope = TFConverter.get_scope_name(node.name) - # for the case tf.nn.dense is called directly - if scope == '': - continue - # for the case tf.nn.dense is called within a scope - if scope + '/kernel' not in self.name_node_dict and scope.split('/Tensordot')[0] + '/kernel' not in self.name_node_dict: - continue - self.dense_scope_names.add(scope.split('/Tensordot')[0]) - - # get the input name to the conv2d/dense sub block - for node in self.nodes: - scope = TFConverter.get_scope_name(node.name) - if scope in self.conv2d_scope_names: - if node.op == 'Conv2D' or node.op == 'Shape': - for inp in node.input: - if TFConverter.get_scope_name(inp) != scope: - self.conv2d_scopename_inputname_dict[scope] = inp - elif scope in self.dense_scope_names: - if node.op == 'MatMul' or node.op == 'Shape': - for inp in node.input: - if TFConverter.get_scope_name(inp) != scope: - self.dense_scopename_inputname_dict[scope] = inp - elif scope.split('/Tensordot')[0] in self.dense_scope_names: - if node.op == 'Transpose': - for inp in node.input: - if TFConverter.get_scope_name(inp).find(scope)<0 and TFConverter.get_scope_name(inp).find(scope.split('/')[0])<0: - self.dense_scopename_inputname_dict[scope.split('/Tensordot')[0]] = inp - - - def run(self): - self.generate_name_node_dict() - self.generate_output_names() - self.remove_identity() - self.generate_edges() - self.generate_sub_block_op_scope_info() - - if self.dump4tb: - self.dump_for_tensorboard() - - self.dump_to_file() - - -def convert_from_tensorflow(infile, outfile, dump4tb): - with open(infile, 'rb') as f: - # read the file in .proto format - graph_def = tf.GraphDef() - graph_def.ParseFromString(f.read()) - nodes = graph_def.node - - converter = TFConverter(graph_def, nodes, outfile, dump4tb) - converter.run() diff --git a/tools/python/convert_header.py b/tools/python/convert_header.py deleted file mode 100644 index 143f92c42e..0000000000 --- a/tools/python/convert_header.py +++ /dev/null @@ -1,26 +0,0 @@ -# Copyright (c) 2019 -# -# This file is part of FFmpeg. -# -# FFmpeg is free software; you can redistribute it and/or -# modify it under the terms of the GNU Lesser General Public -# License as published by the Free Software Foundation; either -# version 2.1 of the License, or (at your option) any later version. -# -# FFmpeg is distributed in the hope that it will be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -# Lesser General Public License for more details. -# -# You should have received a copy of the GNU Lesser General Public -# License along with FFmpeg; if not, write to the Free Software -# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA -# ============================================================================== - -str = 'FFMPEGDNNNATIVE' - -# increase major and reset minor when we have to re-convert the model file -major = 1 - -# increase minor when we don't have to re-convert the model file -minor = 23

[FFmpeg-devel,V6,2/3] lavfi/dnn: Modified DNN native backend related tools and docs.

Checks

Commit Message

Comments

Patch