diff mbox

[FFmpeg-devel,2/2] convert_from_tensorflow.py: support conv2d with dilation

Message ID 1564449978-4370-1-git-send-email-yejun.guo@intel.com
State Accepted
Commit ddd92ba2c6c6d0b5f3d5b4496ab07fbcf73b58a2
Headers show

Commit Message

Guo, Yejun July 30, 2019, 1:26 a.m. UTC
conv2d with dilation > 1 generates tens of nodes in graph, it is not
easy to parse each node one by one, so we do special tricks to parse
the conv2d layer.

Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
---
 tools/python/convert_from_tensorflow.py | 80 ++++++++++++++++++++++++---------
 1 file changed, 59 insertions(+), 21 deletions(-)

Comments

Guo, Yejun Aug. 9, 2019, 3:25 p.m. UTC | #1
> -----Original Message-----
> From: Guo, Yejun
> Sent: Tuesday, July 30, 2019 9:26 AM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Guo, Yejun <yejun.guo@intel.com>
> Subject: [PATCH 2/2] convert_from_tensorflow.py: support conv2d with dilation
> 
> conv2d with dilation > 1 generates tens of nodes in graph, it is not
> easy to parse each node one by one, so we do special tricks to parse
> the conv2d layer.
> 
> Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
> ---
>  tools/python/convert_from_tensorflow.py | 80
> ++++++++++++++++++++++++---------
>  1 file changed, 59 insertions(+), 21 deletions(-)

this patch set asks for review, thanks.

I've locally finished more patches to improve dnn module, plan to send more them set by set, since the patches have dependency.

Just in case you are interested in these new patches, I've uploaded to https://github.com/guoyejun/ffmpeg/tree/dnn0809. 
for your convenient, I also copy the oneline log here for each patch (from newer to older) with 4 patch sets.

7eced90 libavfilter/dnn: support multiple outputs for native mode
28a7054 libavfilter/dnn/dnn_backend_native: find the input operand according to input name

256e657 FATE/dnn: add unit test for layer maximum
8c616a0 libavfilter/dnn: add layer maximum for native mode.

8ec6c0c FATE/dnn: add unit test for dnn depth_to_space layer
09ef108 libavfilter/dnn: separate depth_to_space layer from dnn_backend_native.c to a new file
c65b59d FATE/dnn: add unit test for dnn conv2d layer
a5d69a7 libavfilter/dnn: separate conv2d layer from dnn_backend_native.c to a new file

202d323 dnn: export operand info in python script and load in c code
3c706a0 dnn: change .model file format to put layer number at the end of file
0256731 dnn: introduce dnn operand (in c code) to hold operand infos within network


Besides continuous dnn improvement, I also plan to add two generic video filters for dnn.
- a generic filter to process the content of AVFrame with different dnn networks.
and so the current specific filters such as vf_sr (some changes needed) and vf_derain are no longer needed, since they can be
included in this specific filter. And of course, in practice I'll not remove them.

- a generic filter to analyze the content of AVFrame to generate some side data with different dnn networks. The content of AVFrame does not change.
The application, which invokes the filter with a given dnn network, has the responsibility/knowledge to parse the side data (analyze result).
Pedro Arthur Aug. 13, 2019, 4:09 p.m. UTC | #2
LGTM.
Should push soon.

BTW I just noticed that the tensorflow backend is failling to load SR
filter models.

$ python tools/python/convert.py sr_models/srcnn.pb
$ ./ffmpeg -i input.jpg -vf
sr=model=srcnn.model:dnn_backend=tensorflow out_srcnn_tf.png

The above command fails.
It seems commit ccbab41039af424237eaac5c302c293ab97540f8 is the
problem. I thought I had tested it but clearly I made a mistake
somewhere in the process.
I suppose you have the .pb files to test it, but let me know if you need them.

Em sex, 9 de ago de 2019 às 12:25, Guo, Yejun <yejun.guo@intel.com> escreveu:
>
>
>
> > -----Original Message-----
> > From: Guo, Yejun
> > Sent: Tuesday, July 30, 2019 9:26 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Guo, Yejun <yejun.guo@intel.com>
> > Subject: [PATCH 2/2] convert_from_tensorflow.py: support conv2d with dilation
> >
> > conv2d with dilation > 1 generates tens of nodes in graph, it is not
> > easy to parse each node one by one, so we do special tricks to parse
> > the conv2d layer.
> >
> > Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
> > ---
> >  tools/python/convert_from_tensorflow.py | 80
> > ++++++++++++++++++++++++---------
> >  1 file changed, 59 insertions(+), 21 deletions(-)
>
> this patch set asks for review, thanks.
>
> I've locally finished more patches to improve dnn module, plan to send more them set by set, since the patches have dependency.
>
> Just in case you are interested in these new patches, I've uploaded to https://github.com/guoyejun/ffmpeg/tree/dnn0809.
> for your convenient, I also copy the oneline log here for each patch (from newer to older) with 4 patch sets.
>
> 7eced90 libavfilter/dnn: support multiple outputs for native mode
> 28a7054 libavfilter/dnn/dnn_backend_native: find the input operand according to input name
>
> 256e657 FATE/dnn: add unit test for layer maximum
> 8c616a0 libavfilter/dnn: add layer maximum for native mode.
>
> 8ec6c0c FATE/dnn: add unit test for dnn depth_to_space layer
> 09ef108 libavfilter/dnn: separate depth_to_space layer from dnn_backend_native.c to a new file
> c65b59d FATE/dnn: add unit test for dnn conv2d layer
> a5d69a7 libavfilter/dnn: separate conv2d layer from dnn_backend_native.c to a new file
>
> 202d323 dnn: export operand info in python script and load in c code
> 3c706a0 dnn: change .model file format to put layer number at the end of file
> 0256731 dnn: introduce dnn operand (in c code) to hold operand infos within network
>
>
> Besides continuous dnn improvement, I also plan to add two generic video filters for dnn.
> - a generic filter to process the content of AVFrame with different dnn networks.
> and so the current specific filters such as vf_sr (some changes needed) and vf_derain are no longer needed, since they can be
> included in this specific filter. And of course, in practice I'll not remove them.
>
> - a generic filter to analyze the content of AVFrame to generate some side data with different dnn networks. The content of AVFrame does not change.
> The application, which invokes the filter with a given dnn network, has the responsibility/knowledge to parse the side data (analyze result).
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Guo, Yejun Aug. 14, 2019, 6:36 a.m. UTC | #3
> -----Original Message-----

> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of

> Pedro Arthur

> Sent: Wednesday, August 14, 2019 12:09 AM

> To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>

> Subject: Re: [FFmpeg-devel] [PATCH 2/2] convert_from_tensorflow.py: support

> conv2d with dilation

> 

> LGTM.

> Should push soon.


thanks.

> 

> BTW I just noticed that the tensorflow backend is failling to load SR

> filter models.

> 

> $ python tools/python/convert.py sr_models/srcnn.pb

> $ ./ffmpeg -i input.jpg -vf

> sr=model=srcnn.model:dnn_backend=tensorflow out_srcnn_tf.png

> 

> The above command fails.

> It seems commit ccbab41039af424237eaac5c302c293ab97540f8 is the

> problem. I thought I had tested it but clearly I made a mistake

> somewhere in the process.

> I suppose you have the .pb files to test it, but let me know if you need them.


yes, I have the .pb files. I missed the patch for such support. Will refine and send out soon.


> 

> Em sex, 9 de ago de 2019 às 12:25, Guo, Yejun <yejun.guo@intel.com>

> escreveu:

> >

> >

> >

> > > -----Original Message-----

> > > From: Guo, Yejun

> > > Sent: Tuesday, July 30, 2019 9:26 AM

> > > To: ffmpeg-devel@ffmpeg.org

> > > Cc: Guo, Yejun <yejun.guo@intel.com>

> > > Subject: [PATCH 2/2] convert_from_tensorflow.py: support conv2d with

> dilation

> > >

> > > conv2d with dilation > 1 generates tens of nodes in graph, it is not

> > > easy to parse each node one by one, so we do special tricks to parse

> > > the conv2d layer.

> > >

> > > Signed-off-by: Guo, Yejun <yejun.guo@intel.com>

> > > ---

> > >  tools/python/convert_from_tensorflow.py | 80

> > > ++++++++++++++++++++++++---------

> > >  1 file changed, 59 insertions(+), 21 deletions(-)

> >

> > this patch set asks for review, thanks.

> >

> > I've locally finished more patches to improve dnn module, plan to send more

> them set by set, since the patches have dependency.

> >

> > Just in case you are interested in these new patches, I've uploaded to

> https://github.com/guoyejun/ffmpeg/tree/dnn0809.

> > for your convenient, I also copy the oneline log here for each patch (from

> newer to older) with 4 patch sets.

> >

> > 7eced90 libavfilter/dnn: support multiple outputs for native mode

> > 28a7054 libavfilter/dnn/dnn_backend_native: find the input operand

> according to input name

> >

> > 256e657 FATE/dnn: add unit test for layer maximum

> > 8c616a0 libavfilter/dnn: add layer maximum for native mode.

> >

> > 8ec6c0c FATE/dnn: add unit test for dnn depth_to_space layer

> > 09ef108 libavfilter/dnn: separate depth_to_space layer from

> dnn_backend_native.c to a new file

> > c65b59d FATE/dnn: add unit test for dnn conv2d layer

> > a5d69a7 libavfilter/dnn: separate conv2d layer from dnn_backend_native.c to

> a new file

> >

> > 202d323 dnn: export operand info in python script and load in c code

> > 3c706a0 dnn: change .model file format to put layer number at the end of file

> > 0256731 dnn: introduce dnn operand (in c code) to hold operand infos within

> network

> >

> >

> > Besides continuous dnn improvement, I also plan to add two generic video

> filters for dnn.

> > - a generic filter to process the content of AVFrame with different dnn

> networks.

> > and so the current specific filters such as vf_sr (some changes needed) and

> vf_derain are no longer needed, since they can be

> > included in this specific filter. And of course, in practice I'll not remove them.

> >

> > - a generic filter to analyze the content of AVFrame to generate some side

> data with different dnn networks. The content of AVFrame does not change.

> > The application, which invokes the filter with a given dnn network, has the

> responsibility/knowledge to parse the side data (analyze result).

> >

> > _______________________________________________

> > ffmpeg-devel mailing list

> > ffmpeg-devel@ffmpeg.org

> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

> >

> > To unsubscribe, visit link above, or email

> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

> _______________________________________________

> ffmpeg-devel mailing list

> ffmpeg-devel@ffmpeg.org

> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

> 

> To unsubscribe, visit link above, or email

> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
Pedro Arthur Aug. 15, 2019, 6 p.m. UTC | #4
Pushed.

Em qua, 14 de ago de 2019 às 03:37, Guo, Yejun <yejun.guo@intel.com> escreveu:
>
>
>
> > -----Original Message-----
> > From: ffmpeg-devel [mailto:ffmpeg-devel-bounces@ffmpeg.org] On Behalf Of
> > Pedro Arthur
> > Sent: Wednesday, August 14, 2019 12:09 AM
> > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH 2/2] convert_from_tensorflow.py: support
> > conv2d with dilation
> >
> > LGTM.
> > Should push soon.
>
> thanks.
>
> >
> > BTW I just noticed that the tensorflow backend is failling to load SR
> > filter models.
> >
> > $ python tools/python/convert.py sr_models/srcnn.pb
> > $ ./ffmpeg -i input.jpg -vf
> > sr=model=srcnn.model:dnn_backend=tensorflow out_srcnn_tf.png
> >
> > The above command fails.
> > It seems commit ccbab41039af424237eaac5c302c293ab97540f8 is the
> > problem. I thought I had tested it but clearly I made a mistake
> > somewhere in the process.
> > I suppose you have the .pb files to test it, but let me know if you need them.
>
> yes, I have the .pb files. I missed the patch for such support. Will refine and send out soon.
>
>
> >
> > Em sex, 9 de ago de 2019 às 12:25, Guo, Yejun <yejun.guo@intel.com>
> > escreveu:
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Guo, Yejun
> > > > Sent: Tuesday, July 30, 2019 9:26 AM
> > > > To: ffmpeg-devel@ffmpeg.org
> > > > Cc: Guo, Yejun <yejun.guo@intel.com>
> > > > Subject: [PATCH 2/2] convert_from_tensorflow.py: support conv2d with
> > dilation
> > > >
> > > > conv2d with dilation > 1 generates tens of nodes in graph, it is not
> > > > easy to parse each node one by one, so we do special tricks to parse
> > > > the conv2d layer.
> > > >
> > > > Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
> > > > ---
> > > >  tools/python/convert_from_tensorflow.py | 80
> > > > ++++++++++++++++++++++++---------
> > > >  1 file changed, 59 insertions(+), 21 deletions(-)
> > >
> > > this patch set asks for review, thanks.
> > >
> > > I've locally finished more patches to improve dnn module, plan to send more
> > them set by set, since the patches have dependency.
> > >
> > > Just in case you are interested in these new patches, I've uploaded to
> > https://github.com/guoyejun/ffmpeg/tree/dnn0809.
> > > for your convenient, I also copy the oneline log here for each patch (from
> > newer to older) with 4 patch sets.
> > >
> > > 7eced90 libavfilter/dnn: support multiple outputs for native mode
> > > 28a7054 libavfilter/dnn/dnn_backend_native: find the input operand
> > according to input name
> > >
> > > 256e657 FATE/dnn: add unit test for layer maximum
> > > 8c616a0 libavfilter/dnn: add layer maximum for native mode.
> > >
> > > 8ec6c0c FATE/dnn: add unit test for dnn depth_to_space layer
> > > 09ef108 libavfilter/dnn: separate depth_to_space layer from
> > dnn_backend_native.c to a new file
> > > c65b59d FATE/dnn: add unit test for dnn conv2d layer
> > > a5d69a7 libavfilter/dnn: separate conv2d layer from dnn_backend_native.c to
> > a new file
> > >
> > > 202d323 dnn: export operand info in python script and load in c code
> > > 3c706a0 dnn: change .model file format to put layer number at the end of file
> > > 0256731 dnn: introduce dnn operand (in c code) to hold operand infos within
> > network
> > >
> > >
> > > Besides continuous dnn improvement, I also plan to add two generic video
> > filters for dnn.
> > > - a generic filter to process the content of AVFrame with different dnn
> > networks.
> > > and so the current specific filters such as vf_sr (some changes needed) and
> > vf_derain are no longer needed, since they can be
> > > included in this specific filter. And of course, in practice I'll not remove them.
> > >
> > > - a generic filter to analyze the content of AVFrame to generate some side
> > data with different dnn networks. The content of AVFrame does not change.
> > > The application, which invokes the filter with a given dnn network, has the
> > responsibility/knowledge to parse the side data (analyze result).
> > >
> > > _______________________________________________
> > > ffmpeg-devel mailing list
> > > ffmpeg-devel@ffmpeg.org
> > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > >
> > > To unsubscribe, visit link above, or email
> > > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".
diff mbox

Patch

diff --git a/tools/python/convert_from_tensorflow.py b/tools/python/convert_from_tensorflow.py
index 804c14f..34454b8 100644
--- a/tools/python/convert_from_tensorflow.py
+++ b/tools/python/convert_from_tensorflow.py
@@ -33,9 +33,10 @@  class TFConverter:
         self.output_names = []
         self.name_node_dict = {}
         self.edges = {}
-        self.conv_activations = {'Relu':0, 'Tanh':1, 'Sigmoid':2, 'LeakyRelu':4}
+        self.conv_activations = {'Relu':0, 'Tanh':1, 'Sigmoid':2, 'None':3, 'LeakyRelu':4}
         self.conv_paddings = {'VALID':0, 'SAME':1}
         self.converted_nodes = set()
+        self.conv2d_scope_names = set()
         self.op2code = {'Conv2D':1, 'DepthToSpace':2, 'MirrorPad':3}
         self.mirrorpad_mode = {'CONSTANT':0, 'REFLECT':1, 'SYMMETRIC':2}
 
@@ -47,30 +48,45 @@  class TFConverter:
         print('graph saved, run "tensorboard --logdir=/tmp/graph" to see it')
 
 
-    def get_conv2d_params(self, node):
-        knode = self.name_node_dict[node.input[1]]
-        bnode = None
-        activation = 'None'
-        next = self.edges[node.name][0]
-        if next.op == 'BiasAdd':
-            self.converted_nodes.add(next.name)
-            bnode = self.name_node_dict[next.input[1]]
-            next = self.edges[next.name][0]
-        if next.op in self.conv_activations:
-            self.converted_nodes.add(next.name)
-            activation = next.op
-        return knode, bnode, activation
+    def get_conv2d_params(self, conv2d_scope_name):
+        knode = self.name_node_dict[conv2d_scope_name + '/kernel']
+        bnode = self.name_node_dict[conv2d_scope_name + '/bias']
+
+        if conv2d_scope_name + '/dilation_rate' in self.name_node_dict:
+            dnode = self.name_node_dict[conv2d_scope_name + '/dilation_rate']
+        else:
+            dnode = None
+
+        # the BiasAdd name is possible be changed into the output name,
+        # if activation is None, and BiasAdd.next is the last op which is Identity
+        if conv2d_scope_name + '/BiasAdd' in self.edges:
+            activation = self.edges[conv2d_scope_name + '/BiasAdd'][0]
+            activation = activation.op
+        else:
+            activation = 'None'
+        return knode, bnode, dnode, activation
 
 
     def dump_conv2d_to_file(self, node, f):
         assert(node.op == 'Conv2D')
         self.layer_number = self.layer_number + 1
         self.converted_nodes.add(node.name)
-        knode, bnode, activation = self.get_conv2d_params(node)
 
-        dilation = node.attr['dilations'].list.i[0]
-        padding = node.attr['padding'].s
-        padding = self.conv_paddings[padding.decode("utf-8")]
+        scope_name = TFConverter.get_scope_name(node.name)
+        #knode for kernel, bnode for bias, dnode for dilation
+        knode, bnode, dnode, activation = self.get_conv2d_params(scope_name)
+
+        if dnode is not None:
+            dilation = struct.unpack('i', dnode.attr['value'].tensor.tensor_content[0:4])[0]
+        else:
+            dilation = 1
+
+        padding = node.attr['padding'].s.decode("utf-8")
+        # conv2d with dilation > 1 generates tens of nodes, not easy to parse them, so use tricky.
+        if dilation > 1 and scope_name + '/stack' in self.name_node_dict:
+            if self.name_node_dict[scope_name + '/stack'].op == "Const":
+                padding = 'SAME'
+        padding = self.conv_paddings[padding]
 
         ktensor = knode.attr['value'].tensor
         filter_height = ktensor.tensor_shape.dim[0].size
@@ -126,9 +142,15 @@  class TFConverter:
         for node in self.nodes:
             if node.name in self.converted_nodes:
                 continue
-            if node.op == 'Conv2D':
-                self.dump_conv2d_to_file(node, f)
-            elif node.op == 'DepthToSpace':
+
+            # conv2d with dilation generates very complex nodes, so handle it in special
+            scope_name = TFConverter.get_scope_name(node.name)
+            if scope_name in self.conv2d_scope_names:
+                if node.op == 'Conv2D':
+                    self.dump_conv2d_to_file(node, f)
+                continue
+
+            if node.op == 'DepthToSpace':
                 self.dump_depth2space_to_file(node, f)
             elif node.op == 'MirrorPad':
                 self.dump_mirrorpad_to_file(node, f)
@@ -192,11 +214,27 @@  class TFConverter:
                     self.edges[input] = [node]
 
 
+    @staticmethod
+    def get_scope_name(name):
+        index = name.rfind('/')
+        if index == -1:
+            return ""
+        return name[0:index]
+
+
+    def generate_conv2d_scope_names(self):
+        for node in self.nodes:
+            if node.op == 'Conv2D':
+                scope = TFConverter.get_scope_name(node.name)
+                self.conv2d_scope_names.add(scope)
+
+
     def run(self):
         self.generate_name_node_dict()
         self.generate_output_names()
         self.remove_identity()
         self.generate_edges()
+        self.generate_conv2d_scope_names()
 
         if self.dump4tb:
             self.dump_for_tensorboard()