From patchwork Thu Jul 30 10:02:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fu, Ting" X-Patchwork-Id: 21377 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 2F89344A892 for ; Thu, 30 Jul 2020 13:08:38 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0BFB768BAE6; Thu, 30 Jul 2020 13:08:38 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C19B568B87D for ; Thu, 30 Jul 2020 13:08:31 +0300 (EEST) IronPort-SDR: 1OWyCFRVH/6Waaf1mi0CJYKzyPTH8m03oFpEIA87ND4+UHhzmTjLC7+75LO5/yMKCcqN2TUJFB spr0600YlQ7A== X-IronPort-AV: E=McAfee;i="6000,8403,9697"; a="149405165" X-IronPort-AV: E=Sophos;i="5.75,414,1589266800"; d="scan'208";a="149405165" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jul 2020 03:08:30 -0700 IronPort-SDR: EquoBYOCcvL2AS3YAyx5fieKiRGJqpoC/vUlMuNTNEENvb5xCca7VEDpL2E/Pn4YoLO4I4pUHt 7t6+CEjOwoxg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,414,1589266800"; d="scan'208";a="313362946" Received: from semmer-ubuntu.sh.intel.com ([10.239.159.54]) by fmsmga004.fm.intel.com with ESMTP; 30 Jul 2020 03:08:28 -0700 From: Ting Fu To: ffmpeg-devel@ffmpeg.org Date: Thu, 30 Jul 2020 18:02:58 +0800 Message-Id: <20200730100259.7511-1-ting.fu@intel.com> X-Mailer: git-send-email 2.17.1 Subject: [FFmpeg-devel] [PATCH V3 1/2] dnn/native: add native support for avg_pool X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Not support pooling strides in channel dimension now. It can be tested with the model generated with below python script: import tensorflow as tf import numpy as np import imageio in_img = imageio.imread('input_odd.jpg') in_img = in_img.astype(np.float32)/255.0 in_data = in_img[np.newaxis, :] x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in') x_pool = tf.nn.avg_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME') #please alter the params as needed y = tf.identity(x_pool, name='dnn_out') sess=tf.Session() sess.run(tf.global_variables_initializer()) graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out']) tf.train.write_graph(graph_def, '.', 'image_process.pb', as_text=False) print("image_process.pb generated, please use \ path_to_ffmpeg/tools/python/convert.py to generate image_process.model\n") output = sess.run(y, feed_dict={x: in_data}) imageio.imsave("out.jpg", np.squeeze(output)) Signed-off-by: Ting Fu --- libavfilter/dnn/Makefile | 1 + libavfilter/dnn/dnn_backend_native.h | 2 + .../dnn/dnn_backend_native_layer_avgpool.c | 147 ++++++++++++++++++ .../dnn/dnn_backend_native_layer_avgpool.h | 35 +++++ .../dnn/dnn_backend_native_layer_conv2d.h | 3 +- libavfilter/dnn/dnn_backend_native_layers.c | 2 + tools/python/convert_from_tensorflow.py | 39 ++++- 7 files changed, 226 insertions(+), 3 deletions(-) create mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.c create mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.h diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile index d90137ec42..e0957073ee 100644 --- a/libavfilter/dnn/Makefile +++ b/libavfilter/dnn/Makefile @@ -1,6 +1,7 @@ OBJS-$(CONFIG_DNN) += dnn/dnn_interface.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layers.o +OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_avgpool.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_pad.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_conv2d.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_depth2space.o diff --git a/libavfilter/dnn/dnn_backend_native.h b/libavfilter/dnn/dnn_backend_native.h index 62191ffe88..26e9a33387 100644 --- a/libavfilter/dnn/dnn_backend_native.h +++ b/libavfilter/dnn/dnn_backend_native.h @@ -43,10 +43,12 @@ typedef enum { DLT_MAXIMUM = 4, DLT_MATH_BINARY = 5, DLT_MATH_UNARY = 6, + DLT_AVG_POOL = 7, DLT_COUNT } DNNLayerType; typedef enum {DOT_INPUT = 1, DOT_OUTPUT = 2, DOT_INTERMEDIATE = DOT_INPUT | DOT_OUTPUT} DNNOperandType; +typedef enum {VALID, SAME, SAME_CLAMP_TO_EDGE} DNNPaddingParam; typedef struct Layer{ DNNLayerType type; diff --git a/libavfilter/dnn/dnn_backend_native_layer_avgpool.c b/libavfilter/dnn/dnn_backend_native_layer_avgpool.c new file mode 100644 index 0000000000..a6ebb0db8f --- /dev/null +++ b/libavfilter/dnn/dnn_backend_native_layer_avgpool.c @@ -0,0 +1,147 @@ +/* + * Copyright (c) 2020 + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * DNN native backend implementation. + */ + +#include "libavutil/avassert.h" +#include "dnn_backend_native_layer_avgpool.h" + +int dnn_load_layer_avg_pool(Layer *layer, AVIOContext *model_file_context, int file_size, int operands_num) +{ + AvgPoolParams *avgpool_params; + int dnn_size = 0; + avgpool_params = av_malloc(sizeof(*avgpool_params)); + if(!avgpool_params) + return 0; + + avgpool_params->strides = (int32_t)avio_rl32(model_file_context); + avgpool_params->padding_method = (int32_t)avio_rl32(model_file_context); + avgpool_params->in_channels = (int32_t)avio_rl32(model_file_context); + avgpool_params->out_channels = (int32_t)avio_rl32(model_file_context); + avgpool_params->kernel_size = (int32_t)avio_rl32(model_file_context); + dnn_size += 20; + + if (dnn_size > file_size || avgpool_params->in_channels <= 0 || + avgpool_params->out_channels <= 0 || avgpool_params->kernel_size <= 0 || + avgpool_params->strides <=0){ + av_freep(&avgpool_params); + return 0; + } + + layer->params = avgpool_params; + layer->input_operand_indexes[0] = (int32_t)avio_rl32(model_file_context); + layer->output_operand_index = (int32_t)avio_rl32(model_file_context); + dnn_size += 8; + + if (layer->input_operand_indexes[0] >= operands_num || layer->output_operand_index >= operands_num) { + return 0; + } + return dnn_size; +} + +int dnn_execute_layer_avg_pool(DnnOperand *operands, const int32_t *input_operand_indexes, + int32_t output_operand_index, const void *parameters) +{ + float *output; + int height_end, width_end, height_radius, width_radius, output_height, output_width, kernel_area; + int32_t input_operand_index = input_operand_indexes[0]; + int number = operands[input_operand_index].dims[0]; + int height = operands[input_operand_index].dims[1]; + int width = operands[input_operand_index].dims[2]; + int channel = operands[input_operand_index].dims[3]; + const float *input = operands[input_operand_index].data; + const AvgPoolParams *avgpool_params = (const AvgPoolParams *)parameters; + + int kernel_strides = avgpool_params->strides; + int src_linesize = width * channel; + DnnOperand *output_operand = &operands[output_operand_index]; + + /** + * When padding_method = SAME, the tensorflow will only padding the hald number of 0 pxiels + * except the remainders. + * Eg: assuming the input height = 1080, the strides = 11, so the remainders = 1080 % 11 = 2 + * and if ksize = 5: it will fill (5 - 2) >> 1 = 1 line before the first line of input image, + * and 5 - 2 - 1 = 2 lines after the last line of input image. + * and if ksize = 7: it will fill (7 - 2) >> 1 = 2 lines before the first line of input image, + * and 7 - 2 - 2 = 3 lines after the last line of input image. + */ + if (avgpool_params->padding_method == SAME) { + height_end = height; + width_end = width; + height_radius = avgpool_params->kernel_size - ((height - 1) % kernel_strides + 1); + width_radius = avgpool_params->kernel_size - ((width - 1) % kernel_strides + 1); + height_radius = height_radius < 0 ? 0 : height_radius >> 1; + width_radius = width_radius < 0 ? 0 : width_radius >> 1; + output_height = ceil(height / (kernel_strides * 1.0)); + output_width = ceil(width / (kernel_strides * 1.0)); + } else { + height_end = height - avgpool_params->kernel_size + 1; + width_end = width - avgpool_params->kernel_size + 1; + height_radius = 0; + width_radius = 0; + output_height = ceil((height - avgpool_params->kernel_size + 1) / (kernel_strides * 1.0)); + output_width = ceil((width - avgpool_params->kernel_size + 1) / (kernel_strides * 1.0)); + } + + output_operand->dims[0] = number; + output_operand->dims[1] = output_height; + output_operand->dims[2] = output_width; + // not support pooling in channel dimension now + output_operand->dims[3] = channel; + output_operand->data_type = operands[input_operand_index].data_type; + output_operand->length = calculate_operand_data_length(output_operand); + output_operand->data = av_realloc(output_operand->data, output_operand->length); + if (!output_operand->data) + return -1; + output = output_operand->data; + + av_assert0(channel == avgpool_params->in_channels); + av_assert0(channel == avgpool_params->out_channels); + + for (int y = 0; y < height_end; y += kernel_strides) { + for (int x = 0; x < width_end; x += kernel_strides) { + for (int n_channel = 0; n_channel < channel; ++n_channel) { + output[n_channel] = 0.0; + kernel_area = 0; + for (int kernel_y = 0; kernel_y < avgpool_params->kernel_size; ++kernel_y) { + for (int kernel_x = 0; kernel_x < avgpool_params->kernel_size; ++kernel_x) { + float input_pel; + int y_pos = y + (kernel_y - height_radius); + int x_pos = x + (kernel_x - width_radius); + if (x_pos < 0 || x_pos >= width || y_pos < 0 || y_pos >= height) { + input_pel = 0.0; + } else { + kernel_area++; + input_pel = input[y_pos * src_linesize + x_pos * channel + n_channel]; + } + output[n_channel] += input_pel; + } + } + output[n_channel] /= kernel_area; + } + output += avgpool_params->out_channels; + } + } + + return 0; +} diff --git a/libavfilter/dnn/dnn_backend_native_layer_avgpool.h b/libavfilter/dnn/dnn_backend_native_layer_avgpool.h new file mode 100644 index 0000000000..0b37a8f64b --- /dev/null +++ b/libavfilter/dnn/dnn_backend_native_layer_avgpool.h @@ -0,0 +1,35 @@ +/* + * Copyright (c) 2018 Sergey Lavrushkin + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFILTER_DNN_DNN_BACKEND_NATIVE_LAYER_AVGPOOL_H +#define AVFILTER_DNN_DNN_BACKEND_NATIVE_LAYER_AVGPOOL_H + +#include "dnn_backend_native.h" + +typedef struct AvgPoolParams{ + int32_t strides, in_channels, out_channels, kernel_size; + DNNPaddingParam padding_method; +} AvgPoolParams; + +int dnn_load_layer_avg_pool(Layer *layer, AVIOContext *model_file_context, int file_size, int operands_num); +int dnn_execute_layer_avg_pool(DnnOperand *operands, const int32_t *input_operand_indexes, + int32_t output_operand_index, const void *parameters); + +#endif diff --git a/libavfilter/dnn/dnn_backend_native_layer_conv2d.h b/libavfilter/dnn/dnn_backend_native_layer_conv2d.h index eeb15fdf01..b240b7ef6b 100644 --- a/libavfilter/dnn/dnn_backend_native_layer_conv2d.h +++ b/libavfilter/dnn/dnn_backend_native_layer_conv2d.h @@ -24,12 +24,11 @@ #include "dnn_backend_native.h" typedef enum {RELU, TANH, SIGMOID, NONE, LEAKY_RELU} DNNActivationFunc; -typedef enum {VALID, SAME, SAME_CLAMP_TO_EDGE} DNNConvPaddingParam; typedef struct ConvolutionalParams{ int32_t input_num, output_num, kernel_size; DNNActivationFunc activation; - DNNConvPaddingParam padding_method; + DNNPaddingParam padding_method; int32_t dilation; int32_t has_bias; float *kernel; diff --git a/libavfilter/dnn/dnn_backend_native_layers.c b/libavfilter/dnn/dnn_backend_native_layers.c index 70f9a5f958..4f42f62abb 100644 --- a/libavfilter/dnn/dnn_backend_native_layers.c +++ b/libavfilter/dnn/dnn_backend_native_layers.c @@ -26,6 +26,7 @@ #include "dnn_backend_native_layer_maximum.h" #include "dnn_backend_native_layer_mathbinary.h" #include "dnn_backend_native_layer_mathunary.h" +#include "dnn_backend_native_layer_avgpool.h" LayerFunc layer_funcs[DLT_COUNT] = { {NULL, NULL}, @@ -35,4 +36,5 @@ LayerFunc layer_funcs[DLT_COUNT] = { {dnn_execute_layer_maximum, dnn_load_layer_maximum}, {dnn_execute_layer_math_binary, dnn_load_layer_math_binary}, {dnn_execute_layer_math_unary, dnn_load_layer_math_unary}, + {dnn_execute_layer_avg_pool, dnn_load_layer_avg_pool}, }; diff --git a/tools/python/convert_from_tensorflow.py b/tools/python/convert_from_tensorflow.py index 85db7bf710..f0bfb3e37b 100644 --- a/tools/python/convert_from_tensorflow.py +++ b/tools/python/convert_from_tensorflow.py @@ -67,10 +67,12 @@ class TFConverter: self.edges = {} self.conv_activations = {'Relu':0, 'Tanh':1, 'Sigmoid':2, 'None':3, 'LeakyRelu':4} self.conv_paddings = {'VALID':0, 'SAME':1} + self.pool_paddings = {'VALID':0, 'SAME':1} self.converted_nodes = set() self.conv2d_scope_names = set() self.conv2d_scopename_inputname_dict = {} - self.op2code = {'Conv2D':1, 'DepthToSpace':2, 'MirrorPad':3, 'Maximum':4, 'MathBinary':5, 'MathUnary':6} + self.op2code = {'Conv2D':1, 'DepthToSpace':2, 'MirrorPad':3, 'Maximum':4, + 'MathBinary':5, 'MathUnary':6, 'AvgPool':7} self.mathbin2code = {'Sub':0, 'Add':1, 'Mul':2, 'RealDiv':3, 'Minimum':4} self.mathun2code = {'Abs':0, 'Sin':1, 'Cos':2, 'Tan':3, 'Asin':4, 'Acos':5, 'Atan':6, 'Sinh':7, 'Cosh':8, 'Tanh':9, 'Asinh':10, 'Acosh':11, 'Atanh':12} self.mirrorpad_mode = {'CONSTANT':0, 'REFLECT':1, 'SYMMETRIC':2} @@ -298,6 +300,39 @@ class TFConverter: np.array([output_operand_index],dtype=np.uint32).tofile(f) + def dump_avg_pool_to_file(self, node, f): + assert(node.op == 'AvgPool') + self.layer_number = self.layer_number + 1 + self.converted_nodes.add(node.name) + node0 = self.name_node_dict[node.input[0]] + strides = node.attr['strides'] + + # Tensorflow do not support pooling strides in batch dimension and + # current native NN do not support pooling strides in channel dimension, added assert() here. + assert(strides.list.i[1]==strides.list.i[2]) + assert(strides.list.i[0]==1) + assert(strides.list.i[3]==1) + strides = strides.list.i[1] + filter_node = node.attr['ksize'] + input_name = node.input[0] + + # Tensorflow do not support pooling ksize in batch dimension and channel dimension. + assert(filter_node.list.i[0]==1) + assert(filter_node.list.i[3]==1) + filter_height = filter_node.list.i[1] + filter_width = filter_node.list.i[2] + + in_channels = node0.attr['shape'].shape.dim[3].size + out_channels = in_channels + padding = node.attr['padding'].s.decode("utf-8") + np.array([self.op2code[node.op], strides, self.pool_paddings[padding], in_channels, out_channels, + filter_height],dtype=np.uint32).tofile(f) + + input_operand_index = self.add_operand(input_name, Operand.IOTYPE_INPUT) + output_operand_index = self.add_operand(node.name, Operand.IOTYPE_OUTPUT) + np.array([input_operand_index, output_operand_index],dtype=np.uint32).tofile(f) + + def dump_layers_to_file(self, f): for node in self.nodes: if node.name in self.converted_nodes: @@ -311,6 +346,8 @@ class TFConverter: if node.op == 'Conv2D': self.dump_simple_conv2d_to_file(node, f) + if node.op == 'AvgPool': + self.dump_avg_pool_to_file(node, f) elif node.op == 'DepthToSpace': self.dump_depth2space_to_file(node, f) elif node.op == 'MirrorPad': From patchwork Thu Jul 30 10:02:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fu, Ting" X-Patchwork-Id: 21378 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 3EEA544A892 for ; Thu, 30 Jul 2020 13:08:41 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 273DA68BAFA; Thu, 30 Jul 2020 13:08:41 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id F3C2368BA6A for ; Thu, 30 Jul 2020 13:08:32 +0300 (EEST) IronPort-SDR: KIBej6v3MhoYHGRLAPb6upxb32fASxtZyZBwl9Ux2ikuwmPgvEffur5oKEobZWSBLLfPTidtkI BVAKX4pqlhhg== X-IronPort-AV: E=McAfee;i="6000,8403,9697"; a="149405169" X-IronPort-AV: E=Sophos;i="5.75,414,1589266800"; d="scan'208";a="149405169" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jul 2020 03:08:31 -0700 IronPort-SDR: ojMxD8vSShge9No+e18cHvGzLT3zbvQQD62v3A1I3nmfnNM4waa1U2Bb8D1j6ATN7hcfeWdWq+ 2DHD2uilz2xg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,414,1589266800"; d="scan'208";a="313362952" Received: from semmer-ubuntu.sh.intel.com ([10.239.159.54]) by fmsmga004.fm.intel.com with ESMTP; 30 Jul 2020 03:08:30 -0700 From: Ting Fu To: ffmpeg-devel@ffmpeg.org Date: Thu, 30 Jul 2020 18:02:59 +0800 Message-Id: <20200730100259.7511-2-ting.fu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200730100259.7511-1-ting.fu@intel.com> References: <20200730100259.7511-1-ting.fu@intel.com> Subject: [FFmpeg-devel] [PATCH V3 2/2] FATE/dnn: add unit test for dnn avgpool layer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" 'make fate-dnn-layer-avgpool' to run the test Signed-off-by: Ting Fu --- tests/dnn/.gitignore | 1 + tests/dnn/Makefile | 1 + tests/dnn/dnn-layer-avgpool-test.c | 202 +++++++++++++++++++++++++++++ tests/fate/dnn.mak | 5 + 4 files changed, 209 insertions(+) create mode 100644 tests/dnn/dnn-layer-avgpool-test.c diff --git a/tests/dnn/.gitignore b/tests/dnn/.gitignore index 1fcd2410b4..b847a01177 100644 --- a/tests/dnn/.gitignore +++ b/tests/dnn/.gitignore @@ -4,3 +4,4 @@ /dnn-layer-pad-test /dnn-layer-mathbinary-test /dnn-layer-mathunary-test +/dnn-layer-avgpool-test diff --git a/tests/dnn/Makefile b/tests/dnn/Makefile index 64591b7851..8afdfab5d3 100644 --- a/tests/dnn/Makefile +++ b/tests/dnn/Makefile @@ -4,6 +4,7 @@ DNNTESTPROGS += dnn-layer-depth2space DNNTESTPROGS += dnn-layer-mathbinary DNNTESTPROGS += dnn-layer-maximum DNNTESTPROGS += dnn-layer-mathunary +DNNTESTPROGS += dnn-layer-avgpool DNNTESTOBJS := $(DNNTESTOBJS:%=$(DNNTESTSDIR)%) $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test.o) DNNTESTPROGS := $(DNNTESTPROGS:%=$(DNNTESTSDIR)/%-test$(EXESUF)) diff --git a/tests/dnn/dnn-layer-avgpool-test.c b/tests/dnn/dnn-layer-avgpool-test.c new file mode 100644 index 0000000000..1c47f9330d --- /dev/null +++ b/tests/dnn/dnn-layer-avgpool-test.c @@ -0,0 +1,202 @@ +/* + * Copyright (c) 2020 + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include //? +#include "libavfilter/dnn/dnn_backend_native_layer_avgpool.h" + +#define EPSON 0.00001 + +static int test_with_same(void) +{ + // the input data and expected data are generated with below python code. + /* + import tensorflow as tf + import numpy as np + + x = tf.placeholder(tf.float32, shape=[1, None, None, 3]) + y = tf.layers.average_pooling2d(x, pool_size=[2,2], strides=[1,1], padding='VALID') + data = np.random.rand(1, 5, 6, 3); + + sess=tf.Session() + sess.run(tf.global_variables_initializer()) + + output = sess.run(y, feed_dict={x: data}) + + print("input:") + print(data.shape) + print(list(data.flatten())) + + print("output:") + print(output.shape) + print(list(output.flatten())) + */ + + AvgPoolParams params; + DnnOperand operands[2]; + int32_t input_indexes[1]; + float input[1*5*6*3] = { + 0.7461309859908424, 0.7567538372797069, 0.07662743569678687, 0.8882112610336333, 0.9720443314026668, 0.3337200343220823, 0.4421032129780248, + 0.14940809044964876, 0.6773177061961277, 0.9778844630669781, 0.6522650522626998, 0.0317651530878591, 0.31259897552911364, 0.6235936821891896, + 0.40016094349542775, 0.4599222930032276, 0.7893807222960093, 0.8475986363538283, 0.5058802717647394, 0.7827005363222633, 0.3032188123727916, + 0.8983728631302361, 0.20622408444965523, 0.22966072303869878, 0.09535751273161308, 0.8760709100995375, 0.9982324154558745, 0.7904595468621013, + 0.13883671508879347, 0.9332751439533138, 0.0010861680752152214, 0.3607210449251048, 0.6600652759586171, 0.7629572058138805, 0.29441975810476106, + 0.2683471432889405, 0.22574580829831536, 0.8893251976212904, 0.3907737043801005, 0.6421829842863968, 0.6670373870457297, 0.9383850793160277, + 0.4120458907436003, 0.3589847212711481, 0.48047736550128983, 0.6428192648418949, 0.0313661686292348, 0.429357100401472, 0.5123413386514056, + 0.8492446404097114, 0.9045286128486804, 0.8123708563814285, 0.3943245008451698, 0.9576713003177785, 0.5985610965938726, 0.9350833279543561, + 0.8010079897491659, 0.45882114217642866, 0.35275037908941487, 0.4555844661432271, 0.12352455940255314, 0.37801756635035544, 0.2824056214573083, + 0.6229462823245029, 0.7235305681391472, 0.5408259266122064, 0.12142224381781208, 0.34431198802873686, 0.7112823816321276, 0.6307144385115417, + 0.8136734589018082, 0.842095618140585, 0.8602767724004784, 0.6649236853766185, 0.5184782829419623, 0.9119607270982825, 0.3084111974561645, + 0.39460705638161364, 0.17710447526170836, 0.1715485945814199, 0.17277563576521882, 0.40188232428735704, 0.22847985411491878, 0.4135361701550696, + 0.24621846601980057, 0.6576588108454774, 0.6063336087333997, 0.6452342242996931, 0.7071689702737508, 0.1973416063225648 + }; + float expected_output[] = { + 0.75964886, 0.6794307, 0.23580676, 0.5810112, 0.5509369, 0.55973274, 0.5764512, 0.45414522, 0.6601476, 0.52050734, 0.44385415, + 0.50631666, 0.38414115, 0.5170288, 0.544043, 0.61143976, 0.5419003, 0.5579729, 0.5680455, 0.6363218, 0.4655096, 0.51198983, + 0.5270792, 0.66168886, 0.48517057, 0.3513146, 0.7103355, 0.48667657, 0.34504217, 0.7318065, 0.5221889, 0.4746775, 0.69765306, + 0.78766406, 0.34437215, 0.6130092, 0.48132777, 0.7110491, 0.6464378, 0.40914366, 0.4391975, 0.5392131, 0.45033398, 0.37297475, + 0.43326652, 0.4748823, 0.48711336, 0.64649844, 0.51921225, 0.60038865, 0.8538945, 0.7215426, 0.60399896, 0.89988345, 0.707405, + 0.5652921, 0.54241943, 0.41785273, 0.30268195, 0.3263432, 0.3313644, 0.37539417, 0.35238582, 0.34811732, 0.48849532, 0.56799453, + 0.41089734, 0.63070333, 0.5892633, 0.6379743, 0.7604212, 0.5197186, 0.88611877, 0.48666745, 0.45654267, 0.5445326, 0.2399799, + 0.28369135, 0.28949338, 0.20001422, 0.2931559, 0.3240504, 0.44306934, 0.5099349, 0.44572634, 0.68241394, 0.40183762, 0.6452342, + 0.707169, 0.1973416 + }; + float *output; + + params.strides = 1; + params.kernel_size = 2; + params.in_channels = 3; + params.out_channels = 3; + params.padding_method = SAME; + + operands[0].data = input; + operands[0].dims[0] = 1; + operands[0].dims[1] = 5; + operands[0].dims[2] = 6; + operands[0].dims[3] = 3; + operands[1].data = NULL; + + input_indexes[0] = 0; + dnn_execute_layer_avg_pool(operands, input_indexes, 1, ¶ms); + + output = operands[1].data; + for (int i = 0; i < sizeof(expected_output) / sizeof(float); ++i) { + if (fabs(output[i] - expected_output[i]) > EPSON) { + printf("at index %d, output: %f, expected_output: %f\n", i, output[i], expected_output[i]); + av_freep(&output); + return 1; + } + } + + av_freep(&output); + return 0; +} + +static int test_with_valid(void) +{ + // the input data and expected data are generated with below python code. + /* + import tensorflow as tf + import numpy as np + + x = tf.placeholder(tf.float32, shape=[1, None, None, 3]) + y = tf.layers.average_pooling2d(x, pool_size=[2,2], strides=[1,1], padding='VALID') + data = np.random.rand(1, 5, 6, 3); + + sess=tf.Session() + sess.run(tf.global_variables_initializer()) + + output = sess.run(y, feed_dict={x: data}) + + print("input:") + print(data.shape) + print(list(data.flatten())) + + print("output:") + print(output.shape) + print(list(output.flatten())) + */ + + AvgPoolParams params; + DnnOperand operands[2]; + int32_t input_indexes[1]; + float input[1*5*6*3] = { + 0.5046741692941682, 0.9273653202485155, 0.8193878359859937, 0.1904059431360905, 0.8664919633253656, 0.7484625128286059, 0.984534184632278, + 0.31900804890072254, 0.3259426099940872, 0.05388974903570376, 0.7356610151331133, 0.46710858713311965, 0.718553768817036, 0.062478421853278676, + 0.7813224786584609, 0.4826837517658389, 0.9748095400220147, 0.8078547703898341, 0.11976750668368585, 0.8713586777195065, 0.41447321551284355, + 0.9818788239089807, 0.4335715767584073, 0.4059793452147419, 0.3677205907204525, 0.47919995923571, 0.8341395256258882, 0.7059726374074609, + 0.5478504551919791, 0.8622900484790175, 0.8343709722511167, 0.05089827275068537, 0.6465283980840416, 0.544539116066677, 0.39812057257884337, + 0.9578115576866337, 0.25012888117580145, 0.579333516024662, 0.5556732133051457, 0.6119862111181243, 0.0018736758772316398, 0.9795490254040474, + 0.4488085008883018, 0.28947489777011737, 0.4834108668633247, 0.9280490084385024, 0.9895821458049648, 0.31777618554697606, 0.42679693258977847, + 0.74447844466923, 0.9752225305081498, 0.17564130841849335, 0.22382692067314292, 0.009602884447469373, 0.5144884415025782, 0.031622570708844555, + 0.8277532752502512, 0.4111593210409763, 0.5272084646575664, 0.28856508082905297, 0.11317726946036655, 0.7203328275540273, 0.8310055019972384, + 0.8535951508685228, 0.40230347305233227, 0.2819703265132867, 0.6243143957791139, 0.7512463693822311, 0.7523056340495644, 0.8838077258040928, + 0.5472240664033092, 0.2550538284454935, 0.5560317774456567, 0.8966847087518931, 0.6728358284165321, 0.30361297147530875, 0.464343925441822, + 0.34507695659461224, 0.6333175615390685, 0.26661369038523497, 0.9926748632253231, 0.9994267301382666, 0.8684917986974414, 0.3598754806113009, + 0.49550268625464666, 0.03652458679973214, 0.13469081713137177, 0.4579424049273835, 0.48641107969110353, 0.9670250266945365 + }; + float expected_output[1*4*5*3] = { + 0.44918162, 0.7746969, 0.5970757, 0.63113487, 0.5245679, 0.578631, 0.52802926, 0.52042985, 0.6223702, 0.57819676, 0.34922206, + 0.6893124, 0.64503694, 0.37157673, 0.7983793, 0.49094033, 0.47153437, 0.5889187, 0.6025985, 0.30103004, 0.6757697, 0.6126377, + 0.5765268, 0.62440413, 0.7237974, 0.5832023, 0.7004543, 0.49533707, 0.35433105, 0.6472913, 0.44694072, 0.28500956, 0.6628852, + 0.39628282, 0.38472247, 0.6456326, 0.58590746, 0.60042334, 0.47854072, 0.7081889, 0.7219026, 0.5818187, 0.5276401, 0.56669396, + 0.49804622, 0.4463231, 0.4799649, 0.5335578, 0.36531678, 0.4946247, 0.6143306, 0.6498792, 0.5644355, 0.6163815, 0.7432098, + 0.5146416, 0.38221055, 0.6153918, 0.45535153, 0.5272688 + }; + float *output; + + params.strides = 1; + params.kernel_size = 2; + params.in_channels = 3; + params.out_channels = 3; + params.padding_method = VALID; + + operands[0].data = input; + operands[0].dims[0] = 1; + operands[0].dims[1] = 5; + operands[0].dims[2] = 6; + operands[0].dims[3] = 3; + operands[1].data = NULL; + + input_indexes[0] = 0; + dnn_execute_layer_avg_pool(operands, input_indexes, 1, ¶ms); + + output = operands[1].data; + for (int i = 0; i < sizeof(expected_output) / sizeof(float); ++i) { + if (fabs(output[i] - expected_output[i]) > EPSON) { + printf("at index %d, output: %f, expected_output: %f\n", i, output[i], expected_output[i]); + av_freep(&output); + return 1; + } + } + + av_freep(&output); + return 0; +} + +int main(int argc, char **argv) +{ + if (test_with_same()) + return 1; + if (test_with_valid()) + return 1; + + return 0; +} diff --git a/tests/fate/dnn.mak b/tests/fate/dnn.mak index 4a50b16382..90a1bb3cac 100644 --- a/tests/fate/dnn.mak +++ b/tests/fate/dnn.mak @@ -28,6 +28,11 @@ fate-dnn-layer-mathunary: $(DNNTESTSDIR)/dnn-layer-mathunary-test$(EXESUF) fate-dnn-layer-mathunary: CMD = run $(DNNTESTSDIR)/dnn-layer-mathunary-test$(EXESUF) fate-dnn-layer-mathunary: CMP = null +FATE_DNN += fate-dnn-layer-avgpool +fate-dnn-layer-avgpool: $(DNNTESTSDIR)/dnn-layer-avgpool-test$(EXESUF) +fate-dnn-layer-avgpool: CMD = run $(DNNTESTSDIR)/dnn-layer-avgpool-test$(EXESUF) +fate-dnn-layer-avgpool: CMP = null + FATE-yes += $(FATE_DNN) fate-dnn: $(FATE_DNN)