From patchwork Wed May 22 13:02:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuewei Meng X-Patchwork-Id: 13249 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 24663449618 for ; Wed, 22 May 2019 16:09:07 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0B227689E21; Wed, 22 May 2019 16:09:07 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f196.google.com (mail-pg1-f196.google.com [209.85.215.196]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E55CA689B90 for ; Wed, 22 May 2019 16:08:59 +0300 (EEST) Received: by mail-pg1-f196.google.com with SMTP id n27so1312030pgm.4 for ; Wed, 22 May 2019 06:08:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id; bh=L98yuSnC9OEV3YZTkn0y8A2F2s3Plc8ACAvTrf3Ox88=; b=pFqikaR/P5A0NryHLJYJAkSwmqSsdKozbake4Ua3m5iYEEqY4MkhHx4kj76RAAknIx vemjKcFgieB8cIKnR2b2cv/yXXFcCs1g1XcRc1EMGNf0UhXTBqtIY+wPssGEUwepIGAx ewXMqI8Zkj2ELGWOMCi/dxXOWS/XJSZ/0aps7Uno0rm2AowGL6hXtFqksX1Ex3nf7QIu bJhOpN7vRWtTYj1KvteV5tqRVhEsLa69gEQBFIYgILCL799tEbJwkuAvTQK+WmVYSNb7 6ntlDUMlm+rDjo5c8nRtjNGLil0zGxSd80wfAgLqlHUb+d5eykHIMR6cx4z0gwQPik7b VlTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=L98yuSnC9OEV3YZTkn0y8A2F2s3Plc8ACAvTrf3Ox88=; b=fJ8hSgV+6EBUCoNMD5s+ZMZmAom/hia1Fmbg2s8nrfPl6Oyujekrn62XcF3IwHQTmO Uvj8BcY09UxQ1WmBF4zKbIXoSdP1lKxZklr1N4J8ErLnPNUumBgLIsahJBHZSXceoVXL nq08yAW8ptW7g6rJUIAKOMh6A4FdKWkabc5swgItuQ5TTdZghS62nM8DGw4qyoWMWvIA vJed4qHcUv2tib/WCmTLmGu99TPTjp2FJkIipDMvxFojzGE0guA9z2/mP0ekkiIskGAb P3EPFPQiKosClSZj9DbTzPVBAHbSCRU/EZFq+pInNhF7MMA+4qR+n/cwGHz3d2FO/goj TudA== X-Gm-Message-State: APjAAAWqHkGy59bmbhwLsSAucqWkV25pTJPloTvJYNLbwzuE50kMGV1q lruOeSzwAL2+FRz7SUnAfFBCFAK84rE= X-Google-Smtp-Source: APXvYqz/wq08F63xH/o+tjhl4kUzv8pHVEEZZ1p5L8g9yOkvbAONTX0maTSth3iQ3BAhFO/dmu/rug== X-Received: by 2002:a65:554d:: with SMTP id t13mr89071750pgr.171.1558530180690; Wed, 22 May 2019 06:03:00 -0700 (PDT) Received: from DESKTOP-IACK8OK.lan ([2001:da8:201:1094:1c8c:d2b0:73e4:5250]) by smtp.gmail.com with ESMTPSA id z7sm33476013pfr.23.2019.05.22.06.02.59 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 22 May 2019 06:03:00 -0700 (PDT) From: Xuewei Meng To: ffmpeg-devel@ffmpeg.org Date: Wed, 22 May 2019 21:02:58 +0800 Message-Id: <20190522130258.24931-1-xwmeng96@gmail.com> X-Mailer: git-send-email 2.17.1 Subject: [FFmpeg-devel] [PATCH] libavfilter/dnn_native: Add support of dilated convolution in dnn_native. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" Add dilation parameter in dnn native to support dilated convolution. Signed-off-by: Xuewei Meng --- libavfilter/dnn_backend_native.c | 17 +++++++++-------- libavfilter/dnn_backend_native.h | 1 + 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/libavfilter/dnn_backend_native.c b/libavfilter/dnn_backend_native.c index 3c8465a283..82e900bd8c 100644 --- a/libavfilter/dnn_backend_native.c +++ b/libavfilter/dnn_backend_native.c @@ -63,7 +63,7 @@ static DNNReturnType set_input_output_native(void *model, DNNInputData *input, c cur_channels = conv_params->output_num; if (conv_params->padding_method == VALID) { - int pad_size = conv_params->kernel_size - 1; + int pad_size = (conv_params->kernel_size - 1) * conv_params->dilation; cur_height -= pad_size; cur_width -= pad_size; } @@ -164,6 +164,7 @@ DNNModel *ff_dnn_load_model_native(const char *model_filename) ff_dnn_free_model_native(&model); return NULL; } + conv_params->dilation = (int32_t)avio_rl32(model_file_context); conv_params->padding_method = (int32_t)avio_rl32(model_file_context); conv_params->activation = (int32_t)avio_rl32(model_file_context); conv_params->input_num = (int32_t)avio_rl32(model_file_context); @@ -171,7 +172,7 @@ DNNModel *ff_dnn_load_model_native(const char *model_filename) conv_params->kernel_size = (int32_t)avio_rl32(model_file_context); kernel_size = conv_params->input_num * conv_params->output_num * conv_params->kernel_size * conv_params->kernel_size; - dnn_size += 20 + (kernel_size + conv_params->output_num << 2); + dnn_size += 24 + (kernel_size + conv_params->output_num << 2); if (dnn_size > file_size || conv_params->input_num <= 0 || conv_params->output_num <= 0 || conv_params->kernel_size <= 0){ avio_closep(&model_file_context); @@ -233,7 +234,7 @@ static void convolve(const float *input, float *output, const ConvolutionalParam int src_linesize = width * conv_params->input_num; int filter_linesize = conv_params->kernel_size * conv_params->input_num; int filter_size = conv_params->kernel_size * filter_linesize; - int pad_size = (conv_params->padding_method == VALID) ? (conv_params->kernel_size - 1) / 2 : 0; + int pad_size = (conv_params->padding_method == VALID) ? (conv_params->kernel_size - 1) / 2 * conv_params->dilation : 0; for (int y = pad_size; y < height - pad_size; ++y) { for (int x = pad_size; x < width - pad_size; ++x) { @@ -245,12 +246,12 @@ static void convolve(const float *input, float *output, const ConvolutionalParam for (int kernel_x = 0; kernel_x < conv_params->kernel_size; ++kernel_x) { float input_pel; if (conv_params->padding_method == SAME_CLAMP_TO_EDGE) { - int y_pos = CLAMP_TO_EDGE(y + kernel_y - radius, height); - int x_pos = CLAMP_TO_EDGE(x + kernel_x - radius, width); + int y_pos = CLAMP_TO_EDGE(y + (kernel_y - radius) * conv_params->dilation, height); + int x_pos = CLAMP_TO_EDGE(x + (kernel_x - radius) * conv_params->dilation, width); input_pel = input[y_pos * src_linesize + x_pos * conv_params->input_num + ch]; } else { - int y_pos = y + kernel_y - radius; - int x_pos = x + kernel_x - radius; + int y_pos = y + (kernel_y - radius) * conv_params->dilation; + int x_pos = x + (kernel_x - radius) * conv_params->dilation; input_pel = (x_pos < 0 || x_pos >= width || y_pos < 0 || y_pos >= height) ? 0.0 : input[y_pos * src_linesize + x_pos * conv_params->input_num + ch]; } @@ -334,7 +335,7 @@ DNNReturnType ff_dnn_execute_model_native(const DNNModel *model, DNNData *output convolve(network->layers[layer - 1].output, network->layers[layer].output, conv_params, cur_width, cur_height); cur_channels = conv_params->output_num; if (conv_params->padding_method == VALID) { - int pad_size = conv_params->kernel_size - 1; + int pad_size = (conv_params->kernel_size - 1) * conv_params->dilation; cur_height -= pad_size; cur_width -= pad_size; } diff --git a/libavfilter/dnn_backend_native.h b/libavfilter/dnn_backend_native.h index 7e4e943137..5917955733 100644 --- a/libavfilter/dnn_backend_native.h +++ b/libavfilter/dnn_backend_native.h @@ -46,6 +46,7 @@ typedef struct ConvolutionalParams{ int32_t input_num, output_num, kernel_size; DNNActivationFunc activation; DNNConvPaddingParam padding_method; + int32_t dilation; float *kernel; float *biases; } ConvolutionalParams;