From patchwork Mon Sep 14 11:31:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Jun X-Patchwork-Id: 22375 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id B176E44B2DE for ; Mon, 14 Sep 2020 14:32:47 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 900A168A558; Mon, 14 Sep 2020 14:32:47 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from smtp181.sjtu.edu.cn (smtp181.sjtu.edu.cn [202.120.2.181]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 788DB68A353 for ; Mon, 14 Sep 2020 14:32:40 +0300 (EEST) Received: from proxy02.sjtu.edu.cn (smtp188.sjtu.edu.cn [202.120.2.188]) by smtp181.sjtu.edu.cn (Postfix) with ESMTPS id 69DF31008CBC1 for ; Mon, 14 Sep 2020 19:32:33 +0800 (CST) Received: from localhost (localhost.localdomain [127.0.0.1]) by proxy02.sjtu.edu.cn (Postfix) with ESMTP id 56D4D200A718A; Mon, 14 Sep 2020 19:32:33 +0800 (CST) X-Virus-Scanned: amavisd-new at Received: from proxy02.sjtu.edu.cn ([127.0.0.1]) by localhost (proxy02.sjtu.edu.cn [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id RuQ5cgVvya3V; Mon, 14 Sep 2020 19:32:33 +0800 (CST) Received: from localhost.localdomain (unknown [202.120.39.204]) (Authenticated sender: xujunzz@sjtu.edu.cn) by proxy02.sjtu.edu.cn (Postfix) with ESMTPSA id 60484200A7187; Mon, 14 Sep 2020 19:32:31 +0800 (CST) From: xujunzz@sjtu.edu.cn To: ffmpeg-devel@ffmpeg.org Date: Mon, 14 Sep 2020 19:31:54 +0800 Message-Id: <20200914113154.61946-1-xujunzz@sjtu.edu.cn> X-Mailer: git-send-email 2.28.0 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/2] dnn_backend_native_layer_conv2d.c: fix memory allocation bug in multithread function. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: xujunzz@sjtu.edu.cn Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Xu Jun Before patch, memory was allocated in each thread functions, which may cause more than one time of memory allocation and cause crash. After patch, memory is allocated in the main thread once, an index was parsed into thread functions. Bug fixed. Signed-off-by: Xu Jun --- .../dnn/dnn_backend_native_layer_conv2d.c | 55 +++++++++---------- 1 file changed, 27 insertions(+), 28 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c index c52725aa2b..5ed1851512 100644 --- a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c +++ b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c @@ -32,6 +32,7 @@ typedef struct thread_common_param{ int32_t output_operand_index; const void *parameters; NativeContext *ctx; + float *output_data; int thread_num; } thread_common_param; @@ -111,7 +112,6 @@ static void * dnn_execute_layer_conv2d_thread(void *threadarg) thread_param *thread_param = (struct thread_param *)threadarg; thread_common_param *thread_common_param = thread_param->thread_common_param; DnnOperand *operands = thread_common_param->operands; - float *output; int32_t input_operand_index = thread_common_param->input_operand_indexes[0]; int number = operands[input_operand_index].dims[0]; int height = operands[input_operand_index].dims[1]; @@ -130,24 +130,7 @@ static void * dnn_execute_layer_conv2d_thread(void *threadarg) int thread_start = thread_stride * thread_param->thread_index + pad_size; int thread_end = (thread_param->thread_index == thread_common_param->thread_num - 1) ? (height - pad_size) : (thread_start + thread_stride); - DnnOperand *output_operand = &operands[thread_common_param->output_operand_index]; - output_operand->dims[0] = number; - output_operand->dims[1] = height - pad_size * 2; - output_operand->dims[2] = width - pad_size * 2; - output_operand->dims[3] = conv_params->output_num; - output_operand->data_type = operands[input_operand_index].data_type; - output_operand->length = calculate_operand_data_length(output_operand); - if (output_operand->length <= 0) { - av_log(thread_common_param->ctx, AV_LOG_ERROR, "The output data length overflow\n"); - return (void *)DNN_ERROR; - } - output_operand->data = av_realloc(output_operand->data, output_operand->length); - if (!output_operand->data) { - av_log(thread_common_param->ctx, AV_LOG_ERROR, "Failed to reallocate memory for output\n"); - return (void *)DNN_ERROR; - } - - output = output_operand->data; + float *output = thread_common_param->output_data; output += (conv_params->output_num) * (width - 2 * pad_size) * (thread_start - pad_size); av_assert0(channel == conv_params->input_num); @@ -213,8 +196,6 @@ int dnn_execute_layer_conv2d(DnnOperand *operands, const int32_t *input_operand_ pthread_t *thread_id = av_malloc(thread_num * sizeof(pthread_t)); #endif thread_param **thread_param = av_malloc(thread_num * sizeof(*thread_param)); - void *res; - int error_flag = DNN_SUCCESS; //struct used to pass parameters thread_common_param thread_common_param; @@ -223,6 +204,28 @@ int dnn_execute_layer_conv2d(DnnOperand *operands, const int32_t *input_operand_ thread_common_param.output_operand_index = output_operand_index; thread_common_param.parameters = parameters; thread_common_param.ctx = ctx; + + //alloc memory + const ConvolutionalParams *conv_params = (const ConvolutionalParams *)(parameters); + int pad_size = (conv_params->padding_method == VALID) ? (conv_params->kernel_size - 1) / 2 * conv_params->dilation : 0; + DnnOperand *output_operand = &operands[output_operand_index]; + output_operand->dims[0] = operands[input_operand_indexes[0]].dims[0]; + output_operand->dims[1] = operands[input_operand_indexes[0]].dims[1] - pad_size * 2; + output_operand->dims[2] = operands[input_operand_indexes[0]].dims[2] - pad_size * 2; + output_operand->dims[3] = conv_params->output_num; + output_operand->data_type = operands[input_operand_indexes[0]].data_type; + output_operand->length = calculate_operand_data_length(output_operand); + if (output_operand->length <= 0) { + av_log(ctx, AV_LOG_ERROR, "The output data length overflow\n"); + return DNN_ERROR; + } + output_operand->data = av_realloc(output_operand->data, output_operand->length); + if (!output_operand->data) { + av_log(ctx, AV_LOG_ERROR, "Failed to reallocate memory for output\n"); + return DNN_ERROR; + } + thread_common_param.output_data = output_operand->data; + #if HAVE_PTHREAD_CANCEL thread_common_param.thread_num = thread_num; @@ -236,9 +239,7 @@ int dnn_execute_layer_conv2d(DnnOperand *operands, const int32_t *input_operand_ //join threads, res gets function return for (int i = 0; i < thread_num; i++){ - pthread_join(thread_id[i], &res); - if ((int)res != DNN_SUCCESS) - error_flag = (int)res; + pthread_join(thread_id[i], NULL); } //release memory @@ -252,12 +253,10 @@ int dnn_execute_layer_conv2d(DnnOperand *operands, const int32_t *input_operand_ thread_param[0] = av_malloc(sizeof(thread_param)); thread_param[0]->thread_common_param = &thread_common_param; thread_param[0]->thread_index = 0; - res = dnn_execute_layer_conv2d_thread((void *)thread_param[0]); - if ((int)res != DNN_SUCCESS) - error_flag = (int)res; + dnn_execute_layer_conv2d_thread((void *)thread_param[0]); av_free(thread_param[0]); #endif av_free(thread_param); - return error_flag; + return DNN_SUCCESS; } From patchwork Mon Sep 14 11:31:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Jun X-Patchwork-Id: 22376 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id AAD5744B2DE for ; Mon, 14 Sep 2020 14:32:56 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 91C9F68A8E9; Mon, 14 Sep 2020 14:32:56 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from smtp181.sjtu.edu.cn (smtp181.sjtu.edu.cn [202.120.2.181]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 23B5868A687 for ; Mon, 14 Sep 2020 14:32:49 +0300 (EEST) Received: from proxy02.sjtu.edu.cn (smtp188.sjtu.edu.cn [202.120.2.188]) by smtp181.sjtu.edu.cn (Postfix) with ESMTPS id CA5FC1008CBC2 for ; Mon, 14 Sep 2020 19:32:45 +0800 (CST) Received: from localhost (localhost.localdomain [127.0.0.1]) by proxy02.sjtu.edu.cn (Postfix) with ESMTP id BC9AE2007C458; Mon, 14 Sep 2020 19:32:45 +0800 (CST) X-Virus-Scanned: amavisd-new at Received: from proxy02.sjtu.edu.cn ([127.0.0.1]) by localhost (proxy02.sjtu.edu.cn [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ex7vQxsvdry5; Mon, 14 Sep 2020 19:32:45 +0800 (CST) Received: from localhost.localdomain (unknown [202.120.39.204]) (Authenticated sender: xujunzz@sjtu.edu.cn) by proxy02.sjtu.edu.cn (Postfix) with ESMTPSA id 266372007C451; Mon, 14 Sep 2020 19:32:42 +0800 (CST) From: xujunzz@sjtu.edu.cn To: ffmpeg-devel@ffmpeg.org Date: Mon, 14 Sep 2020 19:31:56 +0800 Message-Id: <20200914113154.61946-2-xujunzz@sjtu.edu.cn> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200914113154.61946-1-xujunzz@sjtu.edu.cn> References: <20200914113154.61946-1-xujunzz@sjtu.edu.cn> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/2] dnn_backend_native_layer_conv2d.c: refine code. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: xujunzz@sjtu.edu.cn Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Xu Jun Move thread area allocate out of thread function into main thread. Signed-off-by: Xu Jun --- .../dnn/dnn_backend_native_layer_conv2d.c | 29 +++++++++---------- 1 file changed, 13 insertions(+), 16 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c index 5ed1851512..57659a1283 100644 --- a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c +++ b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c @@ -33,12 +33,11 @@ typedef struct thread_common_param{ const void *parameters; NativeContext *ctx; float *output_data; - int thread_num; } thread_common_param; typedef struct thread_param{ thread_common_param *thread_common_param; - int thread_index; + int thread_start, thread_end; } thread_param; int dnn_load_layer_conv2d(Layer *layer, AVIOContext *model_file_context, int file_size, int operands_num) @@ -126,16 +125,12 @@ static void * dnn_execute_layer_conv2d_thread(void *threadarg) int filter_size = conv_params->kernel_size * filter_linesize; int pad_size = (conv_params->padding_method == VALID) ? (conv_params->kernel_size - 1) / 2 * conv_params->dilation : 0; - int thread_stride = (height - pad_size * 2) / thread_common_param->thread_num; - int thread_start = thread_stride * thread_param->thread_index + pad_size; - int thread_end = (thread_param->thread_index == thread_common_param->thread_num - 1) ? (height - pad_size) : (thread_start + thread_stride); - float *output = thread_common_param->output_data; - output += (conv_params->output_num) * (width - 2 * pad_size) * (thread_start - pad_size); + output += (conv_params->output_num) * (width - 2 * pad_size) * (thread_param->thread_start - pad_size); av_assert0(channel == conv_params->input_num); - for (int y = thread_start; y < thread_end; ++y) { + for (int y = thread_param->thread_start; y < thread_param->thread_end; ++y) { for (int x = pad_size; x < width - pad_size; ++x) { for (int n_filter = 0; n_filter < conv_params->output_num; ++n_filter) { if (conv_params->has_bias) @@ -207,11 +202,13 @@ int dnn_execute_layer_conv2d(DnnOperand *operands, const int32_t *input_operand_ //alloc memory const ConvolutionalParams *conv_params = (const ConvolutionalParams *)(parameters); + int height = operands[input_operand_indexes[0]].dims[1]; + int width = operands[input_operand_indexes[0]].dims[2]; int pad_size = (conv_params->padding_method == VALID) ? (conv_params->kernel_size - 1) / 2 * conv_params->dilation : 0; DnnOperand *output_operand = &operands[output_operand_index]; output_operand->dims[0] = operands[input_operand_indexes[0]].dims[0]; - output_operand->dims[1] = operands[input_operand_indexes[0]].dims[1] - pad_size * 2; - output_operand->dims[2] = operands[input_operand_indexes[0]].dims[2] - pad_size * 2; + output_operand->dims[1] = height - pad_size * 2; + output_operand->dims[2] = width - pad_size * 2; output_operand->dims[3] = conv_params->output_num; output_operand->data_type = operands[input_operand_indexes[0]].data_type; output_operand->length = calculate_operand_data_length(output_operand); @@ -227,13 +224,13 @@ int dnn_execute_layer_conv2d(DnnOperand *operands, const int32_t *input_operand_ thread_common_param.output_data = output_operand->data; #if HAVE_PTHREAD_CANCEL - thread_common_param.thread_num = thread_num; - + int thread_stride = (height - pad_size * 2) / thread_num; //create threads for (int i = 0; i < thread_num; i++){ thread_param[i] = av_malloc(sizeof(**thread_param)); thread_param[i]->thread_common_param = &thread_common_param; - thread_param[i]->thread_index = i; + thread_param[i]->thread_start = thread_stride * i + pad_size; + thread_param[i]->thread_end = (i == thread_num - 1) ? (height - pad_size) : (thread_param[i]->thread_start + thread_stride); pthread_create(&thread_id[i], NULL, dnn_execute_layer_conv2d_thread, (void *)thread_param[i]); } @@ -249,10 +246,10 @@ int dnn_execute_layer_conv2d(DnnOperand *operands, const int32_t *input_operand_ av_free(thread_param[i]); } #else - thread_common_param.thread_num = 1; - thread_param[0] = av_malloc(sizeof(thread_param)); + thread_param[0] = av_malloc(sizeof(**thread_param)); thread_param[0]->thread_common_param = &thread_common_param; - thread_param[0]->thread_index = 0; + thread_param[0]->thread_start = 0; + thread_param[0]->thread_end = height - pad_size; dnn_execute_layer_conv2d_thread((void *)thread_param[0]); av_free(thread_param[0]); #endif