From patchwork Wed Sep 16 10:07:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Jun X-Patchwork-Id: 22443 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id DD78944A08C for ; Wed, 16 Sep 2020 13:08:14 +0300 (EEST) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id C398B68BCA8; Wed, 16 Sep 2020 13:08:14 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from smtp181.sjtu.edu.cn (smtp181.sjtu.edu.cn [202.120.2.181]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 502B668B905 for ; Wed, 16 Sep 2020 13:08:08 +0300 (EEST) Received: from proxy02.sjtu.edu.cn (smtp188.sjtu.edu.cn [202.120.2.188]) by smtp181.sjtu.edu.cn (Postfix) with ESMTPS id 84D041008CBC1 for ; Wed, 16 Sep 2020 18:08:05 +0800 (CST) Received: from localhost (localhost.localdomain [127.0.0.1]) by proxy02.sjtu.edu.cn (Postfix) with ESMTP id 5B805200B4497; Wed, 16 Sep 2020 18:08:05 +0800 (CST) X-Virus-Scanned: amavisd-new at Received: from proxy02.sjtu.edu.cn ([127.0.0.1]) by localhost (proxy02.sjtu.edu.cn [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id z-J7XjZ57kXo; Wed, 16 Sep 2020 18:08:05 +0800 (CST) Received: from localhost.localdomain (unknown [202.120.39.204]) (Authenticated sender: xujunzz@sjtu.edu.cn) by proxy02.sjtu.edu.cn (Postfix) with ESMTPSA id 93275200B4498; Wed, 16 Sep 2020 18:08:01 +0800 (CST) From: xujunzz@sjtu.edu.cn To: ffmpeg-devel@ffmpeg.org Date: Wed, 16 Sep 2020 18:07:19 +0800 Message-Id: <20200916100717.3142217-2-xujunzz@sjtu.edu.cn> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200916100717.3142217-1-xujunzz@sjtu.edu.cn> References: <20200916100717.3142217-1-xujunzz@sjtu.edu.cn> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v3 2/2] dnn_backend_native_layer_conv2d.c: refine code. X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: xujunzz@sjtu.edu.cn Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Xu Jun Move thread area allocate out of thread function into main thread. Signed-off-by: Xu Jun --- .../dnn/dnn_backend_native_layer_conv2d.c | 30 +++++++++---------- 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c index 5c313454f7..2aaa4162df 100644 --- a/libavfilter/dnn/dnn_backend_native_layer_conv2d.c +++ b/libavfilter/dnn/dnn_backend_native_layer_conv2d.c @@ -33,12 +33,11 @@ typedef struct thread_common_param{ const void *parameters; NativeContext *ctx; float *output_data; - int thread_num; } thread_common_param; typedef struct thread_param{ thread_common_param *thread_common_param; - int thread_index; + int thread_start, thread_end; } thread_param; int dnn_load_layer_conv2d(Layer *layer, AVIOContext *model_file_context, int file_size, int operands_num) @@ -125,16 +124,12 @@ static void * dnn_execute_layer_conv2d_thread(void *threadarg) int filter_size = conv_params->kernel_size * filter_linesize; int pad_size = (conv_params->padding_method == VALID) ? (conv_params->kernel_size - 1) / 2 * conv_params->dilation : 0; - int thread_stride = (height - pad_size * 2) / thread_common_param->thread_num; - int thread_start = thread_stride * thread_param->thread_index + pad_size; - int thread_end = (thread_param->thread_index == thread_common_param->thread_num - 1) ? (height - pad_size) : (thread_start + thread_stride); - float *output = thread_common_param->output_data; - output += (conv_params->output_num) * (width - 2 * pad_size) * (thread_start - pad_size); + output += (conv_params->output_num) * (width - 2 * pad_size) * (thread_param->thread_start - pad_size); av_assert0(channel == conv_params->input_num); - for (int y = thread_start; y < thread_end; ++y) { + for (int y = thread_param->thread_start; y < thread_param->thread_end; ++y) { for (int x = pad_size; x < width - pad_size; ++x) { for (int n_filter = 0; n_filter < conv_params->output_num; ++n_filter) { if (conv_params->has_bias) @@ -193,16 +188,19 @@ int dnn_execute_layer_conv2d(DnnOperand *operands, const int32_t *input_operand_ ? (av_cpu_count() + 1) : (ctx->options.conv2d_threads); #if HAVE_PTHREAD_CANCEL pthread_t *thread_id = av_malloc(thread_num * sizeof(pthread_t)); + int thread_stride; #endif thread_param **thread_param = av_malloc(thread_num * sizeof(*thread_param)); thread_common_param thread_common_param; const ConvolutionalParams *conv_params = (const ConvolutionalParams *)(parameters); + int height = operands[input_operand_indexes[0]].dims[1]; + int width = operands[input_operand_indexes[0]].dims[2]; int pad_size = (conv_params->padding_method == VALID) ? (conv_params->kernel_size - 1) / 2 * conv_params->dilation : 0; DnnOperand *output_operand = &operands[output_operand_index]; output_operand->dims[0] = operands[input_operand_indexes[0]].dims[0]; - output_operand->dims[1] = operands[input_operand_indexes[0]].dims[1] - pad_size * 2; - output_operand->dims[2] = operands[input_operand_indexes[0]].dims[2] - pad_size * 2; + output_operand->dims[1] = height - pad_size * 2; + output_operand->dims[2] = width - pad_size * 2; output_operand->dims[3] = conv_params->output_num; output_operand->data_type = operands[input_operand_indexes[0]].data_type; output_operand->length = calculate_operand_data_length(output_operand); @@ -223,13 +221,13 @@ int dnn_execute_layer_conv2d(DnnOperand *operands, const int32_t *input_operand_ thread_common_param.ctx = ctx; #if HAVE_PTHREAD_CANCEL - thread_common_param.thread_num = thread_num; - + thread_stride = (height - pad_size * 2) / thread_num; //create threads for (int i = 0; i < thread_num; i++){ thread_param[i] = av_malloc(sizeof(**thread_param)); thread_param[i]->thread_common_param = &thread_common_param; - thread_param[i]->thread_index = i; + thread_param[i]->thread_start = thread_stride * i + pad_size; + thread_param[i]->thread_end = (i == thread_num - 1) ? (height - pad_size) : (thread_param[i]->thread_start + thread_stride); pthread_create(&thread_id[i], NULL, dnn_execute_layer_conv2d_thread, (void *)thread_param[i]); } @@ -245,10 +243,10 @@ int dnn_execute_layer_conv2d(DnnOperand *operands, const int32_t *input_operand_ av_free(thread_param[i]); } #else - thread_common_param.thread_num = 1; - thread_param[0] = av_malloc(sizeof(thread_param)); + thread_param[0] = av_malloc(sizeof(**thread_param)); thread_param[0]->thread_common_param = &thread_common_param; - thread_param[0]->thread_index = 0; + thread_param[0]->thread_start = 0; + thread_param[0]->thread_end = height - pad_size; dnn_execute_layer_conv2d_thread((void *)thread_param[0]); av_free(thread_param[0]); #endif