[FFmpeg-devel,v1] libavfi/dnn: add Paddle Inference as one of DNN backend

Message ID	tencent_68C538A3615885020246632C2D9C13D79706@qq.com
State	New
Headers	show Delivered-To: ffmpegpatchwork2@gmail.com Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Message-ID: <tencent_68C538A3615885020246632C2D9C13D79706@qq.com> From: wongwwz@foxmail.com To: ffmpeg-devel@ffmpeg.org Date: Thu, 6 Apr 2023 18:36:56 +0800 MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend Precedence: list Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Cc: "wenzhe.wang" <wongwwz@foxmail.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>
Series	[FFmpeg-devel,v1] libavfi/dnn: add Paddle Inference as one of DNN backend \| expand [FFmpeg-devel,v1] libavfi/dnn: add Paddle Inference as one of DNN backend

Context	Check	Description
andriy/make_x86	success	Make finished
andriy/make_fate_x86	fail	Make fate failed

> On May 10, 2023, at 10:25, WenzheWang <wongwwz@foxmail.com> wrote: > > Dear Madam or Sir, > > > Hope this email finds you well. > > > I am writing this email since i recently found FFmepg remove DNN native backend, and i will be really grateful if you let me know if there is any new plan on libavfilter/dnn. > > > I would like to explain to you again about the addition of dnn paddle backend. > > At present, ffmpeg only supports openvino and tensorflow backend. Among the current deep learning frameworks, TensorFlow is the most active in development. TensorFlow has 174k stars and pytorch has 66.5k. openvino is 4.2k, and the models that openvino can implement are relatively few. But in terms of attention on GitHub, there's no doubt that TensorFlow and pytorch are more promising. Currently, the paddle framework has reached 20.2k stars on github, which is much more widely used and active than frameworks such as mxnet and caffe. Stars don't matter much here. Just for reference, there is a thread before: https://patchwork.ffmpeg.org/project/ffmpeg/patch/20220523092918.9548-2-ting.fu@intel.com/ > > Tensoflow has a very rich ecosystem. The TensorFlow models library updates very quickly and has existing examples of deep learning applications for image classification, object detection, image generation text, and generation of adversus-network models. The dnn libavfilter module is undoubtedly very necessary for tensorflow backend to support. But the complexity of the TensorFlow API and the complexity of the training are almost prohibitive, making it a love-hate framework. > > PyTorch framework tends to be applied to academic fast implementation, and its industrial application performance is not good. For example, Pytorch framework makes a model to run on a server, Android phone or embedded system, and its performance is poor compared with other deep learning frameworks. > > > PaddlePadddle is an open source framework of Baidu, which is also used by many people in China. It is very consistent with the usage habits of developers, but the practicability of the API still needs to be further strengthened. However, Paddle is the only deep learning framework I have ever used, which does not configure any third-party libraries and can be used directly by cloning make. Besides, Paddle occupies a small amount of memory and is fast. It also serves a considerable number of projects inside Baidu, which is very strong in industrial application. And PaddlePaddle supports multiple machine and multiple card training. > > > Users' choice of different deep learning frameworks is a personal choice, and the reason why most of us chose paddle is because of its better support for embedded development and different hardware platforms and because the community is very active and has proposed industrial improvements and implementations for some advanced models. Especially for the GPU, it supports cuda and opencl, which means we can optimize the model no matter what kind of graphics card is used. In my opinion, more backend support can better improve dnn libavfilter modules. > > If there are any new changes in dnn libavfilter module, I will be very willing to adjust our implementation with the new planning and provide continuous maintenance. > > > > > Best Regards, > Wenzhe Wang > > > > > > > WenzheWang > wongwwz@foxmail.com > > >   > > > > > ------------------ Original ------------------ > From: "WenzheWang" <wongwwz@foxmail.com>; > Date: Tue, Apr 11, 2023 11:03 PM > To: "ffmpeg-devel"<ffmpeg-devel@ffmpeg.org>; > > Subject: Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend > > > > > Could you please briefly introduce the reason why not adding any dnn backend?  > > > > > Do you have any plan for the maintenance and development of the dnn backend in the future? From my understanding, the current backend of dnn has tensoflow, openvino and native, but this cannot meet the needs of users. > > > > > Thus, I believe adding other dnn backends will be great for user experience, user growth, and industrial applications. In particular, various dnn backend can be adapted to different application environments, and there are some emerging inference engines that are faster and stronger, such as Pytorch and Paddle. In addition, from the practical point of view, it is not difficult for a deep learning practitioner to learn and use this framework, but how to choose a framework and apply it in practice, people pay more attention to the effect (recall and precision), and easy deployment, that is, high reasoning performance efficiency. The main reason why Paddle is relatively mainstream and why I want to add paddle backend is that it has a very high efficiency and performance. There are several projects maintained by Paddle, such as paddleDetection, paddleSeg, paddleGAN, paddleOCR and paddleCls have a lot of good pre-training models that migrate well to their own data and has excellent perfo rm > ance. Secondly, in terms of reasoning efficiency, Paddle supports many platforms and chips. Models trained using Paddle framework can be directly deployed, and custom device interfaces are open for independent development based on one's own hardware. > > FFmpeg itself already has very extensive support for codec. If FFmpeg could support the deployment of more reasoning model backend, it would have a wider application. > > > > > In general, I hope that ffmpeg could support the backend of paddle or more. In any case that my code is not mature or proper, I would be grateful if professionals like you could offer me suggestions and comments. I will be absolutely honored if I could contribute to this project :) > > > > > Best, > > Wenzhe Wang > > > > > WenzheWang > wongwwz@foxmail.com > > >   > > > > > ------------------ Original ------------------ > From: "FFmpeg development discussions and patches" <jb@videolan.org>; > Date: Sun, Apr 9, 2023 05:31 AM > To: "ffmpeg-devel"<ffmpeg-devel@ffmpeg.org>; > > Subject: Re: [FFmpeg-devel] [PATCH v1] libavfi/dnn: add Paddle Inference as one of DNN backend > > > On Thu, 6 Apr 2023, at 12:36, wongwwz@foxmail.com wrote: > > PaddlePaddle (PArallel Distributed Deep LEarning) is a simple, > > efficient and extensible deep learning framework that accelerates the > > Please don't add another DNN backend. > > -- > Jean-Baptiste Kempf -  President > +33 672 704 734 > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe". > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-request@ffmpeg.org with subject "unsubscribe".

diff --git a/configure b/configure index 6e363eb470..41cc8f99e2 100755 --- a/configure +++ b/configure @@ -276,6 +276,8 @@ External library support: --enable-libsvtav1 enable AV1 encoding via SVT [no] --enable-libtensorflow enable TensorFlow as a DNN module backend for DNN based filters like sr [no] + --enable-libpaddle enable Paddlepaddle as a DNN module backend + for DNN based filters like sr [no] --enable-libtesseract enable Tesseract, needed for ocr filter [no] --enable-libtheora enable Theora encoding via libtheora [no] --enable-libtls enable LibreSSL (via libtls), needed for https support @@ -1855,6 +1857,7 @@ EXTERNAL_LIBRARY_LIST=" libssh libsvtav1 libtensorflow + libpaddle libtesseract libtheora libtwolame @@ -2717,7 +2720,7 @@ dct_select="rdft" deflate_wrapper_deps="zlib" dirac_parse_select="golomb" dovi_rpu_select="golomb" -dnn_suggest="libtensorflow libopenvino" +dnn_suggest="libtensorflow libopenvino libpaddle" dnn_deps="avformat swscale" error_resilience_select="me_cmp" faandct_deps="faan" @@ -6695,6 +6698,7 @@ enabled libspeex && require_pkg_config libspeex speex speex/speex.h spe enabled libsrt && require_pkg_config libsrt "srt >= 1.3.0" srt/srt.h srt_socket enabled libsvtav1 && require_pkg_config libsvtav1 "SvtAv1Enc >= 0.9.0" EbSvtAv1Enc.h svt_av1_enc_init_handle enabled libtensorflow && require libtensorflow tensorflow/c/c_api.h TF_Version -ltensorflow +enabled libpaddle && require libpaddle pd_inference_api.h PD_GetVersion $CFLAGS enabled libtesseract && require_pkg_config libtesseract tesseract tesseract/capi.h TessBaseAPICreate enabled libtheora && require libtheora theora/theoraenc.h th_info_init -ltheoraenc -ltheoradec -logg enabled libtls && require_pkg_config libtls libtls tls.h tls_configure diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile index 4cfbce0efc..b9fc8e4a30 100644 --- a/libavfilter/dnn/Makefile +++ b/libavfilter/dnn/Makefile @@ -16,5 +16,6 @@ OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_mat DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o DNN-OBJS-$(CONFIG_LIBOPENVINO) += dnn/dnn_backend_openvino.o +DNN-OBJS-$(CONFIG_LIBPADDLE) += dnn/dnn_backend_pd.o OBJS-$(CONFIG_DNN) += $(DNN-OBJS-yes) diff --git a/libavfilter/dnn/dnn_backend_common.c b/libavfilter/dnn/dnn_backend_common.c index 91a4a3c4bf..5adeb6bb3b 100644 --- a/libavfilter/dnn/dnn_backend_common.c +++ b/libavfilter/dnn/dnn_backend_common.c @@ -43,7 +43,7 @@ int ff_check_exec_params(void *ctx, DNNBackendType backend, DNNFunctionType func return AVERROR(EINVAL); } - if (exec_params->nb_output != 1 && backend != DNN_TF) { + if (exec_params->nb_output != 1 && backend != DNN_TF && backend !=DNN_PD) { // currently, the filter does not need multiple outputs, // so we just pending the support until we really need it. avpriv_report_missing_feature(ctx, "multiple outputs"); diff --git a/libavfilter/dnn/dnn_backend_pd.c b/libavfilter/dnn/dnn_backend_pd.c new file mode 100644 index 0000000000..b397b945b1 --- /dev/null +++ b/libavfilter/dnn/dnn_backend_pd.c @@ -0,0 +1,840 @@ +/* + * Copyright (c) 2023 + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * DNN paddle backend implementation. + */ + +#include "dnn_backend_pd.h" +#include "dnn_backend_native.h" +#include "libavutil/avassert.h" +#include "libavutil/avstring.h" +#include "libavutil/cpu.h" +#include "../internal.h" +#include "dnn_io_proc.h" +#include "dnn_backend_common.h" +#include "safe_queue.h" +#include <paddle/pd_inference_api.h> + +typedef struct PDOptions { + float im_height, im_width; + float scale_factorH, scale_factorW; + char *input_layout; + uint8_t async; + uint32_t nireq; +} PDOptions; + +typedef struct PDContext { + const AVClass *class; + PDOptions options; +} PDContext; + +typedef struct PDModel { + PDContext ctx; + DNNModel *model; + PD_Config *config; + PD_Predictor *predictor; + PD_Bool status; + SafeQueue *request_queue; + Queue *lltask_queue; + Queue *task_queue; +} PDModel; +/** + * Stores execution parameters for single + * call to the Paddlepaddle C API + */ +typedef struct PDInferRequest { + + PD_OneDimArrayCstr *input_names; + PD_OneDimArrayCstr *output_names; + PD_Tensor **output_tensors; + PD_Tensor *input_tensor; +} PDInferRequest; + +typedef struct PDRequestItem { + PDInferRequest *infer_request; + LastLevelTaskItem *lltask; + PD_Bool status; + DNNAsyncExecModule exec_module; +} PDRequestItem; + +#define OFFSET(x) offsetof(PDContext, x) +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM +static const AVOption dnn_paddle_options[] = { + {"im_height", "image shape(H,W)", OFFSET(options.im_height), AV_OPT_TYPE_FLOAT, {.dbl = 320}, 0, 10000, + FLAGS}, + {"im_width", "image shape(H,W)", OFFSET(options.im_width), AV_OPT_TYPE_FLOAT, {.dbl = 320}, 0, 10000, + FLAGS}, + {"scale_factorH", "scalar factor for height", OFFSET(options.scale_factorH), AV_OPT_TYPE_FLOAT, {.dbl = 1.0}, + 0, 10000, + FLAGS}, + {"scale_factorW", "scalar factor for height", OFFSET(options.scale_factorW), AV_OPT_TYPE_FLOAT, {.dbl = 1.0}, + 0, 10000, + FLAGS}, + {"input_layout", "NHWC or NCHW", OFFSET(options.input_layout), AV_OPT_TYPE_STRING, {.str = "NCHW"}, 0, 0, + FLAGS}, + DNN_BACKEND_COMMON_OPTIONS + {NULL} +}; + +AVFILTER_DEFINE_CLASS(dnn_paddle); + +static int execute_model_pd(PDRequestItem *request, Queue *lltask_queue); + +static void infer_completion_callback(void *args); + +static inline void destroy_request_item(PDRequestItem **arg); + + +/** + * Free the contents of Paddle inference request. + * It does not free the PDInferRequest instance. + * + * @param request pointer to PDInferRequest instance. + * NULL pointer is allowed. + */ +static void pd_free_request(PDInferRequest *request) { + if (!request) + return; + if (request->input_tensor) { + PD_TensorDestroy(request->input_tensor); + request->input_tensor = NULL; + } + av_freep(&request->input_names); + av_freep(&request->output_names); + if (request->output_tensors) { + int nb_output = sizeof(*request->output_tensors) / sizeof(request->output_tensors[0]); + for (uint32_t i = 0; i < nb_output; ++i) { + if (request->output_tensors[i]) { + PD_TensorDestroy(request->output_tensors[i]); + request->output_tensors[i] = NULL; + } + } + av_freep(&request->output_tensors); + } +} + +/** + * Free the PaddkeRequestItem completely. + * + * @param arg Address of the PaddleInferRequest instance. + */ +static inline void destroy_request_item(PDRequestItem **arg) { + PDRequestItem *request; + if (!arg) { + return; + } + request = *arg; + pd_free_request(request->infer_request); + av_freep(&request->infer_request); + av_freep(&request->lltask); + ff_dnn_async_module_cleanup(&request->exec_module); + av_freep(arg); +} + +/** + * Create a Paddle inference request. All properties + * are initially unallocated and set as NULL. + * + * @return pointer to the allocated PDInferRequest instance. + */ +static PDInferRequest *pd_create_inference_request(void) { + PDInferRequest *infer_request = av_malloc(sizeof(PDInferRequest)); + if (!infer_request) { + return NULL; + } + infer_request->input_names = NULL; + infer_request->output_names = NULL; + infer_request->input_tensor = NULL; + infer_request->output_tensors = NULL; + return infer_request; +} + +static int load_pd_model(PDModel *pd_model, const char *model_filename) { + + PDContext *ctx = &pd_model->ctx; + char *model_path = (char *) malloc(strlen(model_filename) + strlen(".pdmodel")+1); + char *params_path = (char *) malloc(strlen(model_filename) + strlen(".pdiparams")+1); + pd_model->config = PD_ConfigCreate(); + strcpy(model_path, model_filename); + strcat(model_path, ".pdmodel"); + strcpy(params_path, model_filename); + strcat(params_path, ".pdiparams"); + PD_ConfigSetModel(pd_model->config, model_path, params_path); + free(model_path); + free(params_path); + pd_model->status = PD_ConfigIsValid(pd_model->config); + pd_model->predictor = PD_PredictorCreate(pd_model->config); + if (!pd_model->status) { + av_log(ctx, AV_LOG_ERROR, "Failed to read model \"%s\" graph\n", model_filename); + PD_ConfigDestroy(pd_model->config); + PD_PredictorDestroy(pd_model->predictor); + return DNN_GENERIC_ERROR; + } + return 0; +} + +static float *transposeNHWC2NCHW(float *data, const int32_t shape[4]) { + // the shape layout is NCHW + int N = shape[0]; + int H = shape[2]; + int W = shape[3]; + int C = shape[1]; + float *transposed = calloc(shape[0] * shape[1] * shape[2] * shape[3], sizeof(float)); + // [N,H,W,C] -> [N,C,H,W] + for (int n = 0; n < N; ++n) { + for (int c = 0; c < C; ++c) { + for (int h = 0; h < H; ++h) { + for (int w = 0; w < W; ++w) { + int old_index = n * H * W * C + h * W * C + w * C + c; + int new_index = n * C * H * W + c * H * W + h * W + w; + transposed[new_index] = data[old_index]; + } + } + } + } + memcpy(data, transposed, shape[0] * shape[1] * shape[2] * shape[3] * sizeof(float)); + free(transposed); + return data; +} + +static float *transposeNCHW2NHWC(float *data, const int32_t shape[4]) { + // the shape layout is NCHW + int N = shape[0]; + int C = shape[1]; + int H = shape[2]; + int W = shape[3]; + float *transposed = calloc(shape[0] * shape[1] * shape[2] * shape[3], sizeof(float)); + // [N,C,H,W] -> [N,H,W,C] + for (int n = 0; n < N; ++n) { + for (int h = 0; h < H; ++h) { + for (int w = 0; w < W; ++w) { + for (int c = 0; c < C; ++c) { + int old_index = n * C * H * W + c * H * W + h * W + w; + int new_index = n * H * W * C + h * W * C + w * C + c; + transposed[new_index] = data[old_index]; + } + } + } + } + memcpy(data, transposed, shape[0] * shape[1] * shape[2] * shape[3] * sizeof(float)); + free(transposed); + return data; +} + +static int get_name_index(PDModel *pd_model, TaskItem *task) { + int name_index = -1; + PD_OneDimArrayCstr *pd_input_names = PD_PredictorGetInputNames(pd_model->predictor); + for (int i = 0; i < pd_input_names->size; ++i) { + if (strcmp(pd_input_names->data[i], task->input_name) == 0) { + name_index = i; + } + } + PD_OneDimArrayCstrDestroy(pd_input_names); + if (name_index == -1) { + av_log(&pd_model->ctx, AV_LOG_ERROR, "Could not find \"%s\" in model\n", task->input_name); + return AVERROR(EINVAL); + } + return name_index; +} + +static int pd_start_inference(void *args) { + DNNData input; + PDRequestItem *request = args; + PDInferRequest *infer_request = request->infer_request; + LastLevelTaskItem *lltask = request->lltask; + TaskItem *task = lltask->task; + PDModel *pd_model = task->model; + // get input data nhwc + PD_Tensor *input_tensor = infer_request->input_tensor; + int32_t input_shape[4] = {1, -1, -1, -1}; + + for (int i = 0; i < infer_request->input_names->size; ++i) { + + if (strcmp(infer_request->input_names->data[i], "im_shape") == 0) { + PD_Tensor *im_shape_tensor = PD_PredictorGetInputHandle(pd_model->predictor, + infer_request->input_names->data[i]); + int32_t im_shape_shape[2] = {1, 2}; + float im_shape_data[2] = {pd_model->ctx.options.im_height, pd_model->ctx.options.im_height}; + PD_TensorReshape(im_shape_tensor, 2, im_shape_shape); + PD_TensorCopyFromCpuFloat(im_shape_tensor, im_shape_data); + } else if (strcmp(infer_request->input_names->data[i], "scale_factor") == 0) { + PD_Tensor *scale_factor_tensor = PD_PredictorGetInputHandle(pd_model->predictor, + infer_request->input_names->data[i]); + int32_t scale_factor_shape[2] = {1, 2}; + float scale_factor_data[2] = {pd_model->ctx.options.scale_factorH, pd_model->ctx.options.scale_factorW}; + PD_TensorReshape(scale_factor_tensor, 2, scale_factor_shape); + PD_TensorCopyFromCpuFloat(scale_factor_tensor, scale_factor_data); + } + } + + if (strcmp(pd_model->ctx.options.input_layout, "NCHW") == 0) { + input_shape[1] = 3; + input_shape[2] = task->in_frame->height; + input_shape[3] = task->in_frame->width; + } else if (strcmp(pd_model->ctx.options.input_layout, "NHWC") == 0) { + input_shape[1] = task->in_frame->height; + input_shape[2] = task->in_frame->width; + input_shape[3] = 3; + } else { + av_log(&pd_model->ctx, AV_LOG_ERROR, "The input layout should be NCHW or NHWC\n"); + } + float *in_data = (float *) calloc(1 * input_shape[1] * input_shape[2] * input_shape[3], sizeof(float)); + PD_TensorCopyToCpuFloat(input_tensor, in_data); + if (strcmp(pd_model->ctx.options.input_layout, "NCHW") == 0) { + in_data = transposeNHWC2NCHW(in_data, input_shape); + } + + PD_TensorReshape(input_tensor, 4, input_shape); + PD_TensorCopyFromCpuFloat(input_tensor, in_data); + free(in_data); + + request->status = PD_PredictorRun(pd_model->predictor); + + if (!request->status) { + av_log(&pd_model->ctx, AV_LOG_ERROR, "%s", "paddlepaddle predictor run fail!"); + pd_free_request(infer_request); + if (ff_safe_queue_push_back(pd_model->request_queue, request) < 0) { + destroy_request_item(&request); + } + return DNN_GENERIC_ERROR; + } + return 0; +} + +static void infer_completion_callback(void *args) { + PDRequestItem *request = args; + LastLevelTaskItem *lltask = request->lltask; + TaskItem *task = lltask->task; + DNNData *outputs; + PDInferRequest *infer_request = request->infer_request; + PDModel *pd_model = task->model; + PDContext *ctx = &pd_model->ctx; + + outputs = av_malloc_array(task->nb_output, sizeof(*outputs)); + if (!outputs) { + av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for *outputs\n"); + goto err; + } + + for (uint32_t i = 0; i < task->nb_output; ++i) { + const size_t shape_size = PD_TensorGetShape(infer_request->output_tensors[i])->size; + int32_t length = 1; + PD_DataType out_dt = PD_TensorGetDataType(infer_request->output_tensors[i]); + size_t size; + float *out_data; + + if (strcmp(pd_model->ctx.options.input_layout, "NCHW") == 0) { + outputs[i].height = PD_TensorGetShape(infer_request->output_tensors[i])->data[2]; + outputs[i].width = PD_TensorGetShape(infer_request->output_tensors[i])->data[3]; + outputs[i].channels = PD_TensorGetShape(infer_request->output_tensors[i])->data[1]; + } else { + outputs[i].height = PD_TensorGetShape(infer_request->output_tensors[i])->data[1]; + outputs[i].width = PD_TensorGetShape(infer_request->output_tensors[i])->data[2]; + outputs[i].channels = PD_TensorGetShape(infer_request->output_tensors[i])->data[3]; + } + + for (int j = 0; j < shape_size; ++j) { + length *= PD_TensorGetShape(infer_request->output_tensors[i])->data[j]; + } + + if (out_dt != PD_DATA_FLOAT32){ + av_log(&pd_model->ctx, AV_LOG_ERROR, "The model output datatype has to be float.\n"); + } else { + outputs[i].dt = DNN_FLOAT; + size = sizeof(float); + out_data = (float *) malloc(length * size); + PD_TensorCopyToCpuFloat(infer_request->output_tensors[i], out_data); + } + + if (shape_size == 4 && (strcmp(pd_model->ctx.options.input_layout, "NCHW") == 0)) { + int32_t output_shape[4] = {PD_TensorGetShape(infer_request->output_tensors[i])->data[0], + PD_TensorGetShape(infer_request->output_tensors[i])->data[1], + PD_TensorGetShape(infer_request->output_tensors[i])->data[2], + PD_TensorGetShape(infer_request->output_tensors[i])->data[3]}; + out_data = transposeNCHW2NHWC(out_data, output_shape); + } + + outputs[i].order = DCO_BGR; + outputs[i].data = out_data; + } + switch (pd_model->model->func_type) { + case DFT_PROCESS_FRAME: + //it only support 1 output if it's frame in & frame out + if (task->do_ioproc) { + if (pd_model->model->frame_post_proc != NULL) { + pd_model->model->frame_post_proc(task->out_frame, outputs, pd_model->model->filter_ctx); + } else { + ff_proc_from_dnn_to_frame(task->out_frame, outputs, ctx); + } + } else { + task->out_frame->width = outputs[0].width; + task->out_frame->height = outputs[0].height; + } + break; + case DFT_ANALYTICS_DETECT: + if (!pd_model->model->detect_post_proc) { + av_log(ctx, AV_LOG_ERROR, "Detect filter needs provide post proc\n"); + return; + } + pd_model->model->detect_post_proc(task->in_frame, outputs, task->nb_output, pd_model->model->filter_ctx); + break; + default: + av_log(ctx, AV_LOG_ERROR, "Paddle Inference backend does not support this kind of dnn filter now\n"); + goto err; + } + task->inference_done++; + err: + pd_free_request(infer_request); + av_freep(&outputs); + if (ff_safe_queue_push_back(pd_model->request_queue, request) < 0) { + destroy_request_item(&request); + av_log(ctx, AV_LOG_ERROR, "Failed to push back request_queue.\n"); + } +} + +static int extract_lltask_from_task(TaskItem *task, Queue *lltask_queue) { + PDModel *pd_model = task->model; + PDContext *ctx = &pd_model->ctx; + LastLevelTaskItem *lltask = av_malloc(sizeof(*lltask)); + if (!lltask) { + av_log(ctx, AV_LOG_ERROR, "Unable to allocate space for LastLevelTaskItem\n"); + return AVERROR(ENOMEM); + } + task->inference_todo = 1; + task->inference_done = 0; + lltask->task = task; + if (ff_queue_push_back(lltask_queue, lltask) < 0) { + av_log(ctx, AV_LOG_ERROR, "Failed to push back lltask_queue.\n"); + av_freep(&lltask); + return AVERROR(ENOMEM); + } + return 0; +} + +static int get_input_pd(void *model, DNNData *input, const char *input_name) { + PDModel *pd_model = model; + PDContext *ctx = &pd_model->ctx; + int has_name = -1; + PD_OneDimArrayCstr *pd_input_names = PD_PredictorGetInputNames(pd_model->predictor); + for (int i = 0; i < pd_input_names->size; ++i) { + if (strcmp(pd_input_names->data[i], input_name) == 0) { + has_name = i; + break; + } + } + PD_OneDimArrayCstrDestroy(pd_input_names); + if (has_name == -1) { + av_log(ctx, AV_LOG_ERROR, "Could not find \"%s\" in model\n", input_name); + return AVERROR(EINVAL); + } + input->dt = DNN_FLOAT; + input->order = DCO_RGB; + input->height = -1; + input->width = -1; + input->channels = 3; + return 0; +} + +static int get_output_pd(void *model, const char *input_name, int input_width, int input_height, + const char *output_name, int *output_width, int *output_height) { + int ret = 0; + PDModel *pd_model = model; + PDContext *ctx = &pd_model->ctx; + TaskItem task; + PDRequestItem *request; + DNNExecBaseParams exec_params = { + .input_name = input_name, + .output_names = &output_name, + .nb_output = 1, + .in_frame = NULL, + .out_frame = NULL, + }; + + ret = ff_dnn_fill_gettingoutput_task(&task, &exec_params, pd_model, input_height, input_width, ctx); + if (ret != 0) { + goto err; + } + + ret = extract_lltask_from_task(&task, pd_model->lltask_queue); + if (ret != 0) { + av_log(ctx, AV_LOG_ERROR, "unable to extract inference from task.\n"); + goto err; + } + + request = ff_safe_queue_pop_front(pd_model->request_queue); + if (!request) { + av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); + ret = AVERROR(EINVAL); + goto err; + } + + ret = execute_model_pd(request, pd_model->lltask_queue); + *output_width = task.out_frame->width; + *output_height = task.out_frame->height; + + err: + av_frame_free(&task.out_frame); + av_frame_free(&task.in_frame); + return ret; +} + +DNNModel *ff_dnn_load_model_pd(const char *model_filename, DNNFunctionType func_type, const char *options, + AVFilterContext *filter_ctx) { + DNNModel *model = NULL; + PDModel *pd_model = NULL; + PDRequestItem *item = NULL; + PDContext *ctx = NULL; + + model = av_mallocz(sizeof(DNNModel)); + if (!model) { + return NULL; + } + + pd_model = av_mallocz(sizeof(PDModel)); + if (!pd_model) { + av_freep(&model); + return NULL; + } + pd_model->model = model; + ctx = &pd_model->ctx; + ctx->class = &dnn_paddle_class; + + //parse options + av_opt_set_defaults(ctx); + if (av_opt_set_from_string(ctx, options, NULL, "=", "&") < 0) { + av_log(ctx, AV_LOG_ERROR, "Failed to parse options \"%s\"\n", options); + goto err; + } + + if (load_pd_model(pd_model, model_filename) != 0) { + goto err; + } + + if (ctx->options.nireq <= 0) { + ctx->options.nireq = av_cpu_count() / 2 + 1; + } + +#if !HAVE_PTHREAD_CANCEL + if (ctx->options.async) { + ctx->options.async = 0; + av_log(filter_ctx, AV_LOG_WARNING, "pthread is not supported, roll back to sync.\n"); + } +#endif + + pd_model->request_queue = ff_safe_queue_create(); + if (!pd_model->request_queue) { + goto err; + } + + for (int i = 0; i < ctx->options.nireq; i++) { + PDRequestItem *item = av_mallocz(sizeof(*item)); + if (!item) { + goto err; + } + item->lltask = NULL; + item->infer_request = pd_create_inference_request(); + if (!item->infer_request) { + av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for Paddle inference request\n"); + av_freep(&item); + goto err; + } + item->exec_module.start_inference = &pd_start_inference; + item->exec_module.callback = &infer_completion_callback; + item->exec_module.args = item; + + if (ff_safe_queue_push_back(pd_model->request_queue, item) < 0) { + destroy_request_item(&item); + goto err; + } + } + + pd_model->lltask_queue = ff_queue_create(); + if (!pd_model->lltask_queue) { + goto err; + } + + pd_model->task_queue = ff_queue_create(); + if (!pd_model->task_queue) { + goto err; + } + + model->model = pd_model; + model->get_input = &get_input_pd; + model->get_output = &get_output_pd; + model->options = options; + model->filter_ctx = filter_ctx; + model->func_type = func_type; + + return model; + err: + ff_dnn_free_model_pd(&model); + return NULL; +} + +static int fill_model_input_pd(PDModel *pd_model, PDRequestItem *request) { + DNNData input; + LastLevelTaskItem *lltask; + TaskItem *task; + PDInferRequest *infer_request; + PDContext *ctx = &pd_model->ctx; + int ret = 0; + int32_t input_shape[4] = {1, -1, -1, -1}; + + lltask = ff_queue_pop_front(pd_model->lltask_queue); + av_assert0(lltask); + task = lltask->task; + request->lltask = lltask; + + ret = get_input_pd(pd_model, &input, task->input_name); + if (ret != 0) { + goto err; + } + + infer_request = request->infer_request; + input.height = task->in_frame->height; + input.width = task->in_frame->width; + + infer_request->input_names = av_malloc(sizeof(PD_OneDimArrayCstr)); + if (!infer_request->input_names) { + av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for input tensor\n"); + ret = AVERROR(ENOMEM); + goto err; + } + + int name_index = get_name_index(pd_model, task); + infer_request->input_names = PD_PredictorGetInputNames(pd_model->predictor); + infer_request->input_tensor = PD_PredictorGetInputHandle(pd_model->predictor, + infer_request->input_names->data[name_index]); + if (!infer_request->input_tensor) { + av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for input tensor\n"); + ret = AVERROR(ENOMEM); + goto err; + } + + if (strcmp(pd_model->ctx.options.input_layout, "NCHW") == 0) { + input_shape[1] = input.channels; + input_shape[2] = input.height; + input_shape[3] = input.width; + } else if (strcmp(pd_model->ctx.options.input_layout, "NHWC") == 0) { + input_shape[1] = input.height; + input_shape[2] = input.width; + input_shape[3] = input.channels; + } else { + av_log(ctx, AV_LOG_ERROR, "The input layout should be NCHW or NHWC\n"); + } + float *in_data = (float *) calloc(1 * input_shape[1] * input_shape[2] * input_shape[3], sizeof(float)); + PD_TensorReshape(infer_request->input_tensor, 4, input_shape); + input.data = in_data; + PD_TensorCopyFromCpuFloat(infer_request->input_tensor, input.data); + + switch (pd_model->model->func_type) { + case DFT_PROCESS_FRAME: + if (task->do_ioproc) { + if (pd_model->model->frame_pre_proc != NULL) { + pd_model->model->frame_pre_proc(task->in_frame, &input, pd_model->model->filter_ctx); + } else { + ff_proc_from_frame_to_dnn(task->in_frame, &input, ctx); + } + PD_TensorCopyFromCpuFloat(infer_request->input_tensor, input.data); + } + break; + case DFT_ANALYTICS_DETECT: + ff_proc_from_frame_to_dnn(task->in_frame, &input, ctx); + PD_TensorCopyFromCpuFloat(infer_request->input_tensor, input.data); + break; + default: + avpriv_report_missing_feature(ctx, "model function type %d", pd_model->model->func_type); + break; + } + + infer_request->output_names = PD_PredictorGetOutputNames(pd_model->predictor);; + if (infer_request->output_names == NULL) { + av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for *pd_outputs\n"); + ret = AVERROR(ENOMEM); + goto err; + } + + infer_request->output_tensors = av_calloc(task->nb_output, sizeof(*infer_request->output_tensors)); + if (!infer_request->output_tensors) { + av_log(ctx, AV_LOG_ERROR, "Failed to allocate memory for output tensor\n"); + ret = AVERROR(ENOMEM); + goto err; + } + + + for (int i = 0; i < task->nb_output; ++i) { + infer_request->output_tensors[i] = PD_PredictorGetOutputHandle(pd_model->predictor, + infer_request->output_names->data[i]); + if (strcmp(infer_request->output_names->data[i], task->output_names[i]) != 0) { + av_log(ctx, AV_LOG_ERROR, "Could not find output \"%s\" in model\n", task->output_names[i]); + ret = DNN_GENERIC_ERROR; + goto err; + } + } + return 0; + err: + pd_free_request(infer_request); + return ret; +} + +static int execute_model_pd(PDRequestItem *request, Queue *lltask_queue) { + PDModel *pd_model; + PDContext *ctx; + LastLevelTaskItem *lltask; + TaskItem *task; + int ret; + + if (ff_queue_size(lltask_queue) == 0) { + destroy_request_item(&request); + return 0; + } + + lltask = ff_queue_peek_front(lltask_queue); + task = lltask->task; + pd_model = task->model; + ctx = &pd_model->ctx; + + ret = fill_model_input_pd(pd_model, request); + if (ret != 0) { + goto err; + } + + ret = pd_start_inference(request); + if (ret != 0) { + goto err; + } + infer_completion_callback(request); + return (task->inference_done == task->inference_todo) ? 0 : DNN_GENERIC_ERROR; + + err: + pd_free_request(request->infer_request); + if (ff_safe_queue_push_back(pd_model->request_queue, request) < 0) { + destroy_request_item(&request); + } + return ret; +} + +int ff_dnn_execute_model_pd(const DNNModel *model, DNNExecBaseParams *exec_params) { + PDModel *pd_model = model->model; + PDContext *ctx = &pd_model->ctx; + TaskItem *task; + PDRequestItem *request; + int ret = 0; + + ret = ff_check_exec_params(ctx, DNN_PD, model->func_type, exec_params); + if (ret != 0) { + return ret; + } + + task = av_malloc(sizeof(*task)); + if (!task) { + av_log(ctx, AV_LOG_ERROR, "unable to alloc memory for task item.\n"); + return AVERROR(ENOMEM); + } + + ret = ff_dnn_fill_task(task, exec_params, pd_model, ctx->options.async, 1); + if (ret != 0) { + av_freep(&task); + return ret; + } + + if (ff_queue_push_back(pd_model->task_queue, task) < 0) { + av_freep(&task); + av_log(ctx, AV_LOG_ERROR, "unable to push back task_queue.\n"); + return AVERROR(ENOMEM); + } + + ret = extract_lltask_from_task(task, pd_model->lltask_queue); + if (ret != 0) { + av_log(ctx, AV_LOG_ERROR, "unable to extract last level task from task.\n"); + return ret; + } + + request = ff_safe_queue_pop_front(pd_model->request_queue); + if (!request) { + av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); + return AVERROR(EINVAL); + } + return execute_model_pd(request, pd_model->lltask_queue); +} + +int ff_dnn_flush_pd(const DNNModel *model) { + PDModel *pd_model = model->model; + PDContext *ctx = &pd_model->ctx; + PDRequestItem *request; + int ret; + + if (ff_queue_size(pd_model->lltask_queue) == 0) { + // no pending task need to flush + return 0; + } + + request = ff_safe_queue_pop_front(pd_model->request_queue); + if (!request) { + av_log(ctx, AV_LOG_ERROR, "unable to get infer request.\n"); + return AVERROR(EINVAL); + } + + ret = fill_model_input_pd(pd_model, request); + if (ret != 0) { + av_log(ctx, AV_LOG_ERROR, "Failed to fill model input.\n"); + if (ff_safe_queue_push_back(pd_model->request_queue, request) < 0) { + destroy_request_item(&request); + } + return ret; + } + return execute_model_pd(request, pd_model->lltask_queue); +} + +void ff_dnn_free_model_pd(DNNModel **model) { + PDModel *pd_model; + + if (*model) { + pd_model = (*model)->model; + while (ff_safe_queue_size(pd_model->request_queue) != 0) { + PDRequestItem *item = ff_safe_queue_pop_front(pd_model->request_queue); + destroy_request_item(&item); + } + ff_safe_queue_destroy(pd_model->request_queue); + + while (ff_queue_size(pd_model->lltask_queue) != 0) { + LastLevelTaskItem *item = (LastLevelTaskItem *)ff_queue_pop_front(pd_model->lltask_queue); + av_freep(&item); + } + ff_queue_destroy(pd_model->lltask_queue); + + while (ff_queue_size(pd_model->task_queue) != 0) { + TaskItem *item = ff_queue_pop_front(pd_model->task_queue); + av_frame_free(&item->in_frame); + av_frame_free(&item->out_frame); + av_freep(&item); + } + ff_queue_destroy(pd_model->task_queue); + av_freep(&pd_model); + av_freep(model); + } +} + +DNNAsyncStatusType ff_dnn_get_result_pd(const DNNModel *model, AVFrame **in, AVFrame **out) { + PDModel *pd_model = model->model; + return ff_dnn_get_result_common(pd_model->task_queue, in, out); +} diff --git a/libavfilter/dnn/dnn_backend_pd.h b/libavfilter/dnn/dnn_backend_pd.h new file mode 100644 index 0000000000..67dd3c986f --- /dev/null +++ b/libavfilter/dnn/dnn_backend_pd.h @@ -0,0 +1,38 @@ +/* + * Copyright (c) 2023 + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * DNN inference functions interface for paddle backend. + */ + +#ifndef FFMPEG_DNN_BACKEND_PD_H +#define FFMPEG_DNN_BACKEND_PD_H +#include "../dnn_interface.h" + +DNNModel *ff_dnn_load_model_pd(const char *model_filename, DNNFunctionType func_type, const char *options, AVFilterContext *filter_ctx); + +int ff_dnn_execute_model_pd(const DNNModel *model, DNNExecBaseParams *exec_params); +DNNAsyncStatusType ff_dnn_get_result_pd(const DNNModel *model, AVFrame **in, AVFrame **out); +int ff_dnn_flush_pd(const DNNModel *model); + +void ff_dnn_free_model_pd(DNNModel **model); + +#endif //FFMPEG_DNN_BACKEND_PD_H diff --git a/libavfilter/dnn/dnn_interface.c b/libavfilter/dnn/dnn_interface.c index 554a36b0dc..e8d86bbc3a 100644 --- a/libavfilter/dnn/dnn_interface.c +++ b/libavfilter/dnn/dnn_interface.c @@ -27,6 +27,7 @@ #include "dnn_backend_native.h" #include "dnn_backend_tf.h" #include "dnn_backend_openvino.h" +#include "dnn_backend_pd.h" #include "libavutil/mem.h" DNNModule *ff_get_dnn_module(DNNBackendType backend_type) @@ -70,8 +71,20 @@ DNNModule *ff_get_dnn_module(DNNBackendType backend_type) return NULL; #endif break; + case DNN_PD: + #if (CONFIG_LIBPADDLE == 1) + dnn_module->load_model = &ff_dnn_load_model_pd; + dnn_module->execute_model = &ff_dnn_execute_model_pd; + dnn_module->get_result = &ff_dnn_get_result_pd; + dnn_module->flush = &ff_dnn_flush_pd; + dnn_module->free_model = &ff_dnn_free_model_pd; + #else + av_freep(&dnn_module); + return NULL; + #endif + break; default: - av_log(NULL, AV_LOG_ERROR, "Module backend_type is not native or tensorflow\n"); + av_log(NULL, AV_LOG_ERROR, "Module backend_type is not native or tensorflow or paddlepaddle\n"); av_freep(&dnn_module); return NULL; } diff --git a/libavfilter/dnn/dnn_io_proc.c b/libavfilter/dnn/dnn_io_proc.c index 7961bf6b95..d7a8904e9c 100644 --- a/libavfilter/dnn/dnn_io_proc.c +++ b/libavfilter/dnn/dnn_io_proc.c @@ -184,6 +184,14 @@ static enum AVPixelFormat get_pixel_format(DNNData *data) av_assert0(!"unsupported data pixel format.\n"); return AV_PIX_FMT_BGR24; } + } else if (data->dt == DNN_FLOAT) { + switch (data->order) { + case DCO_RGB: + return AV_PIX_FMT_BGR24; + default: + av_assert0(!"unsupported data pixel format.\n"); + return AV_PIX_FMT_BGR24; + } } av_assert0(!"unsupported data type.\n"); diff --git a/libavfilter/dnn_interface.h b/libavfilter/dnn_interface.h index ef8d7ae66f..22fe721e4c 100644 --- a/libavfilter/dnn_interface.h +++ b/libavfilter/dnn_interface.h @@ -32,7 +32,7 @@ #define DNN_GENERIC_ERROR FFERRTAG('D','N','N','!') -typedef enum {DNN_NATIVE, DNN_TF, DNN_OV} DNNBackendType; +typedef enum {DNN_NATIVE, DNN_TF, DNN_OV, DNN_PD} DNNBackendType; typedef enum {DNN_FLOAT = 1, DNN_UINT8 = 4} DNNDataType; diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c index 7e133f6af5..6658dfbfc7 100644 --- a/libavfilter/vf_dnn_detect.c +++ b/libavfilter/vf_dnn_detect.c @@ -210,6 +210,79 @@ static int dnn_detect_post_proc_tf(AVFrame *frame, DNNData *output, AVFilterCont return 0; } +static int dnn_detect_post_proc_pd(AVFrame *frame, DNNData *output, AVFilterContext *filter_ctx) +{ + DnnDetectContext *ctx = filter_ctx->priv; + int proposal_count; + float conf_threshold = ctx->confidence; + float *box_info; + float x0, y0, x1, y1; + int nb_bboxes = 0; + AVFrameSideData *sd; + AVDetectionBBox *bbox; + AVDetectionBBoxHeader *header; + + proposal_count = *(int *) (output[1].data); + box_info = output[0].data; + + sd = av_frame_get_side_data(frame, AV_FRAME_DATA_DETECTION_BBOXES); + if (sd) { + av_log(filter_ctx, AV_LOG_ERROR, "already have dnn bounding boxes in side data.\n"); + return -1; + } + + for (int i = 0; i < proposal_count; ++i) { + if (box_info[i * 6 + 1] < conf_threshold) + continue; + nb_bboxes++; + } + + if (nb_bboxes == 0) { + av_log(filter_ctx, AV_LOG_VERBOSE, "nothing detected in this frame.\n"); + return 0; + } + + header = av_detection_bbox_create_side_data(frame, nb_bboxes); + if (!header) { + av_log(filter_ctx, AV_LOG_ERROR, "failed to create side data with %d bounding boxes\n", nb_bboxes); + return -1; + } + + av_strlcpy(header->source, ctx->dnnctx.model_filename, sizeof(header->source)); + + for (int i = 0; i < proposal_count; ++i) { + if (box_info[i * 6 + 1] < conf_threshold) { + continue; + } + x0 = box_info[i * 6 + 2]; + y0 = box_info[i * 6 + 3]; + x1 = box_info[i * 6 + 4]; + y1 = box_info[i * 6 + 5]; + + bbox = av_get_detection_bbox(header, nb_bboxes - 1); + + bbox->x = (int) (x0 ); + bbox->w = (int) (x1 ) - bbox->x; + bbox->y = (int) (y0 ); + bbox->h = (int) (y1 ) - bbox->y; + + bbox->detect_confidence = av_make_q((int) (box_info[i * 6 + 1] * 10000), 10000); + bbox->classify_count = 0; + + if (ctx->labels && box_info[i * 6] < ctx->label_count) { + av_strlcpy(bbox->detect_label, ctx->labels[(int) box_info[i * 6]], sizeof(bbox->detect_label)); + } else { + snprintf(bbox->detect_label, sizeof(bbox->detect_label), "%d", (int) box_info[i * 6]); + } + + nb_bboxes--; + if (nb_bboxes == 0) { + break; + } + } + return 0; +} + static int dnn_detect_post_proc(AVFrame *frame, DNNData *output, uint32_t nb, AVFilterContext *filter_ctx) { DnnDetectContext *ctx = filter_ctx->priv; @@ -219,6 +292,8 @@ static int dnn_detect_post_proc(AVFrame *frame, DNNData *output, uint32_t nb, AV return dnn_detect_post_proc_ov(frame, output, filter_ctx); case DNN_TF: return dnn_detect_post_proc_tf(frame, output, filter_ctx); + case DNN_PD: + return dnn_detect_post_proc_pd(frame, output, filter_ctx); default: avpriv_report_missing_feature(filter_ctx, "Current dnn backend does not support detect filter\n"); return AVERROR(EINVAL); @@ -309,6 +384,13 @@ static int check_output_nb(DnnDetectContext *ctx, DNNBackendType backend_type, i return AVERROR(EINVAL); } return 0; + case DNN_PD: + if (output_nb != 2) { + av_log(ctx, AV_LOG_ERROR, "Dnn detect filter with paddle backend needs 2 output only, \ + but get %d instead\n", output_nb); + return AVERROR(EINVAL); + } + return 0; default: avpriv_report_missing_feature(ctx, "Dnn detect filter does not support current backend\n"); return AVERROR(EINVAL); diff --git a/libavfilter/vf_dnn_processing.c b/libavfilter/vf_dnn_processing.c index 4462915073..260e62c59b 100644 --- a/libavfilter/vf_dnn_processing.c +++ b/libavfilter/vf_dnn_processing.c @@ -52,6 +52,9 @@ static const AVOption dnn_processing_options[] = { #endif #if (CONFIG_LIBOPENVINO == 1) { "openvino", "openvino backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 2 }, 0, 0, FLAGS, "backend" }, +#endif +#if (CONFIG_LIBPADDLE == 1) + { "paddle", "paddle backend flag", 0, AV_OPT_TYPE_CONST, { .i64 = 3 }, 0, 0, FLAGS, "backend" }, #endif DNN_COMMON_OPTIONS { NULL }

[FFmpeg-devel,v1] libavfi/dnn: add Paddle Inference as one of DNN backend

Checks

Commit Message

Comments

Patch