From patchwork Mon May 18 10:36:24 2020
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Lance Wang <lance.lmwang@gmail.com>
X-Patchwork-Id: 19737
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
X-Original-To: patchwork@ffaux-bg.ffmpeg.org
Delivered-To: patchwork@ffaux-bg.ffmpeg.org
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by ffaux.localdomain (Postfix) with ESMTP id 1378644B70D
	for <patchwork@ffaux-bg.ffmpeg.org>; Mon, 18 May 2020 13:36:38 +0300 (EEST)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EC28168AA7E;
	Mon, 18 May 2020 13:36:37 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com
 [209.85.214.177])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A7E1E680353
 for <ffmpeg-devel@ffmpeg.org>; Mon, 18 May 2020 13:36:31 +0300 (EEST)
Received: by mail-pl1-f177.google.com with SMTP id s20so4063928plp.6
 for <ffmpeg-devel@ffmpeg.org>; Mon, 18 May 2020 03:36:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=isLI5dG+TsL5rbl5wnlZZXO0Jp3CzgaOjRJriAEGORs=;
 b=hFawH+dVmE6jSskBFq+fEiOLADBhNEhBliX+2H96xAbQTXVU4R49syzipT1AGqXGGq
 mXZJKs4Z2p5EBcFrzGpJB3vycw7vgbmeMU1a9bTCTqZc3Tt+9ecljT0qjRhm+tSlsZr0
 PkTR5XDod/jSm8R+sYRvCtnoahKPVP78GLvRz8LzEPQc/AwBnz9Ekzyi3O9oYTuRBjtY
 OcyZAUKyo0OltyFIsjWktTShgUuRDfKh7wXmxR1SvGQXWUnivx7wCnM2H0sSFhJRh8g2
 BHwI1stQZqNVZ3fR2yPcSqIfU6TvpziXMdyDIwCDs860MqDnFkOSzPN2hKd7//Vo3Fxh
 1R2w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references;
 bh=isLI5dG+TsL5rbl5wnlZZXO0Jp3CzgaOjRJriAEGORs=;
 b=QRT/ESNKAkU5YgVcz41GA+WdV/h6oYHLX3O22gfPwgGD9TcL3lqGdS86TQGgWI5hTF
 9Kkry/mYxGc8gXgsjFZ0/GVrvG0BpJN0memSRwhRlicGsKFQnP5DAIZauVt1Xc4jJj3R
 gwYnnYSZ2juTT/tVAo05g+210GqF7lwjsJhmtdzDmaSGrU35hZR+hnpaG3bLdCMpp7Uk
 YEPmg3M8YeRiFLm5wOWzHsxyDCzg16oI0kMCOohUIlfZ8vbu9pe7r4hUS59fxJckWN9S
 r4e3pQrJAv8Eu84ehSPb/Sj5BWmuAFLmdT6wfaraGz+Pu723nzEVR90X/+OjiksjX+Aa
 DAVA==
X-Gm-Message-State: AOAM533xAb4yCbhSN9lBdKQNsDz/KH9JDxSqpSfmLuoK9y5Y5KP5rpQR
 LuMQhs5r6tRxCH7UIW0RYjkw1hi7
X-Google-Smtp-Source: 
 ABdhPJxJUiAH9Af6W198QOa+qDfp3i2QB5XlSg5WNhUXqdwpekVpDoAHFIRXcmHkj34JSzOIfAsG9g==
X-Received: by 2002:a17:90a:d3d3:: with SMTP id
 d19mr14044120pjw.42.1589798189413;
 Mon, 18 May 2020 03:36:29 -0700 (PDT)
Received: from vpn2.localdomain ([161.117.202.209])
 by smtp.gmail.com with ESMTPSA id mn19sm8183919pjb.8.2020.05.18.03.36.27
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Mon, 18 May 2020 03:36:28 -0700 (PDT)
From: lance.lmwang@gmail.com
To: ffmpeg-devel@ffmpeg.org
Date: Mon, 18 May 2020 18:36:24 +0800
Message-Id: <1589798184-22170-1-git-send-email-lance.lmwang@gmail.com>
X-Mailer: git-send-email 1.8.3.1
In-Reply-To: <1589382763-28061-2-git-send-email-lance.lmwang@gmail.com>
References: <1589382763-28061-2-git-send-email-lance.lmwang@gmail.com>
Subject: [FFmpeg-devel] [PATCH v3 2/5] avfilter/vf_libopencv: add opencv
	HaarCascade classifier simple face detection filter
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Limin Wang <lance.lmwang@gmail.com>
MIME-Version: 1.0
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

From: Limin Wang <lance.lmwang@gmail.com>

Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
---
change the update_metadata() to postprocess() only, I'll add opencv drawbox
filter and it need preprocess() to get the meta, so I prefer to change the
function name for better readablity, in future, it may have other processing
than metadata only.

 configure                  |   1 +
 doc/filters.texi           |  29 +++++++
 libavfilter/vf_libopencv.c | 164 ++++++++++++++++++++++++++++++++++++-
 3 files changed, 191 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index 34afdaad28..281b67efc4 100755
--- a/configure
+++ b/configure
@@ -2123,6 +2123,7 @@ HEADERS_LIST="
     machine_ioctl_meteor_h
     malloc_h
     opencv2_core_core_c_h
+    opencv2_objdetect_objdetect_c_h
     OpenGL_gl3_h
     poll_h
     sys_param_h
diff --git a/doc/filters.texi b/doc/filters.texi
index d9ba0fffa1..f938dd04de 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -14177,6 +14177,35 @@ other parameters is 0.
 These parameters correspond to the parameters assigned to the
 libopencv function @code{cvSmooth}.
 
+@subsection facedetect
+Face detection using Haar Feature-based Cascade Classifiers.
+
+The filter takes the following parameters:
+@var{xml_model}|@var{qoffset}.
+
+@var{xml_model} is the path of pre-trained classifiers, The C API still
+does not support the newer cascade format, please use the old format
+haarcascade_frontalface_alt.xml which type_id is opencv-haar-classifier.
+
+@var{qoffset}
+If you want export the detected faces by ROI side data in frame, please set the
+parameters, See also the @ref{addroi} filter. The range of qoffset is from [-1.0, 1.0]
+
+By default the filter will report these metadata values if face are
+detected:
+@table @option
+@item lavfi.facedetect.nb_faces
+Display the detected face number
+
+@item lavfi.facedetect.face_id.x, lavfi.facedetect.face_id.y
+Display x and y of every faces, face_id is the face index which is range
+from [0, nb_faces-1]
+
+@item lavfi.facedetect.face_id.w, lavfi.facedetect.face_id.h
+Display width and height of every faces, face_id is the face index
+which is range from [0, nb_faces-1]
+@end table
+
 @section oscilloscope
 
 2D Video Oscilloscope.
diff --git a/libavfilter/vf_libopencv.c b/libavfilter/vf_libopencv.c
index 8128030b8c..b2d19bb241 100644
--- a/libavfilter/vf_libopencv.c
+++ b/libavfilter/vf_libopencv.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2010 Stefano Sabatini
+ * Copyright (c) 2020 Limin Wang
  *
  * This file is part of FFmpeg.
  *
@@ -27,10 +28,16 @@
 #if HAVE_OPENCV2_CORE_CORE_C_H
 #include <opencv2/core/core_c.h>
 #include <opencv2/imgproc/imgproc_c.h>
+#if HAVE_OPENCV2_OBJECTDETECT_OBJECTDETECT_C_H
+#include <opencv2/objdetect/objdetect_c.h>
+#else
+#include <opencv/cv.h>
+#endif
 #else
 #include <opencv/cv.h>
 #include <opencv/cxcore.h>
 #endif
+
 #include "libavutil/avstring.h"
 #include "libavutil/common.h"
 #include "libavutil/file.h"
@@ -82,6 +89,7 @@ typedef struct OCVContext {
     int (*init)(AVFilterContext *ctx, const char *args);
     void (*uninit)(AVFilterContext *ctx);
     void (*end_frame_filter)(AVFilterContext *ctx, IplImage *inimg, IplImage *outimg);
+    void (*postprocess)(AVFilterContext *ctx, AVFrame *out);
     void *priv;
 } OCVContext;
 
@@ -326,18 +334,152 @@ static void erode_end_frame_filter(AVFilterContext *ctx, IplImage *inimg, IplIma
     cvErode(inimg, outimg, dilate->kernel, dilate->nb_iterations);
 }
 
+typedef struct FaceDetectContext {
+    char *xml_model;
+    CvHaarClassifierCascade* cascade;
+    CvMemStorage* storage;
+    int nb_faces;
+    CvSeq *faces;
+    int add_roi;
+    AVRational qoffset;
+} FaceDetectContext;
+
+static av_cold int facedetect_init(AVFilterContext *ctx, const char *args)
+{
+    OCVContext *s = ctx->priv;
+    FaceDetectContext *facedetect = s->priv;
+    const char *buf = args;
+    double qoffset;
+
+    if (args) {
+        facedetect->xml_model = av_get_token(&buf, "|");
+        if (!facedetect->xml_model) {
+            av_log(ctx, AV_LOG_ERROR, "failed to get %s, %s\n", args, facedetect->xml_model);
+            return AVERROR(EINVAL);
+        }
+
+        if (buf && sscanf(buf, "|%lf", &qoffset) == 1) {
+            if (qoffset < -1.0 || qoffset > 1.0) {
+                av_log(ctx, AV_LOG_ERROR, "failed to get valid qoffset(%f))\n", qoffset);
+                return AVERROR(EINVAL);
+            }
+            facedetect->add_roi = 1;
+            facedetect->qoffset = av_d2q(qoffset, 255);
+        }
+    } else {
+        av_log(ctx, AV_LOG_ERROR, "failed to get haarcascade_frontalface_alt.xml model file\n");
+        return AVERROR(EINVAL);
+    }
+
+    av_log(ctx, AV_LOG_VERBOSE, "xml_model: %s add_roi: %d qoffset: %d/%d\n",
+           facedetect->xml_model, facedetect->add_roi, facedetect->qoffset.num, facedetect->qoffset.den);
+
+    facedetect->storage = cvCreateMemStorage(0);
+    if (!facedetect->storage) {
+        av_log(ctx, AV_LOG_ERROR, "cvCreateMemStorage() failed\n");
+        return AVERROR(EINVAL);
+    }
+    cvClearMemStorage(facedetect->storage);
+
+    facedetect->cascade = (CvHaarClassifierCascade*)cvLoad( facedetect->xml_model, NULL, NULL, NULL );
+    if (!facedetect->cascade) {
+        av_log(ctx, AV_LOG_ERROR, "failed to load classifier cascade: %s \n", facedetect->xml_model);
+        return AVERROR(EINVAL);
+    }
+
+    return 0;
+}
+
+static av_cold void facedetect_uninit(AVFilterContext *ctx)
+{
+    OCVContext *s = ctx->priv;
+    FaceDetectContext *facedetect = s->priv;
+
+    if (facedetect->cascade)
+        cvReleaseHaarClassifierCascade(&facedetect->cascade);
+    if (facedetect->storage)
+        cvReleaseMemStorage(&facedetect->storage);
+}
+
+static void set_meta_int(AVDictionary **metadata, const char *key, int idx, int d)
+{
+    char value[128];
+    char key2[128];
+
+    snprintf(value, sizeof(value), "%d", d);
+    snprintf(key2, sizeof(key2), "lavfi.facedetect.%d.%s", idx, key);
+    av_dict_set(metadata, key2, value, 0);
+}
+
+static void facedetect_end_frame_filter(AVFilterContext *ctx, IplImage *inimg, IplImage *outimg)
+{
+    OCVContext *s = ctx->priv;
+    FaceDetectContext *facedetect = s->priv;
+
+    facedetect->faces = cvHaarDetectObjects(inimg, facedetect->cascade, facedetect->storage,
+            1.25, 3, CV_HAAR_DO_CANNY_PRUNING,
+            cvSize(inimg->width/16,inimg->height/16), cvSize(0,0));
+
+    facedetect->nb_faces = facedetect->faces ? facedetect->faces->total : 0;
+}
+
+static void facedetect_postprocess(AVFilterContext *ctx, AVFrame *out)
+{
+    OCVContext *s = ctx->priv;
+    FaceDetectContext *facedetect = s->priv;
+    AVRegionOfInterest *roi;
+    AVFrameSideData *sd;
+    AVBufferRef *roi_buf;
+    int i;
+
+    if (facedetect->add_roi && facedetect->nb_faces > 0) {
+        sd = av_frame_new_side_data(out, AV_FRAME_DATA_REGIONS_OF_INTEREST,
+                facedetect->nb_faces * sizeof(AVRegionOfInterest));
+        if (!sd) {
+            return AVERROR(ENOMEM);
+        }
+        roi = (AVRegionOfInterest*)sd->data;
+        for(i = 0; i < facedetect->nb_faces; i++ ) {
+            CvRect *r = (CvRect*) cvGetSeqElem(facedetect->faces, i);
+
+            roi[i] = (AVRegionOfInterest) {
+                .self_size = sizeof(*roi),
+                .top       = r->y,
+                .bottom    = r->y + r->height,
+                .left      = r->x,
+                .right     = r->x + r->width,
+                .qoffset   = facedetect->qoffset,
+            };
+        }
+    }
+
+    if (facedetect->nb_faces > 0)
+        av_dict_set_int(&out->metadata, "lavfi.facedetect.nb_faces", facedetect->nb_faces, 0);
+
+    for(i = 0; i < facedetect->nb_faces; i++ ) {
+        CvRect *r = (CvRect*) cvGetSeqElem(facedetect->faces, i);
+
+        set_meta_int(&out->metadata, "x", i, r->x);
+        set_meta_int(&out->metadata, "y", i, r->y);
+        set_meta_int(&out->metadata, "w", i, r->width);
+        set_meta_int(&out->metadata, "h", i, r->height);
+    }
+}
+
 typedef struct OCVFilterEntry {
     const char *name;
     size_t priv_size;
     int  (*init)(AVFilterContext *ctx, const char *args);
     void (*uninit)(AVFilterContext *ctx);
     void (*end_frame_filter)(AVFilterContext *ctx, IplImage *inimg, IplImage *outimg);
+    void (*postprocess)(AVFilterContext *ctx, AVFrame *out);
 } OCVFilterEntry;
 
 static const OCVFilterEntry ocv_filter_entries[] = {
-    { "dilate", sizeof(DilateContext), dilate_init, dilate_uninit, dilate_end_frame_filter },
-    { "erode",  sizeof(DilateContext), dilate_init, dilate_uninit, erode_end_frame_filter  },
-    { "smooth", sizeof(SmoothContext), smooth_init, NULL, smooth_end_frame_filter },
+    { "dilate", sizeof(DilateContext), dilate_init, dilate_uninit, dilate_end_frame_filter, NULL },
+    { "erode",  sizeof(DilateContext), dilate_init, dilate_uninit, erode_end_frame_filter, NULL },
+    { "smooth", sizeof(SmoothContext), smooth_init, NULL, smooth_end_frame_filter, NULL },
+    { "facedetect", sizeof(FaceDetectContext), facedetect_init, facedetect_uninit, facedetect_end_frame_filter, facedetect_postprocess },
 };
 
 static av_cold int init(AVFilterContext *ctx)
@@ -355,6 +497,7 @@ static av_cold int init(AVFilterContext *ctx)
             s->init             = entry->init;
             s->uninit           = entry->uninit;
             s->end_frame_filter = entry->end_frame_filter;
+            s->postprocess      = entry->postprocess;
 
             if (!(s->priv = av_mallocz(entry->priv_size)))
                 return AVERROR(ENOMEM);
@@ -383,18 +526,33 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
     AVFrame *out;
     IplImage inimg, outimg;
 
+    /* facedetect filter will passthrought the input frame */
+    if (strcmp(s->name, "facedetect")) {
     out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
     if (!out) {
         av_frame_free(&in);
         return AVERROR(ENOMEM);
     }
     av_frame_copy_props(out, in);
+    } else {
+        out = in;
+    }
 
     fill_iplimage_from_frame(&inimg , in , inlink->format);
+
+    if (strcmp(s->name, "facedetect")) {
     fill_iplimage_from_frame(&outimg, out, inlink->format);
     s->end_frame_filter(ctx, &inimg, &outimg);
     fill_frame_from_iplimage(out, &outimg, inlink->format);
+    } else {
+        s->end_frame_filter(ctx, &inimg, NULL);
+    }
+
+    if (s->postprocess) {
+        s->postprocess(ctx, out);
+    }
 
+    if (out != in)
     av_frame_free(&in);
 
     return ff_filter_frame(outlink, out);

From patchwork Mon May 18 14:22:46 2020
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Lance Wang <lance.lmwang@gmail.com>
X-Patchwork-Id: 19742
Return-Path: <ffmpeg-devel-bounces@ffmpeg.org>
X-Original-To: patchwork@ffaux-bg.ffmpeg.org
Delivered-To: patchwork@ffaux-bg.ffmpeg.org
Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100])
	by ffaux.localdomain (Postfix) with ESMTP id DEA6944ABE1
	for <patchwork@ffaux-bg.ffmpeg.org>; Mon, 18 May 2020 17:23:00 +0300 (EEST)
Received: from [127.0.1.1] (localhost [127.0.0.1])
	by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id BA35668A45F;
	Mon, 18 May 2020 17:23:00 +0300 (EEST)
X-Original-To: ffmpeg-devel@ffmpeg.org
Delivered-To: ffmpeg-devel@ffmpeg.org
Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com
 [209.85.210.193])
 by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A51D16808B8
 for <ffmpeg-devel@ffmpeg.org>; Mon, 18 May 2020 17:22:53 +0300 (EEST)
Received: by mail-pf1-f193.google.com with SMTP id n18so5015875pfa.2
 for <ffmpeg-devel@ffmpeg.org>; Mon, 18 May 2020 07:22:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=lMZV7R+ySDBBb9+bpaURYK9vk7pbbR2PyArgRanxlek=;
 b=gbGLOX/BPNFHDEcNpTw1Y+axwcY2aRPy8VMc4TryfAgyuKF1QFtFEfJUnSX757ectN
 pyYXXezstZs0mwmLf3uFDviRr0y7x/qEykCz8XyfKWaUH/5eBpJltxWoaRqXpUwRwnid
 eaE4La9UBk/mAE3on7MO84p/aTZLdynbNldpBxqHSR+e1+AS19dyjVydv3m4Oq4ZzHz6
 oXRF1vD/YQGMp1cfufz3EVyZqtTkuBhrgPnSoHJWfHn+uyezf//pUiA1IlbwiMEpMrCI
 esZp7OUK2chiDyUsL0OwlKmXxmu2I6iannMw9eCe2nQkdjmaIPGt/Suv4xvC/7Nwx1L5
 V2CQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references;
 bh=lMZV7R+ySDBBb9+bpaURYK9vk7pbbR2PyArgRanxlek=;
 b=Z5zZuDsbW+mppzeqLbVBmzExw3Uui2B06umk4Nvthg3yIWN62MlLTaTy05xVjEs+wK
 VJZjtLA9DMJPiqD9bz5ED1JRHJetgWGSwmcHJFJoTrU+VJdIoRwUWlcu8k3SmwD8OYgZ
 aRJ7EvAEkDkWtLdnvaqLU6pdaW0Ul9fOjBS4XQNeHKXo8aUEqe9bWKq0I+SQ1bHLqTMm
 nVZVb9mkdNMliKw/JezZ7ivo3B3od50xUvzxvS0xmt6g/cg85gNKjq7wBWJ4Pq/gdDqr
 xCo/BIjmups14ij6Y4bZOGIAdCAa0I1wHs8WfECEU0gQ/sKuFNwDW+rDwgxzOlKWsftY
 qTQg==
X-Gm-Message-State: AOAM530SB9t/ezgWxhiuw97u/3XmhhJycDHOZxIMOnz3EJw+icKSx4MX
 zRG+Nkv6QH1AR1rtZsw1Y3oTUUhn
X-Google-Smtp-Source: 
 ABdhPJz5NwE9jNt29T04XMlcOul9uY9/AZXtP1AueW0cajApI+fvWMbJ/YQFGSpS9X0jD97o2A9fPw==
X-Received: by 2002:a63:d148:: with SMTP id c8mr8159677pgj.51.1589811771328;
 Mon, 18 May 2020 07:22:51 -0700 (PDT)
Received: from vpn2.localdomain ([161.117.202.209])
 by smtp.gmail.com with ESMTPSA id v22sm3878137pfu.172.2020.05.18.07.22.49
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Mon, 18 May 2020 07:22:50 -0700 (PDT)
From: lance.lmwang@gmail.com
To: ffmpeg-devel@ffmpeg.org
Date: Mon, 18 May 2020 22:22:46 +0800
Message-Id: <1589811766-32338-1-git-send-email-lance.lmwang@gmail.com>
X-Mailer: git-send-email 1.8.3.1
In-Reply-To: <1589798184-22170-1-git-send-email-lance.lmwang@gmail.com>
References: <1589798184-22170-1-git-send-email-lance.lmwang@gmail.com>
Subject: [FFmpeg-devel] [PATCH v3 5/5] avfilter/vf_libopencv: add opencv
	drawbox filter
X-BeenThere: ffmpeg-devel@ffmpeg.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: FFmpeg development discussions and patches <ffmpeg-devel.ffmpeg.org>
List-Unsubscribe: <https://ffmpeg.org/mailman/options/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=unsubscribe>
List-Archive: <https://ffmpeg.org/pipermail/ffmpeg-devel>
List-Post: <mailto:ffmpeg-devel@ffmpeg.org>
List-Help: <mailto:ffmpeg-devel-request@ffmpeg.org?subject=help>
List-Subscribe: <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>,
 <mailto:ffmpeg-devel-request@ffmpeg.org?subject=subscribe>
Reply-To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org>
Cc: Limin Wang <lance.lmwang@gmail.com>
MIME-Version: 1.0
Errors-To: ffmpeg-devel-bounces@ffmpeg.org
Sender: "ffmpeg-devel" <ffmpeg-devel-bounces@ffmpeg.org>

From: Limin Wang <lance.lmwang@gmail.com>

Signed-off-by: Limin Wang <lance.lmwang@gmail.com>
---
depend on below patchset:
https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=1212

 doc/filters.texi           |  21 ++++++
 libavfilter/vf_libopencv.c | 147 +++++++++++++++++++++++++++++++++++--
 2 files changed, 162 insertions(+), 6 deletions(-)

diff --git a/doc/filters.texi b/doc/filters.texi
index e12c667348..cac204fd81 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -14219,6 +14219,27 @@ Display width and height of every faces, face_id is the face index
 which is range from [0, nb_faces-1]
 @end table
 
+@subsection drawbox
+draw all boxes by the detected metadata results of the ocv filter's
+facedetect mode. If no face detection metadata exists, then the filter
+will do nothing.
+
+The filter takes the following parameters:
+@var{color}|@var{thickness}|@var{line_type}|@var{shift}.
+
+@var{color} Specify the color of the box to write. For the general syntax of this option,
+check the @ref{color syntax,,"Color" section in the ffmpeg-utils manual,ffmpeg-utils}.
+
+@var{thickness}
+Set the thickness of the box edge. Default value is @code{1}.
+Negative values, like -1, mean that the function has to draw a filled rectangle.
+
+@var{line_type}
+Set the line type of the box boundary. Default value is @code{8}.
+
+@var{shift}
+Set the number of fractional bits in the point coordinates. Default value is @code{0}.
+
 @section oscilloscope
 
 2D Video Oscilloscope.
diff --git a/libavfilter/vf_libopencv.c b/libavfilter/vf_libopencv.c
index c70c4dc8b9..2e23222cb4 100644
--- a/libavfilter/vf_libopencv.c
+++ b/libavfilter/vf_libopencv.c
@@ -42,6 +42,7 @@
 #include "libavutil/common.h"
 #include "libavutil/file.h"
 #include "libavutil/opt.h"
+#include "libavutil/parseutils.h"
 #include "avfilter.h"
 #include "formats.h"
 #include "internal.h"
@@ -90,6 +91,7 @@ typedef struct OCVContext {
     void (*uninit)(AVFilterContext *ctx);
     void (*end_frame_filter)(AVFilterContext *ctx, IplImage *inimg, IplImage *outimg);
     void (*postprocess)(AVFilterContext *ctx, AVFrame *out);
+    void (*preprocess)(AVFilterContext *ctx, AVFrame *in);
     void *priv;
 } OCVContext;
 
@@ -466,6 +468,127 @@ static void facedetect_postprocess(AVFilterContext *ctx, AVFrame *out)
     }
 }
 
+typedef struct DrawboxContext {
+    CvScalar color;
+    int thickness;
+    int line_type;
+    int shift;
+
+    int nb_faces;
+    CvRect *faces[1024];
+} DrawboxContext;
+
+static av_cold int drawbox_init(AVFilterContext *ctx, const char *args)
+{
+    OCVContext *s = ctx->priv;
+    DrawboxContext *drawbox = s->priv;
+    const char *buf = args;
+    int ret;
+    uint8_t rgba[4] = { 255, 0, 0, 255};
+    char color_str[32] = "Red";
+
+    drawbox->thickness = 1;
+    drawbox->line_type = 8;
+    drawbox->shift = 0;
+    if (args) {
+        sscanf(args, "%32[^|]|%d|%d|%d", color_str, &drawbox->thickness, &drawbox->line_type, &drawbox->shift);
+    }
+
+    if (av_parse_color(rgba, color_str, -1, ctx) < 0) {
+        av_log(ctx, AV_LOG_ERROR, "failed to get color \n");
+        return AVERROR(EINVAL);
+    }
+    drawbox->color = cvScalar(rgba[0], rgba[1], rgba[2], rgba[3]);
+
+    av_log(ctx, AV_LOG_TRACE, "rgba: %d:%d:%d:%d, thickness: %d, line_type: %d, shift: %d \n",
+            rgba[0], rgba[1], rgba[2], rgba[3], drawbox->thickness, drawbox->line_type, drawbox->shift);
+
+    return 0;
+}
+
+static void drawbox_preprocess(AVFilterContext *ctx, AVFrame *in)
+{
+    OCVContext *s = ctx->priv;
+    DrawboxContext *drawbox = s->priv;
+    AVDictionaryEntry *ef, *ex, *ey, *ew, *eh;
+    char key2[128];
+    AVDictionary *metadata = in->metadata;
+    int nb_faces = 0;
+
+    ef = av_dict_get(metadata, "lavfi.facedetect.nb_faces", NULL, AV_DICT_MATCH_CASE);
+    if (ef) {
+        nb_faces = strtol(ef->value, NULL, 10);
+    }
+
+    if (nb_faces > 0) {
+        drawbox->nb_faces = nb_faces;
+        for (int i = 0; i < nb_faces && i < sizeof(drawbox->faces); i++ ) {
+            CvRect *tmp;
+
+            tmp =  av_realloc(drawbox->faces[i], sizeof(*tmp));
+            if (!tmp)
+                return AVERROR(ENOMEM);
+            drawbox->faces[i] = tmp;
+
+            snprintf(key2, sizeof(key2), "lavfi.facedetect.%d.%s", i, "x");
+            ex = av_dict_get(metadata, key2, NULL, AV_DICT_MATCH_CASE);
+
+            snprintf(key2, sizeof(key2), "lavfi.facedetect.%d.%s", i, "y");
+            ey = av_dict_get(metadata, key2, NULL, AV_DICT_MATCH_CASE);
+
+            snprintf(key2, sizeof(key2), "lavfi.facedetect.%d.%s", i, "w");
+            ew = av_dict_get(metadata, key2, NULL, AV_DICT_MATCH_CASE);
+
+            snprintf(key2, sizeof(key2), "lavfi.facedetect.%d.%s", i, "h");
+            eh = av_dict_get(metadata, key2, NULL, AV_DICT_MATCH_CASE);
+
+            if (ex && ey && ew && eh) {
+                tmp->x      = strtol(ex->value, NULL, 10);
+                tmp->y      = strtol(ey->value, NULL, 10);
+                tmp->width  = strtol(ew->value, NULL, 10);
+                tmp->height = strtol(eh->value, NULL, 10);
+            }
+        }
+    }
+}
+
+static av_cold void drawbox_uninit(AVFilterContext *ctx)
+{
+    OCVContext *s = ctx->priv;
+    DrawboxContext *drawbox = s->priv;
+
+    for (int i = 0; i < drawbox->nb_faces; i++ ) {
+        av_freep(&drawbox->faces[i]);
+    }
+}
+
+static void draw_rectangle(AVFilterContext *ctx, IplImage *img, CvPoint pt1, CvPoint pt2) {
+    OCVContext *s = ctx->priv;
+    DrawboxContext *drawbox = s->priv;
+
+    cvRectangle(img, pt1, pt2, drawbox->color, drawbox->thickness, drawbox->line_type, drawbox->shift);
+}
+
+static void drawbox_end_frame_filter(AVFilterContext *ctx, IplImage *inimg, IplImage *outimg)
+{
+    OCVContext *s = ctx->priv;
+    DrawboxContext *drawbox = s->priv;
+
+    for (int i = 0; i < drawbox->nb_faces; i++ ) {
+        CvPoint pt1, pt2;
+        CvRect *face = drawbox->faces[i];
+
+        if (face) {
+            pt1.x = face->x;
+            pt1.y = face->y;
+            pt2.x = face->x + face->width;
+            pt2.y = face->y + face->height;
+
+            draw_rectangle(ctx, inimg, pt1, pt2);
+        }
+    }
+}
+
 typedef struct OCVFilterEntry {
     const char *name;
     size_t priv_size;
@@ -473,13 +596,15 @@ typedef struct OCVFilterEntry {
     void (*uninit)(AVFilterContext *ctx);
     void (*end_frame_filter)(AVFilterContext *ctx, IplImage *inimg, IplImage *outimg);
     void (*postprocess)(AVFilterContext *ctx, AVFrame *out);
+    void (*preprocess)(AVFilterContext *ctx, AVFrame *in);
 } OCVFilterEntry;
 
 static const OCVFilterEntry ocv_filter_entries[] = {
-    { "dilate", sizeof(DilateContext), dilate_init, dilate_uninit, dilate_end_frame_filter, NULL },
-    { "erode",  sizeof(DilateContext), dilate_init, dilate_uninit, erode_end_frame_filter, NULL },
-    { "smooth", sizeof(SmoothContext), smooth_init, NULL, smooth_end_frame_filter, NULL },
-    { "facedetect", sizeof(FaceDetectContext), facedetect_init, facedetect_uninit, facedetect_end_frame_filter, facedetect_postprocess },
+    { "dilate", sizeof(DilateContext), dilate_init, dilate_uninit, dilate_end_frame_filter, NULL, NULL },
+    { "erode",  sizeof(DilateContext), dilate_init, dilate_uninit, erode_end_frame_filter, NULL, NULL},
+    { "smooth", sizeof(SmoothContext), smooth_init, NULL, smooth_end_frame_filter, NULL, NULL},
+    { "facedetect", sizeof(FaceDetectContext), facedetect_init, facedetect_uninit, facedetect_end_frame_filter, NULL, facedetect_postprocess},
+    { "drawbox", sizeof(DrawboxContext), drawbox_init, drawbox_uninit, drawbox_end_frame_filter, drawbox_preprocess, NULL},
 };
 
 static av_cold int init(AVFilterContext *ctx)
@@ -498,6 +623,7 @@ static av_cold int init(AVFilterContext *ctx)
             s->uninit           = entry->uninit;
             s->end_frame_filter = entry->end_frame_filter;
             s->postprocess      = entry->postprocess;
+            s->preprocess       = entry->preprocess;
 
             if (!(s->priv = av_mallocz(entry->priv_size)))
                 return AVERROR(ENOMEM);
@@ -538,12 +664,21 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
         out = in;
     }
 
+    if (s->preprocess) {
+        s->preprocess(ctx, in);
+    }
+
     fill_iplimage_from_frame(&inimg , in , inlink->format);
 
     if (strcmp(s->name, "facedetect")) {
         fill_iplimage_from_frame(&outimg, out, inlink->format);
-        s->end_frame_filter(ctx, &inimg, &outimg);
-        fill_frame_from_iplimage(out, &outimg, inlink->format);
+        if (strcmp(s->name, "drawbox")) {
+            s->end_frame_filter(ctx, &inimg, &outimg);
+            fill_frame_from_iplimage(out, &outimg, inlink->format);
+        } else {
+            s->end_frame_filter(ctx, &inimg, NULL);
+            fill_frame_from_iplimage(out, &inimg, inlink->format);
+        }
     } else {
         s->end_frame_filter(ctx, &inimg, NULL);
     }