From patchwork Sat Jul 2 09:55:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thilo Borgmann X-Patchwork-Id: 36590 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:8b27:b0:88:1bbf:7fd2 with SMTP id l39csp1553772pzh; Sat, 2 Jul 2022 02:55:18 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sSu6zKbOJGWrMnoF2XiCjAARZpiYE8Jee+udXmwQ6OvWu2iEt8x0bzKbFY5Abnd3cr0BpT X-Received: by 2002:a17:906:14d:b0:711:ffc4:3932 with SMTP id 13-20020a170906014d00b00711ffc43932mr18325560ejh.321.1656755718200; Sat, 02 Jul 2022 02:55:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656755718; cv=none; d=google.com; s=arc-20160816; b=peBE9mLcmopSIhvOTgtmpetuhm0c9PKBthj7B+npNP63ctBKLLYeP+wX4o7Dv8bP+D Od53dq7GVHL29IQq2BN2qhuaRhyTelXw0YZmhQcqR7vX4dlLZ9vO3RRIYoyzwziDo3U3 MYh6obXd68TVL2ga4S8dWdnAMlPPKUmG4yzsxN7sPYh4sY5yPUM5TtLxXAfyObn3KBb0 1jSQ+EazvwiLifx90IcSXSbSu7t6/eZxG8XkajXdp9LFlGOspF8rX1O0VEbBgNt+3Y9R VTzOkVrPki4L0qqagXoQj5yQR8rd3VtnfFFVmQQB6PAHoLhJwiuJdpZqUCwhyUYnWcKm JvmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:subject:from:to :content-language:mime-version:date:message-id:dkim-signature :delivered-to; bh=ATxmbIFORHbaqDR4BSgiyzTt/ISLiAULoaiF5UT1HbU=; b=IEYqrSaYTxgu4LPS6o+qveM93ZiU+g8sChrfltL4ofqeqt01JxWMEUCUnXHiagYYVQ SYZTyhcaxtF+G8c/RsU5NCzUd8C8LllwdryJqQPmu1s011/5kD4uWhoUpbITC1Mr+6TV F1aCujmgT5IVAKhKzVyW2+sHeaAHxdUr21HCpgNQF3FsSUDItQqSCXV6eDBd9/Tpuc/P xmLJHhU5/EXkmmN9LnGVT/bJFGnB1ZLrFblWg78VJ+hD9eWrT5HhGeW5k7MWGgvRTCuj +juIG7BNUFOKy58OPJ+Zi7tKdbDSV/VhZzI9BDmfzCnL/bdTQ5Jat7X5LY3pkKAuz3JN usKw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@mail.de header.s=mailde202009 header.b=pJlLXT5b; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mail.de Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id gl1-20020a1709073c8100b00722e7e8b484si747433ejc.625.2022.07.02.02.55.17; Sat, 02 Jul 2022 02:55:18 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@mail.de header.s=mailde202009 header.b=pJlLXT5b; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mail.de Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 162D068B74D; Sat, 2 Jul 2022 12:55:15 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from shout02.mail.de (shout02.mail.de [62.201.172.25]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5963F68B74D for ; Sat, 2 Jul 2022 12:55:08 +0300 (EEST) Received: from postfix03.mail.de (postfix03.bt.mail.de [10.0.121.127]) by shout02.mail.de (Postfix) with ESMTP id 39F2DA0E40 for ; Sat, 2 Jul 2022 11:55:07 +0200 (CEST) Received: from smtp03.mail.de (smtp03.bt.mail.de [10.0.121.213]) by postfix03.mail.de (Postfix) with ESMTP id 2149C801A9 for ; Sat, 2 Jul 2022 11:55:07 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mail.de; s=mailde202009; t=1656755707; bh=6k60odAwADt30Ye/ZCdu4ELgtcUJbMGzI2kYq8QNGjY=; h=Message-ID:Date:To:From:Subject:From:To:CC:Subject:Reply-To; b=pJlLXT5b2A0V9l3rNSY58ICwGeTGV2MLRx2sJ8CmkAFZVduuZNtZW9KZnYp8s0Aow Pre7n4mbYSaO22j5XlBlqjrKZ8QdX/A2c/BmNC2g6+lLjj2Q1VBYPcP6V9IjRjK7Lw pxd07MnR99kiK9O+ky6nR5b4KC4ZufHaKQHX64E4EniKACbxIUdqgRJ+tA1VI78Fmr kgrKc6Sf8o9fnF2g6SNdLGe8TE5gnXxmjk7TJZ4NVm/WtnBFmnRTcUju4fHfq5ZWbT VQ507+A9LdsjuX1wXMAbL33uKbJzu+Go5tmfePKI75viy1s9VNfFJondZnlXXDcyYf ED1urYyvIqTUg== Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by smtp03.mail.de (Postfix) with ESMTPSA id D5068A0214 for ; Sat, 2 Jul 2022 11:55:06 +0200 (CEST) Message-ID: Date: Sat, 2 Jul 2022 11:55:06 +0200 MIME-Version: 1.0 Content-Language: en-US To: FFmpeg development discussions and patches From: Thilo Borgmann X-purgate: clean X-purgate: This mail is considered clean (visit http://www.eleven.de for further information) X-purgate-type: clean X-purgate-Ad: Categorized by eleven eXpurgate (R) http://www.eleven.de X-purgate: This mail is considered clean (visit http://www.eleven.de for further information) X-purgate: clean X-purgate-size: 33492 X-purgate-ID: 154282::1656755706-0000061A-454F1C8D/0/0 Subject: [FFmpeg-devel] [PATCH] lavfi: Add cropdetect_video filter X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: AHTre8eaETKb Hi, $subject allows crop detection even if the video is embedded in non-black areas. Shares logic and purpose of lavfi/vf_cropdetect, though its edge detection is 8-bit formats only. Therefore this ends up in a separate filter. It would also benew GPL code if it shares living in lavfi/vf_cropdetect.c - if we'd like to add this as LGPL for some reason it might go into its own file without sharing (useful) logic with the existing cropdetect. Thanks, Thilo From 9050d15c2f1bcb3b2a628c8b6f04ea3a5f7e69d1 Mon Sep 17 00:00:00 2001 From: Thilo Borgmann Date: Sat, 2 Jul 2022 11:42:47 +0200 Subject: [PATCH] lavfi: Add cropdetect_video filter This filter allows crop detection even if the video is embedded in non-black areas. --- Changelog | 1 + doc/filters.texi | 69 +++++ libavfilter/Makefile | 1 + libavfilter/allfilters.c | 1 + libavfilter/version.h | 2 +- libavfilter/vf_cropdetect.c | 245 +++++++++++++++++- tests/fate/filter-video.mak | 8 + .../fate/filter-metadata-cropdetect_video1 | 9 + .../fate/filter-metadata-cropdetect_video2 | 9 + 9 files changed, 343 insertions(+), 2 deletions(-) create mode 100644 tests/ref/fate/filter-metadata-cropdetect_video1 create mode 100644 tests/ref/fate/filter-metadata-cropdetect_video2 diff --git a/Changelog b/Changelog index d4ca674b1b..3b5a4880cb 100644 --- a/Changelog +++ b/Changelog @@ -19,6 +19,7 @@ version 5.1: - blurdetect filter - tiltshelf audio filter - QOI image format support +- cropdetect_video video filter version 5.0: diff --git a/doc/filters.texi b/doc/filters.texi index d65e83d4d0..5117e12623 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -10108,6 +10108,75 @@ indicates 'never reset', and returns the largest area encountered during playback. @end table +@anchor{cropdetect_video} +@section cropdetect_video + +Auto-detect the crop size. + +It calculates the necessary cropping parameters and prints the +recommended parameters via the logging system. The detected dimensions +correspond to the video playback area of the input video. +It can detect videos embedded even in non-black areas although it supports only 8 bit pixel formats. + +It accepts the following parameters: + +@table @option + +@item mv_threshold +Set motion in pixel units as threshold for motion detection. It defaults to 8. + +@item low +@item high +Set low and high threshold values used by the Canny thresholding +algorithm. + +The high threshold selects the "strong" edge pixels, which are then +connected through 8-connectivity with the "weak" edge pixels selected +by the low threshold. + +@var{low} and @var{high} threshold values must be chosen in the range +[0,1], and @var{low} should be lesser or equal to @var{high}. + +Default value for @var{low} is @code{5/255}, and default value for @var{high} +is @code{15/255}. + +@item round +The value which the width/height should be divisible by. It defaults to +16. The offset is automatically adjusted to center the video. Use 2 to +get only even dimensions (needed for 4:2:2 video). 16 is best when +encoding to most video codecs. + +@item skip +Set the number of initial frames for which evaluation is skipped. +Default is 2. Range is 0 to INT_MAX. + +@item reset_count, reset +Set the counter that determines after how many frames cropdetect will +reset the previously detected largest video area and start over to +detect the current optimal crop area. Default value is 0. + +This can be useful when channel logos distort the video area. 0 +indicates 'never reset', and returns the largest area encountered during +playback. +@end table + +@subsection Examples + +@itemize +@item +Find an embedded video area, generate motion vectors beforehand: +@example +ffmpeg -i file.mp4 -vf mestimate,cropdetect_video,metadata=mode=print -f null - +@end example + +@item +Find an embedded video area, use motion vectors from decoder: +@example +ffmpeg -flags2 +export_mvs -i file.mp4 -vf cropdetect_video,metadata=mode=print -f null - +@end example +@end itemize + + @anchor{cue} @section cue diff --git a/libavfilter/Makefile b/libavfilter/Makefile index e0e4d0de2c..8e4b4d33b1 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -235,6 +235,7 @@ OBJS-$(CONFIG_COREIMAGE_FILTER) += vf_coreimage.o OBJS-$(CONFIG_COVER_RECT_FILTER) += vf_cover_rect.o lavfutils.o OBJS-$(CONFIG_CROP_FILTER) += vf_crop.o OBJS-$(CONFIG_CROPDETECT_FILTER) += vf_cropdetect.o +OBJS-$(CONFIG_CROPDETECT_VIDEO_FILTER) += vf_cropdetect.o OBJS-$(CONFIG_CUE_FILTER) += f_cue.o OBJS-$(CONFIG_CURVES_FILTER) += vf_curves.o OBJS-$(CONFIG_DATASCOPE_FILTER) += vf_datascope.o diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index 2f72477523..d68746a4b0 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -219,6 +219,7 @@ extern const AVFilter ff_vf_coreimage; extern const AVFilter ff_vf_cover_rect; extern const AVFilter ff_vf_crop; extern const AVFilter ff_vf_cropdetect; +extern const AVFilter ff_vf_cropdetect_video; extern const AVFilter ff_vf_cue; extern const AVFilter ff_vf_curves; extern const AVFilter ff_vf_datascope; diff --git a/libavfilter/version.h b/libavfilter/version.h index 86b33c4174..814ab071da 100644 --- a/libavfilter/version.h +++ b/libavfilter/version.h @@ -31,7 +31,7 @@ #include "version_major.h" -#define LIBAVFILTER_VERSION_MINOR 40 +#define LIBAVFILTER_VERSION_MINOR 41 #define LIBAVFILTER_VERSION_MICRO 100 diff --git a/libavfilter/vf_cropdetect.c b/libavfilter/vf_cropdetect.c index b887b9ecb1..f2e25ff90d 100644 --- a/libavfilter/vf_cropdetect.c +++ b/libavfilter/vf_cropdetect.c @@ -26,11 +26,14 @@ #include "libavutil/imgutils.h" #include "libavutil/internal.h" #include "libavutil/opt.h" +#include "libavutil/motion_vector.h" +#include "libavutil/qsort.h" #include "avfilter.h" #include "formats.h" #include "internal.h" #include "video.h" +#include "edge_common.h" typedef struct CropDetectContext { const AVClass *class; @@ -42,6 +45,16 @@ typedef struct CropDetectContext { int frame_nb; int max_pixsteps[4]; int max_outliers; + int mode; + int window_size; + int mv_threshold; + float low, high; + uint8_t low_u8, high_u8; + uint8_t *filterbuf; + uint8_t *tmpbuf; + uint16_t *gradients; + char *directions; + int *bboxes[4]; } CropDetectContext; static const enum AVPixelFormat pix_fmts[] = { @@ -61,6 +74,29 @@ static const enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_NONE }; +static const enum AVPixelFormat pix_fmts_video[] = { + AV_PIX_FMT_GRAY8, + AV_PIX_FMT_GBRP, AV_PIX_FMT_GBRAP, + AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUV420P, + AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV440P, + AV_PIX_FMT_YUV411P, AV_PIX_FMT_YUV410P, + AV_PIX_FMT_YUVJ440P, AV_PIX_FMT_YUVJ411P, AV_PIX_FMT_YUVJ420P, + AV_PIX_FMT_YUVJ422P, AV_PIX_FMT_YUVJ444P, + AV_PIX_FMT_YUVA444P, AV_PIX_FMT_YUVA422P, AV_PIX_FMT_YUVA420P, + AV_PIX_FMT_NONE +}; + +enum CropMode { + MODE_BELOW_TH, + MODE_MV_EDGES, + MODE_NB +}; + +static int comp(const int *a,const int *b) +{ + return FFDIFFSIGN(*a, *b); +} + static int checkline(void *ctx, const unsigned char *src, int stride, int len, int bpp) { int total = 0; @@ -116,6 +152,36 @@ static int checkline(void *ctx, const unsigned char *src, int stride, int len, i return total; } +static int checkline_edge(void *ctx, const unsigned char *src, int stride, int len, int bpp) +{ + const uint16_t *src16 = (const uint16_t *)src; + + switch (bpp) { + case 1: + while (--len >= 0) { + if(src[0]) return 0; + src += stride; + } + break; + case 2: + stride >>= 1; + while (--len >= 0) { + if(src16[0]) return 0; + src16 += stride; + } + break; + case 3: + case 4: + while (--len >= 0) { + if(src[0] || src[1] || src[2]) return 0; + src += stride; + } + break; + } + + return 1; +} + static av_cold int init(AVFilterContext *ctx) { CropDetectContext *s = ctx->priv; @@ -128,6 +194,31 @@ static av_cold int init(AVFilterContext *ctx) return 0; } +static av_cold int init_video(AVFilterContext *ctx) +{ + CropDetectContext *s = ctx->priv; + + s->mode = MODE_MV_EDGES; + s->low_u8 = s->low * 255. + .5; + s->high_u8 = s->high * 255. + .5; + + return init(ctx); +} + +static av_cold void uninit_video(AVFilterContext *ctx) +{ + CropDetectContext *s = ctx->priv; + + av_freep(&s->tmpbuf); + av_freep(&s->filterbuf); + av_freep(&s->gradients); + av_freep(&s->directions); + av_freep(&s->bboxes[0]); + av_freep(&s->bboxes[1]); + av_freep(&s->bboxes[2]); + av_freep(&s->bboxes[3]); +} + static int config_input(AVFilterLink *inlink) { AVFilterContext *ctx = inlink->dst; @@ -147,6 +238,29 @@ static int config_input(AVFilterLink *inlink) return 0; } +static int config_input_video(AVFilterLink *inlink) +{ + AVFilterContext *ctx = inlink->dst; + CropDetectContext *s = ctx->priv; + const int bufsize = inlink->w * inlink->h; + + s->window_size = FFMAX(s->reset_count, 15); + s->tmpbuf = av_malloc(bufsize); + s->filterbuf = av_malloc(bufsize); + s->gradients = av_calloc(bufsize, sizeof(*s->gradients)); + s->directions = av_malloc(bufsize); + s->bboxes[0] = av_malloc(s->window_size * sizeof(*s->bboxes[0])); + s->bboxes[1] = av_malloc(s->window_size * sizeof(*s->bboxes[1])); + s->bboxes[2] = av_malloc(s->window_size * sizeof(*s->bboxes[2])); + s->bboxes[3] = av_malloc(s->window_size * sizeof(*s->bboxes[3])); + + if (!s->tmpbuf || !s->filterbuf || !s->gradients || !s->directions || + !s->bboxes[0] || !s->bboxes[1] || !s->bboxes[2] || !s->bboxes[3]) + return AVERROR(ENOMEM); + + return config_input(inlink); +} + #define SET_META(key, value) \ av_dict_set_int(metadata, key, value, 0) @@ -155,11 +269,20 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) AVFilterContext *ctx = inlink->dst; CropDetectContext *s = ctx->priv; int bpp = s->max_pixsteps[0]; - int w, h, x, y, shrink_by; + int w, h, x, y, shrink_by, i; AVDictionary **metadata; int outliers, last_y; int limit = lrint(s->limit); + const int inw = inlink->w; + const int inh = inlink->h; + uint8_t *tmpbuf = s->tmpbuf; + uint8_t *filterbuf = s->filterbuf; + uint16_t *gradients = s->gradients; + int8_t *directions = s->directions; + const AVFrameSideData *sd = NULL; + int scan_w, scan_h, bboff; + // ignore first s->skip frames if (++s->frame_nb > 0) { metadata = &frame->metadata; @@ -185,11 +308,105 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) last_y = y INC;\ } + if (s->mode == MODE_BELOW_TH) { FIND(s->y1, 0, y < s->y1, +1, frame->linesize[0], bpp, frame->width); FIND(s->y2, frame->height - 1, y > FFMAX(s->y2, s->y1), -1, frame->linesize[0], bpp, frame->width); FIND(s->x1, 0, y < s->x1, +1, bpp, frame->linesize[0], frame->height); FIND(s->x2, frame->width - 1, y > FFMAX(s->x2, s->x1), -1, bpp, frame->linesize[0], frame->height); + } else { // MODE_MV_EDGES + sd = av_frame_get_side_data(frame, AV_FRAME_DATA_MOTION_VECTORS); + s->x1 = 0; + s->y1 = 0; + s->x2 = inw - 1; + s->y2 = inh - 1; + + if (!sd) { + av_log(ctx, AV_LOG_WARNING, "Cannot detect: no motion vectors available"); + } else { + // gaussian filter to reduce noise + ff_gaussian_blur(inw, inh, + filterbuf, inw, + frame->data[0], frame->linesize[0]); + + // compute the 16-bits gradients and directions for the next step + ff_sobel(inw, inh, gradients, inw, directions, inw, filterbuf, inw); + + // non_maximum_suppression() will actually keep & clip what's necessary and + // ignore the rest, so we need a clean output buffer + memset(tmpbuf, 0, inw * inh); + ff_non_maximum_suppression(inw, inh, tmpbuf, inw, directions, inw, gradients, inw); + + + // keep high values, or low values surrounded by high values + ff_double_threshold(s->low_u8, s->high_u8, inw, inh, + tmpbuf, inw, tmpbuf, inw); + + // scan all MVs and store bounding box + s->x1 = inw - 1; + s->y1 = inh - 1; + s->x2 = 0; + s->y2 = 0; + for (i = 0; i < sd->size / sizeof(AVMotionVector); i++) { + const AVMotionVector *mv = (const AVMotionVector*)sd->data + i; + const int mx = mv->dst_x - mv->src_x; + const int my = mv->dst_y - mv->src_y; + + if (mv->dst_x >= 0 && mv->dst_x < inw && + mv->dst_y >= 0 && mv->dst_y < inh && + mv->src_x >= 0 && mv->src_x < inw && + mv->src_y >= 0 && mv->src_y < inh && + mx * mx + my * my >= s->mv_threshold * s->mv_threshold) { + s->x1 = mv->dst_x < s->x1 ? mv->dst_x : s->x1; + s->y1 = mv->dst_y < s->y1 ? mv->dst_y : s->y1; + s->x2 = mv->dst_x > s->x2 ? mv->dst_x : s->x2; + s->y2 = mv->dst_y > s->y2 ? mv->dst_y : s->y2; + } + } + + // scan outward looking for 0-edge-lines in edge image + scan_w = s->x2 - s->x1; + scan_h = s->y2 - s->y1; + +#define FIND_EDGE(DST, FROM, NOEND, INC, STEP0, STEP1, LEN) \ + for (last_y = y = FROM; NOEND; y = y INC) { \ + if (checkline_edge(ctx, tmpbuf + STEP0 * y, STEP1, LEN, bpp)) { \ + if (last_y INC == y) { \ + DST = y; \ + break; \ + } else \ + last_y = y; \ + } \ + } \ + if (!(NOEND)) { \ + DST = y -(INC); \ + } + FIND_EDGE(s->y1, s->y1, y >= 0, -1, inw, bpp, scan_w); + FIND_EDGE(s->y2, s->y2, y < inh, +1, inw, bpp, scan_w); + FIND_EDGE(s->x1, s->x1, y >= 0, -1, bpp, inw, scan_h); + FIND_EDGE(s->x2, s->x2, y < inw, +1, bpp, inw, scan_h); + + // queue bboxes + bboff = (s->frame_nb - 1) % s->window_size; + s->bboxes[0][bboff] = s->x1; + s->bboxes[1][bboff] = s->x2; + s->bboxes[2][bboff] = s->y1; + s->bboxes[3][bboff] = s->y2; + + // sort queue + bboff = FFMIN(s->frame_nb, s->window_size); + AV_QSORT(s->bboxes[0], bboff, int, comp); + AV_QSORT(s->bboxes[1], bboff, int, comp); + AV_QSORT(s->bboxes[2], bboff, int, comp); + AV_QSORT(s->bboxes[3], bboff, int, comp); + + // return median of window_size elems + s->x1 = s->bboxes[0][bboff/2]; + s->x2 = s->bboxes[1][bboff/2]; + s->y1 = s->bboxes[2][bboff/2]; + s->y2 = s->bboxes[3][bboff/2]; + } + } // round x and y (up), important for yuv colorspaces // make sure they stay rounded! @@ -243,10 +460,14 @@ static const AVOption cropdetect_options[] = { { "skip", "Number of initial frames to skip", OFFSET(skip), AV_OPT_TYPE_INT, { .i64 = 2 }, 0, INT_MAX, FLAGS }, { "reset_count", "Recalculate the crop area after this many frames",OFFSET(reset_count),AV_OPT_TYPE_INT,{ .i64 = 0 }, 0, INT_MAX, FLAGS }, { "max_outliers", "Threshold count of outliers", OFFSET(max_outliers),AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, FLAGS }, + { "high", "Set high threshold for edge detection", OFFSET(high), AV_OPT_TYPE_FLOAT, {.dbl=25/255.}, 0, 1, FLAGS }, + { "low", "Set low threshold for edge detection", OFFSET(low), AV_OPT_TYPE_FLOAT, {.dbl=15/255.}, 0, 1, FLAGS }, + { "mv_threshold", "motion vector threshold when estimating video window size", OFFSET(mv_threshold), AV_OPT_TYPE_INT, {.i64=8}, 0, 100, FLAGS}, { NULL } }; AVFILTER_DEFINE_CLASS(cropdetect); +AVFILTER_DEFINE_CLASS_EXT(cropdetect_video, "cropdetect_video", cropdetect_options); static const AVFilterPad avfilter_vf_cropdetect_inputs[] = { { @@ -257,6 +478,15 @@ static const AVFilterPad avfilter_vf_cropdetect_inputs[] = { }, }; +static const AVFilterPad avfilter_vf_cropdetect_video_inputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + .config_props = config_input_video, + .filter_frame = filter_frame, + }, +}; + static const AVFilterPad avfilter_vf_cropdetect_outputs[] = { { .name = "default", @@ -275,3 +505,16 @@ const AVFilter ff_vf_cropdetect = { FILTER_PIXFMTS_ARRAY(pix_fmts), .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC | AVFILTER_FLAG_METADATA_ONLY, }; + +const AVFilter ff_vf_cropdetect_video = { + .name = "cropdetect_video", + .description = NULL_IF_CONFIG_SMALL("Auto-detect crop size of an embedded video area."), + .priv_size = sizeof(CropDetectContext), + .priv_class = &cropdetect_video_class, + .init = init_video, + .uninit = uninit_video, + FILTER_INPUTS(avfilter_vf_cropdetect_video_inputs), + FILTER_OUTPUTS(avfilter_vf_cropdetect_outputs), + FILTER_PIXFMTS_ARRAY(pix_fmts_video), + .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC | AVFILTER_FLAG_METADATA_ONLY, +}; diff --git a/tests/fate/filter-video.mak b/tests/fate/filter-video.mak index faed832cd4..2da4018785 100644 --- a/tests/fate/filter-video.mak +++ b/tests/fate/filter-video.mak @@ -647,6 +647,14 @@ FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_DEPS)) += fate-filter-metadata- fate-filter-metadata-cropdetect: SRC = $(TARGET_SAMPLES)/filter/cropdetect.mp4 fate-filter-metadata-cropdetect: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',cropdetect=max_outliers=3" +CROPDETECT_VIDEO_DEPS = LAVFI_INDEV FILE_PROTOCOL MOVIE_FILTER MESTIMATE_FILTER CROPDETECT_VIDEO_FILTER \ + SCALE_FILTER MOV_DEMUXER H264_DECODER +FATE_METADATA_FILTER-$(call ALLYES, $(CROPDETECT_VIDEO_DEPS)) += fate-filter-metadata-cropdetect_video1 fate-filter-metadata-cropdetect_video2 +fate-filter-metadata-cropdetect_video1: SRC = $(TARGET_SAMPLES)/filter/cropdetect_video1.mp4 +fate-filter-metadata-cropdetect_video1: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',mestimate,cropdetect_video,metadata=mode=print" +fate-filter-metadata-cropdetect_video2: SRC = $(TARGET_SAMPLES)/filter/cropdetect_video2.mp4 +fate-filter-metadata-cropdetect_video2: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;movie='$(SRC)',mestimate,cropdetect_video,metadata=mode=print" + FREEZEDETECT_DEPS = LAVFI_INDEV MPTESTSRC_FILTER SCALE_FILTER FREEZEDETECT_FILTER FATE_METADATA_FILTER-$(call ALLYES, $(FREEZEDETECT_DEPS)) += fate-filter-metadata-freezedetect fate-filter-metadata-freezedetect: CMD = run $(FILTER_METADATA_COMMAND) "sws_flags=+accurate_rnd+bitexact;mptestsrc=r=25:d=10:m=51,freezedetect" diff --git a/tests/ref/fate/filter-metadata-cropdetect_video1 b/tests/ref/fate/filter-metadata-cropdetect_video1 new file mode 100644 index 0000000000..892373cc11 --- /dev/null +++ b/tests/ref/fate/filter-metadata-cropdetect_video1 @@ -0,0 +1,9 @@ +pts=0 +pts=1001 +pts=2002|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=851|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=601|tag:lavfi.cropdetect.w=832|tag:lavfi.cropdetect.h=288|tag:lavfi.cropdetect.x=20|tag:lavfi.cropdetect.y=314 +pts=3003|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=864|tag:lavfi.cropdetect.h=304|tag:lavfi.cropdetect.x=22|tag:lavfi.cropdetect.y=316 +pts=4004|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=5005|tag:lavfi.cropdetect.x1=20|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=311|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=864|tag:lavfi.cropdetect.h=304|tag:lavfi.cropdetect.x=22|tag:lavfi.cropdetect.y=316 +pts=6006|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=7007|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 +pts=8008|tag:lavfi.cropdetect.x1=0|tag:lavfi.cropdetect.x2=885|tag:lavfi.cropdetect.y1=115|tag:lavfi.cropdetect.y2=621|tag:lavfi.cropdetect.w=880|tag:lavfi.cropdetect.h=496|tag:lavfi.cropdetect.x=4|tag:lavfi.cropdetect.y=122 diff --git a/tests/ref/fate/filter-metadata-cropdetect_video2 b/tests/ref/fate/filter-metadata-cropdetect_video2 new file mode 100644 index 0000000000..6b433d17cb --- /dev/null +++ b/tests/ref/fate/filter-metadata-cropdetect_video2 @@ -0,0 +1,9 @@ +pts=0 +pts=512 +pts=1024|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=33|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=34 +pts=1536|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=33|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=34 +pts=2048|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=2560|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=3072|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=3584|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32 +pts=4096|tag:lavfi.cropdetect.x1=21|tag:lavfi.cropdetect.x2=817|tag:lavfi.cropdetect.y1=29|tag:lavfi.cropdetect.y2=465|tag:lavfi.cropdetect.w=784|tag:lavfi.cropdetect.h=432|tag:lavfi.cropdetect.x=28|tag:lavfi.cropdetect.y=32