From patchwork Fri Jun 18 15:56:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lingjiang Fang X-Patchwork-Id: 28577 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a5e:c91a:0:0:0:0:0 with SMTP id z26csp1791127iol; Fri, 18 Jun 2021 08:57:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwUDCcdClZuSqh8Kc4sniw+ovt9z8qqREzCOiD7YDhpkJ4swrfjqqhqfhL/I1/aj+Tbi0Z3 X-Received: by 2002:a17:907:20da:: with SMTP id qq26mr11642456ejb.42.1624031841106; Fri, 18 Jun 2021 08:57:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624031841; cv=none; d=google.com; s=arc-20160816; b=obXIeXNmvKuwNfHd0+/qK4WdTml1vywgtxelm+XPXeU5oSbWHuBMrzNHj9f6M1vhOW e12o77ZtH3QXBuYuYR6RZhptRxrMaCBjUbSSERvMuYhR8PeiCV6M8WQhbzNcpXTAdJKd klogr26KE68S2ewGLhwnOmTs8pjxVZDs970Z/as+CGKF6TMYVMuM5SHTl70Z4nVcqQnn VhGId98U7RCsj9tlWxcT4PDH9NfcVE+SmrUs3PQ8hXXgnA66VMtqnkkffGMcU6JFo060 xPDSjUXllvNVePLMDxD8pbW1TTzL71OgY30LDfSnYYaV3xr5AedfyDEWCKrg2w+GNkS+ zcvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:cc:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:mime-version:references:in-reply-to:date :to:from:message-id:dkim-signature:delivered-to; bh=TS2D9s7v4JMoOnZFFhqLqTl8mrBC8zsctZF4cxX/d7o=; b=ErWmR3U0moSgHvJljFlTUkBn+oA9SLgJtsjIqXmaTxLKcfUSZlzYZJRbYFwYdtYS5F hDLemfEJ/HdUUcNmSEsWKxgOkviBTsW4SmbKjLMh2eNPCFHnK7oVKmKZCv/HXdZ5fV61 vFQ8tdcBrBZKMuz4eUHRdYxI61/mHOUU1yd/HF/KBr/PEwQ5h5esWHeAfBzrxsctV/SN 0MGmPr2Ey7HtkN8sE1lzO6/qeh7lfE5AUVB3MYn8FK9XqDIGpxIyCoemCTzu5fITNJwC rpIqibVcsuXZaEGYgIaBnho/6d3ZNzCYKDzy5RdH1CTJVSiKKyYl+cyhtCdKq4N/IY7j IUjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=G6l8WJt2; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id i2si3212711ejp.181.2021.06.18.08.57.19; Fri, 18 Jun 2021 08:57:21 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@foxmail.com header.s=s201512 header.b=G6l8WJt2; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=foxmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E041F689721; Fri, 18 Jun 2021 18:57:14 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from out203-205-221-173.mail.qq.com (unknown [203.205.221.173]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0A724680515 for ; Fri, 18 Jun 2021 18:57:06 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foxmail.com; s=s201512; t=1624031818; bh=4pkZrmgULxHungeelYwGwt9B6I7dVg2ssn6IQH00LQk=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=G6l8WJt2iogZWy++Dw0Lz6FG/h7gWLQp+JXG0ZMW/BX29g346vSHQ4QoZKPEaP5t3 jBUJVR6SVEkwKuKYyMYjot3sWeaJG7ia6JHkTzZ/TiXOnSJqUxhzQyHNELeUgK2wo1 tc9lKuEA23rpcM7ryp+RK5Ad27BmC9HEBIbWedq8= Received: from localhost.localdomain ([14.17.22.37]) by newxmesmtplogicsvrszc9.qq.com (NewEsmtp) with SMTP id E3990693; Fri, 18 Jun 2021 23:56:57 +0800 X-QQ-mid: xmsmtpt1624031817tv3qvuwki Message-ID: X-QQ-XMAILINFO: NcJQPrdKkKeKPtK3WWBQgTOgQox/j+bxYq+sHcmImV0+1JXs/fzuhUKunT205t z3GOfZ4lS6XdYrAVCj+Es5vSwN7eoK7HaG/8eO+NU7oqTv9khl5nv2ANpCTd8ZG2uEzXCl12z0xU +wpNtU7L2E2joTrQI6M/WnQzRpj77RFTs5wz/psNW181SrdbZyGTR/w1w4Zc+sWqfDo9yWGIzhh1 v49fDJ0cRilDrF8pd38TdhtVYAfecgzAJU+dmvI20NI0B6tr8uRGWhIaySeziZm6yP+Uk2n83GUW PgfmkIRavoOLka0ZCuk6oakF6EK+WriGqlfNoSJ7h/OQsB586X7fGZ6/LBp54zO/b1tSWWnUpDXl krLjYWdwHk0ulBiOZVLmQYOoviPS+kIi882G0WjLpQdV2614n6lfcA/OC+DrdeObUAvENKkKTdDX dZQ5fmPqX4izdTMvie7Go/GlzHbejJaXz7q8p+4w4I8D1Uxg8MG7pFxQQPP85O7600YGBIwwPI8G EdTU2+qb7Fr4qpGTX9XNA1VL+umfzMiWrpeqg617mLBW5hz9l8vtAKf4xSIIbxADHgSNkXg/w/OJ 7dck7M5i4wD2wZKqgGEDmXmZM2bIPaTNTqRohhBrqV1/eILJypZSt1Se4tUiq8FnHuCu4HxlcfJY sMtMPBrRAbDGEll+FzQJIJQFiCeL9gHy9Wk7D7HW/hvo4B88sOQH7VJeuXN71nsBtXOmaKA/LQuH zuy5DW3dUOp7cBYoBVAXm5fMBGTbFCjzA0kP4dCQBjeyCIT0aLw+pOc0sx729vLm5GWDtN3kSYlS ESCn1Ugv2fqwemmoShXvhXse5rHsGAbDUbMQJfX6X2yg== From: Lingjiang Fang To: ffmpeg-devel@ffmpeg.org Date: Fri, 18 Jun 2021 23:56:56 +0800 X-OQ-MSGID: <20210618155656.10278-1-vacingfang@foxmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH V3] lavf/vf_ocr: add subregion support X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Lingjiang Fang Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: o0tawEPyBAqp fix doc errors, ping for review, thanks :) --- doc/filters.texi | 8 ++++++++ libavfilter/vf_ocr.c | 35 ++++++++++++++++++++++++++++++++++- 2 files changed, 42 insertions(+), 1 deletion(-) diff --git a/doc/filters.texi b/doc/filters.texi index da8f7d7726..041fd28c57 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -15451,6 +15451,14 @@ Set character whitelist. @item blacklist Set character blacklist. + +@item x, y +Set top-left corner of the subregion, in pixels, default is (0,0). + +@item w, h +Set width and height of the subregion, in pixels, +default is the bottom-right part from given top-left corner. + @end table The filter exports recognized text as the frame metadata @code{lavfi.ocr.text}. diff --git a/libavfilter/vf_ocr.c b/libavfilter/vf_ocr.c index 6de474025a..e96dce2d87 100644 --- a/libavfilter/vf_ocr.c +++ b/libavfilter/vf_ocr.c @@ -33,6 +33,8 @@ typedef struct OCRContext { char *language; char *whitelist; char *blacklist; + int x, y; + int w, h; TessBaseAPI *tess; } OCRContext; @@ -45,6 +47,10 @@ static const AVOption ocr_options[] = { { "language", "set language", OFFSET(language), AV_OPT_TYPE_STRING, {.str="eng"}, 0, 0, FLAGS }, { "whitelist", "set character whitelist", OFFSET(whitelist), AV_OPT_TYPE_STRING, {.str="0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.:;,-+_!?\"'[]{}()<>|/\\=*&%$#@!~ "}, 0, 0, FLAGS }, { "blacklist", "set character blacklist", OFFSET(blacklist), AV_OPT_TYPE_STRING, {.str=""}, 0, 0, FLAGS }, + { "x", "top x of sub region", OFFSET(x), AV_OPT_TYPE_INT, {.i64=0}, 0, INT_MAX, FLAGS }, + { "y", "top y of sub region", OFFSET(y), AV_OPT_TYPE_INT, {.i64=0}, 0, INT_MAX, FLAGS }, + { "w", "width of sub region", OFFSET(w), AV_OPT_TYPE_INT, {.i64=0}, 0, INT_MAX, FLAGS }, + { "h", "height of sub region", OFFSET(h), AV_OPT_TYPE_INT, {.i64=0}, 0, INT_MAX, FLAGS }, { NULL } }; @@ -93,6 +99,21 @@ static int query_formats(AVFilterContext *ctx) return ff_set_common_formats(ctx, fmts_list); } +static void check_fix(int *x, int *y, int *w, int *h, int pic_w, int pic_h) +{ + // 0 <= x < pic_w + if (*x >= pic_w) + *x = 0; + // 0 <= y < pic_h + if (*y >= pic_h) + *y = 0; + + if (*w == 0 || *w + *x > pic_w) + *w = pic_w - *x; + if (*h == 0 || *h + *y > pic_h) + *h = pic_h - *y; +} + static int filter_frame(AVFilterLink *inlink, AVFrame *in) { AVDictionary **metadata = &in->metadata; @@ -102,8 +123,20 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in) char *result; int *confs; + // TODO(vacing): support expression + int x = s->x; + int y = s->y; + int w = s->w; + int h = s->h; + check_fix(&x, &y, &w, &h, in->width, in->height); + if ( x != s->x || y != s->y || + (s->w != 0 && w != s->w) || (s->h != 0 && h != s->h)) { + av_log(s, AV_LOG_WARNING, "config error, subregion changed to x=%d, y=%d, w=%d, h=%d\n", + x, y, w, h); + } + result = TessBaseAPIRect(s->tess, in->data[0], 1, - in->linesize[0], 0, 0, in->width, in->height); + in->linesize[0], x, y, w, h); confs = TessBaseAPIAllWordConfidences(s->tess); av_dict_set(metadata, "lavfi.ocr.text", result, 0); for (int i = 0; confs[i] != -1; i++) {