From patchwork Tue Mar 2 09:00:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= X-Patchwork-Id: 26052 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id CB50D44AFB3 for ; Tue, 2 Mar 2021 11:06:46 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B527968AB94; Tue, 2 Mar 2021 11:06:46 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1A3BD680397 for ; Tue, 2 Mar 2021 11:06:40 +0200 (EET) Received: by mail-lf1-f43.google.com with SMTP id d3so30093335lfg.10 for ; Tue, 02 Mar 2021 01:06:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=b/faBFZf2ymJthz3qtsDEoa2Q1AQIyZcoY/PDk0PSSM=; b=pshvjHkx7T/LlHLEkZ3slJlVBemdCYQUNYEk/oqBhYIqCz70PKrlr1CCc0KKiD57As aJjvSt1ooAhGO6avK4mP4mGQYJpq19peWVn+CVvz5Wg5r4ZUs3Tt9cfgPj4btUUgZDaq TXoYivVEaAsbOdwiWpNGXqb2PKORbigiG7k1bV4kaCnOf9vsgQtxHLw8P1aeqB2ZqImW gF5Rn6EZJcNH3uvMCkhxzHNNEbkl8P/yZfTNg/46zyJDX5WcwjJ18IEBXMkrN+gj9/xk I/iVbLzP4rwqLXbj5OJgBdqYolOIgD0WShoRkB5/fGyOP3xn/Y+m2Pc4xFtG/NoKU1bD IDZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=b/faBFZf2ymJthz3qtsDEoa2Q1AQIyZcoY/PDk0PSSM=; b=K0/Fp1v+22yBzqEpEWaxsQxByNvgWfokoa9ZAOldXq04rdD+DE868U9VN3LxYdoMco /K+sxEgBhdE6IEG5dClEpooHv+AgpF90MoSgyKi9p+cK4ipsjaKkU9cN3X7AUWd4vsqi wH6nYuQjTA0tMsB3tWhuLZdtiG4PTjN+ak2kajcxqBBW3hpdMWZ1KskhmSVx+suuSC5T Dve4tFtln7JEV/Qs0kDCRupaJOteJFUk0coE2HJkwedG22nOGp32dKlrbl7/nBvLxKQK y+CavtN84eSHMvlSH4TWmWRWg1aj7YIrjbQ6+Kde9jGO1djKtxE5evp/xihr4l9ouTpE J40g== X-Gm-Message-State: AOAM530Gy+/Y7lOU6zWsvgNw/MRtbOBtHCjIaeyUn62Go+3Bx8e1Fd7T FcBcXtII6SR5ZouVy3JFKOSe/MUKIHA= X-Google-Smtp-Source: ABdhPJxGA19XhVf+DKFyM5H2XzqSN6RgXN1bjGnsbfvkCTzHsH83Cj9BmejxtuI+ViyaO07TGLT8Ew== X-Received: by 2002:a2e:9047:: with SMTP id n7mr5136670ljg.291.1614675644506; Tue, 02 Mar 2021 01:00:44 -0800 (PST) Received: from localhost.localdomain (91-159-194-103.elisa-laajakaista.fi. [91.159.194.103]) by smtp.gmail.com with ESMTPSA id u9sm1791626ljj.0.2021.03.02.01.00.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Mar 2021 01:00:44 -0800 (PST) From: =?utf-8?q?Jan_Ekstr=C3=B6m?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 2 Mar 2021 11:00:37 +0200 Message-Id: <20210302090040.10484-2-jeebjp@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210302090040.10484-1-jeebjp@gmail.com> References: <20210302090040.10484-1-jeebjp@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v6 1/4] avutil/{avstring, bprint}: add XML escaping from ffprobe to avutil X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Stefano Sabatini Base escaping only escapes values required for base character data according to part 2.4 of XML, and if additional flags are added single and double quotes can additionally be escaped in order to handle single and double quoted attributes. Co-Authored-By: Jan Ekström Signed-off-by: Jan Ekström --- libavutil/avstring.h | 14 ++++++++++++++ libavutil/bprint.c | 29 +++++++++++++++++++++++++++++ libavutil/version.h | 2 +- tools/ffescape.c | 7 +++++-- 4 files changed, 49 insertions(+), 3 deletions(-) diff --git a/libavutil/avstring.h b/libavutil/avstring.h index ee225585b3..fae446c302 100644 --- a/libavutil/avstring.h +++ b/libavutil/avstring.h @@ -324,6 +324,7 @@ enum AVEscapeMode { AV_ESCAPE_MODE_AUTO, ///< Use auto-selected escaping mode. AV_ESCAPE_MODE_BACKSLASH, ///< Use backslash escaping. AV_ESCAPE_MODE_QUOTE, ///< Use single-quote escaping. + AV_ESCAPE_MODE_XML, ///< Use XML non-markup character data escaping. }; /** @@ -343,6 +344,19 @@ enum AVEscapeMode { */ #define AV_ESCAPE_FLAG_STRICT (1 << 1) +/** + * Within AV_ESCAPE_MODE_XML, additionally escape single quotes for single + * quoted attributes. + */ +#define AV_ESCAPE_FLAG_XML_SINGLE_QUOTES (1 << 2) + +/** + * Within AV_ESCAPE_MODE_XML, additionally escape double quotes for double + * quoted attributes. + */ +#define AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES (1 << 3) + + /** * Escape string in src, and put the escaped string in an allocated * string in *dst, which must be freed with av_free(). diff --git a/libavutil/bprint.c b/libavutil/bprint.c index 2f059c5ba6..e12fb263fe 100644 --- a/libavutil/bprint.c +++ b/libavutil/bprint.c @@ -283,6 +283,35 @@ void av_bprint_escape(AVBPrint *dstbuf, const char *src, const char *special_cha av_bprint_chars(dstbuf, '\'', 1); break; + case AV_ESCAPE_MODE_XML: + /* escape XML non-markup character data as per 2.4 by default: */ + /* [^<&]* - ([^<&]* ']]>' [^<&]*) */ + + /* additionally, given one of the AV_ESCAPE_FLAG_XML_* flags, */ + /* escape those specific characters as required. */ + for (; *src; src++) { + switch (*src) { + case '&' : av_bprintf(dstbuf, "%s", "&"); break; + case '<' : av_bprintf(dstbuf, "%s", "<"); break; + case '>' : av_bprintf(dstbuf, "%s", ">"); break; + case '\'': + if (!(flags & AV_ESCAPE_FLAG_XML_SINGLE_QUOTES)) + goto XML_DEFAULT_HANDLING; + + av_bprintf(dstbuf, "%s", "'"); + break; + case '"' : + if (!(flags & AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES)) + goto XML_DEFAULT_HANDLING; + + av_bprintf(dstbuf, "%s", """); + break; +XML_DEFAULT_HANDLING: + default: av_bprint_chars(dstbuf, *src, 1); + } + } + break; + /* case AV_ESCAPE_MODE_BACKSLASH or unknown mode */ default: /* \-escape characters */ diff --git a/libavutil/version.h b/libavutil/version.h index b7c5892a37..356c54d633 100644 --- a/libavutil/version.h +++ b/libavutil/version.h @@ -79,7 +79,7 @@ */ #define LIBAVUTIL_VERSION_MAJOR 56 -#define LIBAVUTIL_VERSION_MINOR 66 +#define LIBAVUTIL_VERSION_MINOR 67 #define LIBAVUTIL_VERSION_MICRO 100 #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \ diff --git a/tools/ffescape.c b/tools/ffescape.c index 0530d28c6d..1ed8daa801 100644 --- a/tools/ffescape.c +++ b/tools/ffescape.c @@ -78,8 +78,10 @@ int main(int argc, char **argv) infilename = optarg; break; case 'f': - if (!strcmp(optarg, "whitespace")) escape_flags |= AV_ESCAPE_FLAG_WHITESPACE; - else if (!strcmp(optarg, "strict")) escape_flags |= AV_ESCAPE_FLAG_STRICT; + if (!strcmp(optarg, "whitespace")) escape_flags |= AV_ESCAPE_FLAG_WHITESPACE; + else if (!strcmp(optarg, "strict")) escape_flags |= AV_ESCAPE_FLAG_STRICT; + else if (!strcmp(optarg, "xml_single_quotes")) escape_flags |= AV_ESCAPE_FLAG_XML_SINGLE_QUOTES; + else if (!strcmp(optarg, "xml_double_quotes")) escape_flags |= AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES; else { av_log(NULL, AV_LOG_ERROR, "Invalid value '%s' for option -f, " @@ -104,6 +106,7 @@ int main(int argc, char **argv) if (!strcmp(optarg, "auto")) escape_mode = AV_ESCAPE_MODE_AUTO; else if (!strcmp(optarg, "backslash")) escape_mode = AV_ESCAPE_MODE_BACKSLASH; else if (!strcmp(optarg, "quote")) escape_mode = AV_ESCAPE_MODE_QUOTE; + else if (!strcmp(optarg, "xml")) escape_mode = AV_ESCAPE_MODE_XML; else { av_log(NULL, AV_LOG_ERROR, "Invalid value '%s' for option -m, " From patchwork Tue Mar 2 09:00:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= X-Patchwork-Id: 26054 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 1A46044B7BE for ; Tue, 2 Mar 2021 11:08:12 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id F32CB68AB25; Tue, 2 Mar 2021 11:08:11 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com [209.85.167.50]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0CC6268A831 for ; Tue, 2 Mar 2021 11:08:05 +0200 (EET) Received: by mail-lf1-f50.google.com with SMTP id m22so30151349lfg.5 for ; Tue, 02 Mar 2021 01:08:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=CO757h3pud/O4zoxvoiGCGcSzpz6g83NKGO9D9oiwVo=; b=SbpF70AL+vMyOQNDFL31b1kBnvnS7M/6j8cMasexalz1rWIBaZI7u7ANyTWjWRFtpk XaWh1M0WVZL3yAqSdgckW6o9eX2dBVNao1Q258tJoeCqTofmZ4raL9dUOyRR8Emm9J8T +dLFNLSmx0zBfhNZQhCSqnOEztXQvART3jfMpRB8YJPtWksY3kfUfnwVJaHFYxld1UDj Cm8dc+0wJntdUJ8DYuxflQaDKwIlZ2CzyRrVN1T7wlFt6OejZX+Je3f7fx/Or8N4z+Nk egacbNVPHj4Bz03TF+fhaI7KCWewOKuX/RhN2KwLO2KvDkG6jbAf5oC9S2JZQi+nEA68 CwEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CO757h3pud/O4zoxvoiGCGcSzpz6g83NKGO9D9oiwVo=; b=FWAbbgfAtT5lfystC5uITBukLhv0xCLo5mBatb0XFvwtfbSfTJ4LCdRNAq6NEzAXUy FRxiHadBR1Dwa5GsJKeMzp+J6tgQi1Vr0dX26aWYf8LMBb1+sbfGj45Huild1ESnQL3a 9Y6MfVSPA+/7YR8tBd0F+TV6onncFvdOh9XgfT3dXILfnJ2iOK1XQryFfW32dRP/zjmy O74CBxPSNgsAnMiqXbb0KY0cNj2Tx42JZIrSz2Xni7Ke5NqGMU3NCXYg9nNRIydeSTvT /9049VMgDqeBUHrcW1dGcMjtvXKpFpNUSEp1YHd/axwswYtLV6gebC9Q7Q4l3h+FXGu7 tzSw== X-Gm-Message-State: AOAM5322Zl0l8rSPsJaoD7bTQ9fOqdH+sDEj58Dd5GKjyj0f9piarOCS ng5yojok1drBEJJb/2zPBT+HJT5/43g= X-Google-Smtp-Source: ABdhPJxYkEW20qXMywCWvTtyUnF6p5hcn9EW+Awc5tq0MMj/KNSBYEeXMX7YyqWD4qbUz0rASeeaDg== X-Received: by 2002:ac2:5e26:: with SMTP id o6mr12202242lfg.355.1614675645284; Tue, 02 Mar 2021 01:00:45 -0800 (PST) Received: from localhost.localdomain (91-159-194-103.elisa-laajakaista.fi. [91.159.194.103]) by smtp.gmail.com with ESMTPSA id u9sm1791626ljj.0.2021.03.02.01.00.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Mar 2021 01:00:44 -0800 (PST) From: =?utf-8?q?Jan_Ekstr=C3=B6m?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 2 Mar 2021 11:00:38 +0200 Message-Id: <20210302090040.10484-3-jeebjp@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210302090040.10484-1-jeebjp@gmail.com> References: <20210302090040.10484-1-jeebjp@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v6 2/4] ffprobe: switch to av_bprint_escape for XML escaping X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Jan Ekström Additionally update the result of the ffprobe XML writing test. Signed-off-by: Jan Ekström --- fftools/ffprobe.c | 32 +++++++++++--------------------- tests/ref/fate/ffprobe_xml | 2 +- 2 files changed, 12 insertions(+), 22 deletions(-) diff --git a/fftools/ffprobe.c b/fftools/ffprobe.c index 740e759958..1eb9d88b5e 100644 --- a/fftools/ffprobe.c +++ b/fftools/ffprobe.c @@ -1672,24 +1672,6 @@ static av_cold int xml_init(WriterContext *wctx) return 0; } -static const char *xml_escape_str(AVBPrint *dst, const char *src, void *log_ctx) -{ - const char *p; - - for (p = src; *p; p++) { - switch (*p) { - case '&' : av_bprintf(dst, "%s", "&"); break; - case '<' : av_bprintf(dst, "%s", "<"); break; - case '>' : av_bprintf(dst, "%s", ">"); break; - case '"' : av_bprintf(dst, "%s", """); break; - case '\'': av_bprintf(dst, "%s", "'"); break; - default: av_bprint_chars(dst, *p, 1); - } - } - - return dst->str; -} - #define XML_INDENT() printf("%*c", xml->indent_level * 4, ' ') static void xml_print_section_header(WriterContext *wctx) @@ -1761,14 +1743,22 @@ static void xml_print_str(WriterContext *wctx, const char *key, const char *valu if (section->flags & SECTION_FLAG_HAS_VARIABLE_FIELDS) { XML_INDENT(); + av_bprint_escape(&buf, key, NULL, + AV_ESCAPE_MODE_XML, AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES); printf("<%s key=\"%s\"", - section->element_name, xml_escape_str(&buf, key, wctx)); + section->element_name, buf.str); av_bprint_clear(&buf); - printf(" value=\"%s\"/>\n", xml_escape_str(&buf, value, wctx)); + + av_bprint_escape(&buf, value, NULL, + AV_ESCAPE_MODE_XML, AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES); + printf(" value=\"%s\"/>\n", buf.str); } else { if (wctx->nb_item[wctx->level]) printf(" "); - printf("%s=\"%s\"", key, xml_escape_str(&buf, value, wctx)); + + av_bprint_escape(&buf, value, NULL, + AV_ESCAPE_MODE_XML, AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES); + printf("%s=\"%s\"", key, buf.str); } av_bprint_finalize(&buf, NULL); diff --git a/tests/ref/fate/ffprobe_xml b/tests/ref/fate/ffprobe_xml index 1e99158021..04261ed693 100644 --- a/tests/ref/fate/ffprobe_xml +++ b/tests/ref/fate/ffprobe_xml @@ -51,7 +51,7 @@ - + From patchwork Tue Mar 2 09:00:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= X-Patchwork-Id: 26053 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 1486144B171 for ; Tue, 2 Mar 2021 11:07:49 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EAB4F68AB7C; Tue, 2 Mar 2021 11:07:48 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f41.google.com (mail-lf1-f41.google.com [209.85.167.41]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 94A1E68A831 for ; Tue, 2 Mar 2021 11:07:42 +0200 (EET) Received: by mail-lf1-f41.google.com with SMTP id k9so10208996lfo.12 for ; Tue, 02 Mar 2021 01:07:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=zLhFRpaqPdzi2eBAPKZ7wli7MTHV1yiQ8gbxwR/SV5A=; b=o+fr8vPiSwDs7XOKCaE/YwOcZeWjmGR+wayFbdFrDy61siicyF6b45kJnMJN9IVP+1 KVZfqfC3n6KOIhDDGzyHpOlZF32PkfbuiEyTPCYK4ZdYfJiLMOcg+p4IIaoDHJvflLO1 XqIfBeycfWH4lEEAs5wFAmNa5chO30GabXRh/bG4J0wjX3ocbkkYB+23i387JV2C5J4Q UN0qs/C+1n+267zT1VnvorgKrS8E6CSnNZFtA/wd/bmS8u0hSu/A5L9hO34g2WRXGXAA +bhgp+TfjI+NL3tPWkHXp1iJPmeuZLBpJoIfKLJLv8Tx+5s3tzos4w//0UX9gzh2OosC S/jA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zLhFRpaqPdzi2eBAPKZ7wli7MTHV1yiQ8gbxwR/SV5A=; b=SUQKVDVJgFEQIuSiT3ubpKx8HKSe/PXhnpw5X7iOIn2IolledOo+2qL9LRm7q1xTxj Q2BWlb0PB1HFhy9+KEm7dy1Q5tkjLxdHr7kZ1B0PIGS5owbJGYoHc/ZShz5ZRQD8365a cTxEpJHQ8J4o0gPYopH5s7VXvgP6+xMugrUGWc6q8GlfI1F01SeLhtnyB2bwQAziYaUf N5SQko9RTxnyTMXTg7Mukiecv7lwUuTDsJSuu/K6Dgpaz0PAMkzfvS2xKz4IHZ/WgDQb hk3O3j83YrzaZtooJfVqCXHQ+5PKWjLMZeH+V6487rGtUX67KVlymqadzK0rbZ64lVQP pbYQ== X-Gm-Message-State: AOAM530yAUIjh2JefeB2fEKnHEjJULNklDtjkEMCcMvEoi6mHk637Pka uey4vdGUwJVA9cjKy8rFFskekrD8cDE= X-Google-Smtp-Source: ABdhPJySVZStWwTcqecc/ptgvWmtkR8ARyyrF3ypiJkY4ICNC56xs2fpDLJ5/cu6aF4id2dWNYyH7Q== X-Received: by 2002:a05:651c:2050:: with SMTP id t16mr11579565ljo.109.1614675645981; Tue, 02 Mar 2021 01:00:45 -0800 (PST) Received: from localhost.localdomain (91-159-194-103.elisa-laajakaista.fi. [91.159.194.103]) by smtp.gmail.com with ESMTPSA id u9sm1791626ljj.0.2021.03.02.01.00.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Mar 2021 01:00:45 -0800 (PST) From: =?utf-8?q?Jan_Ekstr=C3=B6m?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 2 Mar 2021 11:00:39 +0200 Message-Id: <20210302090040.10484-4-jeebjp@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210302090040.10484-1-jeebjp@gmail.com> References: <20210302090040.10484-1-jeebjp@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v6 3/4] avcodec: enable usage of err_recognition for encoders X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Jan Ekström Enables the usage of such values as AV_EF_EXPLODE in encoders, which can be useful in cases such as subtitle encoders where they have the responsibility to validate the correctness of an incoming ASS dialog line. Signed-off-by: Jan Ekström --- doc/APIchanges | 3 +++ libavcodec/avcodec.h | 2 +- libavcodec/options_table.h | 18 +++++++++--------- libavcodec/version.h | 2 +- 4 files changed, 14 insertions(+), 11 deletions(-) diff --git a/doc/APIchanges b/doc/APIchanges index a49e181c13..a4cb3a5d63 100644 --- a/doc/APIchanges +++ b/doc/APIchanges @@ -15,6 +15,9 @@ libavutil: 2017-10-21 API changes, most recent first: +2021-03-02 - xxxxxxxxxx - lavc 58.128.101 - avcodec.h + Enable err_recognition to be set for encoders. + 2021-02-27 - xxxxxxxxxx - lavc 58.126.100 - avcodec.h Deprecated avcodec_get_frame_class(). diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index cd6e6d19bc..ecc665677a 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -1634,7 +1634,7 @@ typedef struct AVCodecContext { /** * Error recognition; may misdetect some more or less valid parts as errors. - * - encoding: unused + * - encoding: Set by user. * - decoding: Set by user. */ int err_recognition; diff --git a/libavcodec/options_table.h b/libavcodec/options_table.h index ded9de4d67..e12159f734 100644 --- a/libavcodec/options_table.h +++ b/libavcodec/options_table.h @@ -140,15 +140,15 @@ static const AVOption avcodec_options[] = { {"unofficial", "allow unofficial extensions", 0, AV_OPT_TYPE_CONST, {.i64 = FF_COMPLIANCE_UNOFFICIAL }, INT_MIN, INT_MAX, A|V|D|E, "strict"}, {"experimental", "allow non-standardized experimental things", 0, AV_OPT_TYPE_CONST, {.i64 = FF_COMPLIANCE_EXPERIMENTAL }, INT_MIN, INT_MAX, A|V|D|E, "strict"}, {"b_qoffset", "QP offset between P- and B-frames", OFFSET(b_quant_offset), AV_OPT_TYPE_FLOAT, {.dbl = 1.25 }, -FLT_MAX, FLT_MAX, V|E}, -{"err_detect", "set error detection flags", OFFSET(err_recognition), AV_OPT_TYPE_FLAGS, {.i64 = 0 }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"crccheck", "verify embedded CRCs", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_CRCCHECK }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"bitstream", "detect bitstream specification deviations", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_BITSTREAM }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"buffer", "detect improper bitstream length", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_BUFFER }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"explode", "abort decoding on minor error detection", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_EXPLODE }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"ignore_err", "ignore errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_IGNORE_ERR }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"careful", "consider things that violate the spec, are fast to check and have not been seen in the wild as errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_CAREFUL }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"compliant", "consider all spec non compliancies as errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_COMPLIANT | AV_EF_CAREFUL }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"aggressive", "consider things that a sane encoder should not do as an error", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_AGGRESSIVE | AV_EF_COMPLIANT | AV_EF_CAREFUL}, INT_MIN, INT_MAX, A|V|D, "err_detect"}, +{"err_detect", "set error detection flags", OFFSET(err_recognition), AV_OPT_TYPE_FLAGS, {.i64 = 0 }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"crccheck", "verify embedded CRCs", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_CRCCHECK }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"bitstream", "detect bitstream specification deviations", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_BITSTREAM }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"buffer", "detect improper bitstream length", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_BUFFER }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"explode", "abort decoding on minor error detection", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_EXPLODE }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"ignore_err", "ignore errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_IGNORE_ERR }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"careful", "consider things that violate the spec, are fast to check and have not been seen in the wild as errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_CAREFUL }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"compliant", "consider all spec non compliancies as errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_COMPLIANT | AV_EF_CAREFUL }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"aggressive", "consider things that a sane encoder should not do as an error", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_AGGRESSIVE | AV_EF_COMPLIANT | AV_EF_CAREFUL}, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, {"has_b_frames", NULL, OFFSET(has_b_frames), AV_OPT_TYPE_INT, {.i64 = DEFAULT }, 0, INT_MAX}, {"block_align", NULL, OFFSET(block_align), AV_OPT_TYPE_INT, {.i64 = DEFAULT }, 0, INT_MAX}, #if FF_API_PRIVATE_OPT diff --git a/libavcodec/version.h b/libavcodec/version.h index 2b3757fa07..dd15ae341e 100644 --- a/libavcodec/version.h +++ b/libavcodec/version.h @@ -29,7 +29,7 @@ #define LIBAVCODEC_VERSION_MAJOR 58 #define LIBAVCODEC_VERSION_MINOR 128 -#define LIBAVCODEC_VERSION_MICRO 100 +#define LIBAVCODEC_VERSION_MICRO 101 #define LIBAVCODEC_VERSION_INT AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \ LIBAVCODEC_VERSION_MINOR, \ From patchwork Tue Mar 2 09:00:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= X-Patchwork-Id: 26055 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 1618644B7BE for ; Tue, 2 Mar 2021 11:08:27 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id E955768ABB1; Tue, 2 Mar 2021 11:08:26 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BCEFB68ABAC for ; Tue, 2 Mar 2021 11:08:25 +0200 (EET) Received: by mail-lj1-f174.google.com with SMTP id u18so9366733ljd.3 for ; Tue, 02 Mar 2021 01:08:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=XLZ5RSKsLC2bvytnoYUEuKWufrGfSmc8GUXiVJIAMDg=; b=mb6e0p93CBN3M5kh/KfbK33D929c0ouP+1+jx6FedhjHykUjuQjBAPgeXWgkXVHjCj ClodT/jWqCLuesxs6e8TIErehLPR1M4LddHbtfa7as4ZCPetKn+uEenLFSFMGGQbep1B 0k8a7uu9WpRnIrVPNlwU5YLgPpCC0pL1TK6ZiOFLtHADN/mltUCnoEcJyAlQUzT0a2fj D/W1KIRxZUGvMtFBqKBfChkKyRVzeUjfjEsYvlKhjczlh6DGGdVh6+rK5sdU5FrXevRA SZG2wD7pschlB5G0y22M5p9jF7W9TSEE7lxYiMfWh4dm3tXQi1/Jt/gKW7xZKe/fvVGI 4rCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XLZ5RSKsLC2bvytnoYUEuKWufrGfSmc8GUXiVJIAMDg=; b=E2JMKiezBC99kUgsWhZymvdCRsnhy/IDokX2I9QX8nm0fvweoGqS7p5LywJNWI9Trb Y4ZjFJA8isYKO43PM8QJVsSqn84RENiEokvMsjuJjx1WmvFmBV8l3owDzxIp4aH0BQkU VXCMvQfR1M6LFTLIPHfFFWEMVEdVU83pB76hL7p4DejYDya8F6MQkffyaRidxt1e2w2M FtoPNUd9KUV9ZAoxEgmyVnrn2tyIt+y0V8eXIWDtt2T05SWK1mTOJLHvArv5/ZE34Nmb NwaE59aH/b/KrMBLzsT3zXBS1eIvvEWmNUjWkUPyZYH67pA+zr6ZYqEWRBjC5BpLH1Nt SqiQ== X-Gm-Message-State: AOAM533noS9g7cxSqqjwKrFIVw780kE+OmzRDHect7G5G9d9uiQqeFxx oz3gAtM8Xb4Qa4ndoZmDVmokFtB7C64= X-Google-Smtp-Source: ABdhPJxrf0ChGLslgmL1F9092QjWjI5aGjsmJ3una179au83U4IzjnLZLODC119M2hgfLxcKgMSSqA== X-Received: by 2002:a2e:504d:: with SMTP id v13mr6512682ljd.92.1614675646824; Tue, 02 Mar 2021 01:00:46 -0800 (PST) Received: from localhost.localdomain (91-159-194-103.elisa-laajakaista.fi. [91.159.194.103]) by smtp.gmail.com with ESMTPSA id u9sm1791626ljj.0.2021.03.02.01.00.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Mar 2021 01:00:46 -0800 (PST) From: =?utf-8?q?Jan_Ekstr=C3=B6m?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 2 Mar 2021 11:00:40 +0200 Message-Id: <20210302090040.10484-5-jeebjp@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210302090040.10484-1-jeebjp@gmail.com> References: <20210302090040.10484-1-jeebjp@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v6 4/4] {avcodec, avformat}: add TTML encoder and muxer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Jan Ekström Enables encoding of other subtitle formats into TTML and writing them out as such documents. Signed-off-by: Jan Ekström --- Changelog | 1 + doc/general_contents.texi | 1 + libavcodec/Makefile | 1 + libavcodec/allcodecs.c | 1 + libavcodec/ttmlenc.c | 210 +++++++++++++++++++++++++++++++++++++ libavcodec/ttmlenc.h | 28 +++++ libavcodec/version.h | 4 +- libavformat/Makefile | 1 + libavformat/allformats.c | 1 + libavformat/ttmlenc.c | 174 ++++++++++++++++++++++++++++++ libavformat/version.h | 2 +- tests/fate/subtitles.mak | 3 + tests/ref/fate/sub-ttmlenc | 122 +++++++++++++++++++++ 13 files changed, 546 insertions(+), 3 deletions(-) create mode 100644 libavcodec/ttmlenc.c create mode 100644 libavcodec/ttmlenc.h create mode 100644 libavformat/ttmlenc.c create mode 100644 tests/ref/fate/sub-ttmlenc diff --git a/Changelog b/Changelog index 9e7f67cc19..f0b2995444 100644 --- a/Changelog +++ b/Changelog @@ -78,6 +78,7 @@ version : - Simbiosis IMX decoder - Simbiosis IMX demuxer - Digital Pictures SGA demuxer and decoders +- TTML subtitle encoder and muxer version 4.3: diff --git a/doc/general_contents.texi b/doc/general_contents.texi index 6acdf441d6..58c9bcf747 100644 --- a/doc/general_contents.texi +++ b/doc/general_contents.texi @@ -1352,6 +1352,7 @@ performance on systems without hardware floating point support). @item SubViewer v1 @tab @tab X @tab @tab X @item SubViewer @tab @tab X @tab @tab X @item TED Talks captions @tab @tab X @tab @tab X +@item TTML @tab X @tab @tab X @tab @item VobSub (IDX+SUB) @tab @tab X @tab @tab X @item VPlayer @tab @tab X @tab @tab X @item WebVTT @tab X @tab X @tab X @tab X diff --git a/libavcodec/Makefile b/libavcodec/Makefile index b7e456b59f..d1b1125a30 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -672,6 +672,7 @@ OBJS-$(CONFIG_TSCC_DECODER) += tscc.o msrledec.o OBJS-$(CONFIG_TSCC2_DECODER) += tscc2.o OBJS-$(CONFIG_TTA_DECODER) += tta.o ttadata.o ttadsp.o OBJS-$(CONFIG_TTA_ENCODER) += ttaenc.o ttaencdsp.o ttadata.o +OBJS-$(CONFIG_TTML_ENCODER) += ttmlenc.o ass_split.o OBJS-$(CONFIG_TWINVQ_DECODER) += twinvqdec.o twinvq.o metasound_data.o OBJS-$(CONFIG_TXD_DECODER) += txd.o OBJS-$(CONFIG_ULTI_DECODER) += ulti.o diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c index a04faead16..2e9a3581de 100644 --- a/libavcodec/allcodecs.c +++ b/libavcodec/allcodecs.c @@ -691,6 +691,7 @@ extern AVCodec ff_subviewer_decoder; extern AVCodec ff_subviewer1_decoder; extern AVCodec ff_text_encoder; extern AVCodec ff_text_decoder; +extern AVCodec ff_ttml_encoder; extern AVCodec ff_vplayer_decoder; extern AVCodec ff_webvtt_encoder; extern AVCodec ff_webvtt_decoder; diff --git a/libavcodec/ttmlenc.c b/libavcodec/ttmlenc.c new file mode 100644 index 0000000000..3972b4368c --- /dev/null +++ b/libavcodec/ttmlenc.c @@ -0,0 +1,210 @@ +/* + * TTML subtitle encoder + * Copyright (c) 2020 24i + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * TTML subtitle encoder + * @see https://www.w3.org/TR/ttml1/ + * @see https://www.w3.org/TR/ttml2/ + * @see https://www.w3.org/TR/ttml-imsc/rec + */ + +#include "avcodec.h" +#include "internal.h" +#include "libavutil/avstring.h" +#include "libavutil/bprint.h" +#include "libavutil/internal.h" +#include "ass_split.h" +#include "ass.h" +#include "ttmlenc.h" + +typedef struct { + AVCodecContext *avctx; + ASSSplitContext *ass_ctx; + AVBPrint buffer; +} TTMLContext; + +static void ttml_text_cb(void *priv, const char *text, int len) +{ + TTMLContext *s = priv; + AVBPrint cur_line = { 0 }; + AVBPrint *buffer = &s->buffer; + + av_bprint_init(&cur_line, len, AV_BPRINT_SIZE_UNLIMITED); + + av_bprint_append_data(&cur_line, text, len); + if (!av_bprint_is_complete(&cur_line)) { + av_log(s->avctx, AV_LOG_ERROR, + "Failed to move the current subtitle dialog to AVBPrint!\n"); + av_bprint_finalize(&cur_line, NULL); + return; + } + + + av_bprint_escape(buffer, cur_line.str, NULL, AV_ESCAPE_MODE_XML, + 0); + + av_bprint_finalize(&cur_line, NULL); +} + +static void ttml_new_line_cb(void *priv, int forced) +{ + TTMLContext *s = priv; + + av_bprintf(&s->buffer, "
"); +} + +static const ASSCodesCallbacks ttml_callbacks = { + .text = ttml_text_cb, + .new_line = ttml_new_line_cb, +}; + +static int ttml_encode_frame(AVCodecContext *avctx, uint8_t *buf, + int bufsize, const AVSubtitle *sub) +{ + TTMLContext *s = avctx->priv_data; + ASSDialog *dialog; + int i; + + av_bprint_clear(&s->buffer); + + for (i=0; inum_rects; i++) { + const char *ass = sub->rects[i]->ass; + + if (sub->rects[i]->type != SUBTITLE_ASS) { + av_log(avctx, AV_LOG_ERROR, "Only SUBTITLE_ASS type supported.\n"); + return AVERROR(EINVAL); + } + +#if FF_API_ASS_TIMING + if (!strncmp(ass, "Dialogue: ", 10)) { + int num; + dialog = ff_ass_split_dialog(s->ass_ctx, ass, 0, &num); + + for (; dialog && num--; dialog++) { + int ret = ff_ass_split_override_codes(&ttml_callbacks, s, + dialog->text); + int log_level = (ret != AVERROR_INVALIDDATA || + avctx->err_recognition & AV_EF_EXPLODE) ? + AV_LOG_ERROR : AV_LOG_WARNING; + + if (ret < 0) { + av_log(avctx, log_level, + "Splitting received ASS dialog failed: %s\n", + av_err2str(ret)); + + if (log_level == AV_LOG_ERROR) + return ret; + } + } + } else { +#endif + dialog = ff_ass_split_dialog2(s->ass_ctx, ass); + if (!dialog) + return AVERROR(ENOMEM); + + { + int ret = ff_ass_split_override_codes(&ttml_callbacks, s, + dialog->text); + int log_level = (ret != AVERROR_INVALIDDATA || + avctx->err_recognition & AV_EF_EXPLODE) ? + AV_LOG_ERROR : AV_LOG_WARNING; + + if (ret < 0) { + av_log(avctx, log_level, + "Splitting received ASS dialog text %s failed: %s\n", + dialog->text, + av_err2str(ret)); + + if (log_level == AV_LOG_ERROR) { + ff_ass_free_dialog(&dialog); + return ret; + } + } + + ff_ass_free_dialog(&dialog); + } +#if FF_API_ASS_TIMING + } +#endif + } + + if (!av_bprint_is_complete(&s->buffer)) + return AVERROR(ENOMEM); + if (!s->buffer.len) + return 0; + + // force null-termination, so in case our destination buffer is + // too small, the return value is larger than bufsize minus null. + if (av_strlcpy(buf, s->buffer.str, bufsize) > bufsize - 1) { + av_log(avctx, AV_LOG_ERROR, "Buffer too small for TTML event.\n"); + return AVERROR_BUFFER_TOO_SMALL; + } + + return s->buffer.len; +} + +static av_cold int ttml_encode_close(AVCodecContext *avctx) +{ + TTMLContext *s = avctx->priv_data; + + ff_ass_split_free(s->ass_ctx); + + av_bprint_finalize(&s->buffer, NULL); + + return 0; +} + +static av_cold int ttml_encode_init(AVCodecContext *avctx) +{ + TTMLContext *s = avctx->priv_data; + + s->avctx = avctx; + + if (!(s->ass_ctx = ff_ass_split(avctx->subtitle_header))) { + return AVERROR_INVALIDDATA; + } + + if (!(avctx->extradata = av_mallocz(TTMLENC_EXTRADATA_SIGNATURE_SIZE + + 1 + AV_INPUT_BUFFER_PADDING_SIZE))) { + return AVERROR(ENOMEM); + } + + avctx->extradata_size = TTMLENC_EXTRADATA_SIGNATURE_SIZE; + memcpy(avctx->extradata, TTMLENC_EXTRADATA_SIGNATURE, + TTMLENC_EXTRADATA_SIGNATURE_SIZE); + + av_bprint_init(&s->buffer, 0, AV_BPRINT_SIZE_UNLIMITED); + + return 0; +} + +AVCodec ff_ttml_encoder = { + .name = "ttml", + .long_name = NULL_IF_CONFIG_SMALL("TTML subtitle"), + .type = AVMEDIA_TYPE_SUBTITLE, + .id = AV_CODEC_ID_TTML, + .priv_data_size = sizeof(TTMLContext), + .init = ttml_encode_init, + .encode_sub = ttml_encode_frame, + .close = ttml_encode_close, + .capabilities = FF_CODEC_CAP_INIT_CLEANUP, +}; diff --git a/libavcodec/ttmlenc.h b/libavcodec/ttmlenc.h new file mode 100644 index 0000000000..c1dd5ec990 --- /dev/null +++ b/libavcodec/ttmlenc.h @@ -0,0 +1,28 @@ +/* + * TTML subtitle encoder shared functionality + * Copyright (c) 2020 24i + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVCODEC_TTMLENC_H +#define AVCODEC_TTMLENC_H + +#define TTMLENC_EXTRADATA_SIGNATURE "lavc-ttmlenc" +#define TTMLENC_EXTRADATA_SIGNATURE_SIZE (sizeof(TTMLENC_EXTRADATA_SIGNATURE) - 1) + +#endif /* AVCODEC_TTMLENC_H */ diff --git a/libavcodec/version.h b/libavcodec/version.h index dd15ae341e..d7ccf9943e 100644 --- a/libavcodec/version.h +++ b/libavcodec/version.h @@ -28,8 +28,8 @@ #include "libavutil/version.h" #define LIBAVCODEC_VERSION_MAJOR 58 -#define LIBAVCODEC_VERSION_MINOR 128 -#define LIBAVCODEC_VERSION_MICRO 101 +#define LIBAVCODEC_VERSION_MINOR 129 +#define LIBAVCODEC_VERSION_MICRO 100 #define LIBAVCODEC_VERSION_INT AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \ LIBAVCODEC_VERSION_MINOR, \ diff --git a/libavformat/Makefile b/libavformat/Makefile index 48b91ea4d0..0504f47f88 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -546,6 +546,7 @@ OBJS-$(CONFIG_TRUEHD_DEMUXER) += rawdec.o mlpdec.o OBJS-$(CONFIG_TRUEHD_MUXER) += rawenc.o OBJS-$(CONFIG_TTA_DEMUXER) += tta.o apetag.o img2.o OBJS-$(CONFIG_TTA_MUXER) += ttaenc.o apetag.o img2.o +OBJS-$(CONFIG_TTML_MUXER) += ttmlenc.o OBJS-$(CONFIG_TTY_DEMUXER) += tty.o sauce.o OBJS-$(CONFIG_TY_DEMUXER) += ty.o OBJS-$(CONFIG_TXD_DEMUXER) += txd.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index ade247640c..a38fd1f583 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -445,6 +445,7 @@ extern AVInputFormat ff_truehd_demuxer; extern AVOutputFormat ff_truehd_muxer; extern AVInputFormat ff_tta_demuxer; extern AVOutputFormat ff_tta_muxer; +extern AVOutputFormat ff_ttml_muxer; extern AVInputFormat ff_txd_demuxer; extern AVInputFormat ff_tty_demuxer; extern AVInputFormat ff_ty_demuxer; diff --git a/libavformat/ttmlenc.c b/libavformat/ttmlenc.c new file mode 100644 index 0000000000..940f8bbd4e --- /dev/null +++ b/libavformat/ttmlenc.c @@ -0,0 +1,174 @@ +/* + * TTML subtitle muxer + * Copyright (c) 2020 24i + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * TTML subtitle muxer + * @see https://www.w3.org/TR/ttml1/ + * @see https://www.w3.org/TR/ttml2/ + * @see https://www.w3.org/TR/ttml-imsc/rec + */ + +#include "avformat.h" +#include "internal.h" +#include "libavcodec/ttmlenc.h" +#include "libavutil/internal.h" + +enum TTMLPacketType { + PACKET_TYPE_PARAGRAPH, + PACKET_TYPE_DOCUMENT, +}; + +typedef struct TTMLMuxContext { + enum TTMLPacketType input_type; + unsigned int document_written; +} TTMLMuxContext; + +static const char ttml_header_text[] = +"\n" +"\n" +" \n" +"
\n"; + +static const char ttml_footer_text[] = +"
\n" +" \n" +"\n"; + +static void ttml_write_time(AVIOContext *pb, const char tag[], + int64_t millisec) +{ + int64_t sec, min, hour; + sec = millisec / 1000; + millisec -= 1000 * sec; + min = sec / 60; + sec -= 60 * min; + hour = min / 60; + min -= 60 * hour; + + avio_printf(pb, "%s=\"%02"PRId64":%02"PRId64":%02"PRId64".%03"PRId64"\"", + tag, hour, min, sec, millisec); +} + +static int ttml_write_header(AVFormatContext *ctx) +{ + TTMLMuxContext *ttml_ctx = ctx->priv_data; + ttml_ctx->document_written = 0; + + if (ctx->nb_streams != 1 || + ctx->streams[0]->codecpar->codec_id != AV_CODEC_ID_TTML) { + av_log(ctx, AV_LOG_ERROR, "Exactly one TTML stream is required!\n"); + return AVERROR(EINVAL); + } + + { + AVStream *st = ctx->streams[0]; + AVIOContext *pb = ctx->pb; + + AVDictionaryEntry *lang = av_dict_get(st->metadata, "language", NULL, + 0); + const char *printed_lang = (lang && lang->value) ? lang->value : ""; + + // Not perfect, but decide whether the packet is a document or not + // by the existence of the lavc ttmlenc extradata. + ttml_ctx->input_type = (st->codecpar->extradata && + st->codecpar->extradata_size >= TTMLENC_EXTRADATA_SIGNATURE_SIZE && + !memcmp(st->codecpar->extradata, + TTMLENC_EXTRADATA_SIGNATURE, + TTMLENC_EXTRADATA_SIGNATURE_SIZE)) ? + PACKET_TYPE_PARAGRAPH : + PACKET_TYPE_DOCUMENT; + + avpriv_set_pts_info(st, 64, 1, 1000); + + if (ttml_ctx->input_type == PACKET_TYPE_PARAGRAPH) + avio_printf(pb, ttml_header_text, printed_lang); + } + + return 0; +} + +static int ttml_write_packet(AVFormatContext *ctx, AVPacket *pkt) +{ + TTMLMuxContext *ttml_ctx = ctx->priv_data; + AVIOContext *pb = ctx->pb; + + switch (ttml_ctx->input_type) { + case PACKET_TYPE_PARAGRAPH: + // write out a paragraph element with the given contents. + avio_printf(pb, " pts); + avio_w8(pb, '\n'); + ttml_write_time(pb, " end", pkt->pts + pkt->duration); + avio_printf(pb, ">"); + avio_write(pb, pkt->data, pkt->size); + avio_printf(pb, "

\n"); + break; + case PACKET_TYPE_DOCUMENT: + // dump the given document out as-is. + if (ttml_ctx->document_written) { + av_log(ctx, AV_LOG_ERROR, + "Attempting to write multiple TTML documents into a " + "single document! The XML specification forbids this " + "as there has to be a single root tag.\n"); + return AVERROR(EINVAL); + } + avio_write(pb, pkt->data, pkt->size); + ttml_ctx->document_written = 1; + break; + default: + av_log(ctx, AV_LOG_ERROR, + "Internal error: invalid TTML input packet type: %d!\n", + ttml_ctx->input_type); + return AVERROR_BUG; + } + + return 0; +} + +static int ttml_write_trailer(AVFormatContext *ctx) +{ + TTMLMuxContext *ttml_ctx = ctx->priv_data; + AVIOContext *pb = ctx->pb; + + if (ttml_ctx->input_type == PACKET_TYPE_PARAGRAPH) + avio_printf(pb, ttml_footer_text); + + return 0; +} + +AVOutputFormat ff_ttml_muxer = { + .name = "ttml", + .long_name = NULL_IF_CONFIG_SMALL("TTML subtitle"), + .extensions = "ttml", + .mime_type = "text/ttml", + .priv_data_size = sizeof(TTMLMuxContext), + .flags = AVFMT_GLOBALHEADER | AVFMT_VARIABLE_FPS | + AVFMT_TS_NONSTRICT, + .subtitle_codec = AV_CODEC_ID_TTML, + .write_header = ttml_write_header, + .write_packet = ttml_write_packet, + .write_trailer = ttml_write_trailer, +}; diff --git a/libavformat/version.h b/libavformat/version.h index 6ce2135ee1..8d4715f31d 100644 --- a/libavformat/version.h +++ b/libavformat/version.h @@ -32,7 +32,7 @@ // Major bumping may affect Ticket5467, 5421, 5451(compatibility with Chromium) // Also please add any ticket numbers that you believe might be affected here #define LIBAVFORMAT_VERSION_MAJOR 58 -#define LIBAVFORMAT_VERSION_MINOR 69 +#define LIBAVFORMAT_VERSION_MINOR 70 #define LIBAVFORMAT_VERSION_MICRO 100 #define LIBAVFORMAT_VERSION_INT AV_VERSION_INT(LIBAVFORMAT_VERSION_MAJOR, \ diff --git a/tests/fate/subtitles.mak b/tests/fate/subtitles.mak index 6323d0f93d..ee65afe35b 100644 --- a/tests/fate/subtitles.mak +++ b/tests/fate/subtitles.mak @@ -106,6 +106,9 @@ fate-sub-scc: CMD = fmtstdout ass -ss 57 -i $(TARGET_SAMPLES)/sub/witch.scc FATE_SUBTITLES-$(call ALLYES, MPEGTS_DEMUXER DVBSUB_DECODER DVBSUB_ENCODER) += fate-sub-dvb fate-sub-dvb: CMD = framecrc -i $(TARGET_SAMPLES)/sub/dvbsubtest_filter.ts -map s:0 -c dvbsub +FATE_SUBTITLES-$(call ALLYES, FILE_PROTOCOL PIPE_PROTOCOL SRT_DEMUXER SUBRIP_DECODER TTML_ENCODER TTML_MUXER) += fate-sub-ttmlenc +fate-sub-ttmlenc: CMD = fmtstdout ttml -i $(TARGET_SAMPLES)/sub/SubRip_capability_tester.srt + FATE_SUBTITLES-$(call ENCMUX, ASS, ASS) += $(FATE_SUBTITLES_ASS-yes) FATE_SUBTITLES += $(FATE_SUBTITLES-yes) diff --git a/tests/ref/fate/sub-ttmlenc b/tests/ref/fate/sub-ttmlenc new file mode 100644 index 0000000000..51eab97817 --- /dev/null +++ b/tests/ref/fate/sub-ttmlenc @@ -0,0 +1,122 @@ + + + +
+

Don't show this text it may be used to insert hidden data

+

SubRip subtitles capability tester 1.3o by ale5000
Use VLC 1.1 or higher as reference for most things and MPC Home Cinema for others
This text should be blue
This text should be red
This text should be black
If you see this with the normal font, the player don't (fully) support font face

+

Hidden

+

This text should be small
This text should be normal
This text should be big

+

This should be an E with an accent: È
日本語
This text should be bold, italics and underline
This text should be small and green
This text should be small and red
This text should be big and brown

+

This line should be bold
This line should be italics
This line should be underline
This line should be strikethrough
Both lines
should be underline

+

>
It would be a good thing to
hide invalid html tags that are closed and show the text in them
but show un-closed invalid html tags
Show not opened tags
<

+

and also
hide invalid html tags with parameters that are closed and show the text in them
but show un-closed invalid html tags
This text should be showed underlined without problems also: 2<3,5>1,4<6
This shouldn't be underlined

+

This text should be in the normal position...

+

This text should NOT be in the normal position

+

Implementation is the same of the ASS tag
This text should be at the
top and horizontally centered

+

This text should be at the
middle and horizontally centered

+

This text should be at the
bottom and horizontally centered

+

This text should be at the
top and horizontally at the left

+

This text should be at the
middle and horizontally at the left
(The second position must be ignored)

+

This text should be at the
bottom and horizontally at the left

+

This text should be at the
top and horizontally at the right

+

This text should be at the
middle and horizontally at the right

+

This text should be at the
bottom and horizontally at the right

+

This could be the most difficult thing to implement

+

First text

+

Second, it shouldn't overlap first

+

Third, it should replace second

+

Fourth, it shouldn't overlap first and third

+

Fifth, it should replace third

+

Sixth, it shouldn't be
showed overlapped

+

TEXT 1 (bottom)

+

text 2

+

Hide these tags:
also hide these tags:
but show this: {normal text}

+


\ N is a forced line break
\ h is a hard space
Normal spaces at the start and at the end of the line are trimmed while hard spaces are not trimmed.
The\hline\hwill\hnever\hbreak\hautomatically\hright\hbefore\hor\hafter\ha\hhard\hspace.\h:-D

+


\h\h\h\h\hA (05 hard spaces followed by a letter)
A (Normal spaces followed by a letter)
A (No hard spaces followed by a letter)

+

\h\h\h\h\hA (05 hard spaces followed by a letter)
A (Normal spaces followed by a letter)
A (No hard spaces followed by a letter)
Show this: \TEST and this: \-)

+


A letter followed by 05 hard spaces: A\h\h\h\h\h
A letter followed by normal spaces: A
A letter followed by no hard spaces: A
05 hard spaces between letters: A\h\h\h\h\hA
5 normal spaces between letters: A A

^--Forced line break

+

Both line should be strikethrough,
yes.
Correctly closed tags
should be hidden.

+

It shouldn't be strikethrough,
not opened tag showed as text.
Not opened tag showed as text.

+

Three lines should be strikethrough,
yes.
Not closed tags showed as text

+

Both line should be strikethrough but
the wrong closing tag should be showed

+
+ +