From patchwork Thu Mar 4 17:48:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= X-Patchwork-Id: 26097 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 1C80844A146 for ; Thu, 4 Mar 2021 19:48:41 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id ED88868ABDA; Thu, 4 Mar 2021 19:48:40 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com [209.85.167.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9590D68ABA6 for ; Thu, 4 Mar 2021 19:48:34 +0200 (EET) Received: by mail-lf1-f47.google.com with SMTP id z11so44576366lfb.9 for ; Thu, 04 Mar 2021 09:48:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=ryjMOi7xYtO0792+k3SYlteq7LNAki4JMPJzoFbnK6A=; b=p8a2A6ONLRiBN5JKiAVvgMFwwsarx2CPawTju3vNg5D88IUt1+dSmXNBSMtTDHv9ci Qd7gereWHq9kGWm47o6dzln4iTE+IDO5BdiH6r+C6uvoFXojfElEVZYHw4Hu4ICtHW1Y GQ1lwJjDl05PY0GCvgXsax4LhDkoSQeBvayBW7MJmQBSsEFuIX2ZZZO3e39AvnTMGfv+ 1jeIn1gCiCwcSQOFIt1CByyF6804uK59ly1/vxh/FtqFWrJYXiuRy6sfvp/wICu7I6Ra rbZZIclWAmgoBWmbAP2vIFrhrsD4K+BFyAqLbb2xV3/SDVqwkuIcf7F5wDrAw9r8M7d4 acEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ryjMOi7xYtO0792+k3SYlteq7LNAki4JMPJzoFbnK6A=; b=PrXDb2ghmDdRqp6VD0xZKI5LzLQ/+V3k0O7nymifOxKwlNYf7rGXr77OvkYyLHD9Mh brzSe2cSFb6sOuyEkNiboev4gwE1wDQ7Kt6+VMXAxpgvHQ7TC9bESJvL/HvQMT8G/3qa Gj9mOsYoIwz4ryRrFYSfBVJj9RqHG8Frgs66t0OyWD36AJn/PKkfwH60ZUo2Oo0L/ebx vIk2YJBxTD8G/kEzKWtUd2AAgAUwwz6wBLMub2S66ww2gs+4fq93H5aKOjqn11f6leA0 K1AmTkqzikFh6B55mR/pj9xxaCmpqxkf52PksG1qbT1uIFGoF6AJnszeDu16vdEXbhhi WkRA== X-Gm-Message-State: AOAM5328mfv0pXXgA97EJVZQ/eeOoyo1bl3ozmWqjlWbmtXPGKy6onYm PaklwG6ipvZqdXzeIlcmaUKccg3/8Fg= X-Google-Smtp-Source: ABdhPJyQ4MuwJHzF/AiICHp2up6GCXVo1+OS+Fy7SNJREnDSHebP2MGsZE4pkYlogAA7VRXFEnB2nQ== X-Received: by 2002:ac2:5ed0:: with SMTP id d16mr2907818lfq.569.1614880113923; Thu, 04 Mar 2021 09:48:33 -0800 (PST) Received: from localhost.localdomain (91-159-194-103.elisa-laajakaista.fi. [91.159.194.103]) by smtp.gmail.com with ESMTPSA id n25sm9549lfe.86.2021.03.04.09.48.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Mar 2021 09:48:33 -0800 (PST) From: =?utf-8?q?Jan_Ekstr=C3=B6m?= To: ffmpeg-devel@ffmpeg.org Date: Thu, 4 Mar 2021 19:48:26 +0200 Message-Id: <20210304174830.53798-2-jeebjp@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210304174830.53798-1-jeebjp@gmail.com> References: <20210304174830.53798-1-jeebjp@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v7 1/5] avutil/{avstring, bprint}: add XML escaping from ffprobe to avutil X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Stefano Sabatini Base escaping only escapes values required for base character data according to part 2.4 of XML, and if additional flags are added single and double quotes can additionally be escaped in order to handle single and double quoted attributes. Co-authored-by: Jan Ekström Signed-off-by: Jan Ekström --- libavutil/avstring.h | 14 ++++++++++++++ libavutil/bprint.c | 29 +++++++++++++++++++++++++++++ libavutil/version.h | 2 +- tools/ffescape.c | 7 +++++-- 4 files changed, 49 insertions(+), 3 deletions(-) diff --git a/libavutil/avstring.h b/libavutil/avstring.h index ee225585b3..fae446c302 100644 --- a/libavutil/avstring.h +++ b/libavutil/avstring.h @@ -324,6 +324,7 @@ enum AVEscapeMode { AV_ESCAPE_MODE_AUTO, ///< Use auto-selected escaping mode. AV_ESCAPE_MODE_BACKSLASH, ///< Use backslash escaping. AV_ESCAPE_MODE_QUOTE, ///< Use single-quote escaping. + AV_ESCAPE_MODE_XML, ///< Use XML non-markup character data escaping. }; /** @@ -343,6 +344,19 @@ enum AVEscapeMode { */ #define AV_ESCAPE_FLAG_STRICT (1 << 1) +/** + * Within AV_ESCAPE_MODE_XML, additionally escape single quotes for single + * quoted attributes. + */ +#define AV_ESCAPE_FLAG_XML_SINGLE_QUOTES (1 << 2) + +/** + * Within AV_ESCAPE_MODE_XML, additionally escape double quotes for double + * quoted attributes. + */ +#define AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES (1 << 3) + + /** * Escape string in src, and put the escaped string in an allocated * string in *dst, which must be freed with av_free(). diff --git a/libavutil/bprint.c b/libavutil/bprint.c index 2f059c5ba6..e12fb263fe 100644 --- a/libavutil/bprint.c +++ b/libavutil/bprint.c @@ -283,6 +283,35 @@ void av_bprint_escape(AVBPrint *dstbuf, const char *src, const char *special_cha av_bprint_chars(dstbuf, '\'', 1); break; + case AV_ESCAPE_MODE_XML: + /* escape XML non-markup character data as per 2.4 by default: */ + /* [^<&]* - ([^<&]* ']]>' [^<&]*) */ + + /* additionally, given one of the AV_ESCAPE_FLAG_XML_* flags, */ + /* escape those specific characters as required. */ + for (; *src; src++) { + switch (*src) { + case '&' : av_bprintf(dstbuf, "%s", "&"); break; + case '<' : av_bprintf(dstbuf, "%s", "<"); break; + case '>' : av_bprintf(dstbuf, "%s", ">"); break; + case '\'': + if (!(flags & AV_ESCAPE_FLAG_XML_SINGLE_QUOTES)) + goto XML_DEFAULT_HANDLING; + + av_bprintf(dstbuf, "%s", "'"); + break; + case '"' : + if (!(flags & AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES)) + goto XML_DEFAULT_HANDLING; + + av_bprintf(dstbuf, "%s", """); + break; +XML_DEFAULT_HANDLING: + default: av_bprint_chars(dstbuf, *src, 1); + } + } + break; + /* case AV_ESCAPE_MODE_BACKSLASH or unknown mode */ default: /* \-escape characters */ diff --git a/libavutil/version.h b/libavutil/version.h index b7c5892a37..356c54d633 100644 --- a/libavutil/version.h +++ b/libavutil/version.h @@ -79,7 +79,7 @@ */ #define LIBAVUTIL_VERSION_MAJOR 56 -#define LIBAVUTIL_VERSION_MINOR 66 +#define LIBAVUTIL_VERSION_MINOR 67 #define LIBAVUTIL_VERSION_MICRO 100 #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \ diff --git a/tools/ffescape.c b/tools/ffescape.c index 0530d28c6d..1ed8daa801 100644 --- a/tools/ffescape.c +++ b/tools/ffescape.c @@ -78,8 +78,10 @@ int main(int argc, char **argv) infilename = optarg; break; case 'f': - if (!strcmp(optarg, "whitespace")) escape_flags |= AV_ESCAPE_FLAG_WHITESPACE; - else if (!strcmp(optarg, "strict")) escape_flags |= AV_ESCAPE_FLAG_STRICT; + if (!strcmp(optarg, "whitespace")) escape_flags |= AV_ESCAPE_FLAG_WHITESPACE; + else if (!strcmp(optarg, "strict")) escape_flags |= AV_ESCAPE_FLAG_STRICT; + else if (!strcmp(optarg, "xml_single_quotes")) escape_flags |= AV_ESCAPE_FLAG_XML_SINGLE_QUOTES; + else if (!strcmp(optarg, "xml_double_quotes")) escape_flags |= AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES; else { av_log(NULL, AV_LOG_ERROR, "Invalid value '%s' for option -f, " @@ -104,6 +106,7 @@ int main(int argc, char **argv) if (!strcmp(optarg, "auto")) escape_mode = AV_ESCAPE_MODE_AUTO; else if (!strcmp(optarg, "backslash")) escape_mode = AV_ESCAPE_MODE_BACKSLASH; else if (!strcmp(optarg, "quote")) escape_mode = AV_ESCAPE_MODE_QUOTE; + else if (!strcmp(optarg, "xml")) escape_mode = AV_ESCAPE_MODE_XML; else { av_log(NULL, AV_LOG_ERROR, "Invalid value '%s' for option -m, " From patchwork Thu Mar 4 17:48:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= X-Patchwork-Id: 26098 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 25D2044A146 for ; Thu, 4 Mar 2021 19:48:42 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 0B97D68ABFE; Thu, 4 Mar 2021 19:48:42 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lj1-f177.google.com (mail-lj1-f177.google.com [209.85.208.177]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 5D7A268ABDA for ; Thu, 4 Mar 2021 19:48:35 +0200 (EET) Received: by mail-lj1-f177.google.com with SMTP id a17so34430452ljq.2 for ; Thu, 04 Mar 2021 09:48:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=CO757h3pud/O4zoxvoiGCGcSzpz6g83NKGO9D9oiwVo=; b=m6jbKhsbZ+yQMmc3vdM467sgaFrj/aGNSGIjflvY2FCT1iMyplN62V3/8bDSFDL/00 McpOCO8+4AxL7oRApkOWkvRJPI0WPgFAPake0yD31EDKBoaIAXoNyE0KKsL9zc2BZt8J Z1CmS1aQWYIPdZgCSmvCGbCvIEJL1/zmpyZQbNlw9VaWn32x8+GUo1lmTOtbmZqhxsT4 cahLXLLPtdSpkJ9oJ0tnhmQf8IIIFb4S7MsrLItvF0sBIDfGxWyex2UeDoxjdA4r5oU/ bRnze/mCueXIUcwmYxLz5wZSnJ7KYVQhXijunN345iyv4Yl0jLQijFbSryt4AkPn7Ebi VPuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CO757h3pud/O4zoxvoiGCGcSzpz6g83NKGO9D9oiwVo=; b=QaHNfUGxzUbN0/r8Bj3aHi1yV4A+3QSi//kH3qWaKg0tvPGZpReX0B1oIwFrEhlhb6 gkeMj5qhkP70i+qTtmOVEkVPlW47Gqh1L3MVgrFlF9x/qE6pwJCCiOMy7KNrof3O5jLG CZ/LyV6TwXCofHVUo7v4AUifgYUzCCiWhf5zl0xUMZQxAgbg/C6xXDH6RmZxjVhbMAVt 1YrCT4Marja7ZwUw6fj44BCs3VidsRHvGVteO1ye8ePfUAdYJ7LNZc4Q9AAYhcmo1oll twH2VEbX0q7sok4kQzjbMiy4g6lhE3xoabayC2X8aKQ+SRomJOAXnLDrcZqTRJZBaLIY j4Bw== X-Gm-Message-State: AOAM531ZWcZAS+5zAfQ7HUsDuwKAJ/Hdg0qF6+SnfWb3iC3s15jTQPAQ YrUzZbOiRJdJer7kPmeNAABgPN4b1Xk= X-Google-Smtp-Source: ABdhPJzIi6MsD/GevD26msuON07KEDPlUL03SSfoOBgQRDRPlTLzcKZGRZ19I0UpYDo/X8GnNhIdXA== X-Received: by 2002:a05:651c:387:: with SMTP id e7mr2919789ljp.425.1614880114620; Thu, 04 Mar 2021 09:48:34 -0800 (PST) Received: from localhost.localdomain (91-159-194-103.elisa-laajakaista.fi. [91.159.194.103]) by smtp.gmail.com with ESMTPSA id n25sm9549lfe.86.2021.03.04.09.48.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Mar 2021 09:48:34 -0800 (PST) From: =?utf-8?q?Jan_Ekstr=C3=B6m?= To: ffmpeg-devel@ffmpeg.org Date: Thu, 4 Mar 2021 19:48:27 +0200 Message-Id: <20210304174830.53798-3-jeebjp@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210304174830.53798-1-jeebjp@gmail.com> References: <20210304174830.53798-1-jeebjp@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v7 2/5] ffprobe: switch to av_bprint_escape for XML escaping X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Jan Ekström Additionally update the result of the ffprobe XML writing test. Signed-off-by: Jan Ekström --- fftools/ffprobe.c | 32 +++++++++++--------------------- tests/ref/fate/ffprobe_xml | 2 +- 2 files changed, 12 insertions(+), 22 deletions(-) diff --git a/fftools/ffprobe.c b/fftools/ffprobe.c index 740e759958..1eb9d88b5e 100644 --- a/fftools/ffprobe.c +++ b/fftools/ffprobe.c @@ -1672,24 +1672,6 @@ static av_cold int xml_init(WriterContext *wctx) return 0; } -static const char *xml_escape_str(AVBPrint *dst, const char *src, void *log_ctx) -{ - const char *p; - - for (p = src; *p; p++) { - switch (*p) { - case '&' : av_bprintf(dst, "%s", "&"); break; - case '<' : av_bprintf(dst, "%s", "<"); break; - case '>' : av_bprintf(dst, "%s", ">"); break; - case '"' : av_bprintf(dst, "%s", """); break; - case '\'': av_bprintf(dst, "%s", "'"); break; - default: av_bprint_chars(dst, *p, 1); - } - } - - return dst->str; -} - #define XML_INDENT() printf("%*c", xml->indent_level * 4, ' ') static void xml_print_section_header(WriterContext *wctx) @@ -1761,14 +1743,22 @@ static void xml_print_str(WriterContext *wctx, const char *key, const char *valu if (section->flags & SECTION_FLAG_HAS_VARIABLE_FIELDS) { XML_INDENT(); + av_bprint_escape(&buf, key, NULL, + AV_ESCAPE_MODE_XML, AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES); printf("<%s key=\"%s\"", - section->element_name, xml_escape_str(&buf, key, wctx)); + section->element_name, buf.str); av_bprint_clear(&buf); - printf(" value=\"%s\"/>\n", xml_escape_str(&buf, value, wctx)); + + av_bprint_escape(&buf, value, NULL, + AV_ESCAPE_MODE_XML, AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES); + printf(" value=\"%s\"/>\n", buf.str); } else { if (wctx->nb_item[wctx->level]) printf(" "); - printf("%s=\"%s\"", key, xml_escape_str(&buf, value, wctx)); + + av_bprint_escape(&buf, value, NULL, + AV_ESCAPE_MODE_XML, AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES); + printf("%s=\"%s\"", key, buf.str); } av_bprint_finalize(&buf, NULL); diff --git a/tests/ref/fate/ffprobe_xml b/tests/ref/fate/ffprobe_xml index 1e99158021..04261ed693 100644 --- a/tests/ref/fate/ffprobe_xml +++ b/tests/ref/fate/ffprobe_xml @@ -51,7 +51,7 @@ - + From patchwork Thu Mar 4 17:48:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= X-Patchwork-Id: 26099 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 50C9944A146 for ; Thu, 4 Mar 2021 19:48:43 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 386BC68ABFD; Thu, 4 Mar 2021 19:48:43 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com [209.85.167.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0425768ABE8 for ; Thu, 4 Mar 2021 19:48:36 +0200 (EET) Received: by mail-lf1-f47.google.com with SMTP id d3so44611881lfg.10 for ; Thu, 04 Mar 2021 09:48:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=CQp8btkAHCDSdYPo7pbPxV6QRlUvLcWsrjHPxewOXnw=; b=uk1HyBjWTk9eyvK07GWPPmfVvcfJlITygXDYZ2UK6LSngGaRNvN9osPDqSpT0686Yz 3nBFcqOAImbL6CMAhP2L/jtoxrg17uk68sQyPOnDZj69T7/aIXuJPHXi4eaHEpuBrTol eePEdKgHY8YG+QWlop8ydSGIdJB7Z44IfpEokaTniUO0qVvRSXqPccCcRUbOM4TDSmQS rgsNwuUlFHW0+IQtQJg6v2hd4Usr5ByPOV2aIcWeUNbpsjZ13PapETH902upIkH2c5XI Wd8FnVmY6NHqH8UewNOgCDR+L2YRo0P+hMElajgwzHTmRDLbwMJlBiPj4/p5AWCDSbFo TCzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CQp8btkAHCDSdYPo7pbPxV6QRlUvLcWsrjHPxewOXnw=; b=lq4e0eFmo8un4fSh7EaoWrLPVNVrLsKciiMEJfD5QGzCyToZu8N+O+Y20YefrF/ldg kQy1BmKlhVqrVInGEkRNxDBU7txNfPuoo5OFdqFsIdRRXDNRt4qgnyVDUvhtbAGpIxQI DA/kqNO/r3QzxKAskBGkgCTWf/O9HkEUxlQsbBWvGnc0m034s2VQ3qOQPqT00M5+V+5g d/V5ktcvQFev5c9vvW9MkKuSCRFOUxsv42NcvP6/w2fg2/W60pkxMJa2Ifaa782AoVnN gZQEm9CwddSfQVNQrBuzp1FGyNxYRkNSyWp1QqJJscxzNFo4z5KlbLQaWP+hKg4AewU2 PcUg== X-Gm-Message-State: AOAM533gh2ynpo0ltsWVcIpXN29NBawm0JHRyO2Hp/ujnCDGOGFSAa3w cJbmQyu4nLPb+IEx6TXxy3jvRk/ioak= X-Google-Smtp-Source: ABdhPJy+MesNfrfdSFmRzXzIdYODu6yDEkCWdKnUsG/GQ8YSyjFbt3vS7K5vMrVLlq5v/ZIKMakVsA== X-Received: by 2002:a19:5213:: with SMTP id m19mr2897257lfb.203.1614880115369; Thu, 04 Mar 2021 09:48:35 -0800 (PST) Received: from localhost.localdomain (91-159-194-103.elisa-laajakaista.fi. [91.159.194.103]) by smtp.gmail.com with ESMTPSA id n25sm9549lfe.86.2021.03.04.09.48.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Mar 2021 09:48:34 -0800 (PST) From: =?utf-8?q?Jan_Ekstr=C3=B6m?= To: ffmpeg-devel@ffmpeg.org Date: Thu, 4 Mar 2021 19:48:28 +0200 Message-Id: <20210304174830.53798-4-jeebjp@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210304174830.53798-1-jeebjp@gmail.com> References: <20210304174830.53798-1-jeebjp@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v7 3/5] avcodec: enable usage of err_recognition for encoders X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Jan Ekström Enables the usage of such values as AV_EF_EXPLODE in encoders, which can be useful in cases such as subtitle encoders where they have the responsibility to validate the correctness of an incoming ASS dialog line. Signed-off-by: Jan Ekström --- doc/APIchanges | 3 +++ libavcodec/avcodec.h | 2 +- libavcodec/options_table.h | 18 +++++++++--------- libavcodec/version.h | 2 +- 4 files changed, 14 insertions(+), 11 deletions(-) diff --git a/doc/APIchanges b/doc/APIchanges index a003abf7ca..4027d599e7 100644 --- a/doc/APIchanges +++ b/doc/APIchanges @@ -15,6 +15,9 @@ libavutil: 2017-10-21 API changes, most recent first: +2021-03-04 - xxxxxxxxxx - lavc 58.128.101 - avcodec.h + Enable err_recognition to be set for encoders. + 2021-03-03 - xxxxxxxxxx - lavf 58.70.100 - avformat.h Deprecate AVFMT_FLAG_PRIV_OPT. It will do nothing as soon as av_demuxer_open() is removed. diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index cd6e6d19bc..ecc665677a 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -1634,7 +1634,7 @@ typedef struct AVCodecContext { /** * Error recognition; may misdetect some more or less valid parts as errors. - * - encoding: unused + * - encoding: Set by user. * - decoding: Set by user. */ int err_recognition; diff --git a/libavcodec/options_table.h b/libavcodec/options_table.h index ded9de4d67..e12159f734 100644 --- a/libavcodec/options_table.h +++ b/libavcodec/options_table.h @@ -140,15 +140,15 @@ static const AVOption avcodec_options[] = { {"unofficial", "allow unofficial extensions", 0, AV_OPT_TYPE_CONST, {.i64 = FF_COMPLIANCE_UNOFFICIAL }, INT_MIN, INT_MAX, A|V|D|E, "strict"}, {"experimental", "allow non-standardized experimental things", 0, AV_OPT_TYPE_CONST, {.i64 = FF_COMPLIANCE_EXPERIMENTAL }, INT_MIN, INT_MAX, A|V|D|E, "strict"}, {"b_qoffset", "QP offset between P- and B-frames", OFFSET(b_quant_offset), AV_OPT_TYPE_FLOAT, {.dbl = 1.25 }, -FLT_MAX, FLT_MAX, V|E}, -{"err_detect", "set error detection flags", OFFSET(err_recognition), AV_OPT_TYPE_FLAGS, {.i64 = 0 }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"crccheck", "verify embedded CRCs", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_CRCCHECK }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"bitstream", "detect bitstream specification deviations", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_BITSTREAM }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"buffer", "detect improper bitstream length", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_BUFFER }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"explode", "abort decoding on minor error detection", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_EXPLODE }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"ignore_err", "ignore errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_IGNORE_ERR }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"careful", "consider things that violate the spec, are fast to check and have not been seen in the wild as errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_CAREFUL }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"compliant", "consider all spec non compliancies as errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_COMPLIANT | AV_EF_CAREFUL }, INT_MIN, INT_MAX, A|V|D, "err_detect"}, -{"aggressive", "consider things that a sane encoder should not do as an error", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_AGGRESSIVE | AV_EF_COMPLIANT | AV_EF_CAREFUL}, INT_MIN, INT_MAX, A|V|D, "err_detect"}, +{"err_detect", "set error detection flags", OFFSET(err_recognition), AV_OPT_TYPE_FLAGS, {.i64 = 0 }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"crccheck", "verify embedded CRCs", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_CRCCHECK }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"bitstream", "detect bitstream specification deviations", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_BITSTREAM }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"buffer", "detect improper bitstream length", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_BUFFER }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"explode", "abort decoding on minor error detection", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_EXPLODE }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"ignore_err", "ignore errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_IGNORE_ERR }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"careful", "consider things that violate the spec, are fast to check and have not been seen in the wild as errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_CAREFUL }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"compliant", "consider all spec non compliancies as errors", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_COMPLIANT | AV_EF_CAREFUL }, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, +{"aggressive", "consider things that a sane encoder should not do as an error", 0, AV_OPT_TYPE_CONST, {.i64 = AV_EF_AGGRESSIVE | AV_EF_COMPLIANT | AV_EF_CAREFUL}, INT_MIN, INT_MAX, A|V|S|D|E, "err_detect"}, {"has_b_frames", NULL, OFFSET(has_b_frames), AV_OPT_TYPE_INT, {.i64 = DEFAULT }, 0, INT_MAX}, {"block_align", NULL, OFFSET(block_align), AV_OPT_TYPE_INT, {.i64 = DEFAULT }, 0, INT_MAX}, #if FF_API_PRIVATE_OPT diff --git a/libavcodec/version.h b/libavcodec/version.h index 2b3757fa07..dd15ae341e 100644 --- a/libavcodec/version.h +++ b/libavcodec/version.h @@ -29,7 +29,7 @@ #define LIBAVCODEC_VERSION_MAJOR 58 #define LIBAVCODEC_VERSION_MINOR 128 -#define LIBAVCODEC_VERSION_MICRO 100 +#define LIBAVCODEC_VERSION_MICRO 101 #define LIBAVCODEC_VERSION_INT AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \ LIBAVCODEC_VERSION_MINOR, \ From patchwork Thu Mar 4 17:48:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= X-Patchwork-Id: 26100 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 39C3A44A146 for ; Thu, 4 Mar 2021 19:48:44 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 2538E68AC04; Thu, 4 Mar 2021 19:48:44 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com [209.85.167.51]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id D682F68ABEE for ; Thu, 4 Mar 2021 19:48:36 +0200 (EET) Received: by mail-lf1-f51.google.com with SMTP id v9so26842129lfa.1 for ; Thu, 04 Mar 2021 09:48:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=9ZSIJnsjCnl+H4i3kLV7RcnHFL9FsjdfrydZ7kOk2gU=; b=bXBZgM2yagS/h5hjNjIzvzMoFum9Y2zgAL4O15y9uyZEyxPTuYuXyx6zgWZQhECrbQ VeO6Lcf3eo0z6ByNkf3Wfb39pI6ioewa0GEy75N1XPXUbBj3ke0crSw/J9525LTfpD8x adseWHOdN1qLWnq+auz9QuaLogD/YGSngFphTpKajoTD25OuKT1PrDLD2pJtpCG1XyF8 7ekuc+Gzvybzbm4aK6ZJIHXrSD7tjc61TY5DYBn7eWsez3x36w/RRVLmfS++RRToVYQl wyjMIKLrLK73PgnDhjJe0LSGK+ivEaywVauNTuZanMG+rA0CUzynJY5triQiCPCtkJo1 PW7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9ZSIJnsjCnl+H4i3kLV7RcnHFL9FsjdfrydZ7kOk2gU=; b=bPkz0FcETy5hqUxklNcDQE5oLJRUqUyUDMfiuMIEZ+ZIpTMG1VuNu4JH68BYF0pxya 3q6nqpqZR1nuLj7p3Cj45whCWCo5GqA2NhkaKbBe5J766LAcIY6HKRSAXXz9w38kGqmg jwSNe0JXts72T7mBjPwgQjoUn+F6cBZesRTEAO8eWMmNXipT4nTiWdH/K8QM2LBR40B+ iYHKUFOzWFka3de5DEuO+iVbDHQTsdDyYPq+o/b3rKbCDzZ5PVVT8SBY2jQE3Hgh7Rei i+19svCpRySDTFRG1su94/l3R+5hUnDN0ewcxhb4R61OPC+fmrgwA+KBpkVNCnvuDuUg v4Rg== X-Gm-Message-State: AOAM531v9EOaAt9LqJHoipcPtY9cCYNDEwS+9r9L8mo6STotBbfyb72k QvlPIJxWCp0O5zorzBuicH3rV1qJ5TA= X-Google-Smtp-Source: ABdhPJxxNTVzn46HW7OipkN5F8yNcdqbcaXWDEarF6Yv2dMthNv0vBVBAna3YDlC66Bhv4DJSbjulg== X-Received: by 2002:ac2:50da:: with SMTP id h26mr2813483lfm.33.1614880116099; Thu, 04 Mar 2021 09:48:36 -0800 (PST) Received: from localhost.localdomain (91-159-194-103.elisa-laajakaista.fi. [91.159.194.103]) by smtp.gmail.com with ESMTPSA id n25sm9549lfe.86.2021.03.04.09.48.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Mar 2021 09:48:35 -0800 (PST) From: =?utf-8?q?Jan_Ekstr=C3=B6m?= To: ffmpeg-devel@ffmpeg.org Date: Thu, 4 Mar 2021 19:48:29 +0200 Message-Id: <20210304174830.53798-5-jeebjp@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210304174830.53798-1-jeebjp@gmail.com> References: <20210304174830.53798-1-jeebjp@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v7 4/5] avcodec: add TTML encoder X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Jan Ekström Enables encoding of other subtitle formats into TTML paragraphs. Signed-off-by: Jan Ekström --- Changelog | 1 + doc/general_contents.texi | 1 + libavcodec/Makefile | 1 + libavcodec/allcodecs.c | 1 + libavcodec/ttmlenc.c | 210 ++++++++++++++++++++++++++++++++++++++ libavcodec/ttmlenc.h | 28 +++++ libavcodec/version.h | 4 +- 7 files changed, 244 insertions(+), 2 deletions(-) create mode 100644 libavcodec/ttmlenc.c create mode 100644 libavcodec/ttmlenc.h diff --git a/Changelog b/Changelog index 9e7f67cc19..43b6abb82b 100644 --- a/Changelog +++ b/Changelog @@ -78,6 +78,7 @@ version : - Simbiosis IMX decoder - Simbiosis IMX demuxer - Digital Pictures SGA demuxer and decoders +- TTML subtitle encoder version 4.3: diff --git a/doc/general_contents.texi b/doc/general_contents.texi index 6acdf441d6..ac02f33c6f 100644 --- a/doc/general_contents.texi +++ b/doc/general_contents.texi @@ -1352,6 +1352,7 @@ performance on systems without hardware floating point support). @item SubViewer v1 @tab @tab X @tab @tab X @item SubViewer @tab @tab X @tab @tab X @item TED Talks captions @tab @tab X @tab @tab X +@item TTML @tab @tab @tab X @tab @item VobSub (IDX+SUB) @tab @tab X @tab @tab X @item VPlayer @tab @tab X @tab @tab X @item WebVTT @tab X @tab X @tab X @tab X diff --git a/libavcodec/Makefile b/libavcodec/Makefile index b7e456b59f..d1b1125a30 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -672,6 +672,7 @@ OBJS-$(CONFIG_TSCC_DECODER) += tscc.o msrledec.o OBJS-$(CONFIG_TSCC2_DECODER) += tscc2.o OBJS-$(CONFIG_TTA_DECODER) += tta.o ttadata.o ttadsp.o OBJS-$(CONFIG_TTA_ENCODER) += ttaenc.o ttaencdsp.o ttadata.o +OBJS-$(CONFIG_TTML_ENCODER) += ttmlenc.o ass_split.o OBJS-$(CONFIG_TWINVQ_DECODER) += twinvqdec.o twinvq.o metasound_data.o OBJS-$(CONFIG_TXD_DECODER) += txd.o OBJS-$(CONFIG_ULTI_DECODER) += ulti.o diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c index a04faead16..2e9a3581de 100644 --- a/libavcodec/allcodecs.c +++ b/libavcodec/allcodecs.c @@ -691,6 +691,7 @@ extern AVCodec ff_subviewer_decoder; extern AVCodec ff_subviewer1_decoder; extern AVCodec ff_text_encoder; extern AVCodec ff_text_decoder; +extern AVCodec ff_ttml_encoder; extern AVCodec ff_vplayer_decoder; extern AVCodec ff_webvtt_encoder; extern AVCodec ff_webvtt_decoder; diff --git a/libavcodec/ttmlenc.c b/libavcodec/ttmlenc.c new file mode 100644 index 0000000000..3972b4368c --- /dev/null +++ b/libavcodec/ttmlenc.c @@ -0,0 +1,210 @@ +/* + * TTML subtitle encoder + * Copyright (c) 2020 24i + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * TTML subtitle encoder + * @see https://www.w3.org/TR/ttml1/ + * @see https://www.w3.org/TR/ttml2/ + * @see https://www.w3.org/TR/ttml-imsc/rec + */ + +#include "avcodec.h" +#include "internal.h" +#include "libavutil/avstring.h" +#include "libavutil/bprint.h" +#include "libavutil/internal.h" +#include "ass_split.h" +#include "ass.h" +#include "ttmlenc.h" + +typedef struct { + AVCodecContext *avctx; + ASSSplitContext *ass_ctx; + AVBPrint buffer; +} TTMLContext; + +static void ttml_text_cb(void *priv, const char *text, int len) +{ + TTMLContext *s = priv; + AVBPrint cur_line = { 0 }; + AVBPrint *buffer = &s->buffer; + + av_bprint_init(&cur_line, len, AV_BPRINT_SIZE_UNLIMITED); + + av_bprint_append_data(&cur_line, text, len); + if (!av_bprint_is_complete(&cur_line)) { + av_log(s->avctx, AV_LOG_ERROR, + "Failed to move the current subtitle dialog to AVBPrint!\n"); + av_bprint_finalize(&cur_line, NULL); + return; + } + + + av_bprint_escape(buffer, cur_line.str, NULL, AV_ESCAPE_MODE_XML, + 0); + + av_bprint_finalize(&cur_line, NULL); +} + +static void ttml_new_line_cb(void *priv, int forced) +{ + TTMLContext *s = priv; + + av_bprintf(&s->buffer, "
"); +} + +static const ASSCodesCallbacks ttml_callbacks = { + .text = ttml_text_cb, + .new_line = ttml_new_line_cb, +}; + +static int ttml_encode_frame(AVCodecContext *avctx, uint8_t *buf, + int bufsize, const AVSubtitle *sub) +{ + TTMLContext *s = avctx->priv_data; + ASSDialog *dialog; + int i; + + av_bprint_clear(&s->buffer); + + for (i=0; inum_rects; i++) { + const char *ass = sub->rects[i]->ass; + + if (sub->rects[i]->type != SUBTITLE_ASS) { + av_log(avctx, AV_LOG_ERROR, "Only SUBTITLE_ASS type supported.\n"); + return AVERROR(EINVAL); + } + +#if FF_API_ASS_TIMING + if (!strncmp(ass, "Dialogue: ", 10)) { + int num; + dialog = ff_ass_split_dialog(s->ass_ctx, ass, 0, &num); + + for (; dialog && num--; dialog++) { + int ret = ff_ass_split_override_codes(&ttml_callbacks, s, + dialog->text); + int log_level = (ret != AVERROR_INVALIDDATA || + avctx->err_recognition & AV_EF_EXPLODE) ? + AV_LOG_ERROR : AV_LOG_WARNING; + + if (ret < 0) { + av_log(avctx, log_level, + "Splitting received ASS dialog failed: %s\n", + av_err2str(ret)); + + if (log_level == AV_LOG_ERROR) + return ret; + } + } + } else { +#endif + dialog = ff_ass_split_dialog2(s->ass_ctx, ass); + if (!dialog) + return AVERROR(ENOMEM); + + { + int ret = ff_ass_split_override_codes(&ttml_callbacks, s, + dialog->text); + int log_level = (ret != AVERROR_INVALIDDATA || + avctx->err_recognition & AV_EF_EXPLODE) ? + AV_LOG_ERROR : AV_LOG_WARNING; + + if (ret < 0) { + av_log(avctx, log_level, + "Splitting received ASS dialog text %s failed: %s\n", + dialog->text, + av_err2str(ret)); + + if (log_level == AV_LOG_ERROR) { + ff_ass_free_dialog(&dialog); + return ret; + } + } + + ff_ass_free_dialog(&dialog); + } +#if FF_API_ASS_TIMING + } +#endif + } + + if (!av_bprint_is_complete(&s->buffer)) + return AVERROR(ENOMEM); + if (!s->buffer.len) + return 0; + + // force null-termination, so in case our destination buffer is + // too small, the return value is larger than bufsize minus null. + if (av_strlcpy(buf, s->buffer.str, bufsize) > bufsize - 1) { + av_log(avctx, AV_LOG_ERROR, "Buffer too small for TTML event.\n"); + return AVERROR_BUFFER_TOO_SMALL; + } + + return s->buffer.len; +} + +static av_cold int ttml_encode_close(AVCodecContext *avctx) +{ + TTMLContext *s = avctx->priv_data; + + ff_ass_split_free(s->ass_ctx); + + av_bprint_finalize(&s->buffer, NULL); + + return 0; +} + +static av_cold int ttml_encode_init(AVCodecContext *avctx) +{ + TTMLContext *s = avctx->priv_data; + + s->avctx = avctx; + + if (!(s->ass_ctx = ff_ass_split(avctx->subtitle_header))) { + return AVERROR_INVALIDDATA; + } + + if (!(avctx->extradata = av_mallocz(TTMLENC_EXTRADATA_SIGNATURE_SIZE + + 1 + AV_INPUT_BUFFER_PADDING_SIZE))) { + return AVERROR(ENOMEM); + } + + avctx->extradata_size = TTMLENC_EXTRADATA_SIGNATURE_SIZE; + memcpy(avctx->extradata, TTMLENC_EXTRADATA_SIGNATURE, + TTMLENC_EXTRADATA_SIGNATURE_SIZE); + + av_bprint_init(&s->buffer, 0, AV_BPRINT_SIZE_UNLIMITED); + + return 0; +} + +AVCodec ff_ttml_encoder = { + .name = "ttml", + .long_name = NULL_IF_CONFIG_SMALL("TTML subtitle"), + .type = AVMEDIA_TYPE_SUBTITLE, + .id = AV_CODEC_ID_TTML, + .priv_data_size = sizeof(TTMLContext), + .init = ttml_encode_init, + .encode_sub = ttml_encode_frame, + .close = ttml_encode_close, + .capabilities = FF_CODEC_CAP_INIT_CLEANUP, +}; diff --git a/libavcodec/ttmlenc.h b/libavcodec/ttmlenc.h new file mode 100644 index 0000000000..c1dd5ec990 --- /dev/null +++ b/libavcodec/ttmlenc.h @@ -0,0 +1,28 @@ +/* + * TTML subtitle encoder shared functionality + * Copyright (c) 2020 24i + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVCODEC_TTMLENC_H +#define AVCODEC_TTMLENC_H + +#define TTMLENC_EXTRADATA_SIGNATURE "lavc-ttmlenc" +#define TTMLENC_EXTRADATA_SIGNATURE_SIZE (sizeof(TTMLENC_EXTRADATA_SIGNATURE) - 1) + +#endif /* AVCODEC_TTMLENC_H */ diff --git a/libavcodec/version.h b/libavcodec/version.h index dd15ae341e..d7ccf9943e 100644 --- a/libavcodec/version.h +++ b/libavcodec/version.h @@ -28,8 +28,8 @@ #include "libavutil/version.h" #define LIBAVCODEC_VERSION_MAJOR 58 -#define LIBAVCODEC_VERSION_MINOR 128 -#define LIBAVCODEC_VERSION_MICRO 101 +#define LIBAVCODEC_VERSION_MINOR 129 +#define LIBAVCODEC_VERSION_MICRO 100 #define LIBAVCODEC_VERSION_INT AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \ LIBAVCODEC_VERSION_MINOR, \ From patchwork Thu Mar 4 17:48:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= X-Patchwork-Id: 26102 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id 8ADE144A35D for ; Thu, 4 Mar 2021 20:17:27 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 6A38E68ABF3; Thu, 4 Mar 2021 20:17:27 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9A15968A571 for ; Thu, 4 Mar 2021 20:17:20 +0200 (EET) Received: by mail-wr1-f45.google.com with SMTP id u14so28726014wri.3 for ; Thu, 04 Mar 2021 10:17:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=VjJqqf3791sfxq4Wx0zLh25cbAxQChQP3Gfiay34vVo=; b=VbNngPgwvVEZTJM4a1Mkg0Mly3xZ/8y5mYm/+Z8S7JFnFNczFO3S0VRJ3nt95AP/EN FpwsM4JMQmBVi38x4mwRRTGSr/T9v/4HlV878vDjBGSadjsA0p5qDCCUw12Tknllm67D YUEAVcQY8poOtNYfFtjW4Wk1pi8LC8K4Ydn24Jx7LQErRvvkEnRFwO2ahdP53sj6Isbw N9ebcNCsZ6kRWfjD252dvEKA1oCwOpBvcca54OK8Kh25KniX5INUf3t/K3DJiPl3x6a6 4Y0HRK/9J8S4ePH2oiAQvIuav8mpsA4IkYB0eWXyAPl6pWsX7SxvqolbiTSBjYoWRf24 cr5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VjJqqf3791sfxq4Wx0zLh25cbAxQChQP3Gfiay34vVo=; b=gVvJW6OpcEew0mSRgKwsksw1qtF4LvHjF3Hw2bR9FGDgbqXiK7hDu2norcsP+yfwvT hBf8GufdEfCFJA40yjwCLG5nWmrx2/Q/pXWfLeA69/Q9C98peeorrRd1sXISZGzffGks qe5ii0J1fSVaReYgNpdIgszyPJj+5q9oYHZoV9oGTTRYPe5UQuVIgesSE48JmU/SkKgT 31hjM/se4c/M2dxF2wbniWmTwVV71A9+U1AwtQ9I74AzE4fUvHvUuTkufGtxGrXO1ywN 3uMnRL/effXKm2eKVM+P2rhsprAsHbN7q9stiabqQ3vf7qAFLF6qizYZ/G0hBajtP5Ul QZhQ== X-Gm-Message-State: AOAM530bMbxqrEaB5K4YFdFBm+HkTE/MVJ/DZZppjtLnZ7DNmioFhTHp V6Cfbo0cd8sGSyM7c2jX0Iz/g3VCTQs= X-Google-Smtp-Source: ABdhPJzP4t3fJZ6qkGzQ3V1HkQ18WqFBGaRb6sezMCW1kAWWCOG0ySQxgFwxYN+Pthe5u3WVQ7HSJg== X-Received: by 2002:a2e:978b:: with SMTP id y11mr2926112lji.452.1614880116844; Thu, 04 Mar 2021 09:48:36 -0800 (PST) Received: from localhost.localdomain (91-159-194-103.elisa-laajakaista.fi. [91.159.194.103]) by smtp.gmail.com with ESMTPSA id n25sm9549lfe.86.2021.03.04.09.48.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Mar 2021 09:48:36 -0800 (PST) From: =?utf-8?q?Jan_Ekstr=C3=B6m?= To: ffmpeg-devel@ffmpeg.org Date: Thu, 4 Mar 2021 19:48:30 +0200 Message-Id: <20210304174830.53798-6-jeebjp@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210304174830.53798-1-jeebjp@gmail.com> References: <20210304174830.53798-1-jeebjp@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v7 5/5] avformat: add TTML muxer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Jan Ekström Enables writing TTML documents or encoded TTML paragraphs as such documents. Additionally, a test for the combined TTML encoder and muxer has been added to validate that the components still work. Signed-off-by: Jan Ekström --- Changelog | 2 +- doc/general_contents.texi | 2 +- libavformat/Makefile | 1 + libavformat/allformats.c | 1 + libavformat/ttmlenc.c | 174 +++++++++++++++++++++++++++++++++++++ libavformat/version.h | 2 +- tests/fate/subtitles.mak | 3 + tests/ref/fate/sub-ttmlenc | 122 ++++++++++++++++++++++++++ 8 files changed, 304 insertions(+), 3 deletions(-) create mode 100644 libavformat/ttmlenc.c create mode 100644 tests/ref/fate/sub-ttmlenc diff --git a/Changelog b/Changelog index 43b6abb82b..f0b2995444 100644 --- a/Changelog +++ b/Changelog @@ -78,7 +78,7 @@ version : - Simbiosis IMX decoder - Simbiosis IMX demuxer - Digital Pictures SGA demuxer and decoders -- TTML subtitle encoder +- TTML subtitle encoder and muxer version 4.3: diff --git a/doc/general_contents.texi b/doc/general_contents.texi index ac02f33c6f..58c9bcf747 100644 --- a/doc/general_contents.texi +++ b/doc/general_contents.texi @@ -1352,7 +1352,7 @@ performance on systems without hardware floating point support). @item SubViewer v1 @tab @tab X @tab @tab X @item SubViewer @tab @tab X @tab @tab X @item TED Talks captions @tab @tab X @tab @tab X -@item TTML @tab @tab @tab X @tab +@item TTML @tab X @tab @tab X @tab @item VobSub (IDX+SUB) @tab @tab X @tab @tab X @item VPlayer @tab @tab X @tab @tab X @item WebVTT @tab X @tab X @tab X @tab X diff --git a/libavformat/Makefile b/libavformat/Makefile index 48b91ea4d0..0504f47f88 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -546,6 +546,7 @@ OBJS-$(CONFIG_TRUEHD_DEMUXER) += rawdec.o mlpdec.o OBJS-$(CONFIG_TRUEHD_MUXER) += rawenc.o OBJS-$(CONFIG_TTA_DEMUXER) += tta.o apetag.o img2.o OBJS-$(CONFIG_TTA_MUXER) += ttaenc.o apetag.o img2.o +OBJS-$(CONFIG_TTML_MUXER) += ttmlenc.o OBJS-$(CONFIG_TTY_DEMUXER) += tty.o sauce.o OBJS-$(CONFIG_TY_DEMUXER) += ty.o OBJS-$(CONFIG_TXD_DEMUXER) += txd.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index ade247640c..a38fd1f583 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -445,6 +445,7 @@ extern AVInputFormat ff_truehd_demuxer; extern AVOutputFormat ff_truehd_muxer; extern AVInputFormat ff_tta_demuxer; extern AVOutputFormat ff_tta_muxer; +extern AVOutputFormat ff_ttml_muxer; extern AVInputFormat ff_txd_demuxer; extern AVInputFormat ff_tty_demuxer; extern AVInputFormat ff_ty_demuxer; diff --git a/libavformat/ttmlenc.c b/libavformat/ttmlenc.c new file mode 100644 index 0000000000..940f8bbd4e --- /dev/null +++ b/libavformat/ttmlenc.c @@ -0,0 +1,174 @@ +/* + * TTML subtitle muxer + * Copyright (c) 2020 24i + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * TTML subtitle muxer + * @see https://www.w3.org/TR/ttml1/ + * @see https://www.w3.org/TR/ttml2/ + * @see https://www.w3.org/TR/ttml-imsc/rec + */ + +#include "avformat.h" +#include "internal.h" +#include "libavcodec/ttmlenc.h" +#include "libavutil/internal.h" + +enum TTMLPacketType { + PACKET_TYPE_PARAGRAPH, + PACKET_TYPE_DOCUMENT, +}; + +typedef struct TTMLMuxContext { + enum TTMLPacketType input_type; + unsigned int document_written; +} TTMLMuxContext; + +static const char ttml_header_text[] = +"\n" +"\n" +" \n" +"
\n"; + +static const char ttml_footer_text[] = +"
\n" +" \n" +"\n"; + +static void ttml_write_time(AVIOContext *pb, const char tag[], + int64_t millisec) +{ + int64_t sec, min, hour; + sec = millisec / 1000; + millisec -= 1000 * sec; + min = sec / 60; + sec -= 60 * min; + hour = min / 60; + min -= 60 * hour; + + avio_printf(pb, "%s=\"%02"PRId64":%02"PRId64":%02"PRId64".%03"PRId64"\"", + tag, hour, min, sec, millisec); +} + +static int ttml_write_header(AVFormatContext *ctx) +{ + TTMLMuxContext *ttml_ctx = ctx->priv_data; + ttml_ctx->document_written = 0; + + if (ctx->nb_streams != 1 || + ctx->streams[0]->codecpar->codec_id != AV_CODEC_ID_TTML) { + av_log(ctx, AV_LOG_ERROR, "Exactly one TTML stream is required!\n"); + return AVERROR(EINVAL); + } + + { + AVStream *st = ctx->streams[0]; + AVIOContext *pb = ctx->pb; + + AVDictionaryEntry *lang = av_dict_get(st->metadata, "language", NULL, + 0); + const char *printed_lang = (lang && lang->value) ? lang->value : ""; + + // Not perfect, but decide whether the packet is a document or not + // by the existence of the lavc ttmlenc extradata. + ttml_ctx->input_type = (st->codecpar->extradata && + st->codecpar->extradata_size >= TTMLENC_EXTRADATA_SIGNATURE_SIZE && + !memcmp(st->codecpar->extradata, + TTMLENC_EXTRADATA_SIGNATURE, + TTMLENC_EXTRADATA_SIGNATURE_SIZE)) ? + PACKET_TYPE_PARAGRAPH : + PACKET_TYPE_DOCUMENT; + + avpriv_set_pts_info(st, 64, 1, 1000); + + if (ttml_ctx->input_type == PACKET_TYPE_PARAGRAPH) + avio_printf(pb, ttml_header_text, printed_lang); + } + + return 0; +} + +static int ttml_write_packet(AVFormatContext *ctx, AVPacket *pkt) +{ + TTMLMuxContext *ttml_ctx = ctx->priv_data; + AVIOContext *pb = ctx->pb; + + switch (ttml_ctx->input_type) { + case PACKET_TYPE_PARAGRAPH: + // write out a paragraph element with the given contents. + avio_printf(pb, " pts); + avio_w8(pb, '\n'); + ttml_write_time(pb, " end", pkt->pts + pkt->duration); + avio_printf(pb, ">"); + avio_write(pb, pkt->data, pkt->size); + avio_printf(pb, "

\n"); + break; + case PACKET_TYPE_DOCUMENT: + // dump the given document out as-is. + if (ttml_ctx->document_written) { + av_log(ctx, AV_LOG_ERROR, + "Attempting to write multiple TTML documents into a " + "single document! The XML specification forbids this " + "as there has to be a single root tag.\n"); + return AVERROR(EINVAL); + } + avio_write(pb, pkt->data, pkt->size); + ttml_ctx->document_written = 1; + break; + default: + av_log(ctx, AV_LOG_ERROR, + "Internal error: invalid TTML input packet type: %d!\n", + ttml_ctx->input_type); + return AVERROR_BUG; + } + + return 0; +} + +static int ttml_write_trailer(AVFormatContext *ctx) +{ + TTMLMuxContext *ttml_ctx = ctx->priv_data; + AVIOContext *pb = ctx->pb; + + if (ttml_ctx->input_type == PACKET_TYPE_PARAGRAPH) + avio_printf(pb, ttml_footer_text); + + return 0; +} + +AVOutputFormat ff_ttml_muxer = { + .name = "ttml", + .long_name = NULL_IF_CONFIG_SMALL("TTML subtitle"), + .extensions = "ttml", + .mime_type = "text/ttml", + .priv_data_size = sizeof(TTMLMuxContext), + .flags = AVFMT_GLOBALHEADER | AVFMT_VARIABLE_FPS | + AVFMT_TS_NONSTRICT, + .subtitle_codec = AV_CODEC_ID_TTML, + .write_header = ttml_write_header, + .write_packet = ttml_write_packet, + .write_trailer = ttml_write_trailer, +}; diff --git a/libavformat/version.h b/libavformat/version.h index 3fae3d9645..a05676d979 100644 --- a/libavformat/version.h +++ b/libavformat/version.h @@ -32,7 +32,7 @@ // Major bumping may affect Ticket5467, 5421, 5451(compatibility with Chromium) // Also please add any ticket numbers that you believe might be affected here #define LIBAVFORMAT_VERSION_MAJOR 58 -#define LIBAVFORMAT_VERSION_MINOR 70 +#define LIBAVFORMAT_VERSION_MINOR 71 #define LIBAVFORMAT_VERSION_MICRO 100 #define LIBAVFORMAT_VERSION_INT AV_VERSION_INT(LIBAVFORMAT_VERSION_MAJOR, \ diff --git a/tests/fate/subtitles.mak b/tests/fate/subtitles.mak index 6323d0f93d..ee65afe35b 100644 --- a/tests/fate/subtitles.mak +++ b/tests/fate/subtitles.mak @@ -106,6 +106,9 @@ fate-sub-scc: CMD = fmtstdout ass -ss 57 -i $(TARGET_SAMPLES)/sub/witch.scc FATE_SUBTITLES-$(call ALLYES, MPEGTS_DEMUXER DVBSUB_DECODER DVBSUB_ENCODER) += fate-sub-dvb fate-sub-dvb: CMD = framecrc -i $(TARGET_SAMPLES)/sub/dvbsubtest_filter.ts -map s:0 -c dvbsub +FATE_SUBTITLES-$(call ALLYES, FILE_PROTOCOL PIPE_PROTOCOL SRT_DEMUXER SUBRIP_DECODER TTML_ENCODER TTML_MUXER) += fate-sub-ttmlenc +fate-sub-ttmlenc: CMD = fmtstdout ttml -i $(TARGET_SAMPLES)/sub/SubRip_capability_tester.srt + FATE_SUBTITLES-$(call ENCMUX, ASS, ASS) += $(FATE_SUBTITLES_ASS-yes) FATE_SUBTITLES += $(FATE_SUBTITLES-yes) diff --git a/tests/ref/fate/sub-ttmlenc b/tests/ref/fate/sub-ttmlenc new file mode 100644 index 0000000000..51eab97817 --- /dev/null +++ b/tests/ref/fate/sub-ttmlenc @@ -0,0 +1,122 @@ + + + +
+

Don't show this text it may be used to insert hidden data

+

SubRip subtitles capability tester 1.3o by ale5000
Use VLC 1.1 or higher as reference for most things and MPC Home Cinema for others
This text should be blue
This text should be red
This text should be black
If you see this with the normal font, the player don't (fully) support font face

+

Hidden

+

This text should be small
This text should be normal
This text should be big

+

This should be an E with an accent: È
日本語
This text should be bold, italics and underline
This text should be small and green
This text should be small and red
This text should be big and brown

+

This line should be bold
This line should be italics
This line should be underline
This line should be strikethrough
Both lines
should be underline

+

>
It would be a good thing to
hide invalid html tags that are closed and show the text in them
but show un-closed invalid html tags
Show not opened tags
<

+

and also
hide invalid html tags with parameters that are closed and show the text in them
but show un-closed invalid html tags
This text should be showed underlined without problems also: 2<3,5>1,4<6
This shouldn't be underlined

+

This text should be in the normal position...

+

This text should NOT be in the normal position

+

Implementation is the same of the ASS tag
This text should be at the
top and horizontally centered

+

This text should be at the
middle and horizontally centered

+

This text should be at the
bottom and horizontally centered

+

This text should be at the
top and horizontally at the left

+

This text should be at the
middle and horizontally at the left
(The second position must be ignored)

+

This text should be at the
bottom and horizontally at the left

+

This text should be at the
top and horizontally at the right

+

This text should be at the
middle and horizontally at the right

+

This text should be at the
bottom and horizontally at the right

+

This could be the most difficult thing to implement

+

First text

+

Second, it shouldn't overlap first

+

Third, it should replace second

+

Fourth, it shouldn't overlap first and third

+

Fifth, it should replace third

+

Sixth, it shouldn't be
showed overlapped

+

TEXT 1 (bottom)

+

text 2

+

Hide these tags:
also hide these tags:
but show this: {normal text}

+


\ N is a forced line break
\ h is a hard space
Normal spaces at the start and at the end of the line are trimmed while hard spaces are not trimmed.
The\hline\hwill\hnever\hbreak\hautomatically\hright\hbefore\hor\hafter\ha\hhard\hspace.\h:-D

+


\h\h\h\h\hA (05 hard spaces followed by a letter)
A (Normal spaces followed by a letter)
A (No hard spaces followed by a letter)

+

\h\h\h\h\hA (05 hard spaces followed by a letter)
A (Normal spaces followed by a letter)
A (No hard spaces followed by a letter)
Show this: \TEST and this: \-)

+


A letter followed by 05 hard spaces: A\h\h\h\h\h
A letter followed by normal spaces: A
A letter followed by no hard spaces: A
05 hard spaces between letters: A\h\h\h\h\hA
5 normal spaces between letters: A A

^--Forced line break

+

Both line should be strikethrough,
yes.
Correctly closed tags
should be hidden.

+

It shouldn't be strikethrough,
not opened tag showed as text.
Not opened tag showed as text.

+

Three lines should be strikethrough,
yes.
Not closed tags showed as text

+

Both line should be strikethrough but
the wrong closing tag should be showed

+
+ +