From patchwork Tue Mar 2 09:00:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_Ekstr=C3=B6m?= X-Patchwork-Id: 26052 Return-Path: X-Original-To: patchwork@ffaux-bg.ffmpeg.org Delivered-To: patchwork@ffaux-bg.ffmpeg.org Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org [79.124.17.100]) by ffaux.localdomain (Postfix) with ESMTP id CB50D44AFB3 for ; Tue, 2 Mar 2021 11:06:46 +0200 (EET) Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id B527968AB94; Tue, 2 Mar 2021 11:06:46 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1A3BD680397 for ; Tue, 2 Mar 2021 11:06:40 +0200 (EET) Received: by mail-lf1-f43.google.com with SMTP id d3so30093335lfg.10 for ; Tue, 02 Mar 2021 01:06:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=b/faBFZf2ymJthz3qtsDEoa2Q1AQIyZcoY/PDk0PSSM=; b=pshvjHkx7T/LlHLEkZ3slJlVBemdCYQUNYEk/oqBhYIqCz70PKrlr1CCc0KKiD57As aJjvSt1ooAhGO6avK4mP4mGQYJpq19peWVn+CVvz5Wg5r4ZUs3Tt9cfgPj4btUUgZDaq TXoYivVEaAsbOdwiWpNGXqb2PKORbigiG7k1bV4kaCnOf9vsgQtxHLw8P1aeqB2ZqImW gF5Rn6EZJcNH3uvMCkhxzHNNEbkl8P/yZfTNg/46zyJDX5WcwjJ18IEBXMkrN+gj9/xk I/iVbLzP4rwqLXbj5OJgBdqYolOIgD0WShoRkB5/fGyOP3xn/Y+m2Pc4xFtG/NoKU1bD IDZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=b/faBFZf2ymJthz3qtsDEoa2Q1AQIyZcoY/PDk0PSSM=; b=K0/Fp1v+22yBzqEpEWaxsQxByNvgWfokoa9ZAOldXq04rdD+DE868U9VN3LxYdoMco /K+sxEgBhdE6IEG5dClEpooHv+AgpF90MoSgyKi9p+cK4ipsjaKkU9cN3X7AUWd4vsqi wH6nYuQjTA0tMsB3tWhuLZdtiG4PTjN+ak2kajcxqBBW3hpdMWZ1KskhmSVx+suuSC5T Dve4tFtln7JEV/Qs0kDCRupaJOteJFUk0coE2HJkwedG22nOGp32dKlrbl7/nBvLxKQK y+CavtN84eSHMvlSH4TWmWRWg1aj7YIrjbQ6+Kde9jGO1djKtxE5evp/xihr4l9ouTpE J40g== X-Gm-Message-State: AOAM530Gy+/Y7lOU6zWsvgNw/MRtbOBtHCjIaeyUn62Go+3Bx8e1Fd7T FcBcXtII6SR5ZouVy3JFKOSe/MUKIHA= X-Google-Smtp-Source: ABdhPJxGA19XhVf+DKFyM5H2XzqSN6RgXN1bjGnsbfvkCTzHsH83Cj9BmejxtuI+ViyaO07TGLT8Ew== X-Received: by 2002:a2e:9047:: with SMTP id n7mr5136670ljg.291.1614675644506; Tue, 02 Mar 2021 01:00:44 -0800 (PST) Received: from localhost.localdomain (91-159-194-103.elisa-laajakaista.fi. [91.159.194.103]) by smtp.gmail.com with ESMTPSA id u9sm1791626ljj.0.2021.03.02.01.00.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Mar 2021 01:00:44 -0800 (PST) From: =?utf-8?q?Jan_Ekstr=C3=B6m?= To: ffmpeg-devel@ffmpeg.org Date: Tue, 2 Mar 2021 11:00:37 +0200 Message-Id: <20210302090040.10484-2-jeebjp@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210302090040.10484-1-jeebjp@gmail.com> References: <20210302090040.10484-1-jeebjp@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v6 1/4] avutil/{avstring, bprint}: add XML escaping from ffprobe to avutil X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" From: Stefano Sabatini Base escaping only escapes values required for base character data according to part 2.4 of XML, and if additional flags are added single and double quotes can additionally be escaped in order to handle single and double quoted attributes. Co-Authored-By: Jan Ekström Signed-off-by: Jan Ekström --- libavutil/avstring.h | 14 ++++++++++++++ libavutil/bprint.c | 29 +++++++++++++++++++++++++++++ libavutil/version.h | 2 +- tools/ffescape.c | 7 +++++-- 4 files changed, 49 insertions(+), 3 deletions(-) diff --git a/libavutil/avstring.h b/libavutil/avstring.h index ee225585b3..fae446c302 100644 --- a/libavutil/avstring.h +++ b/libavutil/avstring.h @@ -324,6 +324,7 @@ enum AVEscapeMode { AV_ESCAPE_MODE_AUTO, ///< Use auto-selected escaping mode. AV_ESCAPE_MODE_BACKSLASH, ///< Use backslash escaping. AV_ESCAPE_MODE_QUOTE, ///< Use single-quote escaping. + AV_ESCAPE_MODE_XML, ///< Use XML non-markup character data escaping. }; /** @@ -343,6 +344,19 @@ enum AVEscapeMode { */ #define AV_ESCAPE_FLAG_STRICT (1 << 1) +/** + * Within AV_ESCAPE_MODE_XML, additionally escape single quotes for single + * quoted attributes. + */ +#define AV_ESCAPE_FLAG_XML_SINGLE_QUOTES (1 << 2) + +/** + * Within AV_ESCAPE_MODE_XML, additionally escape double quotes for double + * quoted attributes. + */ +#define AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES (1 << 3) + + /** * Escape string in src, and put the escaped string in an allocated * string in *dst, which must be freed with av_free(). diff --git a/libavutil/bprint.c b/libavutil/bprint.c index 2f059c5ba6..e12fb263fe 100644 --- a/libavutil/bprint.c +++ b/libavutil/bprint.c @@ -283,6 +283,35 @@ void av_bprint_escape(AVBPrint *dstbuf, const char *src, const char *special_cha av_bprint_chars(dstbuf, '\'', 1); break; + case AV_ESCAPE_MODE_XML: + /* escape XML non-markup character data as per 2.4 by default: */ + /* [^<&]* - ([^<&]* ']]>' [^<&]*) */ + + /* additionally, given one of the AV_ESCAPE_FLAG_XML_* flags, */ + /* escape those specific characters as required. */ + for (; *src; src++) { + switch (*src) { + case '&' : av_bprintf(dstbuf, "%s", "&"); break; + case '<' : av_bprintf(dstbuf, "%s", "<"); break; + case '>' : av_bprintf(dstbuf, "%s", ">"); break; + case '\'': + if (!(flags & AV_ESCAPE_FLAG_XML_SINGLE_QUOTES)) + goto XML_DEFAULT_HANDLING; + + av_bprintf(dstbuf, "%s", "'"); + break; + case '"' : + if (!(flags & AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES)) + goto XML_DEFAULT_HANDLING; + + av_bprintf(dstbuf, "%s", """); + break; +XML_DEFAULT_HANDLING: + default: av_bprint_chars(dstbuf, *src, 1); + } + } + break; + /* case AV_ESCAPE_MODE_BACKSLASH or unknown mode */ default: /* \-escape characters */ diff --git a/libavutil/version.h b/libavutil/version.h index b7c5892a37..356c54d633 100644 --- a/libavutil/version.h +++ b/libavutil/version.h @@ -79,7 +79,7 @@ */ #define LIBAVUTIL_VERSION_MAJOR 56 -#define LIBAVUTIL_VERSION_MINOR 66 +#define LIBAVUTIL_VERSION_MINOR 67 #define LIBAVUTIL_VERSION_MICRO 100 #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \ diff --git a/tools/ffescape.c b/tools/ffescape.c index 0530d28c6d..1ed8daa801 100644 --- a/tools/ffescape.c +++ b/tools/ffescape.c @@ -78,8 +78,10 @@ int main(int argc, char **argv) infilename = optarg; break; case 'f': - if (!strcmp(optarg, "whitespace")) escape_flags |= AV_ESCAPE_FLAG_WHITESPACE; - else if (!strcmp(optarg, "strict")) escape_flags |= AV_ESCAPE_FLAG_STRICT; + if (!strcmp(optarg, "whitespace")) escape_flags |= AV_ESCAPE_FLAG_WHITESPACE; + else if (!strcmp(optarg, "strict")) escape_flags |= AV_ESCAPE_FLAG_STRICT; + else if (!strcmp(optarg, "xml_single_quotes")) escape_flags |= AV_ESCAPE_FLAG_XML_SINGLE_QUOTES; + else if (!strcmp(optarg, "xml_double_quotes")) escape_flags |= AV_ESCAPE_FLAG_XML_DOUBLE_QUOTES; else { av_log(NULL, AV_LOG_ERROR, "Invalid value '%s' for option -f, " @@ -104,6 +106,7 @@ int main(int argc, char **argv) if (!strcmp(optarg, "auto")) escape_mode = AV_ESCAPE_MODE_AUTO; else if (!strcmp(optarg, "backslash")) escape_mode = AV_ESCAPE_MODE_BACKSLASH; else if (!strcmp(optarg, "quote")) escape_mode = AV_ESCAPE_MODE_QUOTE; + else if (!strcmp(optarg, "xml")) escape_mode = AV_ESCAPE_MODE_XML; else { av_log(NULL, AV_LOG_ERROR, "Invalid value '%s' for option -m, "