From patchwork Mon Feb 19 21:42:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oneric X-Patchwork-Id: 46369 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:dda5:b0:19e:cdac:8cce with SMTP id kw37csp1510707pzb; Mon, 19 Feb 2024 13:43:01 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCWmzujGLSRL0Z/apmTKmARPYKairqUKH5IlN97r8XmshzrdFMlmNkHnt6qzITLCNI0Zsy4eJob1d/NIkzNtP6iuuP2lIBvztQYUDw== X-Google-Smtp-Source: AGHT+IHepczput3YSoCBqwZoQw2vTbF+XKknipDPJI2piho7Cvs2O4J5FJg379laWkcIu/Fa4911 X-Received: by 2002:a05:6402:718:b0:564:aada:4899 with SMTP id w24-20020a056402071800b00564aada4899mr1490180edx.4.1708378981152; Mon, 19 Feb 2024 13:43:01 -0800 (PST) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id m18-20020aa7c492000000b005649e7ff258si693060edq.570.2024.02.19.13.43.00; Mon, 19 Feb 2024 13:43:01 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@oneric.de header.s=strato-dkim-0002 header.b=XsG8Y1Jk; dkim=neutral (no key) header.i=@oneric.de header.s=strato-dkim-0003; arc=fail (body hash mismatch); spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id DF21D68D411; Mon, 19 Feb 2024 23:42:49 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mo4-p00-ob.smtp.rzone.de (mo4-p00-ob.smtp.rzone.de [81.169.146.220]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id F400168D3E4 for ; Mon, 19 Feb 2024 23:42:41 +0200 (EET) ARC-Seal: i=1; a=rsa-sha256; t=1708378961; cv=none; d=strato.com; s=strato-dkim-0002; b=Somb7bLQ2BhtBwlChCYOvEJihb9yYjMbiWSVtBKITaMi923M3TU+eC+XOo0GYf+XtS viFUUxUxTXQgaDbzwOqXS1h07eV6ePvnLxJoGs50AtGznJin4c1nI8sA8YkJR5ssZcbT mwQpg281ZLuygUt/qHIbmRIRknfSkyYnC+J0HNY2+AOmSPKjgQ0SDc4pN23eXO2w6iMD 5Yhrjc/GYhs9ivnL6iN1QsBXIqbBJe0lENOzUA1LFBHburWg3l3m5kdFOC4dlX+2ru6s GutvfHWrAFBSOWw3SQ/XQjI0BHuY1peTf1533m1vQnPxKgkLUAEQHtB4ulXwwiBNBOi5 PfKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1708378961; s=strato-dkim-0002; d=strato.com; h=References:In-Reply-To:Message-Id:Date:Subject:To:From:Cc:Date:From: Subject:Sender; bh=eGbefD26wGMQ3COdVU6o6keSmR5BpzQynIEQyp4s/fI=; b=gPGKPBDdlj8bsYGtUXdagzbfBv6XpQp/PjlY17s1QE/ZOrVt9nBGa/kGNEteODf4eM /5u6pHdpLi+5Iwy+DoAaPho2cpfwyaAgXnQms1KUpmVGgcvX3c90US8JnY6E+p+2nqT8 MQ6skSo+IWOeCDvTx9W6ZDeWZdofJKxMwvszxkFo16T8Yv4W+cRJ7P2+NnXPQ0W6cJEq CTPDY8PVs1AmmfK/psmXNLAZrC3NOo7cUIF1Hm6TC8XlxSwWUA2B9YaXSsbTT+ZrQNrr XMalNXNK3aawV5K+GTTynQjZ4ZVITStAJH10EwblPBV/s3IdeTYT4pOo7o9msLk+EuKu PJbA== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1708378961; s=strato-dkim-0002; d=oneric.de; h=References:In-Reply-To:Message-Id:Date:Subject:To:From:Cc:Date:From: Subject:Sender; bh=eGbefD26wGMQ3COdVU6o6keSmR5BpzQynIEQyp4s/fI=; b=XsG8Y1JknuKcVknxLv9URa5ZMfCxIa1IzLu1Wd8ASQ08oZIfszYeVGOkRxde1Ccjth nl8V7GdfLaiTVbsiTmEbpUo+ri19or6knuCRxQsUYt3DOlm+E/ckM8+SobQUZ+GWt849 IkBqMCwLtrRlpLxrHKT2lNe6RvnITS/y1DJWAg0Hkq/U+D7SjF3eUabr5mnX0qT3Ez6l o+rSNbxQY5aOumamr3oP883RybWK3zYKuBHP9hx+dM9Wf/uFFjTMRk5cUxI1SK6ZTBav WbIVNoDjBqJKBqFgHnQRGu9PjbTSr5IZLUx/Iy8N1qjGEazCFFYEAKdnAHSpTYKjuvo+ 58lQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1708378961; s=strato-dkim-0003; d=oneric.de; h=References:In-Reply-To:Message-Id:Date:Subject:To:From:Cc:Date:From: Subject:Sender; bh=eGbefD26wGMQ3COdVU6o6keSmR5BpzQynIEQyp4s/fI=; b=dqGqytIfRdp+tMFiLJsZN1HA1fiA3MFxfXTVKz6Y9nplCh6cmj3k/p/Q/n9FRzRKKo PvAcPSfP1bCdqVTOKUDg== X-RZG-AUTH: ":I2IBZ0mrW/AWQXwgB4oxKM1YsW1lFUznrLvi/XReWqAAlWwZ8wlvfXmGs4jUQ0oz8ZbhHexs8fhgUyUBddsh74Htfavbxf07Su2y8qQ=" Received: from abhoth.workgroup by smtp.strato.de (RZmta 49.11.2 AUTH) with ESMTPSA id x7ec2e01JLgfXQm (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate) for ; Mon, 19 Feb 2024 22:42:41 +0100 (CET) From: Oneric To: ffmpeg-devel@ffmpeg.org Date: Mon, 19 Feb 2024 22:42:25 +0100 Message-Id: <20240219214227.19814-3-oneric@oneric.de> In-Reply-To: <20240219214227.19814-1-oneric@oneric.de> References: <20240219214227.19814-1-oneric@oneric.de> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH v4 2/4] avcodec/{ass, webvttdec}: fix handling of backslashes X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: NQpoqGoa0ZfQ Backslashes cannot be escaped by a backslash in any ASS renderer, but unless followed by specific characters it is just printed out. Insert a word-joiner character after a backslash to break up active sequences without changing the visual output. --- libavcodec/ass.c | 9 ++++++++- libavcodec/webvttdec.c | 2 +- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/libavcodec/ass.c b/libavcodec/ass.c index 5058dc8337..a68d3568b4 100644 --- a/libavcodec/ass.c +++ b/libavcodec/ass.c @@ -183,9 +183,16 @@ void ff_ass_bprint_text_event(AVBPrint *buf, const char *p, int size, /* standard ASS escaping so random characters don't get mis-interpreted * as ASS */ - } else if (!keep_ass_markup && strchr("{}\\", *p)) { + } else if (!keep_ass_markup && strchr("{}", *p)) { av_bprintf(buf, "\\%c", *p); + /* append word-joiner U+2060 as UTF-8 to break up sequences like \N */ + } else if (!keep_ass_markup && *p == '\\') { + if (p_end - p <= 3 || strncmp(p + 1, "\xe2\x81\xa0", 3)) + av_bprintf(buf, "\\\xe2\x81\xa0"); + else + av_bprintf(buf, "\\"); + /* some packets might end abruptly (no \0 at the end, like for example * in some cases of demuxing from a classic video container), some * might be terminated with \n or \r\n which we have to remove (for diff --git a/libavcodec/webvttdec.c b/libavcodec/webvttdec.c index 990d150f16..6e55bc5499 100644 --- a/libavcodec/webvttdec.c +++ b/libavcodec/webvttdec.c @@ -37,7 +37,7 @@ static const struct { {"", "{\\i1}"}, {"", "{\\i0}"}, {"", "{\\b1}"}, {"", "{\\b0}"}, {"", "{\\u1}"}, {"", "{\\u0}"}, - {"{", "\\{"}, {"}", "\\}"}, // escape to avoid ASS markup conflicts + {"{", "\\{"}, {"}", "\\}"}, {"\\", "\\\xe2\x81\xa0"}, // escape to avoid ASS markup conflicts {">", ">"}, {"<", "<"}, {"‎", "\xe2\x80\x8e"}, {"‏", "\xe2\x80\x8f"}, {"&", "&"}, {" ", "\\h"},