From patchwork Wed Mar 28 03:07:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Langdale X-Patchwork-Id: 8204 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.2.1.70 with SMTP id c67csp178896jad; Tue, 27 Mar 2018 20:07:28 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+z1f5sB1jRupLiocPWtJKFYNJ3LD0xygIPvc6HNx2PLaa1OKbkVMkiKdAFdr9bGlffx0Xu X-Received: by 10.223.187.75 with SMTP id x11mr1218463wrg.217.1522206448316; Tue, 27 Mar 2018 20:07:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522206448; cv=none; d=google.com; s=arc-20160816; b=DID3fwAp0w66MzQ1+IMTN3mGK0GWYxFSg5oLSznJ5TS1tyA6vRyS5TQMixUIWnWYQE GENLok+azvOOHIuNEV+KWjdKRGiC+H81hCro0jO3KjljS+SceOHhTVi255DnrGR1u8oI p0en6GC9RvdQgYvDhbRJ8unOL5lE47kUcS5u6sdZltq64W7U7bigpWIo/2H+hHINtnnG kiHMwoA9/zPK2MqE7CT4JfbH3GBqCKDVCY/2QQJXiNcPjFMjLXkT2HsgJNkAN1Fm2WOM +CZeUWpcNtJxLJw+7hAPtkHFBij+0v4OtEUJ15vXMpc7RL5Ompc8syZjnEu1go+wKNyR u+Xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:mime-version:reply-to :list-subscribe:list-help:list-post:list-archive:list-unsubscribe :list-id:precedence:subject:message-id:date:to:from:dkim-signature :delivered-to:arc-authentication-results; bh=/iUzu0mDZZEM+hjJTMb7ax1+RARXhvzUU47lCe0N0VA=; b=XX7/aQFdSqfV295VhAAWKZGha5bmB7OTNY/572duSWuJtfONHzXExiUHQQ6qWcqN2O Y/qY8uNyw2bnAMb4tMtspyz4tl20j7igOvGkyRWVabqAUPtjSLmRccHqrncBEM4QhHzz D+/gV+b5A79ars4uR+50xbhA5apYcLNaGR5YZ/fB4IK8fbyOD8Scr9iDlnjkVuBu8XjL DObd+DIVTqkinx5s7NKtWkaSYxNUjuoNO4uMAaOc3C6CqoUYFrCM1SVR3NkfmC6uAHvv p8dYSs8tXA47zVHOKRiZRgoAEG6XkOSV7RomndwiidcU1+wr3BxP0H3l8Iy7RgdkkGHB PCJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@overt.org header.s=mail header.b=d2cyKEnY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id y5si2077985wra.165.2018.03.27.20.07.27; Tue, 27 Mar 2018 20:07:28 -0700 (PDT) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@overt.org header.s=mail header.b=d2cyKEnY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8F2C9689A34; Wed, 28 Mar 2018 06:07:08 +0300 (EEST) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-it0-f97.google.com (mail-it0-f97.google.com [209.85.214.97]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id DB07E688293 for ; Wed, 28 Mar 2018 06:07:01 +0300 (EEST) Received: by mail-it0-f97.google.com with SMTP id p67-v6so1760042itc.2 for ; Tue, 27 Mar 2018 20:07:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:subject:date:message-id; bh=Li0VYS2zJvuk+ksxeYd4mVWnNnLoJ2yJM6Znw7QsKpI=; b=dTclqi1kgA73kFKMpsuWPfn7eqkee/DUfXZjUi38f2tdsZUVskGFKXiBHs1v61lpRv aWzQFbkuHmWGUWBO5Si6M2OrrD05WxHF9bpkEPk1kyyl3hGUKfSS0AdbNAdO6JK4vyHo 86C0uqqbJFAmmKbrFHRfNzfVluICVNs0nHiinUGVd8xrBaRQyfT6EGyQD4B5kj1vrdrM kTFadFD895WKzEWcBPn8pYhtRZzpHS1mUv7oXEF4dSWvkH+D+rohKWRm5UZ3RsNq83Tr 11kl8yyUZzVy1WEUxbegelwWqCUNmgab02omUOVLiVpcxSBYZACF3JpIMGW59bElukSz PVbg== X-Gm-Message-State: AElRT7F/dc8sQxprg2i1HshghBLPUnWZcFkArcNTPcwuNls0E37i6QBM dZBNEv0MMoBrV7oCPWQRg0cVQDyQQemrLgH2IUkC9uNXDlNlvQ== X-Received: by 2002:a24:80cb:: with SMTP id g194-v6mr1726898itd.67.1522206438247; Tue, 27 Mar 2018 20:07:18 -0700 (PDT) Received: from mail.overt.org (155.208.178.107.bc.googleusercontent.com. [107.178.208.155]) by smtp-relay.gmail.com with ESMTPS id a78-v6sm940132itc.12.2018.03.27.20.07.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 27 Mar 2018 20:07:18 -0700 (PDT) X-Relaying-Domain: gapps.overt.org Received: from authenticated-user (mail.overt.org [107.178.208.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.overt.org (Postfix) with ESMTPSA id B3909600DB for ; Wed, 28 Mar 2018 03:07:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=overt.org; s=mail; t=1522206437; bh=Uuyo/8RG1k3b7pUimeT1x4kHZovNN4eFKwbIExK6a8Q=; h=From:To:Subject:Date:From; b=d2cyKEnYWUygsEW+IkqJ0eXUcberXyeyz+Qd14pXV6I2/rYOiMOY9WyMlj8X6pXly 0XAG/OF5rzTeEn17YTQwnpDuw3yG+2SW3uMqlUNUizs4/oCmXF5C2Qb/5/ekj6Zj/J Tq0diooEvsSyeB7qsoK4idvHhwFTd9NwmokY2e2P440dXIN6SiWH6UAm9htlbjnk1H tYsdNnNURBzx5HQkFoZD9jgIkxBNtNxAxXjxJPIJ/k6RIo+sMYj6gC0BamcR9ptiUX FOmw4wbg3R0tVpW+2+DaqceC57D6lVH2G/WXT7m1QehFK1bxTT5TsuEnLyG85jV/hq RyOgmFpMQSd5A== From: Philip Langdale To: ffmpeg-devel@ffmpeg.org Date: Tue, 27 Mar 2018 20:07:05 -0700 Message-Id: <20180328030705.22815-1-philipl@overt.org> Subject: [FFmpeg-devel] [PATCH] movtextenc: fix handling of utf-8 subtitles X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches MIME-Version: 1.0 Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" See the earlier fix for movtextdec for details. The equivalent bug is present on the encoder side as well. We need to track the text length in 'characters' (which seems to really mean codepoints) to ensure that styles are applied across the correct ranges. Signed-off-by: Philip Langdale --- libavcodec/movtextenc.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/libavcodec/movtextenc.c b/libavcodec/movtextenc.c index d795e317c3..fd0743f752 100644 --- a/libavcodec/movtextenc.c +++ b/libavcodec/movtextenc.c @@ -304,11 +304,33 @@ static void mov_text_color_cb(void *priv, unsigned int color, unsigned int color */ } +static uint16_t utf8_strlen(const char *text, int len) +{ + uint16_t i = 0, ret = 0; + while (i < len) { + char c = text[i]; + if (c >= 0) + i += 1; + else if ((c & 0xE0) == 0xC0) + i += 2; + else if ((c & 0xF0) == 0xE0) + i += 3; + else if ((c & 0xF8) == 0xF0) + i += 4; + else + return 0; + ret++; + } + return ret; +} + static void mov_text_text_cb(void *priv, const char *text, int len) { + uint16_t utf8_len = utf8_strlen(text, len); MovTextContext *s = priv; av_bprint_append_data(&s->buffer, text, len); - s->text_pos += len; + // If it's not utf-8, just use the byte length + s->text_pos += utf8_len ? utf8_len : len; } static void mov_text_new_line_cb(void *priv, int forced)