From patchwork Thu Dec 15 22:43:15 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Erik_Br=C3=A5then_Solem?= X-Patchwork-Id: 1818 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.65.86 with SMTP id o83csp1069910vsa; Thu, 15 Dec 2016 17:20:41 -0800 (PST) X-Received: by 10.28.127.9 with SMTP id a9mr857562wmd.95.1481851241135; Thu, 15 Dec 2016 17:20:41 -0800 (PST) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id p127si1004226wmp.101.2016.12.15.17.20.39; Thu, 15 Dec 2016 17:20:41 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@hotmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE dis=NONE) header.from=hotmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 93E6E689A0D; Fri, 16 Dec 2016 03:20:31 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from BAY004-OMC3S28.hotmail.com (bay004-omc3s28.hotmail.com [65.54.190.166]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id ABB79689948 for ; Fri, 16 Dec 2016 00:43:12 +0200 (EET) Received: from EUR02-AM5-obe.outbound.protection.outlook.com ([65.54.190.187]) by BAY004-OMC3S28.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Thu, 15 Dec 2016 14:43:17 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=CiiDmjUR62xzMjvTs1VxNa8qsKLyVp4VeSrkiGxka6c=; b=ga8rk8nqEJZHxfDyP1Va118FJ/M1F2PgrWM8liMZcMcXyJ1hKm/CHZhU8zUMG/OyaUplPwMQQiFm0o2gfZAUhM2b+FxQlgB2OgYT5PUpHMZVrT68xKwhyxZNgM5WJn9Nm5EMHk90YMDIJG4i3WIpmOkedta4af3bLoFtfDJHyxSmjpb2mgR7HeUoOrBTd+kPO21+wSYztKyemL4nc41QQTf6CsijgQwdHctQ7ssPT7j+kTuKphkEucIZuYlQ0v6AF5q9xVCVIECMcpFR5jnBEh24Bh/XJpE7FacwWfBK4eFqsK58HrlKK8UabgPmoWXSe7XEobKl50rpyab6CDvTSQ== Received: from AM5EUR02FT033.eop-EUR02.prod.protection.outlook.com (10.152.8.53) by AM5EUR02HT181.eop-EUR02.prod.protection.outlook.com (10.152.9.246) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.761.6; Thu, 15 Dec 2016 22:43:15 +0000 Received: from VI1PR01MB1327.eurprd01.prod.exchangelabs.com (10.152.8.59) by AM5EUR02FT033.mail.protection.outlook.com (10.152.8.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.761.6 via Frontend Transport; Thu, 15 Dec 2016 22:43:15 +0000 Received: from VI1PR01MB1327.eurprd01.prod.exchangelabs.com ([10.162.119.17]) by VI1PR01MB1327.eurprd01.prod.exchangelabs.com ([10.162.119.17]) with mapi id 15.01.0771.014; Thu, 15 Dec 2016 22:43:15 +0000 From: =?iso-8859-1?Q?Erik_Br=E5then_Solem?= To: "ffmpeg-devel@ffmpeg.org" Thread-Topic: [PATCH 1/1] Fixing 3GPP Timed Text (TTXT / tx3g / mov_text) encoding for UTF-8 (ticket 6021) Thread-Index: AQHSVySf3hRW/iKzHUeZ7DUxAB47Fg== Date: Thu, 15 Dec 2016 22:43:15 +0000 Message-ID: Accept-Language: nb-NO, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-slblob-mailprops: KEQq9915z5t2Iu3yqPIoCs9xEj4oRXuFEdHqS4+Fju53/JuA2HhfNrTx2eMypug9PI5+6bgD7W37CGxItAXcJKQPuCfbYnnK6EiNxEzVLsl732h1osGo/L1f1us8mw2B6trsvPuoGNAMC7WwY89ImaDG+rkk5ril7TcMyc5ftFdQA3FLS7U7lgIPtv5tbKDs/kSWko0/eQ5+y3IEk5TyQ7gRLZQSVKLq25AnVS2tX1CGyXT+9BxrMsTUwEEATKSeEpoBTChOLkoC5pb7K+KKzwWUT5RIcVT2USfqSAzvPTYl2PhQtOs1aLQ8ElYPhTHAQeh+UCkfX/tf9IsOzLeDO3hmlIZVQiyuccuCzTVy3k3haawGR087i1uErqrtFEFDzCRR4fTknLTmPRX5qIuZ8YGO4GKcMZO5/NEZonwNytmecT4zwhDBeY2qESDCTioHHdNKnEkfeIlurUDGHorfmLFZR7crB2/lx3h4n1CWhEDlxBs1T/UAn1VxwiQqHni3W4LRb732Hn4AjZKkRAqU/FBegXnCA4Y3JjERyvS/lSfsO6tr32KQz7THINY+xtzHuhwxCo+PTNati68tswaIGJsgG9Q0cG+epN8EssmyeNqtnVq3Shf8hePdyojB08IBNC+M+TtPh8QSuBl56F37dQ5C5YJzTYy3z+GKgbu5KS+5TmYvvGHwYcW9sgofA5qVX8g7L7hszhXi/UuACkb8dDVasBVRBwg/7IlItEKaf3U+4eWPT+tDQZW1xZr6Zl/6zY3t7SYTGF4G57cyMnxuUPy/isDDsIcboxUuVZ8mcefv2McK192MAselKMyE/AppKBZDcL6U02vgx1Qjx1F/bxzSkHwx6Y1Pp91ZU8+wHXgajwG1CPHCWlrkPNzZS+Rg8AoAKBWGV688seZ6pZDheSViVQ7yH0hVlyXTycScdRXo13uMIsuayA== authentication-results: ffmpeg.org; dkim=none (message not signed) header.d=none; ffmpeg.org; dmarc=none action=none header.from=hotmail.com; x-incomingtopheadermarker: OriginalChecksum:31E6A63DF45AEE11DE8AC12A51AB3421598BC7436B93F7FA170C1DA28EB7679B; UpperCasedChecksum:DC4B6E51092585FE0E129D786E9D249D1A48E2105484AD90A3BAAF5B8B133669; SizeAsReceived:8291; Count:36 x-ms-exchange-messagesentrepresentingtype: 1 x-incomingheadercount: 36 x-eopattributedmessage: 0 x-microsoft-exchange-diagnostics: 1; AM5EUR02HT181; 5:7PqVKDBMYAWs8cXZNEHApB5CHk2dBRFWFx8OvCI9N0lMon2nXW3HsZSCB6e7HFWVbie5AxicJ1E6qeLF6rmpU8K36hjMY3TJxGOEsNSQgXRcOeYqrgfaUS7ordkON1421WlNFVOD790vCPFZ2ufNbg==; 24:YZlIha6ch89SciC0Mx00Lb//+2WdLDIQqBQbToh2HkRuR532hq+HR0TtVlERiUXNROn/trpj5u+t8L7K0C/RX7IfT77TRE6VuNG79Gtrm0I=; 7:DFgLohP0qL48E0RY3R+t60g9NlV82pRZXbEz0I62P9MYkS343cfCqRiF/bY8NCf1S0zQzMsX7OOOm2D7xL+i1kZLOybNpWUHIK3WES3cLcqiCbUC0bLUKLtos9BHSQp38DN5dAsaBwm4+5HpglZQXTJUD34NVMQcm8zEbNXRN6fDyxzQXDZBlU9RmBVqm9dYXDYvTdk1PZGxMThjEB8rO7A5f2hDqZchMA+OQYpR2Hfh62QfAh/rOYqYHsMkRQ9N12Shyxxtr6WPFzdWanrnAv4DKpy1ndNQ0EX9u7zRPb5Jq90ifICAbniDClGEJ/dD7ylrOVQ7x/eZIPEr8RTDvtT/Jd2hXWXwj666ujwY3a5d1DGWco1YJuCr79pzfnfspI+ACPCYwn1ztd6gh1EHXtN4xcZWlfltfqQOIHYVQFBbFxqrooRHgG9+McnF0ZTXFIwXyDAzmbTm+dyGJcimRQ== x-forefront-antispam-report: EFV:NLI; SFV:NSPM; SFS:(10019020)(98900003); DIR:OUT; SFP:1102; SCL:1; SRVR:AM5EUR02HT181; H:VI1PR01MB1327.eurprd01.prod.exchangelabs.com; FPR:; SPF:None; LANG:en; x-ms-office365-filtering-correlation-id: 04d25fea-a943-4236-544e-08d4253bc0fb x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(1601124038)(5061506293)(5061507293)(1603103113)(1601125047); SRVR:AM5EUR02HT181; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(432015012)(82015046); SRVR:AM5EUR02HT181; BCL:0; PCL:0; RULEID:; SRVR:AM5EUR02HT181; x-forefront-prvs: 0157DEB61B spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Dec 2016 22:43:15.1900 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5EUR02HT181 X-OriginalArrivalTime: 15 Dec 2016 22:43:17.0346 (UTC) FILETIME=[A0AF7C20:01D25724] X-Mailman-Approved-At: Fri, 16 Dec 2016 03:20:29 +0200 Subject: [FFmpeg-devel] [PATCH 1/1] Fixing 3GPP Timed Text (TTXT / tx3g / mov_text) encoding for UTF-8 (ticket 6021) X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: =?iso-8859-1?Q?Erik_Br=E5then_Solem?= Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" According to the format specification (3GPP TS 26.245, section 5.2) "storage lengths are specified as byte-counts, wheras highlighting is specified using character offsets." This patch replaces byte counting with character counting for highlighting. See the following page for a link to the specification: https://gpac.wp.mines-telecom.fr/mp4box/ttxt-format-documentation/ --- libavcodec/movtextenc.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/libavcodec/movtextenc.c b/libavcodec/movtextenc.c index 20e01e2..3ae015a 100644 --- a/libavcodec/movtextenc.c +++ b/libavcodec/movtextenc.c @@ -70,6 +70,7 @@ typedef struct { uint8_t style_fontsize; uint32_t style_color; uint16_t text_pos; + uint16_t text_pos_chars; } MovTextContext; typedef struct { @@ -216,10 +217,10 @@ static void mov_text_style_cb(void *priv, const char style, int close) } s->style_attributes_temp->style_flag = 0; - s->style_attributes_temp->style_start = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_start = AV_RB16(&s->text_pos_chars); } else { if (s->style_attributes_temp->style_flag) { //break the style record here and start a new one - s->style_attributes_temp->style_end = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_end = AV_RB16(&s->text_pos_chars); av_dynarray_add(&s->style_attributes, &s->count, s->style_attributes_temp); s->style_attributes_temp = av_malloc(sizeof(*s->style_attributes_temp)); if (!s->style_attributes_temp) { @@ -230,10 +231,10 @@ static void mov_text_style_cb(void *priv, const char style, int close) } s->style_attributes_temp->style_flag = s->style_attributes[s->count - 1]->style_flag; - s->style_attributes_temp->style_start = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_start = AV_RB16(&s->text_pos_chars); } else { s->style_attributes_temp->style_flag = 0; - s->style_attributes_temp->style_start = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_start = AV_RB16(&s->text_pos_chars); } } switch (style){ @@ -248,7 +249,7 @@ static void mov_text_style_cb(void *priv, const char style, int close) break; } } else { - s->style_attributes_temp->style_end = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_end = AV_RB16(&s->text_pos_chars); av_dynarray_add(&s->style_attributes, &s->count, s->style_attributes_temp); s->style_attributes_temp = av_malloc(sizeof(*s->style_attributes_temp)); @@ -273,7 +274,7 @@ static void mov_text_style_cb(void *priv, const char style, int close) break; } if (s->style_attributes_temp->style_flag) { //start of new style record - s->style_attributes_temp->style_start = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_start = AV_RB16(&s->text_pos_chars); } } s->box_flags |= STYL_BOX; @@ -284,11 +285,11 @@ static void mov_text_color_cb(void *priv, unsigned int color, unsigned int color MovTextContext *s = priv; if (color_id == 2) { //secondary color changes if (s->box_flags & HLIT_BOX) { //close tag - s->hlit.end = AV_RB16(&s->text_pos); + s->hlit.end = AV_RB16(&s->text_pos_chars); } else { s->box_flags |= HCLR_BOX; s->box_flags |= HLIT_BOX; - s->hlit.start = AV_RB16(&s->text_pos); + s->hlit.start = AV_RB16(&s->text_pos_chars); s->hclr.color = color | (0xFF << 24); //set alpha value to FF } } @@ -302,7 +303,10 @@ static void mov_text_text_cb(void *priv, const char *text, int len) { MovTextContext *s = priv; av_bprint_append_data(&s->buffer, text, len); - s->text_pos += len; + s->text_pos += len; // length of text in bytes + for (int i = 0; i < len; i++) // length of text in UTF-8 characters + if ((text[i] & 0xC0) != 0x80) + s->text_pos_chars++; } static void mov_text_new_line_cb(void *priv, int forced) @@ -310,6 +314,7 @@ static void mov_text_new_line_cb(void *priv, int forced) MovTextContext *s = priv; av_bprint_append_data(&s->buffer, "\n", 1); s->text_pos += 1; + s->text_pos_chars += 1; } static const ASSCodesCallbacks mov_text_callbacks = { @@ -328,6 +333,7 @@ static int mov_text_encode_frame(AVCodecContext *avctx, unsigned char *buf, size_t j; s->text_pos = 0; + s->text_pos_chars = 0; s->count = 0; s->box_flags = 0; s->style_entries = 0;