From patchwork Sun Dec 18 20:34:25 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Erik_Br=C3=A5then_Solem?= X-Patchwork-Id: 1859 Delivered-To: ffmpegpatchwork@gmail.com Received: by 10.103.65.86 with SMTP id o83csp843204vsa; Sun, 18 Dec 2016 12:34:41 -0800 (PST) X-Received: by 10.28.62.77 with SMTP id l74mr10595358wma.37.1482093281421; Sun, 18 Dec 2016 12:34:41 -0800 (PST) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 127si12238767wmv.35.2016.12.18.12.34.40; Sun, 18 Dec 2016 12:34:41 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@hotmail.com; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE dis=NONE) header.from=hotmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EA459689EC2; Sun, 18 Dec 2016 22:34:30 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from COL004-OMC4S13.hotmail.com (col004-omc4s13.hotmail.com [65.55.34.215]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 2B45F689E8C for ; Sun, 18 Dec 2016 22:34:23 +0200 (EET) Received: from EUR02-HE1-obe.outbound.protection.outlook.com ([65.55.34.200]) by COL004-OMC4S13.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Sun, 18 Dec 2016 12:34:28 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=CiiDmjUR62xzMjvTs1VxNa8qsKLyVp4VeSrkiGxka6c=; b=r3ifbW1qQ7enJeHLSS9RMqxt+d5ApVrjDd2RU3OmQJ3Vms1+O3jcxLXfZ4ASvePiWStzi3wqfnIb8TD0gfhu1kgg/qLQVvRkXXZEvWLOsdKD6yZPWgxFxXPHBrnEMNeDuWKybUpil08R0GCO26N3vlD73eRw4gwlilyZViwOUutEbN5jrvWUvkBwpUqPcK7/dDR3YYTWyoxvWlMdi8Ps2q7jb/OXH77wxnTpQinO+lvwGUa4GhP7iwMk6VLUbUMoGDHZbJv/Lh7JNzg5Hz42CZs5Ll0489mLRLw2GZTbYnGPfA6BT+/8ukC0khgqFTaQX1kM/BitXVlHwHS50awUNw== Received: from HE1EUR02FT044.eop-EUR02.prod.protection.outlook.com (10.152.10.55) by HE1EUR02HT165.eop-EUR02.prod.protection.outlook.com (10.152.10.239) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.761.6; Sun, 18 Dec 2016 20:34:25 +0000 Received: from VI1PR01MB1327.eurprd01.prod.exchangelabs.com (10.152.10.53) by HE1EUR02FT044.mail.protection.outlook.com (10.152.11.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.771.7 via Frontend Transport; Sun, 18 Dec 2016 20:34:25 +0000 Received: from VI1PR01MB1327.eurprd01.prod.exchangelabs.com ([10.162.119.17]) by VI1PR01MB1327.eurprd01.prod.exchangelabs.com ([10.162.119.17]) with mapi id 15.01.0789.018; Sun, 18 Dec 2016 20:34:25 +0000 From: =?iso-8859-1?Q?Erik_Br=E5then_Solem?= To: "ffmpeg-devel@ffmpeg.org" Thread-Topic: [PATCH 1/1] Fixing 3GPP Timed Text (TTXT / tx3g / mov_text) encoding for UTF-8 (ticket 6021) Thread-Index: AQHSWW4f1fG1Kro10kudEsTGowrbIg== Date: Sun, 18 Dec 2016 20:34:25 +0000 Message-ID: Accept-Language: nb-NO, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-slblob-mailprops: /2cIMDU2oIDQYrf+oaSOrNJJKdQxc7Ap0juLDZ56g+5Jv1IROf4K3fKZZZLcFwiffVi/lJKI8EIQbmezTRol1ICDVjAaCmHQEVd7w7/Cdt/GXa57FHSknBxP2LVewWqwWIMdghxOKFZZ7yOSeYdl0z2Oh19Pfd0UBoy1WqXyXYyxikRhNM04ghURzhH9nXst0I1dfd2x/ELgl/h04eSDZMs3v0A0IQHdFlBvP7SJewLAGRUXgsx2rewi+ULq4QfJLMX+ncwS/r8DskrTp2R+s/+ZpVHDhrLBA7gLWRWVF1K5z963wD6iK9Um+4rIwnlUcA/alLalAgLY+Y9pLTBHGsoVXhKDiM9pthul5u/bJ2ZVPuRSCQyPjvqNMp71YAzu1GyL4cEq15uN4xNo7bEB7ugJZtMSoEGusW6U/uoYTt5Xz1gG8BrT7+HNTyceB4wbgjPtq6BXc7mVedib7VfdudYn8KPwrls1F2OuE8NRm1Gns/I/NlZo+6a2K28aSHYGiJVsWdJ7BjrKlH8IJG8FcpOc0L9MlKLJu+8QgWrpywS9jP/LvIlfm9UMRs1/EVUOtZ7N4pNoyZw2wE/KoyvWsXcEL0noePx2N6gC3Z01y+tRkvBXUF79x4g8IHJNr+vdJmnDosQYT/dzHacOQTKCT78ynSUWvPDZGg+e4xdoCSfDexdkhZtGwbOLVrMFo+tach/jzrgjtYRWCU4vdCLRVRZIJdaH1cSCd+7imEifvzoX58rMsQ04C34U83vQ5yxohqd+AXY7bgiwjHoe3GsGCKeDliTejAv8ogW/cgG4vtr76QbkPDnMQlux0FLtYgSTG3tIhyF+X2vOteXp0Jy7ldDwCs1SkPchRU8uMeivRTxHSSS0HCDqfylsR+Ksb2wqr2NyWRW40CK1I2P9aSA39O+ieJRaTSCN7vFOsWAZTB6+hJu2Rwr9DjETZJVC/vOiBDdrPkd/KNM= authentication-results: ffmpeg.org; dkim=none (message not signed) header.d=none; ffmpeg.org; dmarc=none action=none header.from=hotmail.com; x-incomingtopheadermarker: OriginalChecksum:11AF5CCA35496C7A55A934BDA714113CA5B90023ED0A2BA23B4B00A8FEC1C666; UpperCasedChecksum:433FB0083985FC1F99C930FD53829539B9030B8F440A303A956E2A22153382F0; SizeAsReceived:8316; Count:36 x-ms-exchange-messagesentrepresentingtype: 1 x-incomingheadercount: 36 x-eopattributedmessage: 0 x-microsoft-exchange-diagnostics: 1; HE1EUR02HT165; 7:mrxrP4fT0Dzv5Xdq2CxOnywd/cfak+sthzeWuHEre1zDWb++dF6Ij4S4JGLv98y656Xia/s7gKyKkbst/YLzZk61bdJQkf2Yh04xoRUWjJlHcvC8QuFBc3MMZSRwESs7DRlCGoqPHkpMRHAY4OsL5OYxbVMZ5siXta23gJASItBGFK0Hllx15MMobCziTjt8s+SAfUO53b+EFNsKarnRllPVarR/rjkdQ0Ybx7kFwf1+xJ74zOLVC83ABfu+g5fei2ERYQ5XGmBQwO8bmewjrVmO7DFx2UDVAeUpxEwaBokbYHX3D9M/U4p+siY/xGQQcKvvI9dNp5Dh6TElvf0h6Zma6qIP+yl8E4o4dMjBYdDKL2zRtIgZz3oXt4u9UmIBCRH75DMKxh6IiP3qpWvq2hDSoyFFGkfNnzinEULoVBz3h2OpTvpZFHO2sFcePnkTt5YSYRnW4lLj1LH3e3c1uQ== x-forefront-antispam-report: EFV:NLI; SFV:NSPM; SFS:(10019020)(98900003); DIR:OUT; SFP:1102; SCL:1; SRVR:HE1EUR02HT165; H:VI1PR01MB1327.eurprd01.prod.exchangelabs.com; FPR:; SPF:None; LANG:en; x-ms-office365-filtering-correlation-id: d14753b1-4e2e-42b9-2924-08d427854115 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(1601124038)(1603103113)(1601125047); SRVR:HE1EUR02HT165; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(432015012)(82015046); SRVR:HE1EUR02HT165; BCL:0; PCL:0; RULEID:; SRVR:HE1EUR02HT165; x-forefront-prvs: 01604FB62B spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Dec 2016 20:34:25.2431 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1EUR02HT165 X-OriginalArrivalTime: 18 Dec 2016 20:34:28.0910 (UTC) FILETIME=[216AFCE0:01D2596E] Subject: [FFmpeg-devel] [PATCH 1/1] Fixing 3GPP Timed Text (TTXT / tx3g / mov_text) encoding for UTF-8 (ticket 6021) X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: =?iso-8859-1?Q?Erik_Br=E5then_Solem?= Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" According to the format specification (3GPP TS 26.245, section 5.2) "storage lengths are specified as byte-counts, wheras highlighting is specified using character offsets." This patch replaces byte counting with character counting for highlighting. See the following page for a link to the specification: https://gpac.wp.mines-telecom.fr/mp4box/ttxt-format-documentation/ --- libavcodec/movtextenc.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/libavcodec/movtextenc.c b/libavcodec/movtextenc.c index 20e01e2..3ae015a 100644 --- a/libavcodec/movtextenc.c +++ b/libavcodec/movtextenc.c @@ -70,6 +70,7 @@ typedef struct { uint8_t style_fontsize; uint32_t style_color; uint16_t text_pos; + uint16_t text_pos_chars; } MovTextContext; typedef struct { @@ -216,10 +217,10 @@ static void mov_text_style_cb(void *priv, const char style, int close) } s->style_attributes_temp->style_flag = 0; - s->style_attributes_temp->style_start = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_start = AV_RB16(&s->text_pos_chars); } else { if (s->style_attributes_temp->style_flag) { //break the style record here and start a new one - s->style_attributes_temp->style_end = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_end = AV_RB16(&s->text_pos_chars); av_dynarray_add(&s->style_attributes, &s->count, s->style_attributes_temp); s->style_attributes_temp = av_malloc(sizeof(*s->style_attributes_temp)); if (!s->style_attributes_temp) { @@ -230,10 +231,10 @@ static void mov_text_style_cb(void *priv, const char style, int close) } s->style_attributes_temp->style_flag = s->style_attributes[s->count - 1]->style_flag; - s->style_attributes_temp->style_start = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_start = AV_RB16(&s->text_pos_chars); } else { s->style_attributes_temp->style_flag = 0; - s->style_attributes_temp->style_start = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_start = AV_RB16(&s->text_pos_chars); } } switch (style){ @@ -248,7 +249,7 @@ static void mov_text_style_cb(void *priv, const char style, int close) break; } } else { - s->style_attributes_temp->style_end = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_end = AV_RB16(&s->text_pos_chars); av_dynarray_add(&s->style_attributes, &s->count, s->style_attributes_temp); s->style_attributes_temp = av_malloc(sizeof(*s->style_attributes_temp)); @@ -273,7 +274,7 @@ static void mov_text_style_cb(void *priv, const char style, int close) break; } if (s->style_attributes_temp->style_flag) { //start of new style record - s->style_attributes_temp->style_start = AV_RB16(&s->text_pos); + s->style_attributes_temp->style_start = AV_RB16(&s->text_pos_chars); } } s->box_flags |= STYL_BOX; @@ -284,11 +285,11 @@ static void mov_text_color_cb(void *priv, unsigned int color, unsigned int color MovTextContext *s = priv; if (color_id == 2) { //secondary color changes if (s->box_flags & HLIT_BOX) { //close tag - s->hlit.end = AV_RB16(&s->text_pos); + s->hlit.end = AV_RB16(&s->text_pos_chars); } else { s->box_flags |= HCLR_BOX; s->box_flags |= HLIT_BOX; - s->hlit.start = AV_RB16(&s->text_pos); + s->hlit.start = AV_RB16(&s->text_pos_chars); s->hclr.color = color | (0xFF << 24); //set alpha value to FF } } @@ -302,7 +303,10 @@ static void mov_text_text_cb(void *priv, const char *text, int len) { MovTextContext *s = priv; av_bprint_append_data(&s->buffer, text, len); - s->text_pos += len; + s->text_pos += len; // length of text in bytes + for (int i = 0; i < len; i++) // length of text in UTF-8 characters + if ((text[i] & 0xC0) != 0x80) + s->text_pos_chars++; } static void mov_text_new_line_cb(void *priv, int forced) @@ -310,6 +314,7 @@ static void mov_text_new_line_cb(void *priv, int forced) MovTextContext *s = priv; av_bprint_append_data(&s->buffer, "\n", 1); s->text_pos += 1; + s->text_pos_chars += 1; } static const ASSCodesCallbacks mov_text_callbacks = { @@ -328,6 +333,7 @@ static int mov_text_encode_frame(AVCodecContext *avctx, unsigned char *buf, size_t j; s->text_pos = 0; + s->text_pos_chars = 0; s->count = 0; s->box_flags = 0; s->style_entries = 0;