From patchwork Sun Jan 16 23:03:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Rheinhardt X-Patchwork-Id: 33614 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a6b:cd86:0:0:0:0:0 with SMTP id d128csp2272297iog; Sun, 16 Jan 2022 15:04:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJwxmsxNPXmE5BxrtLUvc9WDVICfjlQm672YWAivSMHe60VdJjAhRncUdR4rkGwh6o/xowWK X-Received: by 2002:aa7:db8d:: with SMTP id u13mr18215765edt.111.1642374272977; Sun, 16 Jan 2022 15:04:32 -0800 (PST) Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id rv24si6241976ejb.190.2022.01.16.15.04.32; Sun, 16 Jan 2022 15:04:32 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@outlook.com header.s=selector1 header.b=i5fQyNAK; arc=fail (body hash mismatch); spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=outlook.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CF0EB68AEA3; Mon, 17 Jan 2022 01:04:26 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-oln040092073077.outbound.protection.outlook.com [40.92.73.77]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 18A8768AE46 for ; Mon, 17 Jan 2022 01:04:20 +0200 (EET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DEimdZvcM0SeKVHNSBRvqaexuhxXSmjcZLyJ3fPS90kDUIchAcl7eWkdzSCBP2zgbk8VM2Ib9wos/Ru8hq4WKr1us0AQ1K7z0nBl6DBc+lSc3hNAU9HhixW21fz3UmU4RlNbkchBvP7x+z+BJsqMzgOqHzpG11Mpzgmfp0QmsINKUYldnpcHtC0X616ebRyhXyE4mICPreBrNpMXAx+lO6XOGEDb0mccvWVqe8tbnR46f+OMa4eCweoVj6Z30mJaeXofGnPBRBK0oQJuiVP81PSHE0n7NebE3IqDyVMDrj5c37JZKEWxjJXuvKkrq51MVgazIGqBMKmYwBNlMLCMMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nwRJoSNOmwHkYqFthhBpxXEfAgOfQGGC8fwiD+V3G7Y=; b=IeVr/Bk1CXe1RU0VrtsTWRbr4rm+CMtq6bx00ybLpSGC7NMZgGqr/v8spJ0i3MUO7xP5TDynU5n4oe9kYJDAySzZryy/NJTF/C4z42aVQmESS2FUpEBW9fFNZspkQODYqiXQu9ePbW+wl4b8NcNAlrIol8bxNI7odt+Wmy5KA7QOSof/Hz3caHz0alQPNRW8cn11JGx0OaybDGkBs+d8mbpdOc+1selCoBqgkhuWTYQgvv9pxbr3Viqf4m37FD7zgXmAiNhyhXUNtghcV2T0vd1ZQmFRMnBrtgB8tzFcpY95wcByF9OQz1fpTVJBvGivjgODQzjOLlJk/ezC1HzJNw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nwRJoSNOmwHkYqFthhBpxXEfAgOfQGGC8fwiD+V3G7Y=; b=i5fQyNAK2gEVsm0aU3FTTJ/AHYPX6UHTbOo0KEMz5jD8jawGRfPvnXlHaycYZtoW5uizS1ZArni271Zi8zDYFjEmbWNglBEzOvNKuBLwgfkDBneLpGArW91EXZENFujfJQ+gDdXKuP9NZs3Qjv2aY+RekUYJVBRgxvNYJ6WGNSQmjD5wtOgQJuL3TyabySolIjYwweJ8W3d173pipj+TVXyXiFljCTexD21ZJGldz/8wThXDghOlAvKWeBDf0aW5JFmfgx1Fa0JRP1KYwMqUeehRe8Sxuobtsn96iOCf315MX28AOdMLYK1BGpyAPDhNnQIlnBRkGp4B4Kgod2F9fA== Received: from AM7PR03MB6660.eurprd03.prod.outlook.com (2603:10a6:20b:1c1::22) by AS8PR03MB7704.eurprd03.prod.outlook.com (2603:10a6:20b:403::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4888.10; Sun, 16 Jan 2022 23:04:18 +0000 Received: from AM7PR03MB6660.eurprd03.prod.outlook.com ([fe80::19fc:be9f:2c9c:53f5]) by AM7PR03MB6660.eurprd03.prod.outlook.com ([fe80::19fc:be9f:2c9c:53f5%9]) with mapi id 15.20.4888.013; Sun, 16 Jan 2022 23:04:18 +0000 From: Andreas Rheinhardt To: ffmpeg-devel@ffmpeg.org Date: Mon, 17 Jan 2022 00:03:43 +0100 Message-ID: X-Mailer: git-send-email 2.32.0 In-Reply-To: References: X-TMN: [Gp9OEMDshydZYD7t3fLD6YLUst2AH6QY] X-ClientProxiedBy: AM0PR10CA0070.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:208:15::23) To AM7PR03MB6660.eurprd03.prod.outlook.com (2603:10a6:20b:1c1::22) X-Microsoft-Original-Message-ID: <20220116230405.194506-2-andreas.rheinhardt@outlook.com> MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 5690d0f0-c1a2-4dbd-853b-08d9d944867c X-MS-Exchange-SLBlob-MailProps: S/btQ8cKWiQ2WpFcKeHQajRkhOqduPnsHYHPXpWD+nggV2vCdc9I08W4RIAiJorJg6k8FmO4IaNNf4rjYYgsR7fHtiOiFjEo8ccv0Hq+pl2is44ouVM+tFCU44r8jH+6hMy/FNQp8rjSumEpg1B/MGSrpszopZPrrMDzaf617hvesPQAhX1S4HpY6CSi1u/9NDmn0/r4TaohwFt5euX3zePPSuaqFu5ZAy3NBpHCnmPIZJfT0V9Diws16D91T4+Uk3XFJvO1SLEkMrev5ggXG7l//f0NzAjDa1xFB/yvk1JzTrlxqqrnC7DG9+AEZq+BWtYXjYNDt9QERTVVGNURuaRWpRTbpOf3ew3p77ATLE3VzFw65e0PQ9KExnx8sjfeV7U3+k/jYudJm8SEypeHiBVLgC93/y4VcEPD52kWk/1yUNNypN3FsgovVMgvmesRCOZaXUj3Xg6M06+lY1OGAA466HyPdkta8r+q4v5/XsTERMkIkH3SpEm0YPaltNHdZo93S4XP+2cGOt4/w/k6amuDvDKQzXx0N+JYAkCc6r5LxaPphe0ZCSQbswzBpoWskTH+cywpxSO5XKO0GfX4cLR5MZOpkdoILxEGmzZa8Hscd2ji/nYTvilrIpXgIg+DgAn3VL1sJm3916DKoKpSc/T/hlnSQutoho85nUU0DrMk3y+iKOrnbyztmuXa/O9vRIS/wCwS5bER55kW3JCbMwhd4Rom4U7FBCHI4g1K+Uu8PfXQHC2pb393rrzVGILsCbx2qA7RL/Y= X-MS-TrafficTypeDiagnostic: AS8PR03MB7704:EE_ X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 1JIVWgMEfH9mvX7MQ7B9pxbFjjq/pQ71LApXraZN3zMPmmyeJWS3Ap4dOpc2WYk9H/YvZtkimf2piI6O9Jcmh3tA03FvJnQejqdKaI6azB1A4rcB0klelcQHULkhOjricY9rzgE2V0Q8GRAz3n4yn8iRVL9vVs3bAtUwFxbztMjJbSv6pVunziGTQFcYGcd+VRsrjuD26NphcYwTPVDUN9G8H5ooAlBaQFolJN+N4TUmbLJzlYdaygwTzRmkPjI8jKNd97s25r3hoyYxv/x9hUa2zKU59KXGzk1GN/79ci+QswyHCN8Zqqd/Y/17szAmg4x9yyJfigZfseubQimNdy9cT2VzTeqs0WUXDBICTWuy0yP5LDEQBmgHiWs5vLfSwa0Kmqu+dGrvA/uhHyo4M69zR/e/Xa1ht160tgytw41A1TYv0yaaQ+vU44ArrTQK3zth86HIF3SfT0L2FszrfRioNi46CUh/GPLQf9Z+52Q2sU3uHixgpRdS53rBEIUHfanI/hbviyNzBTmWLymDdvXw51ZsLx2QB1iiTyTbHz1+BhUdj250+GUkhVi9J8oL0pBvBHrOS7Lb84dYTtTWwQ== X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: V4NTKw+ZjWNF7W3VZ+s2NTR3A60FpxYsFcUXR5FwZ0NSkWvwFFwYHY9xHIdpPpKCuj6JJYTJiX1R+6g7iAcwYZgStFKE9ljqnLO0ECD6xqbgRQwfd2urWo/n1aL6Rs2Vt4tNQ+5zxB/VWwQVyG9/kYP7KozYWZPBKHW78x2kuewqH5JftIBv/gKAaaFch8oX3JHV5MQAEoPP3s0Vg++QCChPiWNvbOqzLatvy8WTygWcgSSge2Sett4P0c+njL8cqDlTF8d4qKOyvqDwxlWLT0/VGE55NJdJRCeQ9dueTyfDnSdQSWPSsct7Fas+ZwTVYi5m7CfdQXlj1y93lbD1S6u1/4+ce4MwAA7qlDDf7K8lB9N6auPcYaRma4Bko9XB9qbxPPboGggtjZCsYYQXb1usMh7FUrWffuYDi7MX3exBcO8djlNiBc4tjSrmWWlF14KzWFK8xe7h1nMR8xgwyYDn+hyMp8Ja4W0hMOyn4h0pU4Rluyl0pczSeSF+p2jJubSkyo2gNDimrxsYQXqr26WILcUQYkTwnCi5+y7C+Y4ErVWSWLSw97wnOxXoi0JMOzNuS8ZOlyOTR6iM2XrMuhpESCKZzj9B+Gy6NazRi2VmvsFMMCJgbc3TD3mQljtLGpzrUW24B4HeXYRsmgciomv4kGdMJK9y7bmqlGVEYGZDNRvok42mfVRyLjadLjthnOnHXunIkE2O3T2BJZXbDuSkHYs41KjcJipPVNVj+EXiKERa5urnlY4/FjtacuxsZqnmpqpfBDxW/K5aaom5T9D6eR9/rKRuxOoTh6P4egGjWiZ/q7OdmMu73jgRH6oR7UGS9ZeZ+zI0rv4SeHEVHJWskvUiwfKIjXiFIsI7A8bZYnB4eDHqdy8vY9WEQpyVObXsrhTljZb5/LWBEQUid5Rxt4OimCSn8fOJL7Qm4Z7jK9QpbmO6IPr5+YzLWNsopRKlplkjQZ0BZPwVSK696w== X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5690d0f0-c1a2-4dbd-853b-08d9d944867c X-MS-Exchange-CrossTenant-AuthSource: AM7PR03MB6660.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jan 2022 23:04:18.8977 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR03MB7704 Subject: [FFmpeg-devel] [PATCH 03/25] avformat/matroskaenc: Add API to write Masters with minimal length field X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Cc: Andreas Rheinhardt Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 1jzvxdR2khSI This muxer currently uses two ways to ensure that no bytes are wasted by writing unnecessary long EBML length fields for Master elements and the (Simple)Block element (all the other elements are fine as one either already has the right length or getting the actual length is easy and necessary anyway): Either use an upper bound that is good enough in case one is available or write the data into a dynamic buffer first to get the length; the former approach is impossible in lots of cases, whereas the latter incurs allocations and memcpying. It is therefore unfeasible to use the latter to e.g. This patch adds a third alternative to complement the former two: It consists of an EbmlWriter that one can add EBML elements to that can be written later by calling ebml_writer_write(); the latter function first traverses the written elements recursively and calculates the length of each element; then a second pass is performed in which all the elements are written directly (without any seeks). This new API also performs checks for overlong elements; this is in contrast to put_ebml_string() which simply performs a size_t->int conversion even for strings originating from the user. The new API is designed to have very low overhead: It is designed to use a stack array and performs no allocations; this also comes at a price: Right now, it can only be used in contexts in which there is a compile-time upper bound for the number of elements. It is also incompatible with storing the offset of an element in order to update this field later. Furthermore, it puts the onus of memory management (i.e. ensuring that pointers stay valid) on the user. These restrictions might be overcome in the future. Signed-off-by: Andreas Rheinhardt --- libavformat/matroskaenc.c | 248 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 248 insertions(+) diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c index 81194fd28d..4ec2074d2c 100644 --- a/libavformat/matroskaenc.c +++ b/libavformat/matroskaenc.c @@ -61,6 +61,12 @@ * Info, Tracks, Chapters, Attachments, Tags (potentially twice) and Cues */ #define MAX_SEEKHEAD_ENTRIES 7 +/* Largest known-length EBML length */ +#define MAX_EBML_LENGTH ((1ULL << 56) - 2) +/* The dynamic buffer API we rely upon has a limit of INT_MAX; + * and so has avio_write(). */ +#define MAX_SUPPORTED_EBML_LENGTH FFMIN(MAX_EBML_LENGTH, INT_MAX) + #define MODE_MATROSKAv2 0x01 #define MODE_WEBM 0x02 @@ -85,6 +91,48 @@ typedef struct ebml_stored_master { int64_t pos; } ebml_stored_master; +typedef enum EbmlType { + EBML_UINT, + EBML_SINT, + EBML_FLOAT, + EBML_UID, + EBML_STR, + EBML_UTF8 = EBML_STR, + EBML_BIN, + EBML_MASTER, +} EbmlType; + +typedef struct EbmlMaster { + int nb_elements; ///< -1 if not finished + int containing_master; ///< -1 if no parent exists +} EbmlMaster; + +typedef struct EbmlElement { + uint32_t id; + EbmlType type; + unsigned length_size; + uint64_t size; ///< excluding id and length field + union { + uint64_t uint; + int64_t sint; + double f; + const char *str; + const uint8_t *bin; + EbmlMaster master; + } priv; +} EbmlElement; + +typedef struct EbmlWriter { + unsigned nb_elements; + int current_master_element; + EbmlElement *elements; +} EbmlWriter; + +#define EBML_WRITER(max_nb_elems) \ + EbmlElement elements[max_nb_elems]; \ + EbmlWriter writer = (EbmlWriter){ .elements = elements, \ + .current_master_element = -1 } + typedef struct mkv_seekhead_entry { uint32_t elementid; uint64_t segmentpos; @@ -362,6 +410,206 @@ static void end_ebml_master(AVIOContext *pb, ebml_master master) avio_seek(pb, pos, SEEK_SET); } +static EbmlElement *ebml_writer_add(EbmlWriter *writer, + uint32_t id, EbmlType type) +{ + writer->elements[writer->nb_elements].id = id; + writer->elements[writer->nb_elements].type = type; + return &writer->elements[writer->nb_elements++]; +} + +static void ebml_writer_open_master(EbmlWriter *writer, uint32_t id) +{ + EbmlMaster *master = &ebml_writer_add(writer, id, EBML_MASTER)->priv.master; + + master->containing_master = writer->current_master_element; + master->nb_elements = -1; + + writer->current_master_element = writer->nb_elements - 1; +} + +static void ebml_writer_add_string(EbmlWriter *writer, uint32_t id, + const char *str) +{ + EbmlElement *elem = ebml_writer_add(writer, id, EBML_STR); + + elem->priv.str = str; +} + +static void ebml_writer_add_bin(EbmlWriter *writer, uint32_t id, + const uint8_t *data, size_t size) +{ + EbmlElement *elem = ebml_writer_add(writer, id, EBML_BIN); + +#if SIZE_MAX > UINT64_MAX + size = FFMIN(size, UINT64_MAX); +#endif + elem->size = size; + elem->priv.bin = data; +} + +static void ebml_writer_add_float(EbmlWriter *writer, uint32_t id, + double val) +{ + EbmlElement *elem = ebml_writer_add(writer, id, EBML_FLOAT); + + elem->priv.f = val; +} + +static void ebml_writer_add_uid(EbmlWriter *writer, uint32_t id, + uint64_t val) +{ + EbmlElement *elem = ebml_writer_add(writer, id, EBML_UID); + elem->priv.uint = val; +} + +static int ebml_writer_str_len(EbmlElement *elem) +{ + size_t len = strlen(elem->priv.str); +#if SIZE_MAX > UINT64_MAX + len = FF_MIN(len, UINT64_MAX); +#endif + elem->size = len; + return 0; +} + +static av_const int uint_size(uint64_t val) +{ + int bytes = 0; + do { + bytes++; + } while (val >>= 8); + return bytes; +} + +static int ebml_writer_uint_len(EbmlElement *elem) +{ + elem->size = uint_size(elem->priv.uint); + return 0; +} + +static av_const int sint_size(int64_t val) +{ + uint64_t tmp = 2 * (uint64_t)(val < 0 ? val^-1 : val); + return uint_size(tmp); +} + +static int ebml_writer_sint_len(EbmlElement *elem) +{ + elem->size = sint_size(elem->priv.sint); + return 0; +} + +static int ebml_writer_elem_len(EbmlWriter *writer, EbmlElement *elem, + int remaining_elems); + +static int ebml_writer_master_len(EbmlWriter *writer, EbmlElement *elem, + int remaining_elems) +{ + int nb_elems = elem->priv.master.nb_elements >= 0 ? elem->priv.master.nb_elements : remaining_elems - 1; + EbmlElement *const master = elem; + uint64_t total_size = 0; + + master->priv.master.nb_elements = nb_elems; + for (; elem++, nb_elems > 0;) { + int ret = ebml_writer_elem_len(writer, elem, nb_elems); + if (ret < 0) + return ret; + av_assert2(ret < nb_elems); + /* No overflow is possible here, as both total_size and elem->size + * are bounded by MAX_SUPPORTED_EBML_LENGTH. */ + total_size += ebml_id_size(elem->id) + elem->length_size + elem->size; + if (total_size > MAX_SUPPORTED_EBML_LENGTH) + return AVERROR(ERANGE); + nb_elems--; /* consume elem */ + elem += ret, nb_elems -= ret; /* and elem's children */ + } + master->size = total_size; + + return master->priv.master.nb_elements; +} + +static int ebml_writer_elem_len(EbmlWriter *writer, EbmlElement *elem, + int remaining_elems) +{ + int ret = 0; + + switch (elem->type) { + case EBML_FLOAT: + case EBML_UID: + elem->size = 8; + break; + case EBML_STR: + ret = ebml_writer_str_len(elem); + break; + case EBML_UINT: + ret = ebml_writer_uint_len(elem); + break; + case EBML_SINT: + ret = ebml_writer_sint_len(elem); + break; + case EBML_MASTER: + ret = ebml_writer_master_len(writer, elem, remaining_elems); + break; + } + if (ret < 0) + return ret; + if (elem->size > MAX_SUPPORTED_EBML_LENGTH) + return AVERROR(ERANGE); + elem->length_size = ebml_length_size(elem->size); + return ret; /* number of elements consumed excluding elem itself */ +} + +static int ebml_writer_elem_write(const EbmlElement *elem, AVIOContext *pb) +{ + put_ebml_id(pb, elem->id); + put_ebml_num(pb, elem->size, elem->length_size); + switch (elem->type) { + case EBML_UID: + case EBML_FLOAT: { + uint64_t val = elem->type == EBML_UID ? elem->priv.uint + : av_double2int(elem->priv.f); + avio_wb64(pb, val); + break; + } + case EBML_UINT: + case EBML_SINT: { + uint64_t val = elem->type == EBML_UINT ? elem->priv.uint + : elem->priv.sint; + for (int i = elem->size; --i >= 0; ) + avio_w8(pb, (uint8_t)(val >> i * 8)); + break; + } + case EBML_STR: + case EBML_BIN: { + const uint8_t *data = elem->type == EBML_BIN ? elem->priv.bin + : (const uint8_t*)elem->priv.str; + avio_write(pb, data, elem->size); + break; + } + case EBML_MASTER: { + int nb_elems = elem->priv.master.nb_elements; + + elem++; + for (int i = 0; i < nb_elems; i++) + i += ebml_writer_elem_write(elem + i, pb); + + return nb_elems; + } + } + return 0; +} + +static int ebml_writer_write(EbmlWriter *writer, AVIOContext *pb) +{ + int ret = ebml_writer_elem_len(writer, writer->elements, + writer->nb_elements); + if (ret < 0) + return ret; + ebml_writer_elem_write(writer->elements, pb); + return 0; +} + static void mkv_add_seekhead_entry(MatroskaMuxContext *mkv, uint32_t elementid, uint64_t filepos) {