From patchwork Thu Dec 14 20:14:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45145 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp5208878pzf; Thu, 14 Dec 2023 12:15:31 -0800 (PST) X-Google-Smtp-Source: AGHT+IEKpEawT6KZsmhsAzFzotZITxCgaMqEKrDNQN5clyV5qYJkYq2N/AjOjB+nZiSfZM/l+KXp X-Received: by 2002:a17:906:7:b0:a1e:80ec:fdd4 with SMTP id 7-20020a170906000700b00a1e80ecfdd4mr4340173eja.4.1702584931510; Thu, 14 Dec 2023 12:15:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702584931; cv=none; d=google.com; s=arc-20160816; b=fWneYt2Vr3dt7lyR+RtAJTXkGDUWatZ2SBb5up3no2jcvQA3i2COO++FOcKVSwvJGO GAXwnvY8YJksmMGRlmy4YaQTBI2G17uDBySU/0UUqPNc0+RR50nLs4KU0/UqVM+puFro uY7V9CtxTD4PVVwLxeB9YAm3nG9sD3KjbHnVv79Q32qfuwY4NX+p+k5/SwkzLUGws4te sP1NpdTgF9gvD9TU8TX8qqQgtirK3y7lmOYH70Jac+CT/DGlQ3a096f++bB/LSnig9qB hgSOCcxtZOjoetqedH1LcuiHVbVArLp/RGprxJJDBRKv1+bZOO0u+Dk6J0bRtKAYd3PE YFTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=na37Te/pAnbiPLTmgA5KXmdcBIvPyAyNDSYe2TUtJgU=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=TelpB4W55B8gvPZpadLsvPLeMv4G4eNcfbfmM88cy6whQzjdsZjsprcUTq+/qIH5wI LUAS8KTTFWfowanML7DJ2feVraGCFTQ5NQJS6u5n1Dseus0N0HH+kkVX0OvpYeGWAGp1 +Bq68yvIDcZC55k4aW0BwdkmH+IVZn5ZKM6zUnCNH8CradmxdGwm4bSTbKCqgyoXaSZz KmEqEyZJvRpHcQL20/bPgXyOePi9IfIkunVQOJcSxfhGIWKOmoxg+CbEJWRqF2rYMcNe 0zW4SlA6OKimKtJ6GO/g65M1TSFM5ezwFtF3IViYS8u40lXiX2cBM1lCqEUYrq7aabpF nbKQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=Z0oq27ag; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id j6-20020a1709064b4600b00a23147ad6absi319112ejv.789.2023.12.14.12.15.30; Thu, 14 Dec 2023 12:15:31 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=Z0oq27ag; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1B1FE68D281; Thu, 14 Dec 2023 22:15:19 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 8FBCB68D00C for ; Thu, 14 Dec 2023 22:15:12 +0200 (EET) Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-6ce6dd83945so8107089b3a.3 for ; Thu, 14 Dec 2023 12:15:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702584910; x=1703189710; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=TeBR63J4KWHn9AOcRpnzL7XEeBCIFgHF217/6qNXWEI=; b=Z0oq27agCjlrPluQRQpGMQuiyDN0OOc7EON6L1jNKUuXUFNyh0o5EzhDH8tTSlGudq jva8Erxlkqf8iUTQ8QDDs8vY+aoyDLXmFtqvzqYoLcxXxI8mc4hk6A59/pR022zYIOzZ 83RxfGyzANgih2VZuKmvwbtTmFMwZBGKFVOqtKjtIX6POpAX0x3tpMLiMr2GkFzmQxIx ULqiMZjZl/uCahIzi0wCGVHi+Eh9FKYmBrPpd77sT66lxLqi1G8LQjImJRnlNzEXGeNx Dx9p+GcdKt4/qLwA3SzQWAv6AMawnW/CqueJIjWll0PQ5wLZ79ziWAxXg7lXaz1erlpu eHag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702584910; x=1703189710; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TeBR63J4KWHn9AOcRpnzL7XEeBCIFgHF217/6qNXWEI=; b=MiDuNUTcj9n61Jz5zouUdtFhxeux4kshhS7dOXRY4JOokCTltxhdxeWv6yqIeg4LxF ogTok5NeF0iI2n//JPts/ct7gGocrQfdugUtlSpy2imUXtgkwlZyQ/Jo2Jt+MOWZQ+F6 9FYTZx1HjZVR8Y2tdhpSSdUkes3Qx7UHFG/DmKLUuLvMtaNEqVqrGddGM3WAv4rY8qqL XipZPcFzBIpSGfsxJBVuaXhX2f+PZ6hMOpJ0heilq3NaVjHZz/iTb6ywXpruYmWL00wQ ph540R4mMVYv685vioOG2XFjs/qvcHJDuUzkgX/JXgjc73tBLSDZ8Y1F7CjwECJCtsMV 9sNg== X-Gm-Message-State: AOJu0Yzpp045evoGsZreDy0MRp/I5FZJ6hjXgFL2P3txqYAS1Z4UC1v3 tsWlmOBAQoWzECosCAoj0UpnnxgbzgY= X-Received: by 2002:a05:6a00:1a8f:b0:6ce:7899:80ec with SMTP id e15-20020a056a001a8f00b006ce789980ecmr10764063pfv.20.1702584909664; Thu, 14 Dec 2023 12:15:09 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id h12-20020a62b40c000000b006d0d4bafe31sm3352885pfn.6.2023.12.14.12.15.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 12:15:09 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Thu, 14 Dec 2023 17:14:26 -0300 Message-ID: <20231214201433.4608-2-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214201433.4608-1-jamrial@gmail.com> References: <20231214201433.4608-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3uHHB/GCJK8Y Signed-off-by: James Almer --- libavutil/Makefile | 2 + libavutil/iamf.c | 563 ++++++++++++++++++++++++++++++++++++++++ libavutil/iamf.h | 620 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 1185 insertions(+) create mode 100644 libavutil/iamf.c create mode 100644 libavutil/iamf.h diff --git a/libavutil/Makefile b/libavutil/Makefile index 4711f8cde8..62cc1a1831 100644 --- a/libavutil/Makefile +++ b/libavutil/Makefile @@ -51,6 +51,7 @@ HEADERS = adler32.h \ hwcontext_videotoolbox.h \ hwcontext_vdpau.h \ hwcontext_vulkan.h \ + iamf.h \ imgutils.h \ intfloat.h \ intreadwrite.h \ @@ -140,6 +141,7 @@ OBJS = adler32.o \ hdr_dynamic_vivid_metadata.o \ hmac.o \ hwcontext.o \ + iamf.o \ imgutils.o \ integer.o \ intmath.o \ diff --git a/libavutil/iamf.c b/libavutil/iamf.c new file mode 100644 index 0000000000..62b6051049 --- /dev/null +++ b/libavutil/iamf.c @@ -0,0 +1,563 @@ +/* + * Immersive Audio Model and Formats helper functions and defines + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include +#include + +#include "avassert.h" +#include "error.h" +#include "iamf.h" +#include "log.h" +#include "mem.h" +#include "opt.h" + +#define IAMF_ADD_FUNC_TEMPLATE(parent_type, parent_name, child_type, child_name, suffix) \ +child_type *av_iamf_ ## parent_name ## _add_ ## child_name(parent_type *parent_name) \ +{ \ + child_type **child_name ## suffix, *child_name; \ + \ + if (parent_name->nb_## child_name ## suffix == UINT_MAX) \ + return NULL; \ + \ + child_name ## suffix = av_realloc_array(parent_name->child_name ## suffix, \ + parent_name->nb_## child_name ## suffix + 1, \ + sizeof(*parent_name->child_name ## suffix)); \ + if (!child_name ## suffix) \ + return NULL; \ + \ + parent_name->child_name ## suffix = child_name ## suffix; \ + \ + child_name = parent_name->child_name ## suffix[parent_name->nb_## child_name ## suffix] \ + = av_mallocz(sizeof(*child_name)); \ + if (!child_name) \ + return NULL; \ + \ + child_name->av_class = &child_name ## _class; \ + av_opt_set_defaults(child_name); \ + parent_name->nb_## child_name ## suffix++; \ + \ + return child_name; \ +} + +#define FLAGS AV_OPT_FLAG_ENCODING_PARAM + +// +// Param Definition +// +#define OFFSET(x) offsetof(AVIAMFMixGain, x) +static const AVOption mix_gain_options[] = { + { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS }, + { "animation_type", "set animation_type", OFFSET(animation_type), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 2, FLAGS }, + { "start_point_value", "set start_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS }, + { "end_point_value", "set end_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS }, + { "control_point_value", "set control_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS }, + { "control_point_relative_time", "set control_point_relative_time", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, 0.0, 1.0, FLAGS }, + { NULL }, +}; + +static const AVClass mix_gain_class = { + .class_name = "AVIAMFSubmixElement", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = mix_gain_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFDemixingInfo, x) +static const AVOption demixing_info_options[] = { + { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS }, + { "dmixp_mode", "set dmixp_mode", OFFSET(dmixp_mode), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 6, FLAGS }, + { NULL }, +}; + +static const AVClass demixing_info_class = { + .class_name = "AVIAMFDemixingInfo", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = demixing_info_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFReconGain, x) +static const AVOption recon_gain_options[] = { + { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS }, + { NULL }, +}; + +static const AVClass recon_gain_class = { + .class_name = "AVIAMFReconGain", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = recon_gain_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFParamDefinition, x) +static const AVOption param_definition_options[] = { + { "parameter_id", "set parameter_id", OFFSET(parameter_id), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { "parameter_rate", "set parameter_rate", OFFSET(parameter_rate), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { "duration", "set duration", OFFSET(duration), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { "constant_subblock_duration", "set constant_subblock_duration", OFFSET(constant_subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { NULL }, +}; + +static const AVClass *param_definition_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + switch(i) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + ret = &mix_gain_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + ret = &demixing_info_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + ret = &recon_gain_class; + break; + default: + break; + } + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass param_definition_class = { + .class_name = "AVIAMFParamDefinition", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = param_definition_options, + .child_class_iterate = param_definition_child_iterate, +}; + +const AVClass *av_iamf_param_definition_get_class(void) +{ + return ¶m_definition_class; +} + +AVIAMFParamDefinition *av_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType type, + unsigned int nb_subblocks, size_t *out_size) +{ + + struct MixGainStruct { + AVIAMFParamDefinition p; + AVIAMFMixGain m; + }; + struct DemixStruct { + AVIAMFParamDefinition p; + AVIAMFDemixingInfo d; + }; + struct ReconGainStruct { + AVIAMFParamDefinition p; + AVIAMFReconGain r; + }; + size_t subblocks_offset, subblock_size; + size_t size; + AVIAMFParamDefinition *par; + + switch (type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + subblocks_offset = offsetof(struct MixGainStruct, m); + subblock_size = sizeof(AVIAMFMixGain); + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + subblocks_offset = offsetof(struct DemixStruct, d); + subblock_size = sizeof(AVIAMFDemixingInfo); + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + subblocks_offset = offsetof(struct ReconGainStruct, r); + subblock_size = sizeof(AVIAMFReconGain); + break; + default: + return NULL; + } + + size = subblocks_offset; + if (nb_subblocks > (SIZE_MAX - size) / subblock_size) + return NULL; + size += subblock_size * nb_subblocks; + + par = av_mallocz(size); + if (!par) + return NULL; + + par->av_class = ¶m_definition_class; + av_opt_set_defaults(par); + + par->type = type; + par->nb_subblocks = nb_subblocks; + par->subblock_size = subblock_size; + par->subblocks_offset = subblocks_offset; + + for (int i = 0; i < nb_subblocks; i++) { + void *subblock = av_iamf_param_definition_get_subblock(par, i); + + switch (type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + ((AVIAMFMixGain *)subblock)->av_class = &mix_gain_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + ((AVIAMFDemixingInfo *)subblock)->av_class = &demixing_info_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + ((AVIAMFReconGain *)subblock)->av_class = &recon_gain_class; + break; + default: + av_assert0(0); + } + + av_opt_set_defaults(subblock); + } + + if (out_size) + *out_size = size; + + return par; +} + +// +// Audio Element +// +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFLayer, x) +static const AVOption layer_options[] = { + { "ch_layout", "set ch_layout", OFFSET(ch_layout), AV_OPT_TYPE_CHLAYOUT, {.str = NULL }, 0, 0, FLAGS }, + { "flags", "set flags", OFFSET(flags), AV_OPT_TYPE_FLAGS, + {.i64 = 0 }, 0, AV_IAMF_LAYER_FLAG_RECON_GAIN, FLAGS, "flags" }, + {"recon_gain", "Recon gain is present", 0, AV_OPT_TYPE_CONST, + {.i64 = AV_IAMF_LAYER_FLAG_RECON_GAIN }, INT_MIN, INT_MAX, FLAGS, "flags"}, + { "output_gain_flags", "set output_gain_flags", OFFSET(output_gain_flags), AV_OPT_TYPE_FLAGS, + {.i64 = 0 }, 0, (1 << 6) - 1, FLAGS, "output_gain_flags" }, + {"FL", "Left channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 5 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"FR", "Right channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 4 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"BL", "Left surround channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 3 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"BR", "Right surround channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 2 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"TFL", "Left top front channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 1 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"TFR", "Right top front channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 0 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + { "output_gain", "set output_gain", OFFSET(output_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "ambisonics_mode", "set ambisonics_mode", OFFSET(ambisonics_mode), AV_OPT_TYPE_INT, + { .i64 = AV_IAMF_AMBISONICS_MODE_MONO }, + AV_IAMF_AMBISONICS_MODE_MONO, AV_IAMF_AMBISONICS_MODE_PROJECTION, FLAGS, "ambisonics_mode" }, + { "mono", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AMBISONICS_MODE_MONO }, .unit = "ambisonics_mode" }, + { "projection", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AMBISONICS_MODE_PROJECTION }, .unit = "ambisonics_mode" }, + { NULL }, +}; + +static const AVClass layer_class = { + .class_name = "AVIAMFLayer", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = layer_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFAudioElement, x) +static const AVOption audio_element_options[] = { + { "audio_element_type", "set audio_element_type", OFFSET(audio_element_type), AV_OPT_TYPE_INT, + {.i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL }, + AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, FLAGS, "audio_element_type" }, + { "channel", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL }, .unit = "audio_element_type" }, + { "scene", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE }, .unit = "audio_element_type" }, + { "default_w", "set default_w", OFFSET(default_w), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 10, FLAGS }, + { NULL }, +}; + +static const AVClass *audio_element_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + if (i) + ret = &layer_class; + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass audio_element_class = { + .class_name = "AVIAMFAudioElement", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = audio_element_options, + .child_class_iterate = audio_element_child_iterate, +}; + +const AVClass *av_iamf_audio_element_get_class(void) +{ + return &audio_element_class; +} + +AVIAMFAudioElement *av_iamf_audio_element_alloc(void) +{ + AVIAMFAudioElement *audio_element = av_mallocz(sizeof(*audio_element)); + + if (audio_element) { + audio_element->av_class = &audio_element_class; + av_opt_set_defaults(audio_element); + } + + return audio_element; +} + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFAudioElement, audio_element, AVIAMFLayer, layer, s) + +void av_iamf_audio_element_free(AVIAMFAudioElement **paudio_element) +{ + AVIAMFAudioElement *audio_element = *paudio_element; + + if (!audio_element) + return; + + for (int i = 0; i < audio_element->nb_layers; i++) { + AVIAMFLayer *layer = audio_element->layers[i]; + av_opt_free(layer); + av_free(layer->demixing_matrix); + av_free(layer); + } + av_free(audio_element->layers); + + av_free(audio_element->demixing_info); + av_free(audio_element->recon_gain_info); + av_freep(paudio_element); +} + +// +// Mix Presentation +// +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFSubmixElement, x) +static const AVOption submix_element_options[] = { + { "headphones_rendering_mode", "Headphones rendering mode", OFFSET(headphones_rendering_mode), AV_OPT_TYPE_INT, + { .i64 = AV_IAMF_HEADPHONES_MODE_STEREO }, + AV_IAMF_HEADPHONES_MODE_STEREO, AV_IAMF_HEADPHONES_MODE_BINAURAL, FLAGS, "headphones_rendering_mode" }, + { "stereo", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_HEADPHONES_MODE_STEREO }, .unit = "headphones_rendering_mode" }, + { "binaural", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_HEADPHONES_MODE_BINAURAL }, .unit = "headphones_rendering_mode" }, + { "default_mix_gain", "Default mix gain", OFFSET(default_mix_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "annotations", "Annotations", OFFSET(annotations), AV_OPT_TYPE_DICT, { .str = NULL }, 0, 0, FLAGS }, + { NULL }, +}; + +static void *submix_element_child_next(void *obj, void *prev) +{ + AVIAMFSubmixElement *submix_element = obj; + if (!prev) + return submix_element->element_mix_config; + + return NULL; +} + +static const AVClass *submix_element_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + if (i) + ret = ¶m_definition_class; + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass element_class = { + .class_name = "AVIAMFSubmixElement", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = submix_element_options, + .child_next = submix_element_child_next, + .child_class_iterate = submix_element_child_iterate, +}; + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFSubmix, submix, AVIAMFSubmixElement, element, s) + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFSubmixLayout, x) +static const AVOption submix_layout_options[] = { + { "layout_type", "Layout type", OFFSET(layout_type), AV_OPT_TYPE_INT, + { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS }, + AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS, AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL, FLAGS, "layout_type" }, + { "loudspeakers", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS }, .unit = "layout_type" }, + { "binaural", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL }, .unit = "layout_type" }, + { "sound_system", "Sound System", OFFSET(sound_system), AV_OPT_TYPE_CHLAYOUT, { .str = NULL }, 0, 0, FLAGS }, + { "integrated_loudness", "Integrated loudness", OFFSET(integrated_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "digital_peak", "Digital peak", OFFSET(digital_peak), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "true_peak", "True peak", OFFSET(true_peak), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "dialog_anchored_loudness", "Anchored loudness (Dialog)", OFFSET(dialogue_anchored_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "album_anchored_loudness", "Anchored loudness (Album)", OFFSET(album_anchored_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { NULL }, +}; + +static const AVClass layout_class = { + .class_name = "AVIAMFSubmixLayout", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = submix_layout_options, +}; + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFSubmix, submix, AVIAMFSubmixLayout, layout, s) + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFSubmix, x) +static const AVOption submix_presentation_options[] = { + { "default_mix_gain", "Default mix gain", OFFSET(default_mix_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { NULL }, +}; + +static void *submix_presentation_child_next(void *obj, void *prev) +{ + AVIAMFSubmix *sub_mix = obj; + if (!prev) + return sub_mix->output_mix_config; + + return NULL; +} + +static const AVClass *submix_presentation_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + switch(i) { + case 0: + ret = &element_class; + break; + case 1: + ret = &layout_class; + break; + case 2: + ret = ¶m_definition_class; + break; + default: + break; + } + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass submix_class = { + .class_name = "AVIAMFSubmix", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = submix_presentation_options, + .child_next = submix_presentation_child_next, + .child_class_iterate = submix_presentation_child_iterate, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFMixPresentation, x) +static const AVOption mix_presentation_options[] = { + { "annotations", "set annotations", OFFSET(annotations), AV_OPT_TYPE_DICT, {.str = NULL }, 0, 0, FLAGS }, + { NULL }, +}; + +#undef OFFSET +#undef FLAGS + +static const AVClass *mix_presentation_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + if (i) + ret = &submix_class; + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass mix_presentation_class = { + .class_name = "AVIAMFMixPresentation", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = mix_presentation_options, + .child_class_iterate = mix_presentation_child_iterate, +}; + +const AVClass *av_iamf_mix_presentation_get_class(void) +{ + return &mix_presentation_class; +} + +AVIAMFMixPresentation *av_iamf_mix_presentation_alloc(void) +{ + AVIAMFMixPresentation *mix_presentation = av_mallocz(sizeof(*mix_presentation)); + + if (mix_presentation) { + mix_presentation->av_class = &mix_presentation_class; + av_opt_set_defaults(mix_presentation); + } + + return mix_presentation; +} + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFMixPresentation, mix_presentation, AVIAMFSubmix, submix, es) + +void av_iamf_mix_presentation_free(AVIAMFMixPresentation **pmix_presentation) +{ + AVIAMFMixPresentation *mix_presentation = *pmix_presentation; + + if (!mix_presentation) + return; + + for (int i = 0; i < mix_presentation->nb_submixes; i++) { + AVIAMFSubmix *sub_mix = mix_presentation->submixes[i]; + for (int j = 0; j < sub_mix->nb_elements; j++) { + AVIAMFSubmixElement *submix_element = sub_mix->elements[j]; + av_opt_free(submix_element); + av_free(submix_element->element_mix_config); + av_free(submix_element); + } + av_free(sub_mix->elements); + for (int j = 0; j < sub_mix->nb_layouts; j++) { + AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[j]; + av_opt_free(submix_layout); + av_free(submix_layout); + } + av_free(sub_mix->layouts); + av_free(sub_mix->output_mix_config); + av_free(sub_mix); + } + av_opt_free(mix_presentation); + av_free(mix_presentation->submixes); + + av_freep(pmix_presentation); +} diff --git a/libavutil/iamf.h b/libavutil/iamf.h new file mode 100644 index 0000000000..7038b71a27 --- /dev/null +++ b/libavutil/iamf.h @@ -0,0 +1,620 @@ +/* + * Immersive Audio Model and Formats helper functions and defines + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_IAMF_H +#define AVUTIL_IAMF_H + +/** + * @file + * Immersive Audio Model and Formats API header + * @see Immersive Audio Model and Formats + */ + +#include +#include + +#include "attributes.h" +#include "avassert.h" +#include "channel_layout.h" +#include "dict.h" +#include "rational.h" + +/** + * @defgroup lavf_iamf_params Parameter Definition + * @{ + * Parameters as defined in section 3.6.1 and 3.8 of IAMF. + * @} + * @defgroup lavf_iamf_audio Audio Element + * @{ + * Audio Elements as defined in section 3.6 of IAMF. + * @} + * @defgroup lavf_iamf_mix Mix Presentation + * @{ + * Mix Presentations as defined in section 3.7 of IAMF. + * @} + * + * @} + * @addtogroup lavf_iamf_params + * @{ + */ +enum AVIAMFAnimationType { + AV_IAMF_ANIMATION_TYPE_STEP, + AV_IAMF_ANIMATION_TYPE_LINEAR, + AV_IAMF_ANIMATION_TYPE_BEZIER, +}; + +/** + * Mix Gain Parameter Data as defined in section 3.8.1 of IAMF. + */ +typedef struct AVIAMFMixGain { + const AVClass *av_class; + + /** + * Duration for the given subblock. It must not be 0. + */ + unsigned int subblock_duration; + /** + * The type of animation applied to the parameter values. + */ + enum AVIAMFAnimationType animation_type; + /** + * Parameter value that is applied at the start of the subblock. + * Applies to all defined Animation Types. + * + * Valid range of values is -128.0 to 128.0 + */ + AVRational start_point_value; + /** + * Parameter value that is applied at the end of the subblock. + * Applies only to AV_IAMF_ANIMATION_TYPE_LINEAR and + * AV_IAMF_ANIMATION_TYPE_BEZIER Animation Types. + * + * Valid range of values is -128.0 to 128.0 + */ + AVRational end_point_value; + /** + * Parameter value of the middle control point of a quadratic Bezier + * curve, i.e., its y-axis value. + * Applies only to AV_IAMF_ANIMATION_TYPE_BEZIER Animation Type. + * + * Valid range of values is -128.0 to 128.0 + */ + AVRational control_point_value; + /** + * Parameter value of the time of the middle control point of a + * quadratic Bezier curve, i.e., its x-axis value. + * Applies only to AV_IAMF_ANIMATION_TYPE_BEZIER Animation Type. + * + * Valid range of values is 0.0 to 1.0 + */ + AVRational control_point_relative_time; +} AVIAMFMixGain; + +/** + * Demixing Info Parameter Data as defined in section 3.8.2 of IAMF. + */ +typedef struct AVIAMFDemixingInfo { + const AVClass *av_class; + + /** + * Duration for the given subblock. It must not be 0. + */ + unsigned int subblock_duration; + /** + * Pre-defined combination of demixing parameters. + */ + unsigned int dmixp_mode; +} AVIAMFDemixingInfo; + +/** + * Recon Gain Info Parameter Data as defined in section 3.8.3 of IAMF. + */ +typedef struct AVIAMFReconGain { + const AVClass *av_class; + + /** + * Duration for the given subblock. It must not be 0. + */ + unsigned int subblock_duration; + + /** + * Array of gain values to be applied to each channel for each layer + * defined in the Audio Element referencing the parent Parameter Definition. + * Values for layers where the AV_IAMF_LAYER_FLAG_RECON_GAIN flag is not set + * are undefined. + * + * Channel order is: FL, C, FR, SL, SR, TFL, TFR, BL, BR, TBL, TBR, LFE + */ + uint8_t recon_gain[6][12]; +} AVIAMFReconGain; + +enum AVIAMFParamDefinitionType { + /** + * Subblocks are of struct type AVIAMFMixGain + */ + AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, + /** + * Subblocks are of struct type AVIAMFDemixingInfo + */ + AV_IAMF_PARAMETER_DEFINITION_DEMIXING, + /** + * Subblocks are of struct type AVIAMFReconGain + */ + AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN, +}; + +/** + * Parameters as defined in section 3.6.1 of IAMF. + * + * The struct is allocated by av_iamf_param_definition_alloc() along with an + * array of subblocks, its type depending on the value of type. + * This array is placed subblocks_offset bytes after the start of this struct. + */ +typedef struct AVIAMFParamDefinition { + const AVClass *av_class; + + /** + * Offset in bytes from the start of this struct, at which the subblocks + * array is located. + */ + size_t subblocks_offset; + /** + * Size in bytes of each element in the subblocks array. + */ + size_t subblock_size; + /** + * Number of subblocks in the array. + * + * Must be 0 if @ref constant_subblock_duration is not 0. + */ + unsigned int nb_subblocks; + + /** + * Parameters type. Determines the type of the subblock elements. + */ + enum AVIAMFParamDefinitionType type; + + /** + * Identifier for the paremeter substream. + */ + unsigned int parameter_id; + /** + * Sample rate for the paremeter substream. It must not be 0. + */ + unsigned int parameter_rate; + + /** + * The duration of the all subblocks in this parameter definition. + * + * May be 0, in which case all duration values should be specified in + * another parameter definition referencing the same parameter_id. + */ + unsigned int duration; + /** + * The duration of every subblock in the case where all subblocks, with + * the optional exception of the last subblock, have equal durations. + * + * Must be 0 if subblocks have different durations. + */ + unsigned int constant_subblock_duration; +} AVIAMFParamDefinition; + +const AVClass *av_iamf_param_definition_get_class(void); + +/** + * Allocates memory for AVIAMFParamDefinition, plus an array of {@code nb_subblocks} + * amount of subblocks of the given type and initializes the variables. Can be + * freed with a normal av_free() call. + * + * @param size if non-NULL, the size in bytes of the resulting data array is written here. + */ +AVIAMFParamDefinition *av_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType type, + unsigned int nb_subblocks, size_t *size); + +/** + * Get the subblock at the specified {@code idx}. Must be between 0 and nb_subblocks - 1. + * + * The @ref AVIAMFParamDefinition.type "param definition type" defines + * the struct type of the returned pointer. + */ +static av_always_inline void* +av_iamf_param_definition_get_subblock(const AVIAMFParamDefinition *par, unsigned int idx) +{ + av_assert0(idx < par->nb_subblocks); + return (void *)((uint8_t *)par + par->subblocks_offset + idx * par->subblock_size); +} + +/** + * @} + * @addtogroup lavf_iamf_audio + * @{ + */ + +enum AVIAMFAmbisonicsMode { + AV_IAMF_AMBISONICS_MODE_MONO, + AV_IAMF_AMBISONICS_MODE_PROJECTION, +}; + +/** + * Recon gain information for the layer is present in AVIAMFReconGain + */ +#define AV_IAMF_LAYER_FLAG_RECON_GAIN (1 << 0) + +/** + * A layer defining a Channel Layout in the Audio Element. + * + * When @ref AVIAMFAudioElement.audio_element_type "the parent's Audio Element type" + * is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, this corresponds to an Scalable Channel + * Layout layer as defined in section 3.6.2 of IAMF. + * For AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, it is an Ambisonics channel + * layout as defined in section 3.6.3 of IAMF. + */ +typedef struct AVIAMFLayer { + const AVClass *av_class; + + AVChannelLayout ch_layout; + + /** + * A bitmask which may contain a combination of AV_IAMF_LAYER_FLAG_* flags. + */ + unsigned int flags; + /** + * Output gain channel flags as defined in section 3.6.2 of IAMF. + * + * This field is defined only if @ref AVIAMFAudioElement.audio_element_type + * "the parent's Audio Element type" is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, + * must be 0 otherwise. + */ + unsigned int output_gain_flags; + /** + * Output gain as defined in section 3.6.2 of IAMF. + * + * Must be 0 if @ref output_gain_flags is 0. + */ + AVRational output_gain; + /** + * Ambisonics mode as defined in section 3.6.3 of IAMF. + * + * This field is defined only if @ref AVIAMFAudioElement.audio_element_type + * "the parent's Audio Element type" is AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE. + * + * If AV_IAMF_AMBISONICS_MODE_MONO, channel_mapping is defined implicitly + * (Ambisonic Order) or explicitly (Custom Order with ambi channels) in + * @ref ch_layout. + * If AV_IAMF_AMBISONICS_MODE_PROJECTION, @ref demixing_matrix must be set. + */ + enum AVIAMFAmbisonicsMode ambisonics_mode; + + /** + * Demixing matrix as defined in section 3.6.3 of IAMF. + * + * The length of the array is ch_layout.nb_channels multiplied by the sum of + * the amount of streams in the group plus the amount of streams in the group + * that are stereo. + * + * May be set only if @ref ambisonics_mode == AV_IAMF_AMBISONICS_MODE_PROJECTION, + * must be NULL otherwise. + */ + AVRational *demixing_matrix; +} AVIAMFLayer; + + +enum AVIAMFAudioElementType { + AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, + AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, +}; + +typedef struct AVIAMFAudioElement { + const AVClass *av_class; + + AVIAMFLayer **layers; + /** + * Number of layers, or channel groups, in the Audio Element. + * There may be 6 layers at most, and for @ref audio_element_type + * AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, there may be exactly 1. + * + * Set by av_iamf_audio_element_add_layer(), must not be + * modified by any other code. + */ + unsigned int nb_layers; + + /** + * Demixing information used to reconstruct a scalable channel audio + * representation. + * The @ref AVIAMFParamDefinition.type "type" must be + * AV_IAMF_PARAMETER_DEFINITION_DEMIXING. + */ + AVIAMFParamDefinition *demixing_info; + /** + * Recon gain information used to reconstruct a scalable channel audio + * representation. + * The @ref AVIAMFParamDefinition.type "type" must be + * AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN. + */ + AVIAMFParamDefinition *recon_gain_info; + + /** + * Audio element type as defined in section 3.6 of IAMF. + */ + enum AVIAMFAudioElementType audio_element_type; + + /** + * Default weight value as defined in section 3.6 of IAMF. + */ + unsigned int default_w; +} AVIAMFAudioElement; + +const AVClass *av_iamf_audio_element_get_class(void); + +/** + * Allocates a AVIAMFAudioElement, and initializes its fields with default values. + * No layers are allocated. Must be freed with av_iamf_audio_element_free(). + * + * @see av_iamf_audio_element_add_layer() + */ +AVIAMFAudioElement *av_iamf_audio_element_alloc(void); + +/** + * Allocate a layer and add it to a given AVIAMFAudioElement. + * It is freed by av_iamf_audio_element_free() alongside the rest of the parent + * AVIAMFAudioElement. + * + * @return a pointer to the allocated layer. + */ +AVIAMFLayer *av_iamf_audio_element_add_layer(AVIAMFAudioElement *audio_element); + +void av_iamf_audio_element_free(AVIAMFAudioElement **audio_element); + +/** + * @} + * @addtogroup lavf_iamf_mix + * @{ + */ + +enum AVIAMFHeadphonesMode { + /** + * The referenced Audio Element shall be rendered to stereo loudspeakers. + */ + AV_IAMF_HEADPHONES_MODE_STEREO, + /** + * The referenced Audio Element shall be rendered with a binaural renderer. + */ + AV_IAMF_HEADPHONES_MODE_BINAURAL, +}; + +typedef struct AVIAMFSubmixElement { + const AVClass *av_class; + + /** + * The id of the Audio Element this submix element references. + */ + unsigned int audio_element_id; + + /** + * Information required required for applying any processing to the + * referenced and rendered Audio Element before being summed with other + * processed Audio Elements. + * The @ref AVIAMFParamDefinition.type "type" must be + * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN. + */ + AVIAMFParamDefinition *element_mix_config; + + /** + * Default mix gain value to apply when there are no AVIAMFParamDefinition + * with @ref element_mix_config "element_mix_config's" + * @ref AVIAMFParamDefinition.parameter_id "parameter_id" available for a + * given audio frame. + */ + AVRational default_mix_gain; + + /** + * A value that indicates whether the referenced channel-based Audio Element + * shall be rendered to stereo loudspeakers or spatialized with a binaural + * renderer when played back on headphones. + * If the Audio Element is not of @ref AVIAMFAudioElement.audio_element_type + * "type" AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, then this field is undefined. + */ + enum AVIAMFHeadphonesMode headphones_rendering_mode; + + /** + * A dictionary of strings describing the submix in different languages. + * Must have the same amount of entries as + * @ref AVIAMFMixPresentation.annotations "the mix's annotations", stored + * in the same order, and with the same key strings. + * + * @ref AVDictionaryEntry.key "key" is a string conforming to BCP-47 that + * specifies the language for the string stored in + * @ref AVDictionaryEntry.value "value". + */ + AVDictionary *annotations; +} AVIAMFSubmixElement; + +enum AVIAMFSubmixLayoutType { + /** + * The layout follows the loudspeaker sound system convention of ITU-2051-3. + */ + AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS = 2, + /** + * The layout is binaural. + */ + AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL = 3, +}; + +typedef struct AVIAMFSubmixLayout { + const AVClass *av_class; + + enum AVIAMFSubmixLayoutType layout_type; + + /** + * Channel layout matching one of Sound Systems A to J of ITU-2051-3, plus + * 7.1.2ch and 3.1.2ch + * If layout_type is not AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS, this field + * is undefined. + */ + AVChannelLayout sound_system; + /** + * The program integrated loudness information, as defined in + * ITU-1770-4. + */ + AVRational integrated_loudness; + /** + * The digital (sampled) peak value of the audio signal, as defined + * in ITU-1770-4. + */ + AVRational digital_peak; + /** + * The true peak of the audio signal, as defined in ITU-1770-4. + */ + AVRational true_peak; + /** + * The Dialogue loudness information, as defined in ITU-1770-4. + */ + AVRational dialogue_anchored_loudness; + /** + * The Album loudness information, as defined in ITU-1770-4. + */ + AVRational album_anchored_loudness; +} AVIAMFSubmixLayout; + +typedef struct AVIAMFSubmix { + const AVClass *av_class; + + /** + * Array of submix elements. + * + * Set by av_iamf_submix_add_element(), must not be modified by any + * other code. + */ + AVIAMFSubmixElement **elements; + /** + * Number of elements in the submix. + * + * Set by av_iamf_submix_add_element(), must not be modified by any + * other code. + */ + unsigned int nb_elements; + + /** + * Array of submix layouts. + * + * Set by av_iamf_submix_add_layout(), must not be modified by any + * other code. + */ + AVIAMFSubmixLayout **layouts; + /** + * Number of layouts in the submix. + * + * Set by av_iamf_submix_add_layout(), must not be modified by any + * other code. + */ + unsigned int nb_layouts; + + /** + * Information required for post-processing the mixed audio signal to + * generate the audio signal for playback. + * The @ref AVIAMFParamDefinition.type "type" must be + * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN. + */ + AVIAMFParamDefinition *output_mix_config; + + /** + * Default mix gain value to apply when there are no AVIAMFParamDefinition + * with @ref output_mix_config "output_mix_config's" + * @ref AVIAMFParamDefinition.parameter_id "parameter_id" available for a + * given audio frame. + */ + AVRational default_mix_gain; +} AVIAMFSubmix; + +typedef struct AVIAMFMixPresentation { + const AVClass *av_class; + + /** + * Array of submixes. + * + * Set by av_iamf_mix_presentation_add_submix(), must not be modified + * by any other code. + */ + AVIAMFSubmix **submixes; + /** + * Number of submixes in the presentation. + * + * Set by av_iamf_mix_presentation_add_submix(), must not be modified + * by any other code. + */ + unsigned int nb_submixes; + + /** + * A dictionary of strings describing the mix in different languages. + * Must have the same amount of entries as every + * @ref AVIAMFSubmixElement.annotations "Submix element annotations", + * stored in the same order, and with the same key strings. + * + * @ref AVDictionaryEntry.key "key" is a string conforming to BCP-47 + * that specifies the language for the string stored in + * @ref AVDictionaryEntry.value "value". + */ + AVDictionary *annotations; +} AVIAMFMixPresentation; + +const AVClass *av_iamf_mix_presentation_get_class(void); + +/** + * Allocates a AVIAMFMixPresentation, and initializes its fields with default + * values. No submixes are allocated. + * Must be freed with av_iamf_mix_presentation_free(). + * + * @see av_iamf_mix_presentation_add_submix() + */ +AVIAMFMixPresentation *av_iamf_mix_presentation_alloc(void); + +/** + * Allocate a submix and add it to a given AVIAMFMixPresentation. + * It is freed by av_iamf_mix_presentation_free() alongside the rest of the + * parent AVIAMFMixPresentation. + * + * @return a pointer to the allocated submix. + */ +AVIAMFSubmix *av_iamf_mix_presentation_add_submix(AVIAMFMixPresentation *mix_presentation); + +/** + * Allocate a submix element and add it to a given AVIAMFSubmix. + * It is freed by av_iamf_mix_presentation_free() alongside the rest of the + * parent AVIAMFSubmix. + * + * @return a pointer to the allocated submix. + */ +AVIAMFSubmixElement *av_iamf_submix_add_element(AVIAMFSubmix *submix); + +/** + * Allocate a submix layout and add it to a given AVIAMFSubmix. + * It is freed by av_iamf_mix_presentation_free() alongside the rest of the + * parent AVIAMFSubmix. + * + * @return a pointer to the allocated submix. + */ +AVIAMFSubmixLayout *av_iamf_submix_add_layout(AVIAMFSubmix *submix); + +void av_iamf_mix_presentation_free(AVIAMFMixPresentation **mix_presentation); +/** + * @} + */ + +#endif /* AVUTIL_IAMF_H */ From patchwork Thu Dec 14 20:14:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45146 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp5208949pzf; Thu, 14 Dec 2023 12:15:42 -0800 (PST) X-Google-Smtp-Source: AGHT+IGfClVzndwQ7XH0V1n38ol1yhT//qnqpEBZIvmIbiEeFlRFh8YFG9i0g8zrMNAHK5h68U4N X-Received: by 2002:a17:907:8b8a:b0:a1d:5c34:2ce7 with SMTP id tb10-20020a1709078b8a00b00a1d5c342ce7mr12246871ejc.6.1702584941757; Thu, 14 Dec 2023 12:15:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702584941; cv=none; d=google.com; s=arc-20160816; b=yp7QFxaqSZ+24QvklpPq0zPSBMVwFJ1AjMXUnSpqBC3gm0RmGgmIq/AJpoFiRKh/UL koyKLV9vRQrd/A81uK1mH8xHQWihcSlANzKGQzEQoqT+MoSq5ZSvfJExMud7/gWLjLmn xXMfyCSTsaSVpTBseDI2geme3VNu9iim32HkFfLiqMF+d/xwFIuK17KAhIxIC0/dz2X0 AMWSlX/hO7DZLFoohdJJonANb/VN6p3VebFo6QT6BN2uoasQPw+DQeDUJkuCsaIbMo5B BQIZUU8m+2ZX9bvE7bwOmgIw1iP4Rct0cMD2faE+ICSwlFQfBo7V4iW5LwOoVsw+Ao2y DYYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=p6UlI4FjMQA3NZVPr/u2dUd+2mFrUM6ASzZJV/m+dNE=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=q+oJZQfKOLcMSIkvr90QDnp6JtWUMwkVJjWARvgpp635WqGredw6KGwQLjStalKicl tB+FyceGSdt0QGOVYe2my/UnZ6E1yJfTDuqhAlNFrBckmnpjeyVIkg2RMnCKwYDoYdvl 88S5A0WH70j9lcsgeQbKMTsmr1gCJkgMCZktRPVN8cy7pyRoiss8bT5JiClvbzP+sy5T 8yAx9nXWgk3X/GqYVrX0cyn+bkwTa8zHXCNUdifpWOy1Lwvq3RGlZxl4gWr8X56rC0Tx CQokU3fNbqFwJcYT96zvnLpNnvYmHKgc4o+UCBXQy7qh9LLQGGSqYkQ0UPBTLVcAJrKa RgrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=TjMQWm9Z; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id hb6-20020a170906b88600b00a2313709945si385425ejb.721.2023.12.14.12.15.41; Thu, 14 Dec 2023 12:15:41 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=TjMQWm9Z; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8655368D185; Thu, 14 Dec 2023 22:15:21 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 86EC168D28B for ; Thu, 14 Dec 2023 22:15:14 +0200 (EET) Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-6ce7632b032so4799790b3a.1 for ; Thu, 14 Dec 2023 12:15:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702584912; x=1703189712; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=LHzWrFvPT6z94QJcVivdgvI0E51Ca9iW7IKIFRiNQrA=; b=TjMQWm9Z6iU5N2aglLAMr3Z+WQU1twOnWitbacaeDGce1Og0cTIXN3OE1JrnGQCrtk ctFAObtr6GLt9jg4bU4RmuuLKmhHARWYBULfA1FTh9cPg8Cevuzb5rozqUmxTcra44Uz B/NLqRM/ojpvLsVY9hTj83Ic1YQt4fIi6QCnpWSRWJ7XnxQnfRQdy85YI5Pb6QRp0wLC YvG7dslDA+sNJITJgLj3DsMskMNRDZwN9MyK0Rz6W7LfaMnhpYWCKAR4CkF3qUrXK+GN 4J2Oqon4t7z9M5hR2gII57ypyWDk/Xh3P6kpG8DqBRi4X7l8hTmreqFdYahs5OVTf8ff Dmew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702584912; x=1703189712; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LHzWrFvPT6z94QJcVivdgvI0E51Ca9iW7IKIFRiNQrA=; b=eAPt3A3IPleQbRAD4KjUJ8lnm/p9HL8ggXWOmAuYtMlukPKYwGR9SVrNXGbuHZtTsP bkLdn55g5o6xHt7C01Pn9wW8WSTTTvmw3NWQjf/Fq/Szes6mNkVFQY/5R0W0MwSNa+oC YqDwDGsGEQnpJhoFdoEc/pakqGO8s8BYlo1x3oMCY0Wniu6TMBe4OG+/QZ8ztIyGu/rq ANnTnY2mp4ShRfPeKFdrRwmZ4OlXRYXNO6U9nRL4rttKCwGxKrRoUCrUjQQ1V6vsWSan HV0FoTHT7fF2Zvk1anD0snm6YBxND3R0shLjrT4uqtKoXqOvyRQiCQ4TmBV+XT8Pxck4 lrIw== X-Gm-Message-State: AOJu0YzzBR5ule2KVeb+eXMbnwPXTnoHvklRoxCQjYoo9dwqotQvCB9R O19ZW5E5DCN/PvvLChuWOnjUg2+aI6o= X-Received: by 2002:a05:6a20:1381:b0:18b:9041:58b6 with SMTP id hn1-20020a056a20138100b0018b904158b6mr4105727pzc.7.1702584911464; Thu, 14 Dec 2023 12:15:11 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id h12-20020a62b40c000000b006d0d4bafe31sm3352885pfn.6.2023.12.14.12.15.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 12:15:10 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Thu, 14 Dec 2023 17:14:27 -0300 Message-ID: <20231214201433.4608-3-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214201433.4608-1-jamrial@gmail.com> References: <20231214201433.4608-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: VOoduOoNrF8X Signed-off-by: James Almer --- doc/fftools-common-opts.texi | 17 +++- libavformat/avformat.c | 91 +++++++++++++++++++-- libavformat/avformat.h | 153 +++++++++++++++++++++++++++++++++++ libavformat/dump.c | 147 +++++++++++++++++++++++++++------ libavformat/internal.h | 33 ++++++++ libavformat/options.c | 139 +++++++++++++++++++++++++++++++ 6 files changed, 546 insertions(+), 34 deletions(-) diff --git a/doc/fftools-common-opts.texi b/doc/fftools-common-opts.texi index d9145704d6..f459bfdc1d 100644 --- a/doc/fftools-common-opts.texi +++ b/doc/fftools-common-opts.texi @@ -37,9 +37,9 @@ Matches the stream with this index. E.g. @code{-threads:1 4} would set the thread count for the second stream to 4. If @var{stream_index} is used as an additional stream specifier (see below), then it selects stream number @var{stream_index} from the matching streams. Stream numbering is based on the -order of the streams as detected by libavformat except when a program ID is -also specified. In this case it is based on the ordering of the streams in the -program. +order of the streams as detected by libavformat except when a stream group +specifier or program ID is also specified. In this case it is based on the +ordering of the streams in the group or program. @item @var{stream_type}[:@var{additional_stream_specifier}] @var{stream_type} is one of following: 'v' or 'V' for video, 'a' for audio, 's' for subtitle, 'd' for data, and 't' for attachments. 'v' matches all video @@ -48,6 +48,17 @@ thumbnails or cover arts. If @var{additional_stream_specifier} is used, then it matches streams which both have this type and match the @var{additional_stream_specifier}. Otherwise, it matches all streams of the specified type. +@item g:@var{group_specifier}[:@var{additional_stream_specifier}] +Matches streams which are in the group with the specifier @var{group_specifier}. +if @var{additional_stream_specifier} is used, then it matches streams which both +are part of the group and match the @var{additional_stream_specifier}. +@var{group_specifier} may be one of the following: +@table @option +@item @var{group_index} +Match the stream with this group index. +@item #@var{group_id} or i:@var{group_id} +Match the stream with this group id. +@end table @item p:@var{program_id}[:@var{additional_stream_specifier}] Matches streams which are in the program with the id @var{program_id}. If @var{additional_stream_specifier} is used, then it matches streams which both diff --git a/libavformat/avformat.c b/libavformat/avformat.c index 5b8bb7879e..7e747c43d5 100644 --- a/libavformat/avformat.c +++ b/libavformat/avformat.c @@ -24,6 +24,7 @@ #include "libavutil/avstring.h" #include "libavutil/channel_layout.h" #include "libavutil/frame.h" +#include "libavutil/iamf.h" #include "libavutil/intreadwrite.h" #include "libavutil/mem.h" #include "libavutil/opt.h" @@ -80,6 +81,32 @@ FF_ENABLE_DEPRECATION_WARNINGS av_freep(pst); } +void ff_free_stream_group(AVStreamGroup **pstg) +{ + AVStreamGroup *stg = *pstg; + + if (!stg) + return; + + av_freep(&stg->streams); + av_dict_free(&stg->metadata); + av_freep(&stg->priv_data); + switch (stg->type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: { + av_iamf_audio_element_free(&stg->params.iamf_audio_element); + break; + } + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: { + av_iamf_mix_presentation_free(&stg->params.iamf_mix_presentation); + break; + } + default: + break; + } + + av_freep(pstg); +} + void ff_remove_stream(AVFormatContext *s, AVStream *st) { av_assert0(s->nb_streams>0); @@ -88,6 +115,14 @@ void ff_remove_stream(AVFormatContext *s, AVStream *st) ff_free_stream(&s->streams[ --s->nb_streams ]); } +void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg) +{ + av_assert0(s->nb_stream_groups > 0); + av_assert0(s->stream_groups[ s->nb_stream_groups - 1 ] == stg); + + ff_free_stream_group(&s->stream_groups[ --s->nb_stream_groups ]); +} + /* XXX: suppress the packet queue */ void ff_flush_packet_queue(AVFormatContext *s) { @@ -118,6 +153,9 @@ void avformat_free_context(AVFormatContext *s) for (unsigned i = 0; i < s->nb_streams; i++) ff_free_stream(&s->streams[i]); + for (unsigned i = 0; i < s->nb_stream_groups; i++) + ff_free_stream_group(&s->stream_groups[i]); + s->nb_stream_groups = 0; s->nb_streams = 0; for (unsigned i = 0; i < s->nb_programs; i++) { @@ -139,6 +177,7 @@ void avformat_free_context(AVFormatContext *s) av_packet_free(&si->pkt); av_packet_free(&si->parse_pkt); av_freep(&s->streams); + av_freep(&s->stream_groups); ff_flush_packet_queue(s); av_freep(&s->url); av_free(s); @@ -464,7 +503,7 @@ int av_find_best_stream(AVFormatContext *ic, enum AVMediaType type, */ static int match_stream_specifier(const AVFormatContext *s, const AVStream *st, const char *spec, const char **indexptr, - const AVProgram **p) + const AVStreamGroup **g, const AVProgram **p) { int match = 1; /* Stores if the specifier matches so far. */ while (*spec) { @@ -493,6 +532,46 @@ static int match_stream_specifier(const AVFormatContext *s, const AVStream *st, match = 0; if (nopic && (st->disposition & AV_DISPOSITION_ATTACHED_PIC)) match = 0; + } else if (*spec == 'g' && *(spec + 1) == ':') { + int64_t group_idx = -1, group_id = -1; + int found = 0; + char *endptr; + spec += 2; + if (*spec == '#' || (*spec == 'i' && *(spec + 1) == ':')) { + spec += 1 + (*spec == 'i'); + group_id = strtol(spec, &endptr, 0); + if (spec == endptr || (*endptr && *endptr++ != ':')) + return AVERROR(EINVAL); + spec = endptr; + } else { + group_idx = strtol(spec, &endptr, 0); + /* Disallow empty id and make sure that if we are not at the end, then another specifier must follow. */ + if (spec == endptr || (*endptr && *endptr++ != ':')) + return AVERROR(EINVAL); + spec = endptr; + } + if (match) { + if (group_id > 0) { + for (unsigned i = 0; i < s->nb_stream_groups; i++) { + if (group_id == s->stream_groups[i]->id) { + group_idx = i; + break; + } + } + } + if (group_idx < 0 || group_idx > s->nb_stream_groups) + return AVERROR(EINVAL); + for (unsigned j = 0; j < s->stream_groups[group_idx]->nb_streams; j++) { + if (st->index == s->stream_groups[group_idx]->streams[j]->index) { + found = 1; + if (g) + *g = s->stream_groups[group_idx]; + break; + } + } + } + if (!found) + match = 0; } else if (*spec == 'p' && *(spec + 1) == ':') { int prog_id; int found = 0; @@ -591,10 +670,11 @@ int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st, int ret, index; char *endptr; const char *indexptr = NULL; + const AVStreamGroup *g = NULL; const AVProgram *p = NULL; int nb_streams; - ret = match_stream_specifier(s, st, spec, &indexptr, &p); + ret = match_stream_specifier(s, st, spec, &indexptr, &g, &p); if (ret < 0) goto error; @@ -612,10 +692,11 @@ int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st, return (index == st->index); /* If we requested a matching stream index, we have to ensure st is that. */ - nb_streams = p ? p->nb_stream_indexes : s->nb_streams; + nb_streams = g ? g->nb_streams : (p ? p->nb_stream_indexes : s->nb_streams); for (int i = 0; i < nb_streams && index >= 0; i++) { - const AVStream *candidate = s->streams[p ? p->stream_index[i] : i]; - ret = match_stream_specifier(s, candidate, spec, NULL, NULL); + unsigned idx = g ? g->streams[i]->index : (p ? p->stream_index[i] : i); + const AVStream *candidate = s->streams[idx]; + ret = match_stream_specifier(s, candidate, spec, NULL, NULL, NULL); if (ret < 0) goto error; if (ret > 0 && index-- == 0 && st == candidate) diff --git a/libavformat/avformat.h b/libavformat/avformat.h index 9e7eca007e..5d0fe82250 100644 --- a/libavformat/avformat.h +++ b/libavformat/avformat.h @@ -1018,6 +1018,83 @@ typedef struct AVStream { int pts_wrap_bits; } AVStream; +enum AVStreamGroupParamsType { + AV_STREAM_GROUP_PARAMS_NONE, + AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT, + AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION, +}; + +struct AVIAMFAudioElement; +struct AVIAMFMixPresentation; + +typedef struct AVStreamGroup { + /** + * A class for @ref avoptions. Set by avformat_stream_group_create(). + */ + const AVClass *av_class; + + void *priv_data; + + /** + * Group index in AVFormatContext. + */ + unsigned int index; + + /** + * Group type-specific group ID. + * + * decoding: set by libavformat + * encoding: may set by the user + */ + int64_t id; + + /** + * Group type + * + * decoding: set by libavformat on group creation + * encoding: set by avformat_stream_group_create() + */ + enum AVStreamGroupParamsType type; + + /** + * Group type-specific parameters + */ + union { + struct AVIAMFAudioElement *iamf_audio_element; + struct AVIAMFMixPresentation *iamf_mix_presentation; + } params; + + /** + * Metadata that applies to the whole group. + * + * - demuxing: set by libavformat on group creation + * - muxing: may be set by the caller before avformat_write_header() + * + * Freed by libavformat in avformat_free_context(). + */ + AVDictionary *metadata; + + /** + * Number of elements in AVStreamGroup.streams. + * + * Set by avformat_stream_group_add_stream() must not be modified by any other code. + */ + unsigned int nb_streams; + + /** + * A list of streams in the group. New entries are created with + * avformat_stream_group_add_stream(). + * + * - demuxing: entries are created by libavformat on group creation. + * If AVFMTCTX_NOHEADER is set in ctx_flags, then new entries may also + * appear in av_read_frame(). + * - muxing: entries are created by the user before avformat_write_header(). + * + * Freed by libavformat in avformat_free_context(). + */ + AVStream **streams; +} AVStreamGroup; + struct AVCodecParserContext *av_stream_get_parser(const AVStream *s); #if FF_API_GET_END_PTS @@ -1726,6 +1803,26 @@ typedef struct AVFormatContext { * @return 0 on success, a negative AVERROR code on failure */ int (*io_close2)(struct AVFormatContext *s, AVIOContext *pb); + + /** + * Number of elements in AVFormatContext.stream_groups. + * + * Set by avformat_stream_group_create(), must not be modified by any other code. + */ + unsigned int nb_stream_groups; + + /** + * A list of all stream groups in the file. New groups are created with + * avformat_stream_group_create(), and filled with avformat_stream_group_add_stream(). + * + * - demuxing: groups may be created by libavformat in avformat_open_input(). + * If AVFMTCTX_NOHEADER is set in ctx_flags, then new groups may also + * appear in av_read_frame(). + * - muxing: groups may be created by the user before avformat_write_header(). + * + * Freed by libavformat in avformat_free_context(). + */ + AVStreamGroup **stream_groups; } AVFormatContext; /** @@ -1844,6 +1941,37 @@ const AVClass *avformat_get_class(void); */ const AVClass *av_stream_get_class(void); +/** + * Get the AVClass for AVStreamGroup. It can be used in combination with + * AV_OPT_SEARCH_FAKE_OBJ for examining options. + * + * @see av_opt_find(). + */ +const AVClass *av_stream_group_get_class(void); + +/** + * Add a new empty stream group to a media file. + * + * When demuxing, it may be called by the demuxer in read_header(). If the + * flag AVFMTCTX_NOHEADER is set in s.ctx_flags, then it may also + * be called in read_packet(). + * + * When muxing, may be called by the user before avformat_write_header(). + * + * User is required to call avformat_free_context() to clean up the allocation + * by avformat_stream_group_create(). + * + * New streams can be added to the group with avformat_stream_group_add_stream(). + * + * @param s media file handle + * + * @return newly created group or NULL on error. + * @see avformat_new_stream, avformat_stream_group_add_stream. + */ +AVStreamGroup *avformat_stream_group_create(AVFormatContext *s, + enum AVStreamGroupParamsType type, + AVDictionary **options); + /** * Add a new stream to a media file. * @@ -1863,6 +1991,31 @@ const AVClass *av_stream_get_class(void); */ AVStream *avformat_new_stream(AVFormatContext *s, const struct AVCodec *c); +/** + * Add an already allocated stream to a stream group. + * + * When demuxing, it may be called by the demuxer in read_header(). If the + * flag AVFMTCTX_NOHEADER is set in s.ctx_flags, then it may also + * be called in read_packet(). + * + * When muxing, may be called by the user before avformat_write_header() after + * having allocated a new group with avformat_stream_group_create() and stream with + * avformat_new_stream(). + * + * User is required to call avformat_free_context() to clean up the allocation + * by avformat_stream_group_add_stream(). + * + * @param stg stream group belonging to a media file. + * @param st stream in the media file to add to the group. + * + * @retval 0 success + * @retval AVERROR(EEXIST) the stream was already in the group + * @retval "another negative error code" legitimate errors + * + * @see avformat_new_stream, avformat_stream_group_create. + */ +int avformat_stream_group_add_stream(AVStreamGroup *stg, AVStream *st); + #if FF_API_AVSTREAM_SIDE_DATA /** * Wrap an existing array as stream side data. diff --git a/libavformat/dump.c b/libavformat/dump.c index c0868a1bb3..cc179f284f 100644 --- a/libavformat/dump.c +++ b/libavformat/dump.c @@ -24,6 +24,7 @@ #include "libavutil/channel_layout.h" #include "libavutil/display.h" +#include "libavutil/iamf.h" #include "libavutil/intreadwrite.h" #include "libavutil/log.h" #include "libavutil/mastering_display_metadata.h" @@ -134,28 +135,36 @@ static void print_fps(double d, const char *postfix) av_log(NULL, AV_LOG_INFO, "%1.0fk %s", d / 1000, postfix); } -static void dump_metadata(void *ctx, const AVDictionary *m, const char *indent) +static void dump_dictionary(void *ctx, const AVDictionary *m, + const char *name, const char *indent) { - if (m && !(av_dict_count(m) == 1 && av_dict_get(m, "language", NULL, 0))) { - const AVDictionaryEntry *tag = NULL; - - av_log(ctx, AV_LOG_INFO, "%sMetadata:\n", indent); - while ((tag = av_dict_iterate(m, tag))) - if (strcmp("language", tag->key)) { - const char *p = tag->value; - av_log(ctx, AV_LOG_INFO, - "%s %-16s: ", indent, tag->key); - while (*p) { - size_t len = strcspn(p, "\x8\xa\xb\xc\xd"); - av_log(ctx, AV_LOG_INFO, "%.*s", (int)(FFMIN(255, len)), p); - p += len; - if (*p == 0xd) av_log(ctx, AV_LOG_INFO, " "); - if (*p == 0xa) av_log(ctx, AV_LOG_INFO, "\n%s %-16s: ", indent, ""); - if (*p) p++; - } - av_log(ctx, AV_LOG_INFO, "\n"); + const AVDictionaryEntry *tag = NULL; + + if (!m) + return; + + av_log(ctx, AV_LOG_INFO, "%s%s:\n", indent, name); + while ((tag = av_dict_iterate(m, tag))) + if (strcmp("language", tag->key)) { + const char *p = tag->value; + av_log(ctx, AV_LOG_INFO, + "%s %-16s: ", indent, tag->key); + while (*p) { + size_t len = strcspn(p, "\x8\xa\xb\xc\xd"); + av_log(ctx, AV_LOG_INFO, "%.*s", (int)(FFMIN(255, len)), p); + p += len; + if (*p == 0xd) av_log(ctx, AV_LOG_INFO, " "); + if (*p == 0xa) av_log(ctx, AV_LOG_INFO, "\n%s %-16s: ", indent, ""); + if (*p) p++; } - } + av_log(ctx, AV_LOG_INFO, "\n"); + } +} + +static void dump_metadata(void *ctx, const AVDictionary *m, const char *indent) +{ + if (m && !(av_dict_count(m) == 1 && av_dict_get(m, "language", NULL, 0))) + dump_dictionary(ctx, m, "Metadata", indent); } /* param change side data*/ @@ -509,7 +518,7 @@ static void dump_sidedata(void *ctx, const AVStream *st, const char *indent) /* "user interface" functions */ static void dump_stream_format(const AVFormatContext *ic, int i, - int index, int is_output) + int group_index, int index, int is_output) { char buf[256]; int flags = (is_output ? ic->oformat->flags : ic->iformat->flags); @@ -517,6 +526,8 @@ static void dump_stream_format(const AVFormatContext *ic, int i, const FFStream *const sti = cffstream(st); const AVDictionaryEntry *lang = av_dict_get(st->metadata, "language", NULL, 0); const char *separator = ic->dump_separator; + const char *group_indent = group_index >= 0 ? " " : ""; + const char *extra_indent = group_index >= 0 ? " " : " "; AVCodecContext *avctx; int ret; @@ -543,7 +554,8 @@ static void dump_stream_format(const AVFormatContext *ic, int i, avcodec_string(buf, sizeof(buf), avctx, is_output); avcodec_free_context(&avctx); - av_log(NULL, AV_LOG_INFO, " Stream #%d:%d", index, i); + av_log(NULL, AV_LOG_INFO, "%s Stream #%d", group_indent, index); + av_log(NULL, AV_LOG_INFO, ":%d", i); /* the pid is an important information, so we display it */ /* XXX: add a generic system */ @@ -621,9 +633,89 @@ static void dump_stream_format(const AVFormatContext *ic, int i, av_log(NULL, AV_LOG_INFO, " (non-diegetic)"); av_log(NULL, AV_LOG_INFO, "\n"); - dump_metadata(NULL, st->metadata, " "); + dump_metadata(NULL, st->metadata, extra_indent); + + dump_sidedata(NULL, st, extra_indent); +} + +static void dump_stream_group(const AVFormatContext *ic, uint8_t *printed, + int i, int index, int is_output) +{ + const AVStreamGroup *stg = ic->stream_groups[i]; + int flags = (is_output ? ic->oformat->flags : ic->iformat->flags); + char buf[512]; + int ret; - dump_sidedata(NULL, st, " "); + av_log(NULL, AV_LOG_INFO, " Stream group #%d:%d", index, i); + if (flags & AVFMT_SHOW_IDS) + av_log(NULL, AV_LOG_INFO, "[0x%"PRIx64"]", stg->id); + av_log(NULL, AV_LOG_INFO, ":"); + + switch (stg->type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: { + const AVIAMFAudioElement *audio_element = stg->params.iamf_audio_element; + av_log(NULL, AV_LOG_INFO, " IAMF Audio Element\n"); + dump_metadata(NULL, stg->metadata, " "); + for (int j = 0; j < audio_element->nb_layers; j++) { + const AVIAMFLayer *layer = audio_element->layers[j]; + int channel_count = layer->ch_layout.nb_channels; + av_log(NULL, AV_LOG_INFO, " Layer %d:", j); + ret = av_channel_layout_describe(&layer->ch_layout, buf, sizeof(buf)); + if (ret >= 0) + av_log(NULL, AV_LOG_INFO, " %s", buf); + av_log(NULL, AV_LOG_INFO, "\n"); + for (int k = 0; channel_count > 0 && k < stg->nb_streams; k++) { + AVStream *st = stg->streams[k]; + dump_stream_format(ic, st->index, i, index, is_output); + printed[st->index] = 1; + channel_count -= st->codecpar->ch_layout.nb_channels; + } + } + break; + } + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: { + const AVIAMFMixPresentation *mix_presentation = stg->params.iamf_mix_presentation; + av_log(NULL, AV_LOG_INFO, " IAMF Mix Presentation\n"); + dump_metadata(NULL, stg->metadata, " "); + dump_dictionary(NULL, mix_presentation->annotations, "Annotations", " "); + for (int j = 0; j < mix_presentation->nb_submixes; j++) { + AVIAMFSubmix *sub_mix = mix_presentation->submixes[j]; + av_log(NULL, AV_LOG_INFO, " Submix %d:\n", j); + for (int k = 0; k < sub_mix->nb_elements; k++) { + const AVIAMFSubmixElement *submix_element = sub_mix->elements[k]; + const AVStreamGroup *audio_element = NULL; + for (int l = 0; l < ic->nb_stream_groups; l++) + if (ic->stream_groups[l]->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT && + ic->stream_groups[l]->id == submix_element->audio_element_id) { + audio_element = ic->stream_groups[l]; + break; + } + if (audio_element) { + av_log(NULL, AV_LOG_INFO, " IAMF Audio Element #%d:%d", + index, audio_element->index); + if (flags & AVFMT_SHOW_IDS) + av_log(NULL, AV_LOG_INFO, "[0x%"PRIx64"]", audio_element->id); + av_log(NULL, AV_LOG_INFO, "\n"); + dump_dictionary(NULL, submix_element->annotations, "Annotations", " "); + } + } + for (int k = 0; k < sub_mix->nb_layouts; k++) { + const AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[k]; + av_log(NULL, AV_LOG_INFO, " Layout #%d:", k); + if (submix_layout->layout_type == 2) { + ret = av_channel_layout_describe(&submix_layout->sound_system, buf, sizeof(buf)); + if (ret >= 0) + av_log(NULL, AV_LOG_INFO, " %s", buf); + } else if (submix_layout->layout_type == 3) + av_log(NULL, AV_LOG_INFO, " Binaural"); + av_log(NULL, AV_LOG_INFO, "\n"); + } + } + break; + } + default: + break; + } } void av_dump_format(AVFormatContext *ic, int index, @@ -699,7 +791,7 @@ void av_dump_format(AVFormatContext *ic, int index, dump_metadata(NULL, program->metadata, " "); for (k = 0; k < program->nb_stream_indexes; k++) { dump_stream_format(ic, program->stream_index[k], - index, is_output); + -1, index, is_output); printed[program->stream_index[k]] = 1; } total += program->nb_stream_indexes; @@ -708,9 +800,12 @@ void av_dump_format(AVFormatContext *ic, int index, av_log(NULL, AV_LOG_INFO, " No Program\n"); } + for (i = 0; i < ic->nb_stream_groups; i++) + dump_stream_group(ic, printed, i, index, is_output); + for (i = 0; i < ic->nb_streams; i++) if (!printed[i]) - dump_stream_format(ic, i, index, is_output); + dump_stream_format(ic, i, -1, index, is_output); av_free(printed); } diff --git a/libavformat/internal.h b/libavformat/internal.h index 7702986c9c..c6181683ef 100644 --- a/libavformat/internal.h +++ b/libavformat/internal.h @@ -202,6 +202,7 @@ typedef struct FFStream { */ AVStream pub; + AVFormatContext *fmtctx; /** * Set to 1 if the codec allows reordering, so pts can be different * from dts. @@ -427,6 +428,26 @@ static av_always_inline const FFStream *cffstream(const AVStream *st) return (const FFStream*)st; } +typedef struct FFStreamGroup { + /** + * The public context. + */ + AVStreamGroup pub; + + AVFormatContext *fmtctx; +} FFStreamGroup; + + +static av_always_inline FFStreamGroup *ffstreamgroup(AVStreamGroup *stg) +{ + return (FFStreamGroup*)stg; +} + +static av_always_inline const FFStreamGroup *cffstreamgroup(const AVStreamGroup *stg) +{ + return (const FFStreamGroup*)stg; +} + #ifdef __GNUC__ #define dynarray_add(tab, nb_ptr, elem)\ do {\ @@ -608,6 +629,18 @@ void ff_free_stream(AVStream **st); */ void ff_remove_stream(AVFormatContext *s, AVStream *st); +/** + * Frees a stream group without modifying the corresponding AVFormatContext. + * Must only be called if the latter doesn't matter or if the stream + * is not yet attached to an AVFormatContext. + */ +void ff_free_stream_group(AVStreamGroup **pstg); +/** + * Remove a stream group from its AVFormatContext and free it. + * The group must be the last stream of the AVFormatContext. + */ +void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg); + unsigned int ff_codec_get_tag(const AVCodecTag *tags, enum AVCodecID id); enum AVCodecID ff_codec_get_id(const AVCodecTag *tags, unsigned int tag); diff --git a/libavformat/options.c b/libavformat/options.c index 1d8c52246b..bf6113ca95 100644 --- a/libavformat/options.c +++ b/libavformat/options.c @@ -26,6 +26,7 @@ #include "libavcodec/codec_par.h" #include "libavutil/avassert.h" +#include "libavutil/iamf.h" #include "libavutil/internal.h" #include "libavutil/intmath.h" #include "libavutil/opt.h" @@ -271,6 +272,7 @@ AVStream *avformat_new_stream(AVFormatContext *s, const AVCodec *c) if (!st->codecpar) goto fail; + sti->fmtctx = s; sti->avctx = avcodec_alloc_context3(NULL); if (!sti->avctx) goto fail; @@ -325,6 +327,143 @@ fail: return NULL; } +static void *stream_group_child_next(void *obj, void *prev) +{ + AVStreamGroup *stg = obj; + if (!prev) { + switch(stg->type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + return stg->params.iamf_audio_element; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + return stg->params.iamf_mix_presentation; + default: + break; + } + } + return NULL; +} + +static const AVClass *stream_group_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + switch(i) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + ret = av_iamf_audio_element_get_class(); + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + ret = av_iamf_mix_presentation_get_class(); + break; + default: + break; + } + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVOption stream_group_options[] = { + {"id", "Set group id", offsetof(AVStreamGroup, id), AV_OPT_TYPE_INT64, {.i64 = 0}, 0, INT64_MAX, AV_OPT_FLAG_ENCODING_PARAM }, + { NULL } +}; + +static const AVClass stream_group_class = { + .class_name = "AVStreamGroup", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = stream_group_options, + .child_next = stream_group_child_next, + .child_class_iterate = stream_group_child_iterate, +}; + +const AVClass *av_stream_group_get_class(void) +{ + return &stream_group_class; +} + +AVStreamGroup *avformat_stream_group_create(AVFormatContext *s, + enum AVStreamGroupParamsType type, + AVDictionary **options) +{ + AVStreamGroup **stream_groups; + AVStreamGroup *stg; + FFStreamGroup *stgi; + + stream_groups = av_realloc_array(s->stream_groups, s->nb_stream_groups + 1, + sizeof(*stream_groups)); + if (!stream_groups) + return NULL; + s->stream_groups = stream_groups; + + stgi = av_mallocz(sizeof(*stgi)); + if (!stgi) + return NULL; + stg = &stgi->pub; + + stg->av_class = &stream_group_class; + av_opt_set_defaults(stg); + stg->type = type; + switch (type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + stg->params.iamf_audio_element = av_iamf_audio_element_alloc(); + if (!stg->params.iamf_audio_element) + goto fail; + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + stg->params.iamf_mix_presentation = av_iamf_mix_presentation_alloc(); + if (!stg->params.iamf_mix_presentation) + goto fail; + break; + default: + goto fail; + } + + if (options) { + if (av_opt_set_dict2(stg, options, AV_OPT_SEARCH_CHILDREN)) + goto fail; + } + + stgi->fmtctx = s; + stg->index = s->nb_stream_groups; + + s->stream_groups[s->nb_stream_groups++] = stg; + + return stg; +fail: + ff_free_stream_group(&stg); + return NULL; +} + +static int stream_group_add_stream(AVStreamGroup *stg, AVStream *st) +{ + AVStream **streams = av_realloc_array(stg->streams, stg->nb_streams + 1, + sizeof(*stg->streams)); + if (!streams) + return AVERROR(ENOMEM); + + stg->streams = streams; + stg->streams[stg->nb_streams++] = st; + + return 0; +} + +int avformat_stream_group_add_stream(AVStreamGroup *stg, AVStream *st) +{ + const FFStreamGroup *stgi = cffstreamgroup(stg); + const FFStream *sti = cffstream(st); + + if (stgi->fmtctx != sti->fmtctx) + return AVERROR(EINVAL); + + for (int i = 0; i < stg->nb_streams; i++) + if (stg->streams[i]->index == st->index) + return AVERROR(EEXIST); + + return stream_group_add_stream(stg, st); +} + static int option_is_disposition(const AVOption *opt) { return opt->type == AV_OPT_TYPE_CONST && From patchwork Thu Dec 14 20:14:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45147 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp5209042pzf; Thu, 14 Dec 2023 12:15:50 -0800 (PST) X-Google-Smtp-Source: AGHT+IHXCZAdmadKax1W/rCgm3/XeGKgyfIdVZjdHtc0anaijaKWEfWQqyNrEy6PYFvDr2Uhfo0I X-Received: by 2002:a17:907:d384:b0:a1b:1daf:8270 with SMTP id vh4-20020a170907d38400b00a1b1daf8270mr12062111ejc.5.1702584950697; Thu, 14 Dec 2023 12:15:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702584950; cv=none; d=google.com; s=arc-20160816; b=tHAa9zE8aYBVTHiMRZXV+XHs5aRLHVK0V+LUiIceIEcCZXgZW1Pies0OCMX681fF4w pK72CmpT5rDCO71lyVEEFcyjqdFzGA5fZ/xojgA02fqtdjs93B7sDYrlVytevNdKt+FU kEpQHlhScpHLOVKhfxp6FbQqnei2haPOkZB3mEJYFcm7hsX/BzuwvnoEA4IPuqLo0DTn 6PmM6wgU+fy6goufg8iHwgvhhOMHJNhqG34k3UiUqdklhczJEH+ngKmTJUNGhPx5S+O7 DOjsyoisZ6xglhj/yoD8Lm0dRXxNMBdYH3QN4s95KcV6FaZmAH6gLMGhz+/mThfBwf7l fE5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=uZOY6y05323xdwWguQGzAgB+YpGufeugbnEsZqCgAA0=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=rVMlzlGSMQx1ni8Zf/QYyobR3a6pmdStoFy+b/2mYKuUjYfd35ekYJ5h8pIQeK7Amw bwZPJIw7E5kcAVCS1wk1FsADh0JPbm2uKTo3Mp9W5DhT5CLkkcpJo3f008UQI+BftXxn 53T4LsTPqQXDjX1aoCdbNsCh2pwMq1r7BSSujXDNdwwmFJ7h5LKICgD6Gk4lA3Nhwide ZVXIKtEfM65vq63i+BlHd0y0E0oCJUUYksNAze+s64A28Rr5AiODx5rEYL9wyuf0fGh7 EYh7IX7rgrLurGnoFE1XScbGo7lZLgkxHOzFvELzm202K+A3L4hYwWy0M4kjWxtbKr0F tHeg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=kW4JnOFM; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id k26-20020a17090646da00b00a1e095c4e77si6846320ejs.914.2023.12.14.12.15.50; Thu, 14 Dec 2023 12:15:50 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=kW4JnOFM; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id D2B4368D2B1; Thu, 14 Dec 2023 22:15:23 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A429168D28B for ; Thu, 14 Dec 2023 22:15:16 +0200 (EET) Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-6cea2a38b48so7868049b3a.3 for ; Thu, 14 Dec 2023 12:15:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702584914; x=1703189714; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=XMjFO2iXLXj+pIr6cHeafcriFaiEQgnAV3EVYAjLfRM=; b=kW4JnOFM8a/6PkB/QUFM5S/WsdaPVtMfpCJjzsZ8DkFi5wYu2rI3QaS/a+bourRY4J FqtXscDS4pWZO4lGsmrB3+Wt5j8o0kqeQdmsnti2FKgGt99O7zXMYKHTcKZRUVpzt1Nu ko3Y6PXGH79rrKlIjTCtg/s2oSffDeWM3og8Q8cqJpgQN4uE8vNpNlZFHC+VTDAVPcJ/ PG8s4Y8BM0GWf6T7/SM9gw2UIcJOEoYYjz8/2NOTaL1jid5zAjyv37m/jWfsvNF9KrRY XrEHYYdtyQphLohRFCE+moJfceV+IUxZpsnhTixQZD3C0jeet4ISy2CPmwI7nbqgKNFb DPbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702584914; x=1703189714; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XMjFO2iXLXj+pIr6cHeafcriFaiEQgnAV3EVYAjLfRM=; b=ePd7hlFFVbHOWJ7vcmTpXDjNqjHV3EraTFI2r6AjGJIt62J+rO+d1h03oNpqDGUowU I+hlFJ6uPD818Fagbs2u7fdhvbPJ9rAdKX73ImnqpsKbYdJH6QY+Bf0zcELTm3q0BMKN EpRo1dQ85bUOcsAtCeExaFT8QjIYkpErgV524fuziRLJxl3FgKBzkXTO310BfwOxU14T ISJUx2Ki9At2vQsQPpLgFdkbcJbsxghFWRHxds0TxLlzY6RHHVU7FY6fBgzZIAoIkNYW luHjIvHr9RQNaygNo+qfqm0fex4maJc0Se+Mwaa88+5UN7qSzc8WJmbqHA1jsFpXoU1g 1/4A== X-Gm-Message-State: AOJu0YxEORUGvZIxw+2KRh8XddfwapW2uJnGjsS4tGJWA+eJcxrdACle lBDL8J6g65TojzOm0VhqfiilT9VQSSM= X-Received: by 2002:a05:6a00:14d4:b0:6be:4e6e:2a85 with SMTP id w20-20020a056a0014d400b006be4e6e2a85mr12011663pfu.30.1702584913571; Thu, 14 Dec 2023 12:15:13 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id h12-20020a62b40c000000b006d0d4bafe31sm3352885pfn.6.2023.12.14.12.15.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 12:15:12 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Thu, 14 Dec 2023 17:14:28 -0300 Message-ID: <20231214201433.4608-4-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214201433.4608-1-jamrial@gmail.com> References: <20231214201433.4608-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3GGGAHQZ2GL4 Starting with IAMF support. Signed-off-by: James Almer --- doc/ffmpeg.texi | 200 ++++++++++++++++++++++ fftools/ffmpeg.h | 2 + fftools/ffmpeg_mux_init.c | 341 ++++++++++++++++++++++++++++++++++++++ fftools/ffmpeg_opt.c | 2 + 4 files changed, 545 insertions(+) diff --git a/doc/ffmpeg.texi b/doc/ffmpeg.texi index c503963941..1fadb20686 100644 --- a/doc/ffmpeg.texi +++ b/doc/ffmpeg.texi @@ -623,6 +623,206 @@ Not all muxers support embedded thumbnails, and those who do, only support a few Creates a program with the specified @var{title}, @var{program_num} and adds the specified @var{stream}(s) to it. +@item -stream_group type=@var{type}:st=@var{stream}[:st=@var{stream}][:stg=@var{stream_group}][:id=@var{stream_group_id}...] (@emph{output}) + +Creates a stream group of the specified @var{type}, @var{stream_group_id} and adds the specified +@var{stream}(s) and/or previously defined @var{stream_group}(s) to it. + +@var{type} can be one of the following: +@table @option + +@item iamf_audio_element +Groups @var{stream}s that belong to the same IAMF Audio Element + +For this group @var{type}, the following options are available +@table @option +@item audio_element_type +The Audio Element type. The following values are supported: + +@table @option +@item channel +Scalable channel audio representation +@item scene +Ambisonics representation +@end table + +@item demixing +Demixing information used to reconstruct a scalable channel audio representation. +This option must be separated from the rest with a ',', and takes the following +key=value options + +@table @option +@item parameter_id +An identifier parameters blocks in frames may refer to +@item dmixp_mode +A pre-defined combination of demixing parameters +@end table + +@item recon_gain +Recon gain information used to reconstruct a scalable channel audio representation. +This option must be separated from the rest with a ',', and takes the following +key=value options + +@table @option +@item parameter_id +An identifier parameters blocks in frames may refer to +@end table + +@item layer +A layer defining a Channel Layout in the Audio Element. +This option must be separated from the rest with a ','. Several ',' separated entries +can be defined, and at least one must be set. + +It takes the following ":"-separated key=value options + +@table @option +@item ch_layout +The layer's channel layout +@item flags +The following flags are available: + +@table @option +@item recon_gain +Wether to signal if recon_gain is present as metadata in parameter blocks within frames +@end table + +@item output_gain +@item output_gain_flags +Which channels output_gain applies to. The following flags are available: + +@table @option +@item FL +@item FR +@item BL +@item BR +@item TFL +@item TFR +@end table + +@item ambisonics_mode +The ambisonics mode. This has no effect if audio_element_type is set to channel. + +The following values are supported: + +@table @option +@item mono +Each ambisonics channel is coded as an individual mono stream in the group +@end table + +@end table + +@item default_w +Default weight value + +@end table + +@item iamf_mix_presentation +Groups @var{stream}s that belong to all IAMF Audio Element the same +IAMF Mix Presentation references + +For this group @var{type}, the following options are available + +@table @option +@item submix +A sub-mix within the Mix Presentation. +This option must be separated from the rest with a ','. Several ',' separated entries +can be defined, and at least one must be set. + +It takes the following ":"-separated key=value options + +@table @option +@item parameter_id +An identifier parameters blocks in frames may refer to, for post-processing the mixed +audio signal to generate the audio signal for playback +@item parameter_rate +The sample rate duration fields in parameters blocks in frames that refer to this +@var{parameter_id} are expressed as +@item default_mix_gain +Default mix gain value to apply when there are no parameter blocks sharing the same +@var{parameter_id} for a given frame + +@item element +References an Audio Element used in this Mix Presentation to generate the final output +audio signal for playback. +This option must be separated from the rest with a '|'. Several '|' separated entries +can be defined, and at least one must be set. + +It takes the following ":"-separated key=value options: + +@table @option +@item stg +The @var{stream_group_id} for an Audio Element which this sub-mix refers to +@item parameter_id +An identifier parameters blocks in frames may refer to, for applying any processing to +the referenced and rendered Audio Element before being summed with other processed Audio +Elements +@item parameter_rate +The sample rate duration fields in parameters blocks in frames that refer to this +@var{parameter_id} are expressed as +@item default_mix_gain +Default mix gain value to apply when there are no parameter blocks sharing the same +@var{parameter_id} for a given frame +@item annotations +A key=value string describing the sub-mix element where "key" is a string conforming to +BCP-47 that specifies the language for the "value" string. "key" must be the same as the +one in the mix's @var{annotations} +@item headphones_rendering_mode +Indicates whether the input channel-based Audio Element is rendered to stereo loudspeakers +or spatialized with a binaural renderer when played back on headphones. +This has no effect if the referenced Audio Element's @var{audio_element_type} is set to +channel. + +The following values are supported: + +@table @option +@item stereo +@item binaural +@end table + +@end table + +@item layout +Specifies the layouts for this sub-mix on which the loudness information was measured. +This option must be separated from the rest with a '|'. Several '|' separated entries +can be defined, and at least one must be set. + +It takes the following ":"-separated key=value options: + +@table @option +@item layout_type + +@table @option +@item loudspeakers +The layout follows the loudspeaker sound system convention of ITU-2051-3. +@item binaural +The layout is binaural. +@end table + +@item sound_system +Channel layout matching one of Sound Systems A to J of ITU-2051-3, plus 7.1.2 and 3.1.2 +This has no effect if @var{layout_type} is set to binaural. +@item integrated_loudness +The program integrated loudness information, as defined in ITU-1770-4. +@item digital_peak +The digital (sampled) peak value of the audio signal, as defined in ITU-1770-4. +@item true_peak +The true peak of the audio signal, as defined in ITU-1770-4. +@item dialog_anchored_loudness +The Dialogue loudness information, as defined in ITU-1770-4. +@item album_anchored_loudness +The Album loudness information, as defined in ITU-1770-4. +@end table + +@end table + +@item annotations +A key=value string string describing the mix where "key" is a string conforming to BCP-47 +that specifies the language for the "value" string. "key" must be the same as the ones in +all sub-mix element's @var{annotations}s +@end table + +@end table + @item -target @var{type} (@emph{output}) Specify target file type (@code{vcd}, @code{svcd}, @code{dvd}, @code{dv}, @code{dv50}). @var{type} may be prefixed with @code{pal-}, @code{ntsc-} or diff --git a/fftools/ffmpeg.h b/fftools/ffmpeg.h index affa80856a..1169f723d1 100644 --- a/fftools/ffmpeg.h +++ b/fftools/ffmpeg.h @@ -281,6 +281,8 @@ typedef struct OptionsContext { int nb_disposition; SpecifierOpt *program; int nb_program; + SpecifierOpt *stream_groups; + int nb_stream_groups; SpecifierOpt *time_bases; int nb_time_bases; SpecifierOpt *enc_time_bases; diff --git a/fftools/ffmpeg_mux_init.c b/fftools/ffmpeg_mux_init.c index f527a083db..0f03ee092e 100644 --- a/fftools/ffmpeg_mux_init.c +++ b/fftools/ffmpeg_mux_init.c @@ -40,6 +40,7 @@ #include "libavutil/dict.h" #include "libavutil/display.h" #include "libavutil/getenv_utf8.h" +#include "libavutil/iamf.h" #include "libavutil/intreadwrite.h" #include "libavutil/log.h" #include "libavutil/mem.h" @@ -2008,6 +2009,342 @@ static int setup_sync_queues(Muxer *mux, AVFormatContext *oc, int64_t buf_size_u return 0; } +static int of_parse_iamf_audio_element_layers(Muxer *mux, AVStreamGroup *stg, char *ptr) +{ + AVIAMFAudioElement *audio_element = stg->params.iamf_audio_element; + AVDictionary *dict = NULL; + const char *token; + int ret = 0; + + audio_element->demixing_info = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_DEMIXING, 1, NULL); + audio_element->recon_gain_info = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN, 1, NULL); + + if (!audio_element->demixing_info || + !audio_element->recon_gain_info) + return AVERROR(ENOMEM); + + /* process manually set layers and parameters */ + token = av_strtok(NULL, ",", &ptr); + while (token) { + const AVDictionaryEntry *e; + int demixing = 0, recon_gain = 0; + int layer = 0; + + if (av_strstart(token, "layer=", &token)) + layer = 1; + else if (av_strstart(token, "demixing=", &token)) + demixing = 1; + else if (av_strstart(token, "recon_gain=", &token)) + recon_gain = 1; + + av_dict_free(&dict); + ret = av_dict_parse_string(&dict, token, "=", ":", 0); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing audio element specification %s\n", token); + goto fail; + } + + if (layer) { + AVIAMFLayer *audio_layer = av_iamf_audio_element_add_layer(audio_element); + if (!audio_layer) { + av_log(mux, AV_LOG_ERROR, "Error adding layer to stream group %d\n", stg->index); + ret = AVERROR(ENOMEM); + goto fail; + } + av_opt_set_dict(audio_layer, &dict); + } else if (demixing || recon_gain) { + AVIAMFParamDefinition *param = demixing ? audio_element->demixing_info + : audio_element->recon_gain_info; + void *subblock = av_iamf_param_definition_get_subblock(param, 0); + + av_opt_set_dict(param, &dict); + av_opt_set_dict(subblock, &dict); + } + + // make sure that no entries are left in the dict + e = NULL; + if (e = av_dict_iterate(dict, e)) { + av_log(mux, AV_LOG_FATAL, "Unknown layer key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto fail; + } + token = av_strtok(NULL, ",", &ptr); + } + +fail: + av_dict_free(&dict); + if (!ret && !audio_element->nb_layers) { + av_log(mux, AV_LOG_ERROR, "No layer in audio element specification\n"); + ret = AVERROR(EINVAL); + } + + return ret; +} + +static int of_parse_iamf_submixes(Muxer *mux, AVStreamGroup *stg, char *ptr) +{ + AVFormatContext *oc = mux->fc; + AVIAMFMixPresentation *mix = stg->params.iamf_mix_presentation; + AVDictionary *dict = NULL; + const char *token; + char *submix_str = NULL; + int ret = 0; + + /* process manually set submixes */ + token = av_strtok(NULL, ",", &ptr); + while (token) { + AVIAMFSubmix *submix = NULL; + const char *subtoken; + char *subptr = NULL; + + if (!av_strstart(token, "submix=", &token)) { + av_log(mux, AV_LOG_ERROR, "No submix in mix presentation specification \"%s\"\n", token); + goto fail; + } + + submix_str = av_strdup(token); + if (!submix_str) + goto fail; + + submix = av_iamf_mix_presentation_add_submix(mix); + if (!submix) { + av_log(mux, AV_LOG_ERROR, "Error adding submix to stream group %d\n", stg->index); + ret = AVERROR(ENOMEM); + goto fail; + } + submix->output_mix_config = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, 0, NULL); + if (!submix->output_mix_config) { + ret = AVERROR(ENOMEM); + goto fail; + } + + subptr = NULL; + subtoken = av_strtok(submix_str, "|", &subptr); + while (subtoken) { + const AVDictionaryEntry *e; + int element = 0, layout = 0; + + if (av_strstart(subtoken, "element=", &subtoken)) + element = 1; + else if (av_strstart(subtoken, "layout=", &subtoken)) + layout = 1; + + av_dict_free(&dict); + ret = av_dict_parse_string(&dict, subtoken, "=", ":", 0); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing submix specification \"%s\"\n", subtoken); + goto fail; + } + + if (element) { + AVIAMFSubmixElement *submix_element; + int idx = -1; + + if (e = av_dict_get(dict, "stg", NULL, 0)) + idx = strtol(e->value, NULL, 0); + av_dict_set(&dict, "stg", NULL, 0); + if (idx < 0 || idx >= oc->nb_stream_groups) { + av_log(mux, AV_LOG_ERROR, "Invalid or missing stream group index in " + "submix element specification \"%s\"\n", subtoken); + ret = AVERROR(EINVAL); + goto fail; + } + submix_element = av_iamf_submix_add_element(submix); + if (!submix_element) { + av_log(mux, AV_LOG_ERROR, "Error adding element to submix\n"); + ret = AVERROR(ENOMEM); + goto fail; + } + + submix_element->audio_element_id = oc->stream_groups[idx]->id; + + submix_element->element_mix_config = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, 0, NULL); + if (!submix_element->element_mix_config) + ret = AVERROR(ENOMEM); + av_opt_set_dict2(submix_element, &dict, AV_OPT_SEARCH_CHILDREN); + } else if (layout) { + AVIAMFSubmixLayout *submix_layout = av_iamf_submix_add_layout(submix); + if (!submix_layout) { + av_log(mux, AV_LOG_ERROR, "Error adding layout to submix\n"); + ret = AVERROR(ENOMEM); + goto fail; + } + av_opt_set_dict(submix_layout, &dict); + } else + av_opt_set_dict2(submix, &dict, AV_OPT_SEARCH_CHILDREN); + + if (ret < 0) { + goto fail; + } + + // make sure that no entries are left in the dict + e = NULL; + while (e = av_dict_iterate(dict, e)) { + av_log(mux, AV_LOG_FATAL, "Unknown submix key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto fail; + } + subtoken = av_strtok(NULL, "|", &subptr); + } + av_freep(&submix_str); + + if (!submix->nb_elements) { + av_log(mux, AV_LOG_ERROR, "No audio elements in submix specification \"%s\"\n", token); + ret = AVERROR(EINVAL); + } + token = av_strtok(NULL, ",", &ptr); + } + +fail: + av_dict_free(&dict); + av_free(submix_str); + + return ret; +} + +static int of_parse_group_token(Muxer *mux, const char *token, char *ptr) +{ + AVFormatContext *oc = mux->fc; + AVStreamGroup *stg; + AVDictionary *dict = NULL, *tmp = NULL; + const AVDictionaryEntry *e; + const AVOption opts[] = { + { "type", "Set group type", offsetof(AVStreamGroup, type), AV_OPT_TYPE_INT, + { .i64 = 0 }, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM, "type" }, + { "iamf_audio_element", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT }, .unit = "type" }, + { "iamf_mix_presentation", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION }, .unit = "type" }, + { NULL }, + }; + const AVClass class = { + .class_name = "StreamGroupType", + .item_name = av_default_item_name, + .option = opts, + .version = LIBAVUTIL_VERSION_INT, + }; + const AVClass *pclass = &class; + int type, ret; + + ret = av_dict_parse_string(&dict, token, "=", ":", AV_DICT_MULTIKEY); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing group specification %s\n", token); + return ret; + } + + // "type" is not a user settable AVOption in AVStreamGroup, so handle it here + e = av_dict_get(dict, "type", NULL, 0); + if (!e) { + av_log(mux, AV_LOG_ERROR, "No type specified for Stream Group in \"%s\"\n", token); + ret = AVERROR(EINVAL); + goto end; + } + + ret = av_opt_eval_int(&pclass, opts, e->value, &type); + if (!ret && type == AV_STREAM_GROUP_PARAMS_NONE) + ret = AVERROR(EINVAL); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Invalid group type \"%s\"\n", e->value); + goto end; + } + + av_dict_copy(&tmp, dict, 0); + stg = avformat_stream_group_create(oc, type, &tmp); + if (!stg) { + ret = AVERROR(ENOMEM); + goto end; + } + + e = NULL; + while (e = av_dict_get(dict, "st", e, 0)) { + unsigned int idx = strtol(e->value, NULL, 0); + if (idx >= oc->nb_streams) { + av_log(mux, AV_LOG_ERROR, "Invalid stream index %d\n", idx); + ret = AVERROR(EINVAL); + goto end; + } + ret = avformat_stream_group_add_stream(stg, oc->streams[idx]); + if (ret < 0) + goto end; + } + while (e = av_dict_get(dict, "stg", e, 0)) { + unsigned int idx = strtol(e->value, NULL, 0); + if (idx >= oc->nb_stream_groups || idx == stg->index) { + av_log(mux, AV_LOG_ERROR, "Invalid stream group index %u\n", idx); + ret = AVERROR(EINVAL); + goto end; + } + for (int i = 0; i < oc->stream_groups[idx]->nb_streams; i++) { + ret = avformat_stream_group_add_stream(stg, oc->stream_groups[idx]->streams[i]); + if (ret < 0) + goto end; + } + } + + switch(type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + ret = of_parse_iamf_audio_element_layers(mux, stg, ptr); + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + ret = of_parse_iamf_submixes(mux, stg, ptr); + break; + default: + av_log(mux, AV_LOG_FATAL, "Unknown group type %d.\n", type); + ret = AVERROR(EINVAL); + break; + } + + if (ret < 0) + goto end; + + // make sure that nothing but "st" and "stg" entries are left in the dict + e = NULL; + av_dict_set(&tmp, "type", NULL, 0); + while (e = av_dict_iterate(tmp, e)) { + if (!strcmp(e->key, "st") || !strcmp(e->key, "stg")) + continue; + + av_log(mux, AV_LOG_FATAL, "Unknown group key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto end; + } + + ret = 0; +end: + av_dict_free(&dict); + av_dict_free(&tmp); + + return ret; +} + +static int of_add_groups(Muxer *mux, const OptionsContext *o) +{ + /* process manually set groups */ + for (int i = 0; i < o->nb_stream_groups; i++) { + const char *token; + char *str, *ptr = NULL; + int ret = 0; + + str = av_strdup(o->stream_groups[i].u.str); + if (!str) + return ret; + + token = av_strtok(str, ",", &ptr); + if (token) + ret = of_parse_group_token(mux, token, ptr); + + av_free(str); + if (ret < 0) + return ret; + } + + return 0; +} + static int of_add_programs(Muxer *mux, const OptionsContext *o) { AVFormatContext *oc = mux->fc; @@ -2793,6 +3130,10 @@ int of_open(const OptionsContext *o, const char *filename, Scheduler *sch) if (err < 0) return err; + err = of_add_groups(mux, o); + if (err < 0) + return err; + err = of_add_programs(mux, o); if (err < 0) return err; diff --git a/fftools/ffmpeg_opt.c b/fftools/ffmpeg_opt.c index 6177a96a4e..915f8e3ea0 100644 --- a/fftools/ffmpeg_opt.c +++ b/fftools/ffmpeg_opt.c @@ -1493,6 +1493,8 @@ const OptionDef options[] = { "add metadata", "string=string" }, { "program", HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(program) }, "add program with specified streams", "title=string:st=number..." }, + { "stream_group", HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(stream_groups) }, + "add stream group with specified streams and group type-specific arguments", "id=number:st=number..." }, { "dframes", HAS_ARG | OPT_PERFILE | OPT_EXPERT | OPT_OUTPUT, { .func_arg = opt_data_frames }, "set the number of data frames to output", "number" }, From patchwork Thu Dec 14 20:14:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45148 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp5209122pzf; Thu, 14 Dec 2023 12:15:59 -0800 (PST) X-Google-Smtp-Source: AGHT+IEf8WuWTfW5cg/N8QKyefc+/3Xz/beAP3o0UA9uxnnQvRwWO4mJab+2pQApKEtr5EOLUJ/9 X-Received: by 2002:a17:907:7251:b0:a02:54fa:4f2f with SMTP id ds17-20020a170907725100b00a0254fa4f2fmr6603328ejc.53.1702584959537; Thu, 14 Dec 2023 12:15:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702584959; cv=none; d=google.com; s=arc-20160816; b=mInXBQ7+rzvXynA/Z/QLX/KG1kmDX3hLi3uSA47dPIJmpLNm9wTmCycsErSQjPXFlT qI/Z8wDszcNZ+0i4QlVGZM/KVTHOMOWjRfHqDn0y1y1wNPo63R2LaNWsgQ7jpR6FEj+7 giWkL7HWVzKC03e0TJMq7ZZcVsN1xFdN3muAYKiMbGtMNgkqD8oLUfTMn1f6oqEz8rfb VsEI7QTKWgv5585HF0EztJVKUiA6MpymtsjDBQbQP4amSSoDsvRI47jbPFjGCcB4Mj7R 49uI9Sc6zeVaTd4OaUF/z63yHi/srFC9Xig1cEJWcaaziB18ySEMz5DX9+4aJHvkA5OH AYxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=qy1KQwAlYk+XOqPZ/0jKNRuKNr4MMVZXVDNzQ9D6fHM=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=SViupqorzkuOHz7yCScqps3uGAUwq54TgYxrss/ZqX8KGy2Kytz4FzlYrXINBwQOp2 SyODYSzwItVtUf3DBBHQHPLiuwPux19iWkriPVfwerBldHGSQ7JgK0D1m4Y5aySPdJm1 kpj8iNx2eN782af+NKxbkveT5Yf00ACYYcHIdiXS41Swem8pNkzS7JQ6wX6bN9bvM0rR /eMLXa6rXPR0VodkSKVhTPvDs1cepRjWAW3CB7LUjqOdUsGlL1I4+s0gI0LQQ00MhNtP xwx3yhTcMOnyO5/8Q0/rqvzeeoxmoEX+M7iWadK4fm5XJfyQTg5RK77wIT8dy253X6AW aD9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=dfd7TcDy; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id x17-20020a1709065ad100b00a1d2eae9486si6786056ejs.712.2023.12.14.12.15.59; Thu, 14 Dec 2023 12:15:59 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=dfd7TcDy; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EE85168D2B2; Thu, 14 Dec 2023 22:15:25 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 9EC1368D292 for ; Thu, 14 Dec 2023 22:15:17 +0200 (EET) Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-6d267461249so547449b3a.3 for ; Thu, 14 Dec 2023 12:15:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702584915; x=1703189715; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=2tkQy47363R1byuwNxSqzL7sZyukku8A9sA+O1dn8Ms=; b=dfd7TcDyshLFdkWMSDoVnYeoeBzVZ+ijYANH7c6QBfKCuVod1VwmE3FD0kUHRPP8Ou jL/qi1T1kAUBEmDCt+ZvX7qrUoPX824Y2UEnnzEVfF8bc/mTCWRfEUOPLFOIZHYCRydb YrVoKozjpSKTUh8JSqypR5LzlAVzpSWgCHAWeDX27VoNbrZFfhQM0UhlM0hOTVuT8V5Y Yj1Gxutr9Jg8D0XSDT93yY6seKD48XWKL3G2WTcJg4WU8KwsfgI2OJaO2lubvTG35j6C sNZCKixS31ESEOpUUcWzFo74QgMJzqkJOjWLIOh3TkXMgNflQSYvJpDLgET8Z/ZemPgU fuHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702584915; x=1703189715; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2tkQy47363R1byuwNxSqzL7sZyukku8A9sA+O1dn8Ms=; b=KV0r/V6QQ5Si1PN0VkJbFfAEVLnAlZHp80HxkmRmKzeNj8ErxA07sJqN86roXmcxx7 issuuUsAa8GIcQVLGlfV8OKZQuQC2psSQZXOe9yAl3NN2X/wCR3GH5Nqj6QnEmq3dS49 JLKMKFqsduPzY4ZhohvRap82gHOGbmQvRp3gYVQ1hW2J6tqOobZd1Om4pRUALnxlE6AM p4KSGgiLmXKyxhYTQjq2fmJLGnC78+SeCJRT9Vl2e/SinzuPNkFKtsxbaFo4KGumqKsV 4zyndppDqMbA11RrKA4n8rY74j+wr8kyxbX9eQHrJAz9HlZdYg3YcjDsDXedmD9kaLlA 6s8A== X-Gm-Message-State: AOJu0YwrbTyY4/6Oxlbciceg3T/5kMYXHpgc94UMleAc6VjNp3wMpQ9g QwHUF34OLnslZLbkLFQCaYfCzm7/37o= X-Received: by 2002:a05:6a00:1703:b0:6ce:2731:79fa with SMTP id h3-20020a056a00170300b006ce273179famr5771069pfc.48.1702584915490; Thu, 14 Dec 2023 12:15:15 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id h12-20020a62b40c000000b006d0d4bafe31sm3352885pfn.6.2023.12.14.12.15.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 12:15:14 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Thu, 14 Dec 2023 17:14:29 -0300 Message-ID: <20231214201433.4608-5-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214201433.4608-1-jamrial@gmail.com> References: <20231214201433.4608-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/8] avcodec/packet: add IAMF Parameters side data types X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: Ik9bhVtWBbfK Signed-off-by: James Almer --- libavcodec/avpacket.c | 3 +++ libavcodec/packet.h | 24 ++++++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/libavcodec/avpacket.c b/libavcodec/avpacket.c index e29725c2d2..0f8c9b77ae 100644 --- a/libavcodec/avpacket.c +++ b/libavcodec/avpacket.c @@ -301,6 +301,9 @@ const char *av_packet_side_data_name(enum AVPacketSideDataType type) case AV_PKT_DATA_DOVI_CONF: return "DOVI configuration record"; case AV_PKT_DATA_S12M_TIMECODE: return "SMPTE ST 12-1:2014 timecode"; case AV_PKT_DATA_DYNAMIC_HDR10_PLUS: return "HDR10+ Dynamic Metadata (SMPTE 2094-40)"; + case AV_PKT_DATA_IAMF_MIX_GAIN_PARAM: return "IAMF Mix Gain Parameter Data"; + case AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM: return "IAMF Demixing Info Parameter Data"; + case AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM: return "IAMF Recon Gain Info Parameter Data"; } return NULL; } diff --git a/libavcodec/packet.h b/libavcodec/packet.h index b19409b719..2c57d262c6 100644 --- a/libavcodec/packet.h +++ b/libavcodec/packet.h @@ -299,6 +299,30 @@ enum AVPacketSideDataType { */ AV_PKT_DATA_DYNAMIC_HDR10_PLUS, + /** + * IAMF Mix Gain Parameter Data associated with the audio frame. This metadata + * is in the form of the AVIAMFParamDefinition struct and contains information + * defined in sections 3.6.1 and 3.8.1 of the Immersive Audio Model and + * Formats standard. + */ + AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, + + /** + * IAMF Demixing Info Parameter Data associated with the audio frame. This + * metadata is in the form of the AVIAMFParamDefinition struct and contains + * information defined in sections 3.6.1 and 3.8.2 of the Immersive Audio Model + * and Formats standard. + */ + AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, + + /** + * IAMF Recon Gain Info Parameter Data associated with the audio frame. This + * metadata is in the form of the AVIAMFParamDefinition struct and contains + * information defined in sections 3.6.1 and 3.8.3 of the Immersive Audio Model + * and Formats standard. + */ + AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, + /** * The number of side data types. * This is not part of the public API/ABI in the sense that it may From patchwork Thu Dec 14 20:14:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45149 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp5209187pzf; Thu, 14 Dec 2023 12:16:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IH1Qff1gPzINK8r06UAPf2Q13N1xIzRmyT3DvWfTdilqFv8NHAhhWOiugWToBs1o60dbCn4 X-Received: by 2002:a17:906:41:b0:a23:b67:cb1b with SMTP id 1-20020a170906004100b00a230b67cb1bmr1038095ejg.17.1702584968743; Thu, 14 Dec 2023 12:16:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702584968; cv=none; d=google.com; s=arc-20160816; b=MFzHs4u7+FEuFCJwwvFJKnjxVJXxoKkJmjmsTjk7wRDiYI+iuRK8iCC3fVZfWoEluM XLAUx1Xaehy5DZ6P1qDDvWERmM4/DP7b3GTHnc9jD3s8F/sCpzEBbqjt5TnE+Fj1fYQs badwRecfYTkyRWmpDIgjfuWdr2dhRjWLpRxw+mDhx4rZbTXb9FzCrfKf+7VkTGvD9bMq U22zzsSc/mivyIw942JUSHh/+MLH8mtYoWmiSiDize4ALU9/4N0x7E/GaIlaNVCzrH85 Kw1Sfk2qATpCg3BVFz5Ty8xFR4/hOgky1TJs0dCobDE/XRS+7S7a8I7JliepF/5tgPYK QLag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=sx12p/fxp4uVYjc5O7mFa/tk5nqzo6SAAWWr3sAQxlY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=N4rxV1ToU3vTn5d2GJmUddbQhk4vPeECzMElm7MqBE8xKpYsT7l+bOH0PMfTJxOxxX gwBvhTli5ThIJoxQ0ib9NpnK2M83nXxWVzUcOgh03Cue6DvjnG+U66Ao4Ey+w4xYBukq w2lCH2/yTmzNTFUwy2J6W8Ii8xcVjsNf/Z/CssXaVPVddphAEpjzTcHCotjwnu66z0N5 DVHmf0Itgs6v7OX4N5xNICAxV5jd3UR96Jng7C3tdl8i0H85ZzQTxx59Tj/RxJtD+rZK irqm3C6kNCi6GFGbiNxQ0eXDE/sMZLqD5kL7Q9L+0dZx6mWwSpJGeQJNI8p5ZSM5Kim+ CdqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=cEre2hyD; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h13-20020a1709063b4d00b00a1db9d31459si6882257ejf.434.2023.12.14.12.16.08; Thu, 14 Dec 2023 12:16:08 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=cEre2hyD; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 07B4A68D2A2; Thu, 14 Dec 2023 22:15:27 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A520068D2B3 for ; Thu, 14 Dec 2023 22:15:19 +0200 (EET) Received: by mail-pg1-f177.google.com with SMTP id 41be03b00d2f7-5ca0b968d8dso2385611a12.1 for ; Thu, 14 Dec 2023 12:15:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702584917; x=1703189717; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QxUS7vp0JdGrHnyRk3Y/VWtRAl3/pjEkbskQJyLXrnU=; b=cEre2hyD9ojGMij0qbQDHRu7PbnvozCZL3oxUK/2eKH8UE52YZGnBngWVrfRZddPkW zNrKycu422Eu5I67kPYvsEX59O0Q26tXyryhK2lij9iOl6MUBwHsTVb2FfrYN0X+nEdD 86zP/BrinS0R9UZqLFweS56vMmrusiqt79WKhycv8cbMrkEoqcQCMzUAKQyCztkBJolv EFCJRSWB8LB0g3zK0yGdeBVtci3qZCna8pMDabSObGSITpAFBQ8yPMVPouLHjrL/ztgL Q/HvD2/edD2kKBg7DxO5orbb2wv+qdUXXgIvcNtaINTsei+nTTw0yCub5Y7Ro35+5zdO X5dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702584917; x=1703189717; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QxUS7vp0JdGrHnyRk3Y/VWtRAl3/pjEkbskQJyLXrnU=; b=bJ6acBZZ7PBJvoCFJrrHbteeNwBCpumxubFa95K05xA3mvvvC8ZS5X/PFduR/ayHpc 4P8+wxuqFfMKJ/eRH80tN9efqR5tT7AwtPo0e3ayJx7nqtXMPSMLZGucNFtT80AnzqSe oSZD09wC2frrZ3KtemgZc1Ix3V4i6pFsEP3r57Zv/dCbD9DsCZwi11Srta5vSsAGtXT0 zGC2yzTNT0uOqIEB2U9J0u2oZuexxA08Nw8e42P2n0nyIkcFMkZBuNU0dZozP3xa4c/O FZA8XMtLDLMYOQ/UMW3dQQRx/v7e/p/sfHnPpwZdAFuoQzwwwHb0G3NMC81sf8dtJnUA 9Jvw== X-Gm-Message-State: AOJu0Yz1191LfJOtYwB1WD8rgHRwg/7ltkeDFNpKgvgMpYnaL7F70/Z+ zd+jmeuTDoUU7fgzXhFl8JrBIKGKsdE= X-Received: by 2002:a05:6a20:da9f:b0:190:ecc9:2a8b with SMTP id iy31-20020a056a20da9f00b00190ecc92a8bmr4810843pzb.42.1702584917062; Thu, 14 Dec 2023 12:15:17 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id h12-20020a62b40c000000b006d0d4bafe31sm3352885pfn.6.2023.12.14.12.15.15 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 12:15:16 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Thu, 14 Dec 2023 17:14:30 -0300 Message-ID: <20231214201433.4608-6-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214201433.4608-1-jamrial@gmail.com> References: <20231214201433.4608-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/8] avcodec/get_bits: add get_leb() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: QKr1YNqvHDRG Signed-off-by: James Almer --- libavcodec/bitstream.h | 2 ++ libavcodec/bitstream_template.h | 23 +++++++++++++++++++++++ libavcodec/get_bits.h | 24 ++++++++++++++++++++++++ 3 files changed, 49 insertions(+) diff --git a/libavcodec/bitstream.h b/libavcodec/bitstream.h index 35b7873b9c..17f8a5da83 100644 --- a/libavcodec/bitstream.h +++ b/libavcodec/bitstream.h @@ -103,6 +103,7 @@ # define bits_apply_sign bits_apply_sign_le # define bits_read_vlc bits_read_vlc_le # define bits_read_vlc_multi bits_read_vlc_multi_le +# define bits_read_leb bits_read_leb_le #elif defined(BITS_DEFAULT_BE) @@ -132,6 +133,7 @@ # define bits_apply_sign bits_apply_sign_be # define bits_read_vlc bits_read_vlc_be # define bits_read_vlc_multi bits_read_vlc_multi_be +# define bits_read_leb bits_read_leb_be #endif diff --git a/libavcodec/bitstream_template.h b/libavcodec/bitstream_template.h index 4f3d07275f..4c7101632f 100644 --- a/libavcodec/bitstream_template.h +++ b/libavcodec/bitstream_template.h @@ -562,6 +562,29 @@ static inline int BS_FUNC(read_vlc_multi)(BSCTX *bc, uint8_t dst[8], return ret; } +/** + * Read a unsigned integer coded as a variable number of up to eight + * little-endian bytes, where the MSB in a byte signals another byte + * must be read. + * Values > UINT_MAX are truncated, but all coded bits are read. + */ +static inline unsigned BS_FUNC(read_leb)(BSCTX *bc) { + int more, i = 0; + unsigned leb = 0; + + do { + int byte = BS_FUNC(read)(bc, 8); + unsigned bits = byte & 0x7f; + more = byte & 0x80; + if (i <= 4) + leb |= bits << (i * 7); + if (++i == 8) + break; + } while (more); + + return leb; +} + #undef BSCTX #undef BS_FUNC #undef BS_JOIN3 diff --git a/libavcodec/get_bits.h b/libavcodec/get_bits.h index cfcf97c021..9e19d2a439 100644 --- a/libavcodec/get_bits.h +++ b/libavcodec/get_bits.h @@ -94,6 +94,7 @@ typedef BitstreamContext GetBitContext; #define align_get_bits bits_align #define get_vlc2 bits_read_vlc #define get_vlc_multi bits_read_vlc_multi +#define get_leb bits_read_leb #define init_get_bits8_le(s, buffer, byte_size) bits_init8_le((BitstreamContextLE*)s, buffer, byte_size) #define get_bits_le(s, n) bits_read_le((BitstreamContextLE*)s, n) @@ -710,6 +711,29 @@ static inline int skip_1stop_8data_bits(GetBitContext *gb) return 0; } +/** + * Read a unsigned integer coded as a variable number of up to eight + * little-endian bytes, where the MSB in a byte signals another byte + * must be read. + * All coded bits are read, but values > UINT_MAX are truncated. + */ +static inline unsigned get_leb(GetBitContext *s) { + int more, i = 0; + unsigned leb = 0; + + do { + int byte = get_bits(s, 8); + unsigned bits = byte & 0x7f; + more = byte & 0x80; + if (i <= 4) + leb |= bits << (i * 7); + if (++i == 8) + break; + } while (more); + + return leb; +} + #endif // CACHED_BITSTREAM_READER #endif /* AVCODEC_GET_BITS_H */ From patchwork Thu Dec 14 20:14:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45150 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp5209277pzf; Thu, 14 Dec 2023 12:16:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IF7Cp24fGki3ih+wi/aBTqC7iMGbkJ3TlTmWoi3KWKTDv4KCGNoOvVJZCSjRAIyNAVWX8C/ X-Received: by 2002:a17:907:d30a:b0:a19:2ca9:8e4d with SMTP id vg10-20020a170907d30a00b00a192ca98e4dmr11034905ejc.2.1702584976966; Thu, 14 Dec 2023 12:16:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702584976; cv=none; d=google.com; s=arc-20160816; b=rcj1NT5Oark7EtZL/PPHJNn9t6rCmkhkZ1kcoVEVoO8HTgoRHFjUjm9aiqX0i/pAum Irdw0fdF27mczRZcXilWXftlsFSxxm4EMoEh0oWcV3bZBi9w66c4zuGwSD3Xf+2u8pBy 3AGlWBlqDbqmwbFH9GG/7p0k/LpcsocOlNGmezYQ9H7ULHj4y8TH7kq9BKCNJLqqkwcg CsUq6+F85gc89pPoZ6okhlgsAhMqH9x6Lq0S7rXilm7Wt3jTgT6E2sRtgjzH57i1MpPN 7F/qqHXfZffuq4Z8budDiGCH9mA2/67rdM8dmGl1Ezc0yvPSEKR7hbEoF6wTHEu/3v5Y JlhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=b/pHH4OKHoqsrRSUlrIZPcrS5WucfyFmg9GXD42G98c=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=nBDd7FBoq109TIssEent70nBfE760SQZYgx8rCW0YnZJB8U6miheAA0T8rKTZ6bzUd J7uoGesWhAai1rz+d79KaNMtR95zpTOK4+AqDvOq9wYVUJjvf3GY1tT5nzfgCYqaazSU s1DEdMkGdkyAnGrke9n8fOph6WhljEkF87Pzpv4T/4w073ImOFrM+rZhNNDOxN3kYsDF st0TyEwHeGU2nJxgTxPx5oL+gT58uSo4OFboHZhOIuJrxqkX3XawclUGXRFJXeBItQRF m854CohqYR+mvjc7CXQxpbvTdx+EMMPpn93L3fWu2odpuC0aPI6ATcB3shVmDfY61n4b QaAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=kBMaTJCt; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id p18-20020a1709060dd200b00a1cd50fd711si6694469eji.547.2023.12.14.12.16.16; Thu, 14 Dec 2023 12:16:16 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=kBMaTJCt; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1C83F68D2C9; Thu, 14 Dec 2023 22:15:29 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 7C9B968D29E for ; Thu, 14 Dec 2023 22:15:20 +0200 (EET) Received: by mail-pg1-f182.google.com with SMTP id 41be03b00d2f7-5c21e185df5so7374774a12.1 for ; Thu, 14 Dec 2023 12:15:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702584918; x=1703189718; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=mSE6cPNs1R3AaCBlo94VrNHgs5ToTsW0px4hqwFDXSQ=; b=kBMaTJCtLgxVdQd7TlgWOeVbKFfk3/Dq6y2G9nW5mncnFornxg/T16q0ZfeT8mIHfb wLGJ0hNj3hmsLI5QZrtMgW7+CmfahXAs3UTOSiWDEU60V1NYdFWOCophR7j8NzfI/4p1 IjNLtm1+bpHemdnL+IOgui0UiG6Haach8jJ/y1gSSUlz8uMZvsEe0bndqQi9l5Ib3LwO fuwhqxO51nWJCpeZayryrToBezTz3WBM2LzPgND7jMnmf3FJTDVDsFSEKyfYK4AEIvC8 /oGh/ip1Zidsx0eF3DeWdZD6mtInAqpMvUjtlBw6LrSNQC8GvDI+kyO7Avnz02Bvv98W 9bgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702584918; x=1703189718; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mSE6cPNs1R3AaCBlo94VrNHgs5ToTsW0px4hqwFDXSQ=; b=E1KFiGfMedM4guC8m1w02kbPAjeEju2uLDA6NhdfC+F0FU+dg5II6DspCcYkib5bMr PDY0HXW+NblfkLnsS/k994+orl0hEUwEgFaRcHYFeuS04ggJuFPRkxta+HB0+PEBwLJJ sqqGiuJO0XGq8/xN77/skvi2NMr7pUBCsNUTNXtNveAgW2VET+TZYpQdr31eLrAdv6r5 wkFcB8x8EoSRMIpz0o6MLH8SosW3t95Ex1kDpvJJNr3f5XTKKQxALss9DfhIX4iG2M1O HCPXcA/MLiFZzKmeWR7XnkMeb7USVp6wtygMfUypI6vCN2K0w4rbky5IMkbOvxCkRxiQ Rm+w== X-Gm-Message-State: AOJu0YygfmOe1Sb6jkExgMx1/t5VenM16ARtSY4NqbjIfUNuUKE8p39X Kdn3X1m261tzLOr60cWvWX7ATHcjUc4= X-Received: by 2002:a05:6a21:18d:b0:190:8491:ac38 with SMTP id le13-20020a056a21018d00b001908491ac38mr14379559pzb.106.1702584918481; Thu, 14 Dec 2023 12:15:18 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id h12-20020a62b40c000000b006d0d4bafe31sm3352885pfn.6.2023.12.14.12.15.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 12:15:18 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Thu, 14 Dec 2023 17:14:31 -0300 Message-ID: <20231214201433.4608-7-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214201433.4608-1-jamrial@gmail.com> References: <20231214201433.4608-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 6/8] avformat/aviobuf: add ffio_read_leb() and ffio_write_leb() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: J/a0Rheiv9ZA Signed-off-by: James Almer --- libavformat/avio_internal.h | 10 ++++++++++ libavformat/aviobuf.c | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) diff --git a/libavformat/avio_internal.h b/libavformat/avio_internal.h index bd58499b64..f2e4ff30cb 100644 --- a/libavformat/avio_internal.h +++ b/libavformat/avio_internal.h @@ -146,6 +146,16 @@ int ffio_rewind_with_probe_data(AVIOContext *s, unsigned char **buf, int buf_siz uint64_t ffio_read_varlen(AVIOContext *bc); +/** + * Read a unsigned integer coded as a variable number of up to eight + * little-endian bytes, where the MSB in a byte signals another byte + * must be read. + * All coded bytes are read, but values > UINT_MAX are truncated. + */ +unsigned int ffio_read_leb(AVIOContext *s); + +void ffio_write_leb(AVIOContext *s, unsigned val); + /** * Read size bytes from AVIOContext into buf. * Check that exactly size bytes have been read. diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c index 2899c75521..5a329ce465 100644 --- a/libavformat/aviobuf.c +++ b/libavformat/aviobuf.c @@ -971,6 +971,39 @@ uint64_t ffio_read_varlen(AVIOContext *bc){ return val; } +unsigned int ffio_read_leb(AVIOContext *s) { + int more, i = 0; + unsigned leb = 0; + + do { + int byte = avio_r8(s); + unsigned bits = byte & 0x7f; + more = byte & 0x80; + if (i <= 4) + leb |= bits << (i * 7); + if (++i == 8) + break; + } while (more); + + return leb; +} + +void ffio_write_leb(AVIOContext *s, unsigned val) +{ + int len; + uint8_t byte; + + len = (av_log2(val) + 7) / 7; + + for (int i = 0; i < len; i++) { + byte = val >> (7 * i) & 0x7f; + if (i < len - 1) + byte |= 0x80; + + avio_w8(s, byte); + } +} + int ffio_fdopen(AVIOContext **s, URLContext *h) { uint8_t *buffer = NULL; From patchwork Thu Dec 14 20:14:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45152 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp5209466pzf; Thu, 14 Dec 2023 12:16:36 -0800 (PST) X-Google-Smtp-Source: AGHT+IH04TWxtbWkye7GjdYfjdX5KXB8biz99xVHr1VxJFd+T3MGDbC2vz6iD4AQS6m7kbqCv1TH X-Received: by 2002:a17:907:c317:b0:a1d:5c72:3be2 with SMTP id tl23-20020a170907c31700b00a1d5c723be2mr10924707ejc.7.1702584996566; Thu, 14 Dec 2023 12:16:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702584996; cv=none; d=google.com; s=arc-20160816; b=DmPNFyY02VqnGdYKT9yyRkw9mkGs2VkQWjI41o3woV0U/1gnl1L4F3dRv60JBpzQyF Wg1CgqpyYTZieW//fNumNs8kWoXk2wa2tpm2sTpMLA49YSupzXGiwdS41qP+kR/wGVsg F6XuI9KhI7kdoPlYIOFfPb3jGTXZcr8oj9fWLn5hz4gWTECFIbpY6Mo5eB5vRh+umgS0 O7jnHLW3kiJ1Jm5NT2LmkqUetR0/JPTs+GmoFXt8sAnImm9yoE723XQpyv+4rcAleDRI TlF9TQRvRXLcoEKTfva6+Sl2Ka3qfKQHz9XPSwGqK0CI+Ngc7gexddJerrLSNBFSO81I BJHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=49PlBUDQz5WtHzqgncf/kuWV3PNri8K5UoCfOmuS8us=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=aB2JBOCaU/KHZvwnw976AoIRwPMGI8tAbf4zgBl9drRwqfICGLnf11bmoPBRNCSK4A wwTBqVHF5ZxyiUjAod0XU9JUG9lKg0hhieV3HkuKfW2ORC8qQWK07b4PRQN03oYU0Nd2 uarlAo4cbgIn8m4EcazemoLbsfWdnWLN3ptFy16ZLQY/TfWD2K1aKhiqb7fA9mDUTG4k ceYYf/QxkdXHVUoNO+kro2bLwuOqHoghbvNBwK3XmipCGRgwn57VV98+KD4FQSH68ZAb ZfZXTNfri4KpBliEd6v5f4SfR95HNlAl46FT3iyCkqvRToi7xDITBtNOJnwjzUUxb31d nzow== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=RcdNM1J2; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id le6-20020a170907170600b00a0325fd365bsi6832237ejc.775.2023.12.14.12.16.36; Thu, 14 Dec 2023 12:16:36 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=RcdNM1J2; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 693D268D2D2; Thu, 14 Dec 2023 22:15:31 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 124FC68D2C3 for ; Thu, 14 Dec 2023 22:15:23 +0200 (EET) Received: by mail-pg1-f177.google.com with SMTP id 41be03b00d2f7-5ca29c131ebso3640553a12.0 for ; Thu, 14 Dec 2023 12:15:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702584921; x=1703189721; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=EDZIHjTdcJ2hlL7seZyCtFg+gCW2OU6Tuk+XF53MZSI=; b=RcdNM1J2P3mUeGXpA1qIXBj0ma43UZPB+4YE9+b9lvHYIGojI/LViTbkJ0ROlc74hN 7USashyZjKi+ks8r/4v0h1KSNtdRw3Q7lr8D1S9Ez7hDiY6TsDPqt8KWx2HXEYTvFPR4 wfVmeom4IDpIoVKmWdTVF4x/Rcd/e/SPHozECD5jUgUR813D6BzhLi2sNb6Vs6Qvzrse 1KkEb5iyACgrwqT0GpwQntDtOd6P4YUBSeYx8++r3JrjnwT9k8tdBGWctBzvBMW7yNHS ioGeNqzqRzExAOHtqHPyeM1DLxFqH1JuDl94KDEo66G5hTlCVL8Ksqeqc8U7kMAIxTdY ZvEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702584921; x=1703189721; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EDZIHjTdcJ2hlL7seZyCtFg+gCW2OU6Tuk+XF53MZSI=; b=vSck5L2MXP4raYv6a71CG5TPwbyj/utjeercXgyfVxHtGIDWqlGfGWPlQBKCPJUnSP y29j4fz72OI44o4+xq/NMyH2aKq87VaVEtjfjI+svDKS2wSAX0mnUNnHF2NOylVlccHd FfKLzRiL3fIs10puyic1eGCNYA2d6avG8x0j3YsC/+ZLUq/lqnItm4ikU5TrqnoOUOgG BWY7dwuAxPZGZ+jUh2kHOMLSt3B5qOp/Wf/otKvOFCB1KPb+I8TgRTLyakmnU9NiDBGY /kntjR3hPT7ZAd/tjAaXmy4zwlnzvcPzFhUcdYlmztCWwTmHtBTAlEoiszgu/sRmJgaP XDnA== X-Gm-Message-State: AOJu0YxIpEcx0v4v5e5SM1swoxEOdq2vSAcyR5q8z2B3ux3tgIZ4w2RA u9fMW5D8TddeXeWdqX9JQjJDFoR2NKE= X-Received: by 2002:a05:6a21:a589:b0:18f:d827:bcf9 with SMTP id gd9-20020a056a21a58900b0018fd827bcf9mr15156217pzc.80.1702584920270; Thu, 14 Dec 2023 12:15:20 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id h12-20020a62b40c000000b006d0d4bafe31sm3352885pfn.6.2023.12.14.12.15.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 12:15:19 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Thu, 14 Dec 2023 17:14:32 -0300 Message-ID: <20231214201433.4608-8-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214201433.4608-1-jamrial@gmail.com> References: <20231214201433.4608-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 7/8] avformat: Immersive Audio Model and Formats demuxer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: M1HI6U3CzeS5 Signed-off-by: James Almer --- libavformat/Makefile | 1 + libavformat/allformats.c | 1 + libavformat/iamf.c | 125 +++++ libavformat/iamf.h | 163 ++++++ libavformat/iamf_parse.c | 1106 ++++++++++++++++++++++++++++++++++++++ libavformat/iamf_parse.h | 38 ++ libavformat/iamfdec.c | 503 +++++++++++++++++ 7 files changed, 1937 insertions(+) create mode 100644 libavformat/iamf.c create mode 100644 libavformat/iamf.h create mode 100644 libavformat/iamf_parse.c create mode 100644 libavformat/iamf_parse.h create mode 100644 libavformat/iamfdec.c diff --git a/libavformat/Makefile b/libavformat/Makefile index 2db83aff81..f23c22792b 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -258,6 +258,7 @@ OBJS-$(CONFIG_EVC_MUXER) += rawenc.o OBJS-$(CONFIG_HLS_DEMUXER) += hls.o hls_sample_encryption.o OBJS-$(CONFIG_HLS_MUXER) += hlsenc.o hlsplaylist.o avc.o OBJS-$(CONFIG_HNM_DEMUXER) += hnm.o +OBJS-$(CONFIG_IAMF_DEMUXER) += iamfdec.o iamf_parse.o iamf.o OBJS-$(CONFIG_ICO_DEMUXER) += icodec.o OBJS-$(CONFIG_ICO_MUXER) += icoenc.o OBJS-$(CONFIG_IDCIN_DEMUXER) += idcin.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index c8bb4e3866..6e520b78a6 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -212,6 +212,7 @@ extern const FFOutputFormat ff_hevc_muxer; extern const AVInputFormat ff_hls_demuxer; extern const FFOutputFormat ff_hls_muxer; extern const AVInputFormat ff_hnm_demuxer; +extern const AVInputFormat ff_iamf_demuxer; extern const AVInputFormat ff_ico_demuxer; extern const FFOutputFormat ff_ico_muxer; extern const AVInputFormat ff_idcin_demuxer; diff --git a/libavformat/iamf.c b/libavformat/iamf.c new file mode 100644 index 0000000000..5de70dc082 --- /dev/null +++ b/libavformat/iamf.c @@ -0,0 +1,125 @@ +/* + * Immersive Audio Model and Formats common helpers and structs + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/channel_layout.h" +#include "libavutil/iamf.h" +#include "libavutil/mem.h" +#include "iamf.h" + +const AVChannelLayout ff_iamf_scalable_ch_layouts[10] = { + AV_CHANNEL_LAYOUT_MONO, + AV_CHANNEL_LAYOUT_STEREO, + // "Loudspeaker configuration for Sound System B" + AV_CHANNEL_LAYOUT_5POINT1_BACK, + // "Loudspeaker configuration for Sound System C" + AV_CHANNEL_LAYOUT_5POINT1POINT2_BACK, + // "Loudspeaker configuration for Sound System D" + AV_CHANNEL_LAYOUT_5POINT1POINT4_BACK, + // "Loudspeaker configuration for Sound System I" + AV_CHANNEL_LAYOUT_7POINT1, + // "Loudspeaker configuration for Sound System I" + Ltf + Rtf + AV_CHANNEL_LAYOUT_7POINT1POINT2, + // "Loudspeaker configuration for Sound System J" + AV_CHANNEL_LAYOUT_7POINT1POINT4_BACK, + // Front subset of "Loudspeaker configuration for Sound System J" + AV_CHANNEL_LAYOUT_3POINT1POINT2, + // Binaural + AV_CHANNEL_LAYOUT_STEREO, +}; + +const struct IAMFSoundSystemMap ff_iamf_sound_system_map[13] = { + { SOUND_SYSTEM_A_0_2_0, AV_CHANNEL_LAYOUT_STEREO }, + { SOUND_SYSTEM_B_0_5_0, AV_CHANNEL_LAYOUT_5POINT1_BACK }, + { SOUND_SYSTEM_C_2_5_0, AV_CHANNEL_LAYOUT_5POINT1POINT2_BACK }, + { SOUND_SYSTEM_D_4_5_0, AV_CHANNEL_LAYOUT_5POINT1POINT4_BACK }, + { SOUND_SYSTEM_E_4_5_1, + { + .nb_channels = 11, + .order = AV_CHANNEL_ORDER_NATIVE, + .u.mask = AV_CH_LAYOUT_5POINT1POINT4_BACK | AV_CH_BOTTOM_FRONT_CENTER, + }, + }, + { SOUND_SYSTEM_F_3_7_0, AV_CHANNEL_LAYOUT_7POINT2POINT3 }, + { SOUND_SYSTEM_G_4_9_0, AV_CHANNEL_LAYOUT_9POINT1POINT4_BACK }, + { SOUND_SYSTEM_H_9_10_3, AV_CHANNEL_LAYOUT_22POINT2 }, + { SOUND_SYSTEM_I_0_7_0, AV_CHANNEL_LAYOUT_7POINT1 }, + { SOUND_SYSTEM_J_4_7_0, AV_CHANNEL_LAYOUT_7POINT1POINT4_BACK }, + { SOUND_SYSTEM_10_2_7_0, AV_CHANNEL_LAYOUT_7POINT1POINT2 }, + { SOUND_SYSTEM_11_2_3_0, AV_CHANNEL_LAYOUT_3POINT1POINT2 }, + { SOUND_SYSTEM_12_0_1_0, AV_CHANNEL_LAYOUT_MONO }, +}; + +void ff_iamf_free_audio_element(IAMFAudioElement **paudio_element) +{ + IAMFAudioElement *audio_element = *paudio_element; + + if (!audio_element) + return; + + for (int i = 0; i < audio_element->nb_substreams; i++) + avcodec_parameters_free(&audio_element->substreams[i].codecpar); + av_free(audio_element->substreams); + av_free(audio_element->layers); + av_iamf_audio_element_free(&audio_element->element); + av_freep(paudio_element); +} + +void ff_iamf_free_mix_presentation(IAMFMixPresentation **pmix_presentation) +{ + IAMFMixPresentation *mix_presentation = *pmix_presentation; + + if (!mix_presentation) + return; + + for (int i = 0; i < mix_presentation->count_label; i++) + av_free(mix_presentation->language_label[i]); + av_free(mix_presentation->language_label); + av_iamf_mix_presentation_free(&mix_presentation->mix); + av_freep(pmix_presentation); +} + +void ff_iamf_uninit_context(IAMFContext *c) +{ + if (!c) + return; + + for (int i = 0; i < c->nb_codec_configs; i++) { + av_free(c->codec_configs[i]->extradata); + av_free(c->codec_configs[i]); + } + av_freep(&c->codec_configs); + c->nb_codec_configs = 0; + + for (int i = 0; i < c->nb_audio_elements; i++) + ff_iamf_free_audio_element(&c->audio_elements[i]); + av_freep(&c->audio_elements); + c->nb_audio_elements = 0; + + for (int i = 0; i < c->nb_mix_presentations; i++) + ff_iamf_free_mix_presentation(&c->mix_presentations[i]); + av_freep(&c->mix_presentations); + c->nb_mix_presentations = 0; + + for (int i = 0; i < c->nb_param_definitions; i++) + av_free(c->param_definitions[i]); + av_freep(&c->param_definitions); + c->nb_param_definitions = 0; +} diff --git a/libavformat/iamf.h b/libavformat/iamf.h new file mode 100644 index 0000000000..ce94cb5bc4 --- /dev/null +++ b/libavformat/iamf.h @@ -0,0 +1,163 @@ +/* + * Immersive Audio Model and Formats common helpers and structs + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFORMAT_IAMF_H +#define AVFORMAT_IAMF_H + +#include + +#include "libavutil/channel_layout.h" +#include "libavutil/iamf.h" +#include "libavcodec/codec_id.h" +#include "libavcodec/codec_par.h" +#include "avformat.h" + +#define MAX_IAMF_OBU_HEADER_SIZE (1 + 8 * 3) + +// OBU types (section 3.2). +enum IAMF_OBU_Type { + IAMF_OBU_IA_CODEC_CONFIG = 0, + IAMF_OBU_IA_AUDIO_ELEMENT = 1, + IAMF_OBU_IA_MIX_PRESENTATION = 2, + IAMF_OBU_IA_PARAMETER_BLOCK = 3, + IAMF_OBU_IA_TEMPORAL_DELIMITER = 4, + IAMF_OBU_IA_AUDIO_FRAME = 5, + IAMF_OBU_IA_AUDIO_FRAME_ID0 = 6, + IAMF_OBU_IA_AUDIO_FRAME_ID1 = 7, + IAMF_OBU_IA_AUDIO_FRAME_ID2 = 8, + IAMF_OBU_IA_AUDIO_FRAME_ID3 = 9, + IAMF_OBU_IA_AUDIO_FRAME_ID4 = 10, + IAMF_OBU_IA_AUDIO_FRAME_ID5 = 11, + IAMF_OBU_IA_AUDIO_FRAME_ID6 = 12, + IAMF_OBU_IA_AUDIO_FRAME_ID7 = 13, + IAMF_OBU_IA_AUDIO_FRAME_ID8 = 14, + IAMF_OBU_IA_AUDIO_FRAME_ID9 = 15, + IAMF_OBU_IA_AUDIO_FRAME_ID10 = 16, + IAMF_OBU_IA_AUDIO_FRAME_ID11 = 17, + IAMF_OBU_IA_AUDIO_FRAME_ID12 = 18, + IAMF_OBU_IA_AUDIO_FRAME_ID13 = 19, + IAMF_OBU_IA_AUDIO_FRAME_ID14 = 20, + IAMF_OBU_IA_AUDIO_FRAME_ID15 = 21, + IAMF_OBU_IA_AUDIO_FRAME_ID16 = 22, + IAMF_OBU_IA_AUDIO_FRAME_ID17 = 23, + // 24~30 reserved. + IAMF_OBU_IA_SEQUENCE_HEADER = 31, +}; + +typedef struct IAMFCodecConfig { + unsigned codec_config_id; + enum AVCodecID codec_id; + uint32_t codec_tag; + unsigned nb_samples; + int seek_preroll; + int sample_rate; + int extradata_size; + uint8_t *extradata; +} IAMFCodecConfig; + +typedef struct IAMFLayer { + unsigned int substream_count; + unsigned int coupled_substream_count; +} IAMFLayer; + +typedef struct IAMFSubStream { + unsigned int audio_substream_id; + + // demux + AVCodecParameters *codecpar; +} IAMFSubStream; + +typedef struct IAMFAudioElement { + AVIAMFAudioElement *element; + unsigned int audio_element_id; + + IAMFSubStream *substreams; + unsigned int nb_substreams; + + unsigned int codec_config_id; + + // mux + IAMFLayer *layers; + unsigned int nb_layers; +} IAMFAudioElement; + +typedef struct IAMFMixPresentation { + AVIAMFMixPresentation *mix; + unsigned int mix_presentation_id; + + // demux + unsigned int count_label; + char **language_label; +} IAMFMixPresentation; + +typedef struct IAMFParamDefinition { + const IAMFAudioElement *audio_element; + AVIAMFParamDefinition *param; + int mode; + size_t param_size; +} IAMFParamDefinition; + +typedef struct IAMFContext { + IAMFCodecConfig **codec_configs; + int nb_codec_configs; + IAMFAudioElement **audio_elements; + int nb_audio_elements; + IAMFMixPresentation **mix_presentations; + int nb_mix_presentations; + IAMFParamDefinition **param_definitions; + int nb_param_definitions; +} IAMFContext; + +enum IAMF_Anchor_Element { + IAMF_ANCHOR_ELEMENT_UNKNWONW, + IAMF_ANCHOR_ELEMENT_DIALOGUE, + IAMF_ANCHOR_ELEMENT_ALBUM, +}; + +enum IAMF_Sound_System { + SOUND_SYSTEM_A_0_2_0 = 0, // "Loudspeaker configuration for Sound System A" + SOUND_SYSTEM_B_0_5_0 = 1, // "Loudspeaker configuration for Sound System B" + SOUND_SYSTEM_C_2_5_0 = 2, // "Loudspeaker configuration for Sound System C" + SOUND_SYSTEM_D_4_5_0 = 3, // "Loudspeaker configuration for Sound System D" + SOUND_SYSTEM_E_4_5_1 = 4, // "Loudspeaker configuration for Sound System E" + SOUND_SYSTEM_F_3_7_0 = 5, // "Loudspeaker configuration for Sound System F" + SOUND_SYSTEM_G_4_9_0 = 6, // "Loudspeaker configuration for Sound System G" + SOUND_SYSTEM_H_9_10_3 = 7, // "Loudspeaker configuration for Sound System H" + SOUND_SYSTEM_I_0_7_0 = 8, // "Loudspeaker configuration for Sound System I" + SOUND_SYSTEM_J_4_7_0 = 9, // "Loudspeaker configuration for Sound System J" + SOUND_SYSTEM_10_2_7_0 = 10, // "Loudspeaker configuration for Sound System I" + Ltf + Rtf + SOUND_SYSTEM_11_2_3_0 = 11, // Front subset of "Loudspeaker configuration for Sound System J" + SOUND_SYSTEM_12_0_1_0 = 12, // Mono +}; + +struct IAMFSoundSystemMap { + enum IAMF_Sound_System id; + AVChannelLayout layout; +}; + +extern const AVChannelLayout ff_iamf_scalable_ch_layouts[10]; +extern const struct IAMFSoundSystemMap ff_iamf_sound_system_map[13]; + +void ff_iamf_free_audio_element(IAMFAudioElement **paudio_element); +void ff_iamf_free_mix_presentation(IAMFMixPresentation **pmix_presentation); +void ff_iamf_uninit_context(IAMFContext *c); + +#endif /* AVFORMAT_IAMF_H */ diff --git a/libavformat/iamf_parse.c b/libavformat/iamf_parse.c new file mode 100644 index 0000000000..60305743f9 --- /dev/null +++ b/libavformat/iamf_parse.c @@ -0,0 +1,1106 @@ +/* + * Immersive Audio Model and Formats parsing + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/avassert.h" +#include "libavutil/common.h" +#include "libavutil/iamf.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/log.h" +#include "libavcodec/get_bits.h" +#include "libavcodec/flac.h" +#include "libavcodec/mpeg4audio.h" +#include "libavcodec/put_bits.h" +#include "avio_internal.h" +#include "iamf_parse.h" +#include "isom.h" + +static int opus_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len) +{ + int left = len - avio_tell(pb); + + if (left < 11) + return AVERROR_INVALIDDATA; + + codec_config->extradata = av_malloc(left + 8); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + + AV_WB32(codec_config->extradata, MKBETAG('O','p','u','s')); + AV_WB32(codec_config->extradata + 4, MKBETAG('H','e','a','d')); + codec_config->extradata_size = avio_read(pb, codec_config->extradata + 8, left); + if (codec_config->extradata_size < left) + return AVERROR_INVALIDDATA; + + codec_config->extradata_size += 8; + codec_config->sample_rate = 48000; + + return 0; +} + +static int aac_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len, void *logctx) +{ + MPEG4AudioConfig cfg = { 0 }; + int object_type_id, codec_id, stream_type; + int ret, tag, left; + + tag = avio_r8(pb); + if (tag != MP4DecConfigDescrTag) + return AVERROR_INVALIDDATA; + + object_type_id = avio_r8(pb); + if (object_type_id != 0x40) + return AVERROR_INVALIDDATA; + + stream_type = avio_r8(pb); + if (((stream_type >> 2) != 5) || ((stream_type >> 1) & 1)) + return AVERROR_INVALIDDATA; + + avio_skip(pb, 3); // buffer size db + avio_skip(pb, 4); // rc_max_rate + avio_skip(pb, 4); // avg bitrate + + codec_id = ff_codec_get_id(ff_mp4_obj_type, object_type_id); + if (codec_id && codec_id != codec_config->codec_id) + return AVERROR_INVALIDDATA; + + tag = avio_r8(pb); + if (tag != MP4DecSpecificDescrTag) + return AVERROR_INVALIDDATA; + + left = len - avio_tell(pb); + if (left <= 0) + return AVERROR_INVALIDDATA; + + codec_config->extradata = av_malloc(left); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + + codec_config->extradata_size = avio_read(pb, codec_config->extradata, left); + if (codec_config->extradata_size < left) + return AVERROR_INVALIDDATA; + + ret = avpriv_mpeg4audio_get_config2(&cfg, codec_config->extradata, + codec_config->extradata_size, 1, logctx); + if (ret < 0) + return ret; + + codec_config->sample_rate = cfg.sample_rate; + + return 0; +} + +static int flac_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len) +{ + int left; + + avio_skip(pb, 4); // METADATA_BLOCK_HEADER + + left = len - avio_tell(pb); + if (left < FLAC_STREAMINFO_SIZE) + return AVERROR_INVALIDDATA; + + codec_config->extradata = av_malloc(left); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + + codec_config->extradata_size = avio_read(pb, codec_config->extradata, left); + if (codec_config->extradata_size < left) + return AVERROR_INVALIDDATA; + + codec_config->sample_rate = AV_RB24(codec_config->extradata + 10) >> 4; + + return 0; +} + +static int ipcm_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len) +{ + static const enum AVSampleFormat sample_fmt[2][3] = { + { AV_CODEC_ID_PCM_S16BE, AV_CODEC_ID_PCM_S24BE, AV_CODEC_ID_PCM_S32BE }, + { AV_CODEC_ID_PCM_S16LE, AV_CODEC_ID_PCM_S24LE, AV_CODEC_ID_PCM_S32LE }, + }; + int sample_format = avio_r8(pb); // 0 = BE, 1 = LE + int sample_size = (avio_r8(pb) / 8 - 2); // 16, 24, 32 + if (sample_format > 1 || sample_size > 2) + return AVERROR_INVALIDDATA; + + codec_config->codec_id = sample_fmt[sample_format][sample_size]; + codec_config->sample_rate = avio_rb32(pb); + + if (len - avio_tell(pb)) + return AVERROR_INVALIDDATA; + + return 0; +} + +static int codec_config_obu(void *s, IAMFContext *c, AVIOContext *pb, int len) +{ + IAMFCodecConfig **tmp, *codec_config = NULL; + FFIOContext b; + AVIOContext *pbc; + uint8_t *buf; + enum AVCodecID avcodec_id; + unsigned codec_config_id, nb_samples, codec_id; + int16_t seek_preroll; + int ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pbc = &b.pub; + + codec_config_id = ffio_read_leb(pbc); + codec_id = avio_rb32(pbc); + nb_samples = ffio_read_leb(pbc); + seek_preroll = avio_rb16(pbc); + + switch(codec_id) { + case MKBETAG('O','p','u','s'): + avcodec_id = AV_CODEC_ID_OPUS; + break; + case MKBETAG('m','p','4','a'): + avcodec_id = AV_CODEC_ID_AAC; + break; + case MKBETAG('f','L','a','C'): + avcodec_id = AV_CODEC_ID_FLAC; + break; + default: + avcodec_id = AV_CODEC_ID_NONE; + break; + } + + for (int i = 0; i < c->nb_codec_configs; i++) + if (c->codec_configs[i]->codec_config_id == codec_config_id) { + ret = AVERROR_INVALIDDATA; + goto fail; + } + + tmp = av_realloc_array(c->codec_configs, c->nb_codec_configs + 1, sizeof(*c->codec_configs)); + if (!tmp) { + ret = AVERROR(ENOMEM); + goto fail; + } + c->codec_configs = tmp; + + codec_config = av_mallocz(sizeof(*codec_config)); + if (!codec_config) { + ret = AVERROR(ENOMEM); + goto fail; + } + + codec_config->codec_config_id = codec_config_id; + codec_config->codec_id = avcodec_id; + codec_config->nb_samples = nb_samples; + codec_config->seek_preroll = seek_preroll; + + switch(codec_id) { + case MKBETAG('O','p','u','s'): + ret = opus_decoder_config(codec_config, pbc, len); + break; + case MKBETAG('m','p','4','a'): + ret = aac_decoder_config(codec_config, pbc, len, s); + break; + case MKBETAG('f','L','a','C'): + ret = flac_decoder_config(codec_config, pbc, len); + break; + case MKBETAG('i','p','c','m'): + ret = ipcm_decoder_config(codec_config, pbc, len); + break; + default: + break; + } + if (ret < 0) + goto fail; + + c->codec_configs[c->nb_codec_configs++] = codec_config; + + len -= avio_tell(pbc); + if (len) + av_log(s, AV_LOG_WARNING, "Underread in codec_config_obu. %d bytes left at the end\n", len); + + ret = 0; +fail: + av_free(buf); + if (ret < 0) { + if (codec_config) + av_free(codec_config->extradata); + av_free(codec_config); + } + return ret; +} + +static int update_extradata(AVCodecParameters *codecpar) +{ + GetBitContext gb; + PutBitContext pb; + int ret; + + switch(codecpar->codec_id) { + case AV_CODEC_ID_OPUS: + AV_WB8(codecpar->extradata + 9, codecpar->ch_layout.nb_channels); + break; + case AV_CODEC_ID_AAC: { + uint8_t buf[5]; + + init_put_bits(&pb, buf, sizeof(buf)); + ret = init_get_bits8(&gb, codecpar->extradata, codecpar->extradata_size); + if (ret < 0) + return ret; + + ret = get_bits(&gb, 5); + put_bits(&pb, 5, ret); + if (ret == AOT_ESCAPE) // violates section 3.11.2, but better check for it + put_bits(&pb, 6, get_bits(&gb, 6)); + ret = get_bits(&gb, 4); + put_bits(&pb, 4, ret); + if (ret == 0x0f) + put_bits(&pb, 24, get_bits(&gb, 24)); + + skip_bits(&gb, 4); + put_bits(&pb, 4, codecpar->ch_layout.nb_channels); // set channel config + ret = put_bits_left(&pb); + put_bits(&pb, ret, get_bits(&gb, ret)); + flush_put_bits(&pb); + + memcpy(codecpar->extradata, buf, sizeof(buf)); + break; + } + case AV_CODEC_ID_FLAC: { + uint8_t buf[13]; + + init_put_bits(&pb, buf, sizeof(buf)); + ret = init_get_bits8(&gb, codecpar->extradata, codecpar->extradata_size); + if (ret < 0) + return ret; + + put_bits32(&pb, get_bits_long(&gb, 32)); // min/max blocksize + put_bits64(&pb, 48, get_bits64(&gb, 48)); // min/max framesize + put_bits(&pb, 20, get_bits(&gb, 20)); // samplerate + skip_bits(&gb, 3); + put_bits(&pb, 3, codecpar->ch_layout.nb_channels - 1); + ret = put_bits_left(&pb); + put_bits(&pb, ret, get_bits(&gb, ret)); + flush_put_bits(&pb); + + memcpy(codecpar->extradata, buf, sizeof(buf)); + break; + } + } + + return 0; +} + +static int scalable_channel_layout_config(void *s, AVIOContext *pb, + IAMFAudioElement *audio_element, + const IAMFCodecConfig *codec_config) +{ + int nb_layers, k = 0; + + nb_layers = avio_r8(pb) >> 5; // get_bits(&gb, 3); + // skip_bits(&gb, 5); //reserved + + if (nb_layers > 6) + return AVERROR_INVALIDDATA; + + for (int i = 0; i < nb_layers; i++) { + AVIAMFLayer *layer; + int loudspeaker_layout, output_gain_is_present_flag; + int substream_count, coupled_substream_count; + int ret, byte = avio_r8(pb); + + layer = av_iamf_audio_element_add_layer(audio_element->element); + if (!layer) + return AVERROR(ENOMEM); + + loudspeaker_layout = byte >> 4; // get_bits(&gb, 4); + output_gain_is_present_flag = (byte >> 3) & 1; //get_bits1(&gb); + if ((byte >> 2) & 1) + layer->flags |= AV_IAMF_LAYER_FLAG_RECON_GAIN; + substream_count = avio_r8(pb); + coupled_substream_count = avio_r8(pb); + + if (output_gain_is_present_flag) { + layer->output_gain_flags = avio_r8(pb) >> 2; // get_bits(&gb, 6); + layer->output_gain = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + } + + if (loudspeaker_layout < 10) + av_channel_layout_copy(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[loudspeaker_layout]); + else + layer->ch_layout = (AVChannelLayout){ .order = AV_CHANNEL_ORDER_UNSPEC, + .nb_channels = substream_count + + coupled_substream_count }; + + for (int j = 0; j < substream_count; j++) { + IAMFSubStream *substream = &audio_element->substreams[k++]; + + substream->codecpar->ch_layout = coupled_substream_count-- > 0 ? (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO : + (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO; + + ret = update_extradata(substream->codecpar); + if (ret < 0) + return ret; + } + + } + + return 0; +} + +static int ambisonics_config(void *s, AVIOContext *pb, + IAMFAudioElement *audio_element, + const IAMFCodecConfig *codec_config) +{ + AVIAMFLayer *layer; + unsigned ambisonics_mode; + int output_channel_count, substream_count, order; + int ret; + + ambisonics_mode = ffio_read_leb(pb); + if (ambisonics_mode > 1) + return 0; + + output_channel_count = avio_r8(pb); // C + substream_count = avio_r8(pb); // N + if (audio_element->nb_substreams != substream_count) + return AVERROR_INVALIDDATA; + + order = floor(sqrt(output_channel_count - 1)); + /* incomplete order - some harmonics are missing */ + if ((order + 1) * (order + 1) != output_channel_count) + return AVERROR_INVALIDDATA; + + layer = av_iamf_audio_element_add_layer(audio_element->element); + if (!layer) + return AVERROR(ENOMEM); + + layer->ambisonics_mode = ambisonics_mode; + if (ambisonics_mode == 0) { + for (int i = 0; i < substream_count; i++) { + IAMFSubStream *substream = &audio_element->substreams[i]; + + substream->codecpar->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO; + + ret = update_extradata(substream->codecpar); + if (ret < 0) + return ret; + } + + layer->ch_layout.order = AV_CHANNEL_ORDER_CUSTOM; + layer->ch_layout.nb_channels = output_channel_count; + layer->ch_layout.u.map = av_calloc(output_channel_count, sizeof(*layer->ch_layout.u.map)); + if (!layer->ch_layout.u.map) + return AVERROR(ENOMEM); + + for (int i = 0; i < output_channel_count; i++) + layer->ch_layout.u.map[i].id = avio_r8(pb) + AV_CHAN_AMBISONIC_BASE; + } else { + int coupled_substream_count = avio_r8(pb); // M + int nb_demixing_matrix = substream_count + coupled_substream_count; + int demixing_matrix_size = nb_demixing_matrix * output_channel_count; + + layer->ch_layout = (AVChannelLayout){ .order = AV_CHANNEL_ORDER_AMBISONIC, .nb_channels = output_channel_count }; + layer->demixing_matrix = av_malloc_array(demixing_matrix_size, sizeof(*layer->demixing_matrix)); + if (!layer->demixing_matrix) + return AVERROR(ENOMEM); + + for (int i = 0; i < demixing_matrix_size; i++) + layer->demixing_matrix[i] = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + + for (int i = 0; i < substream_count; i++) { + IAMFSubStream *substream = &audio_element->substreams[i]; + + substream->codecpar->ch_layout = coupled_substream_count-- > 0 ? (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO : + (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO; + + + ret = update_extradata(substream->codecpar); + if (ret < 0) + return ret; + } + } + + return 0; +} + +static int param_parse(void *s, IAMFContext *c, AVIOContext *pb, + unsigned int type, + const IAMFAudioElement *audio_element, + AVIAMFParamDefinition **out_param_definition) +{ + IAMFParamDefinition *param_definition = NULL; + AVIAMFParamDefinition *param; + unsigned int parameter_id, parameter_rate, mode; + unsigned int duration = 0, constant_subblock_duration = 0, nb_subblocks = 0; + size_t param_size; + + parameter_id = ffio_read_leb(pb); + + for (int i = 0; i < c->nb_param_definitions; i++) + if (c->param_definitions[i]->param->parameter_id == parameter_id) { + param_definition = c->param_definitions[i]; + break; + } + + parameter_rate = ffio_read_leb(pb); + mode = avio_r8(pb) >> 7; + + if (mode == 0) { + duration = ffio_read_leb(pb); + constant_subblock_duration = ffio_read_leb(pb); + if (constant_subblock_duration == 0) + nb_subblocks = ffio_read_leb(pb); + else + nb_subblocks = duration / constant_subblock_duration; + } + + param = av_iamf_param_definition_alloc(type, nb_subblocks, ¶m_size); + if (!param) + return AVERROR(ENOMEM); + + for (int i = 0; i < nb_subblocks; i++) { + void *subblock = av_iamf_param_definition_get_subblock(param, i); + unsigned int subblock_duration = constant_subblock_duration; + + if (constant_subblock_duration == 0) + subblock_duration = ffio_read_leb(pb); + + switch (type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + AVIAMFMixGain *mix = subblock; + mix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + AVIAMFDemixingInfo *demix = subblock; + demix->subblock_duration = subblock_duration; + // DemixingInfoParameterData + demix->dmixp_mode = avio_r8(pb) >> 5; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + AVIAMFReconGain *recon = subblock; + recon->subblock_duration = subblock_duration; + break; + } + default: + av_free(param); + return AVERROR_INVALIDDATA; + } + } + + param->parameter_id = parameter_id; + param->parameter_rate = parameter_rate; + param->duration = duration; + param->constant_subblock_duration = constant_subblock_duration; + param->nb_subblocks = nb_subblocks; + + if (param_definition) { + if (param_definition->param_size != param_size || memcmp(param_definition->param, param, param_size)) { + av_log(s, AV_LOG_ERROR, "Incosistent parameters for parameter_id %u\n", parameter_id); + av_free(param); + return AVERROR_INVALIDDATA; + } + } else { + IAMFParamDefinition **tmp = av_realloc_array(c->param_definitions, c->nb_param_definitions + 1, + sizeof(*c->param_definitions)); + if (!tmp) { + av_free(param); + return AVERROR(ENOMEM); + } + c->param_definitions = tmp; + + param_definition = av_mallocz(sizeof(*param_definition)); + if (!param_definition) { + av_free(param); + return AVERROR(ENOMEM); + } + param_definition->param = param; + param_definition->mode = !mode; + param_definition->param_size = param_size; + param_definition->audio_element = audio_element; + + c->param_definitions[c->nb_param_definitions++] = param_definition; + } + + av_assert0(out_param_definition); + *out_param_definition = param; + + return 0; +} + +static IAMFCodecConfig *get_codec_config(IAMFContext *c, unsigned int codec_config_id) +{ + for (int i = 0; i < c->nb_codec_configs; i++) { + if (c->codec_configs[i]->codec_config_id == codec_config_id) + return c->codec_configs[i]; + } + + return NULL; +} + +static int audio_element_obu(void *s, IAMFContext *c, AVIOContext *pb, int len) +{ + const IAMFCodecConfig *codec_config; + AVIAMFAudioElement *element; + IAMFAudioElement **tmp, *audio_element = NULL; + FFIOContext b; + AVIOContext *pbc; + uint8_t *buf; + unsigned audio_element_id, codec_config_id, num_parameters; + int audio_element_type, ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pbc = &b.pub; + + audio_element_id = ffio_read_leb(pbc); + + for (int i = 0; i < c->nb_audio_elements; i++) + if (c->audio_elements[i]->audio_element_id == audio_element_id) { + av_log(s, AV_LOG_ERROR, "Duplicate audio_element_id %d\n", audio_element_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + audio_element_type = avio_r8(pbc) >> 5; + codec_config_id = ffio_read_leb(pbc); + + codec_config = get_codec_config(c, codec_config_id); + if (!codec_config) { + av_log(s, AV_LOG_ERROR, "Non existant codec config id %d referenced in an audio element\n", codec_config_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + if (codec_config->codec_id == AV_CODEC_ID_NONE) { + av_log(s, AV_LOG_DEBUG, "Unknown codec id referenced in an audio element. Ignoring\n"); + ret = 0; + goto fail; + } + + tmp = av_realloc_array(c->audio_elements, c->nb_audio_elements + 1, sizeof(*c->audio_elements)); + if (!tmp) { + ret = AVERROR(ENOMEM); + goto fail; + } + c->audio_elements = tmp; + + audio_element = av_mallocz(sizeof(*audio_element)); + if (!audio_element) { + ret = AVERROR(ENOMEM); + goto fail; + } + + audio_element->nb_substreams = ffio_read_leb(pbc); + audio_element->codec_config_id = codec_config_id; + audio_element->audio_element_id = audio_element_id; + audio_element->substreams = av_calloc(audio_element->nb_substreams, sizeof(*audio_element->substreams)); + if (!audio_element->substreams) { + ret = AVERROR(ENOMEM); + goto fail; + } + + element = audio_element->element = av_iamf_audio_element_alloc(); + if (!element) { + ret = AVERROR(ENOMEM); + goto fail; + } + + element->audio_element_type = audio_element_type; + + for (int i = 0; i < audio_element->nb_substreams; i++) { + IAMFSubStream *substream = &audio_element->substreams[i]; + + substream->codecpar = avcodec_parameters_alloc(); + if (!substream->codecpar) { + ret = AVERROR(ENOMEM); + goto fail; + } + + substream->audio_substream_id = ffio_read_leb(pbc); + + substream->codecpar->codec_type = AVMEDIA_TYPE_AUDIO; + substream->codecpar->codec_id = codec_config->codec_id; + substream->codecpar->frame_size = codec_config->nb_samples; + substream->codecpar->sample_rate = codec_config->sample_rate; + substream->codecpar->seek_preroll = codec_config->seek_preroll; + + switch(substream->codecpar->codec_id) { + case AV_CODEC_ID_AAC: + case AV_CODEC_ID_FLAC: + case AV_CODEC_ID_OPUS: + substream->codecpar->extradata = av_malloc(codec_config->extradata_size + AV_INPUT_BUFFER_PADDING_SIZE); + if (!substream->codecpar->extradata) { + ret = AVERROR(ENOMEM); + goto fail; + } + memcpy(substream->codecpar->extradata, codec_config->extradata, codec_config->extradata_size); + memset(substream->codecpar->extradata + codec_config->extradata_size, 0, AV_INPUT_BUFFER_PADDING_SIZE); + substream->codecpar->extradata_size = codec_config->extradata_size; + break; + } + } + + num_parameters = ffio_read_leb(pbc); + if (num_parameters && audio_element_type != 0) { + av_log(s, AV_LOG_ERROR, "Audio Element parameter count %u is invalid" + " for Scene representations\n", num_parameters); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + for (int i = 0; i < num_parameters; i++) { + unsigned type; + + type = ffio_read_leb(pbc); + if (type == AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN) { + ret = AVERROR_INVALIDDATA; + goto fail; + } else if (type == AV_IAMF_PARAMETER_DEFINITION_DEMIXING) { + ret = param_parse(s, c, pbc, type, audio_element, &element->demixing_info); + if (ret < 0) + goto fail; + + element->default_w = avio_r8(pbc) >> 4; + } else if (type == AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) { + ret = param_parse(s, c, pbc, type, audio_element, &element->recon_gain_info); + if (ret < 0) + goto fail; + } else { + unsigned param_definition_size = ffio_read_leb(pbc); + avio_skip(pbc, param_definition_size); + } + } + + if (audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) { + ret = scalable_channel_layout_config(s, pbc, audio_element, codec_config); + if (ret < 0) + goto fail; + } else if (audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE) { + ret = ambisonics_config(s, pbc, audio_element, codec_config); + if (ret < 0) + goto fail; + } else { + unsigned audio_element_config_size = ffio_read_leb(pbc); + avio_skip(pbc, audio_element_config_size); + } + + c->audio_elements[c->nb_audio_elements++] = audio_element; + + len -= avio_tell(pbc); + if (len) + av_log(s, AV_LOG_WARNING, "Underread in audio_element_obu. %d bytes left at the end\n", len); + + ret = 0; +fail: + av_free(buf); + if (ret < 0) + ff_iamf_free_audio_element(&audio_element); + return ret; +} + +static int label_string(AVIOContext *pb, char **label) +{ + uint8_t buf[128]; + + avio_get_str(pb, sizeof(buf), buf, sizeof(buf)); + + if (pb->error) + return pb->error; + if (pb->eof_reached) + return AVERROR_INVALIDDATA; + *label = av_strdup(buf); + if (!*label) + return AVERROR(ENOMEM); + + return 0; +} + +static int mix_presentation_obu(void *s, IAMFContext *c, AVIOContext *pb, int len) +{ + AVIAMFMixPresentation *mix; + IAMFMixPresentation **tmp, *mix_presentation = NULL; + FFIOContext b; + AVIOContext *pbc; + uint8_t *buf; + unsigned mix_presentation_id; + int ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pbc = &b.pub; + + mix_presentation_id = ffio_read_leb(pbc); + + for (int i = 0; i < c->nb_mix_presentations; i++) + if (c->mix_presentations[i]->mix_presentation_id == mix_presentation_id) { + av_log(s, AV_LOG_ERROR, "Duplicate mix_presentation_id %d\n", mix_presentation_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + tmp = av_realloc_array(c->mix_presentations, c->nb_mix_presentations + 1, sizeof(*c->mix_presentations)); + if (!tmp) { + ret = AVERROR(ENOMEM); + goto fail; + } + c->mix_presentations = tmp; + + mix_presentation = av_mallocz(sizeof(*mix_presentation)); + if (!mix_presentation) { + ret = AVERROR(ENOMEM); + goto fail; + } + + mix_presentation->mix_presentation_id = mix_presentation_id; + mix = mix_presentation->mix = av_iamf_mix_presentation_alloc(); + if (!mix) { + ret = AVERROR(ENOMEM); + goto fail; + } + + mix_presentation->count_label = ffio_read_leb(pbc); + mix_presentation->language_label = av_calloc(mix_presentation->count_label, + sizeof(*mix_presentation->language_label)); + if (!mix_presentation->language_label) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int i = 0; i < mix_presentation->count_label; i++) { + ret = label_string(pbc, &mix_presentation->language_label[i]); + if (ret < 0) + goto fail; + } + + for (int i = 0; i < mix_presentation->count_label; i++) { + char *annotation = NULL; + ret = label_string(pbc, &annotation); + if (ret < 0) + goto fail; + ret = av_dict_set(&mix->annotations, mix_presentation->language_label[i], annotation, + AV_DICT_DONT_STRDUP_VAL | AV_DICT_DONT_OVERWRITE); + if (ret < 0) + goto fail; + } + + mix->nb_submixes = ffio_read_leb(pbc); + mix->submixes = av_calloc(mix->nb_submixes, sizeof(*mix->submixes)); + if (!mix->submixes) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int i = 0; i < mix->nb_submixes; i++) { + AVIAMFSubmix *sub_mix; + + sub_mix = mix->submixes[i] = av_mallocz(sizeof(*sub_mix)); + if (!sub_mix) { + ret = AVERROR(ENOMEM); + goto fail; + } + + sub_mix->nb_elements = ffio_read_leb(pbc); + sub_mix->elements = av_calloc(sub_mix->nb_elements, sizeof(*sub_mix->elements)); + if (!sub_mix->elements) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int j = 0; j < sub_mix->nb_elements; j++) { + AVIAMFSubmixElement *submix_element; + IAMFAudioElement *audio_element = NULL; + unsigned int rendering_config_extension_size; + + submix_element = sub_mix->elements[j] = av_mallocz(sizeof(*submix_element)); + if (!submix_element) { + ret = AVERROR(ENOMEM); + goto fail; + } + + submix_element->audio_element_id = ffio_read_leb(pbc); + + for (int k = 0; k < c->nb_audio_elements; k++) + if (c->audio_elements[k]->audio_element_id == submix_element->audio_element_id) { + audio_element = c->audio_elements[k]; + break; + } + + if (!audio_element) { + av_log(s, AV_LOG_ERROR, "Invalid Audio Element with id %u referenced by Mix Parameters %u\n", + submix_element->audio_element_id, mix_presentation_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + for (int k = 0; k < mix_presentation->count_label; k++) { + char *annotation = NULL; + ret = label_string(pbc, &annotation); + if (ret < 0) + goto fail; + ret = av_dict_set(&submix_element->annotations, mix_presentation->language_label[k], annotation, + AV_DICT_DONT_STRDUP_VAL | AV_DICT_DONT_OVERWRITE); + if (ret < 0) + goto fail; + } + + submix_element->headphones_rendering_mode = avio_r8(pbc) >> 6; + + rendering_config_extension_size = ffio_read_leb(pbc); + avio_skip(pbc, rendering_config_extension_size); + + ret = param_parse(s, c, pbc, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, + NULL, + &submix_element->element_mix_config); + if (ret < 0) + goto fail; + submix_element->default_mix_gain = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + } + + ret = param_parse(s, c, pbc, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL, &sub_mix->output_mix_config); + if (ret < 0) + goto fail; + sub_mix->default_mix_gain = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + + sub_mix->nb_layouts = ffio_read_leb(pbc); + sub_mix->layouts = av_calloc(sub_mix->nb_layouts, sizeof(*sub_mix->layouts)); + if (!sub_mix->layouts) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int j = 0; j < sub_mix->nb_layouts; j++) { + AVIAMFSubmixLayout *submix_layout; + int info_type; + int byte = avio_r8(pbc); + + submix_layout = sub_mix->layouts[j] = av_mallocz(sizeof(*submix_layout)); + if (!submix_layout) { + ret = AVERROR(ENOMEM); + goto fail; + } + + submix_layout->layout_type = byte >> 6; + if (submix_layout->layout_type < AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS && + submix_layout->layout_type > AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL) { + av_log(s, AV_LOG_ERROR, "Invalid Layout type %u in a submix from Mix Presentation %u\n", + submix_layout->layout_type, mix_presentation_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + if (submix_layout->layout_type == 2) { + int sound_system; + sound_system = (byte >> 2) & 0xF; + av_channel_layout_copy(&submix_layout->sound_system, &ff_iamf_sound_system_map[sound_system].layout); + } + + info_type = avio_r8(pbc); + submix_layout->integrated_loudness = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + submix_layout->digital_peak = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + + if (info_type & 1) + submix_layout->true_peak = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + if (info_type & 2) { + unsigned int num_anchored_loudness = avio_r8(pbc); + + for (int k = 0; k < num_anchored_loudness; k++) { + unsigned int anchor_element = avio_r8(pbc); + AVRational anchored_loudness = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + if (anchor_element == IAMF_ANCHOR_ELEMENT_DIALOGUE) + submix_layout->dialogue_anchored_loudness = anchored_loudness; + else if (anchor_element <= IAMF_ANCHOR_ELEMENT_ALBUM) + submix_layout->album_anchored_loudness = anchored_loudness; + else + av_log(s, AV_LOG_DEBUG, "Unknown anchor_element. Ignoring\n"); + } + } + + if (info_type & 0xFC) { + unsigned int info_type_size = ffio_read_leb(pbc); + avio_skip(pbc, info_type_size); + } + } + } + + c->mix_presentations[c->nb_mix_presentations++] = mix_presentation; + + len -= avio_tell(pbc); + if (len) + av_log(s, AV_LOG_WARNING, "Underread in mix_presentation_obu. %d bytes left at the end\n", len); + + ret = 0; +fail: + av_free(buf); + if (ret < 0) + ff_iamf_free_mix_presentation(&mix_presentation); + return ret; +} + +int ff_iamf_parse_obu_header(const uint8_t *buf, int buf_size, + unsigned *obu_size, int *start_pos, enum IAMF_OBU_Type *type, + unsigned *skip_samples, unsigned *discard_padding) +{ + GetBitContext gb; + int ret, extension_flag, trimming, start; + unsigned skip = 0, discard = 0; + unsigned size; + + ret = init_get_bits8(&gb, buf, FFMIN(buf_size, MAX_IAMF_OBU_HEADER_SIZE)); + if (ret < 0) + return ret; + + *type = get_bits(&gb, 5); + /*redundant =*/ get_bits1(&gb); + trimming = get_bits1(&gb); + extension_flag = get_bits1(&gb); + + *obu_size = get_leb(&gb); + if (*obu_size > INT_MAX) + return AVERROR_INVALIDDATA; + + start = get_bits_count(&gb) / 8; + + if (trimming) { + discard = get_leb(&gb); // num_samples_to_trim_at_end + skip = get_leb(&gb); // num_samples_to_trim_at_start + } + + if (skip_samples) + *skip_samples = skip; + if (discard_padding) + *discard_padding = discard; + + if (extension_flag) { + unsigned int extension_bytes; + extension_bytes = get_leb(&gb); + if (extension_bytes > INT_MAX / 8) + return AVERROR_INVALIDDATA; + skip_bits_long(&gb, extension_bytes * 8); + } + + if (get_bits_left(&gb) < 0) + return AVERROR_INVALIDDATA; + + size = *obu_size + start; + if (size > INT_MAX) + return AVERROR_INVALIDDATA; + + *obu_size -= get_bits_count(&gb) / 8 - start; + *start_pos = size - *obu_size; + + return size; +} + +int ff_iamfdec_read_descriptors(IAMFContext *c, AVIOContext *pb, + int max_size, void *log_ctx) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE + AV_INPUT_BUFFER_PADDING_SIZE]; + int ret; + + while (1) { + unsigned obu_size; + enum IAMF_OBU_Type type; + int start_pos, len, size; + + if ((ret = ffio_ensure_seekback(pb, FFMIN(MAX_IAMF_OBU_HEADER_SIZE, max_size))) < 0) + return ret; + size = avio_read(pb, header, FFMIN(MAX_IAMF_OBU_HEADER_SIZE, max_size)); + if (size < 0) + return size; + + len = ff_iamf_parse_obu_header(header, size, &obu_size, &start_pos, &type, NULL, NULL); + if (len < 0 || obu_size > max_size) { + av_log(log_ctx, AV_LOG_ERROR, "Failed to read obu header\n"); + avio_seek(pb, -size, SEEK_CUR); + return len; + } + + if (type >= IAMF_OBU_IA_PARAMETER_BLOCK && type < IAMF_OBU_IA_SEQUENCE_HEADER) { + avio_seek(pb, -size, SEEK_CUR); + break; + } + + avio_seek(pb, -(size - start_pos), SEEK_CUR); + switch (type) { + case IAMF_OBU_IA_CODEC_CONFIG: + ret = codec_config_obu(log_ctx, c, pb, obu_size); + break; + case IAMF_OBU_IA_AUDIO_ELEMENT: + ret = audio_element_obu(log_ctx, c, pb, obu_size); + break; + case IAMF_OBU_IA_MIX_PRESENTATION: + ret = mix_presentation_obu(log_ctx, c, pb, obu_size); + break; + case IAMF_OBU_IA_TEMPORAL_DELIMITER: + break; + default: { + int64_t offset = avio_skip(pb, obu_size); + if (offset < 0) + ret = offset; + break; + } + } + if (ret < 0) { + av_log(log_ctx, AV_LOG_ERROR, "Failed to read obu type %d\n", type); + return ret; + } + max_size -= obu_size + start_pos; + if (max_size < 0) + return AVERROR_INVALIDDATA; + if (!max_size) + break; + } + + return 0; +} diff --git a/libavformat/iamf_parse.h b/libavformat/iamf_parse.h new file mode 100644 index 0000000000..f4f297ecd4 --- /dev/null +++ b/libavformat/iamf_parse.h @@ -0,0 +1,38 @@ +/* + * Immersive Audio Model and Formats parsing + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFORMAT_IAMF_PARSE_H +#define AVFORMAT_IAMF_PARSE_H + +#include + +#include "libavutil/iamf.h" +#include "avio.h" +#include "iamf.h" + +int ff_iamf_parse_obu_header(const uint8_t *buf, int buf_size, + unsigned *obu_size, int *start_pos, enum IAMF_OBU_Type *type, + unsigned *skip_samples, unsigned *discard_padding); + +int ff_iamfdec_read_descriptors(IAMFContext *c, AVIOContext *pb, + int size, void *log_ctx); + +#endif /* AVFORMAT_IAMF_PARSE_H */ diff --git a/libavformat/iamfdec.c b/libavformat/iamfdec.c new file mode 100644 index 0000000000..0374d0f241 --- /dev/null +++ b/libavformat/iamfdec.c @@ -0,0 +1,503 @@ +/* + * Immersive Audio Model and Formats demuxer + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config_components.h" + +#include "libavutil/avassert.h" +#include "libavutil/iamf.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/log.h" +#include "libavcodec/mathops.h" +#include "avformat.h" +#include "avio_internal.h" +#include "demux.h" +#include "iamf.h" +#include "iamf_parse.h" +#include "internal.h" + +typedef struct IAMFDemuxContext { + IAMFContext iamf; + + // Packet side data + AVIAMFParamDefinition *mix; + size_t mix_size; + AVIAMFParamDefinition *demix; + size_t demix_size; + AVIAMFParamDefinition *recon; + size_t recon_size; +} IAMFDemuxContext; + +static AVStream *find_stream_by_id(AVFormatContext *s, int id) +{ + for (int i = 0; i < s->nb_streams; i++) + if (s->streams[i]->id == id) + return s->streams[i]; + + av_log(s, AV_LOG_ERROR, "Invalid stream id %d\n", id); + return NULL; +} + +static int audio_frame_obu(AVFormatContext *s, AVPacket *pkt, int len, + enum IAMF_OBU_Type type, + unsigned skip_samples, unsigned discard_padding, + int id_in_bitstream) +{ + const IAMFDemuxContext *const c = s->priv_data; + AVStream *st; + int ret, audio_substream_id; + + if (id_in_bitstream) { + unsigned explicit_audio_substream_id; + int64_t pos = avio_tell(s->pb); + explicit_audio_substream_id = ffio_read_leb(s->pb); + len -= avio_tell(s->pb) - pos; + audio_substream_id = explicit_audio_substream_id; + } else + audio_substream_id = type - IAMF_OBU_IA_AUDIO_FRAME_ID0; + + st = find_stream_by_id(s, audio_substream_id); + if (!st) + return AVERROR_INVALIDDATA; + + ret = av_get_packet(s->pb, pkt, len); + if (ret < 0) + return ret; + if (ret != len) + return AVERROR_INVALIDDATA; + + if (skip_samples || discard_padding) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_SKIP_SAMPLES, 10); + if (!side_data) + return AVERROR(ENOMEM); + AV_WL32(side_data, skip_samples); + AV_WL32(side_data + 4, discard_padding); + } + if (c->mix) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, c->mix_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->mix, c->mix_size); + } + if (c->demix) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, c->demix_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->demix, c->demix_size); + } + if (c->recon) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, c->recon_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->recon, c->recon_size); + } + + pkt->stream_index = st->index; + return 0; +} + +static const IAMFParamDefinition *get_param_definition(AVFormatContext *s, unsigned int parameter_id) +{ + const IAMFDemuxContext *const c = s->priv_data; + const IAMFContext *const iamf = &c->iamf; + const IAMFParamDefinition *param_definition = NULL; + + for (int i = 0; i < iamf->nb_param_definitions; i++) + if (iamf->param_definitions[i]->param->parameter_id == parameter_id) { + param_definition = iamf->param_definitions[i]; + break; + } + + return param_definition; +} + +static int parameter_block_obu(AVFormatContext *s, int len) +{ + IAMFDemuxContext *const c = s->priv_data; + const IAMFParamDefinition *param_definition; + const AVIAMFParamDefinition *param; + AVIAMFParamDefinition *out_param = NULL; + FFIOContext b; + AVIOContext *pb; + uint8_t *buf; + unsigned int duration, constant_subblock_duration; + unsigned int nb_subblocks; + unsigned int parameter_id; + size_t out_param_size; + int ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(s->pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pb = &b.pub; + + parameter_id = ffio_read_leb(pb); + param_definition = get_param_definition(s, parameter_id); + if (!param_definition) { + av_log(s, AV_LOG_VERBOSE, "Non existant parameter_id %d referenced in a parameter block. Ignoring\n", + parameter_id); + ret = 0; + goto fail; + } + + param = param_definition->param; + if (!param_definition->mode) { + duration = ffio_read_leb(pb); + constant_subblock_duration = ffio_read_leb(pb); + if (constant_subblock_duration == 0) + nb_subblocks = ffio_read_leb(pb); + else + nb_subblocks = duration / constant_subblock_duration; + } else { + duration = param->duration; + constant_subblock_duration = param->constant_subblock_duration; + nb_subblocks = param->nb_subblocks; + if (!nb_subblocks) + nb_subblocks = duration / constant_subblock_duration; + } + + out_param = av_iamf_param_definition_alloc(param->type, nb_subblocks, &out_param_size); + if (!out_param) { + ret = AVERROR(ENOMEM); + goto fail; + } + + out_param->parameter_id = param->parameter_id; + out_param->type = param->type; + out_param->parameter_rate = param->parameter_rate; + out_param->duration = duration; + out_param->constant_subblock_duration = constant_subblock_duration; + out_param->nb_subblocks = nb_subblocks; + + for (int i = 0; i < nb_subblocks; i++) { + void *subblock = av_iamf_param_definition_get_subblock(out_param, i); + unsigned int subblock_duration = constant_subblock_duration; + + if (!param_definition->mode && !constant_subblock_duration) + subblock_duration = ffio_read_leb(pb); + + switch (param->type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + AVIAMFMixGain *mix = subblock; + + mix->animation_type = ffio_read_leb(pb); + if (mix->animation_type > AV_IAMF_ANIMATION_TYPE_BEZIER) { + ret = 0; + av_free(out_param); + goto fail; + } + + mix->start_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR) + mix->end_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) { + mix->control_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + mix->control_point_relative_time = av_make_q(avio_r8(pb), 1 << 8); + } + mix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + AVIAMFDemixingInfo *demix = subblock; + + demix->dmixp_mode = avio_r8(pb) >> 5; + demix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + AVIAMFReconGain *recon = subblock; + const IAMFAudioElement *audio_element = param_definition->audio_element; + const AVIAMFAudioElement *element = audio_element->element; + + av_assert0(audio_element && element); + for (int i = 0; i < element->nb_layers; i++) { + const AVIAMFLayer *layer = element->layers[i]; + if (layer->flags & AV_IAMF_LAYER_FLAG_RECON_GAIN) { + unsigned int recon_gain_flags = ffio_read_leb(pb); + unsigned int bitcount = 7 + 5 * !!(recon_gain_flags & 0x80); + recon_gain_flags = (recon_gain_flags & 0x7F) | ((recon_gain_flags & 0xFF00) >> 1); + for (int j = 0; j < bitcount; j++) { + if (recon_gain_flags & (1 << j)) + recon->recon_gain[i][j] = avio_r8(pb); + } + } + } + recon->subblock_duration = subblock_duration; + break; + } + default: + av_assert0(0); + } + } + + len -= avio_tell(pb); + if (len) { + int level = (s->error_recognition & AV_EF_EXPLODE) ? AV_LOG_ERROR : AV_LOG_WARNING; + av_log(s, level, "Underread in parameter_block_obu. %d bytes left at the end\n", len); + } + + switch (param->type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + av_free(c->mix); + c->mix = out_param; + c->mix_size = out_param_size; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + av_free(c->demix); + c->demix = out_param; + c->demix_size = out_param_size; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + av_free(c->recon); + c->recon = out_param; + c->recon_size = out_param_size; + break; + default: + av_assert0(0); + } + + ret = 0; +fail: + if (ret < 0) + av_free(out_param); + av_free(buf); + + return ret; +} + +static int iamf_read_packet(AVFormatContext *s, AVPacket *pkt) +{ + IAMFDemuxContext *const c = s->priv_data; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE + AV_INPUT_BUFFER_PADDING_SIZE]; + unsigned obu_size; + int ret; + + while (1) { + enum IAMF_OBU_Type type; + unsigned skip_samples, discard_padding; + int len, size, start_pos; + + if ((ret = ffio_ensure_seekback(s->pb, MAX_IAMF_OBU_HEADER_SIZE)) < 0) + return ret; + size = avio_read(s->pb, header, MAX_IAMF_OBU_HEADER_SIZE); + if (size < 0) + return size; + + len = ff_iamf_parse_obu_header(header, size, &obu_size, &start_pos, &type, + &skip_samples, &discard_padding); + if (len < 0) { + av_log(s, AV_LOG_ERROR, "Failed to read obu\n"); + return len; + } + avio_seek(s->pb, -(size - start_pos), SEEK_CUR); + + if (type >= IAMF_OBU_IA_AUDIO_FRAME && type <= IAMF_OBU_IA_AUDIO_FRAME_ID17) + return audio_frame_obu(s, pkt, obu_size, type, + skip_samples, discard_padding, + type == IAMF_OBU_IA_AUDIO_FRAME); + else if (type == IAMF_OBU_IA_PARAMETER_BLOCK) { + ret = parameter_block_obu(s, obu_size); + if (ret < 0) + return ret; + } else if (type == IAMF_OBU_IA_TEMPORAL_DELIMITER) { + av_freep(&c->mix); + c->mix_size = 0; + av_freep(&c->demix); + c->demix_size = 0; + av_freep(&c->recon); + c->recon_size = 0; + } else { + int64_t offset = avio_skip(s->pb, obu_size); + if (offset < 0) { + ret = offset; + break; + } + } + } + + return ret; +} + +//return < 0 if we need more data +static int get_score(const uint8_t *buf, int buf_size, enum IAMF_OBU_Type type, int *seq) +{ + if (type == IAMF_OBU_IA_SEQUENCE_HEADER) { + if (buf_size < 4 || AV_RB32(buf) != MKBETAG('i','a','m','f')) + return 0; + *seq = 1; + return -1; + } + if (type >= IAMF_OBU_IA_CODEC_CONFIG && type <= IAMF_OBU_IA_TEMPORAL_DELIMITER) + return *seq ? -1 : 0; + if (type >= IAMF_OBU_IA_AUDIO_FRAME && type <= IAMF_OBU_IA_AUDIO_FRAME_ID17) + return *seq ? AVPROBE_SCORE_EXTENSION + 1 : 0; + return 0; +} + +static int iamf_probe(const AVProbeData *p) +{ + unsigned obu_size; + enum IAMF_OBU_Type type; + int seq = 0, cnt = 0, start_pos; + int ret; + + while (1) { + int size = ff_iamf_parse_obu_header(p->buf + cnt, p->buf_size - cnt, + &obu_size, &start_pos, &type, + NULL, NULL); + if (size < 0) + return 0; + + ret = get_score(p->buf + cnt + start_pos, + p->buf_size - cnt - start_pos, + type, &seq); + if (ret >= 0) + return ret; + + cnt += FFMIN(size, p->buf_size - cnt); + } + return 0; +} + +static int iamf_read_header(AVFormatContext *s) +{ + IAMFDemuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + int ret; + + ret = ff_iamfdec_read_descriptors(iamf, s->pb, INT_MAX, s); + if (ret < 0) + return ret; + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + IAMFAudioElement *audio_element = iamf->audio_elements[i]; + AVStreamGroup *stg = avformat_stream_group_create(s, AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT, NULL); + + if (!stg) + return AVERROR(ENOMEM); + + stg->id = audio_element->audio_element_id; + stg->params.iamf_audio_element = audio_element->element; + + for (int j = 0; j < audio_element->nb_substreams; j++) { + IAMFSubStream *substream = &audio_element->substreams[j]; + AVStream *st = avformat_new_stream(s, NULL); + + if (!st) + return AVERROR(ENOMEM); + + ret = avformat_stream_group_add_stream(stg, st); + if (ret < 0) + return ret; + + ret = avcodec_parameters_copy(st->codecpar, substream->codecpar); + if (ret < 0) + return ret; + + st->id = substream->audio_substream_id; + avpriv_set_pts_info(st, 64, 1, st->codecpar->sample_rate); + } + } + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + IAMFMixPresentation *mix_presentation = iamf->mix_presentations[i]; + AVStreamGroup *stg = avformat_stream_group_create(s, AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION, NULL); + const AVIAMFMixPresentation *mix = mix_presentation->mix; + + if (!stg) + return AVERROR(ENOMEM); + + stg->id = mix_presentation->mix_presentation_id; + stg->params.iamf_mix_presentation = mix_presentation->mix; + + for (int j = 0; j < mix->nb_submixes; j++) { + AVIAMFSubmix *sub_mix = mix->submixes[j]; + + for (int k = 0; k < sub_mix->nb_elements; k++) { + AVIAMFSubmixElement *submix_element = sub_mix->elements[k]; + AVStreamGroup *audio_element = NULL; + + for (int l = 0; l < s->nb_stream_groups; l++) + if (s->stream_groups[l]->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT && + s->stream_groups[l]->id == submix_element->audio_element_id) { + audio_element = s->stream_groups[l]; + break; + } + av_assert0(audio_element); + + for (int l = 0; l < audio_element->nb_streams; l++) { + ret = avformat_stream_group_add_stream(stg, audio_element->streams[l]); + if (ret < 0 && ret != AVERROR(EEXIST)) + return ret; + } + } + } + } + + return 0; +} + +static int iamf_read_close(AVFormatContext *s) +{ + IAMFDemuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + IAMFAudioElement *audio_element = iamf->audio_elements[i]; + audio_element->element = NULL; + } + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + IAMFMixPresentation *mix_presentation = iamf->mix_presentations[i]; + mix_presentation->mix = NULL; + } + + ff_iamf_uninit_context(&c->iamf); + + av_freep(&c->mix); + c->mix_size = 0; + av_freep(&c->demix); + c->demix_size = 0; + av_freep(&c->recon); + c->recon_size = 0; + + return 0; +} + +const AVInputFormat ff_iamf_demuxer = { + .name = "iamf", + .long_name = NULL_IF_CONFIG_SMALL("Raw Immersive Audio Model and Formats"), + .priv_data_size = sizeof(IAMFDemuxContext), + .flags_internal = FF_FMT_INIT_CLEANUP, + .read_probe = iamf_probe, + .read_header = iamf_read_header, + .read_packet = iamf_read_packet, + .read_close = iamf_read_close, + .extensions = "iamf", + .flags = AVFMT_GENERIC_INDEX | AVFMT_NO_BYTE_SEEK | AVFMT_NOTIMESTAMPS | AVFMT_SHOW_IDS, +}; From patchwork Thu Dec 14 20:14:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45151 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp5209373pzf; Thu, 14 Dec 2023 12:16:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IEZbVhuJPU1EgXLaT7DVtX6YyirMXc/wMQNW8nTvoRhq45ugG/SbVS3CvygxBMw03k0cmDD X-Received: by 2002:a50:cdcd:0:b0:54c:5d36:42e6 with SMTP id h13-20020a50cdcd000000b0054c5d3642e6mr5262783edj.81.1702584986242; Thu, 14 Dec 2023 12:16:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702584986; cv=none; d=google.com; s=arc-20160816; b=FS4DQ+xg/YmZzKu3aTvpFc4IMCnwUHPMRMnWDW4PdyxRMgOD0fk/Vq10TvGZLeFwMV W3k+vYyjgmVleYU3RF5jWv0MXFgHimzgMPAcE+Nphp5X/OIXtjkP2jxYp2wzcTFIIODf FJU1Ln0tt3U/fBJnh44xH9v9sanD2kDosUzHMgO/l26oW6HrvetlDitUHKc50V3Kshet eFyedlhb2RXY47khNPuLjKLoktev6ageMkGitzxVfm5YteLRyHJGsrGVgXZgxClCMeuu UcoDdpZZBeRdCt/HxL9mlM2JbiFFEPfRp60JmRR1vmPyZUbocmn+/clC1AMVAfSbLZrm d71A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=0++wUu00Ev00oXj5uPcUsGTSrfuMujRJzfFXZALL7+E=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=yHALKfrAGZhD8aDlPo2s5w/47vN69M+IBAYw1grIjdonZ0+UC0fqAZyWYZWPKR69HD 5AtoNbIpVjzcmbfNa4cOTZFLZ6MyJiCVFQpGAhgUHMjQaZrS+HotyB1A62UPmDQbtqRI iwkkx2sgcTD5HFs+JwIs6gbiEsQc7xbb4ef2zh2G7j1PWbUmLQ6+DtNm7sl1muN+mjqS 82uzyiTjPYJ9795LyddEA5IhcCLgv/s9KWFCNEzjLkVGR8iHF7rfGzmWBblksMlpRim7 r9A0W6TMi6dGapnJMhTZf74T0vSyqOpsbxnVsN+jw26dH8qPQz8AFCPm90LWtLpPZnK3 TWRA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="G2kecE1/"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id cy22-20020a0564021c9600b005528951fa99si371592edb.52.2023.12.14.12.16.25; Thu, 14 Dec 2023 12:16:26 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="G2kecE1/"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 5766168D2BC; Thu, 14 Dec 2023 22:15:30 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 90CD268D2B2 for ; Thu, 14 Dec 2023 22:15:25 +0200 (EET) Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-6ce94f62806so5025763b3a.1 for ; Thu, 14 Dec 2023 12:15:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702584923; x=1703189723; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=lYn3XSuRHDDq/Zync3oAUzzVjclvHdgPGLyTrmv0E9s=; b=G2kecE1/eARmSCSCDSZOY1GjGa2G9G+Q+kCcPZmWvkG46HHw5Y1zPqZZA/GMosdpOl INo/fGSYsyfshfTPbmYsRXsmke2HMo30tzru7qOF3M4qWDErhcuWhjokAacweE+ZhVYF BN0wq7ieSYU81rcQod7S5pecypvhTHn3mQQ9P6OX1M4guAFg7/CL2SRL2MASNK3RRdMS HeSwEQInBZQ+JlsfFFO3J7lKWYoHSE5Dcyh2mYzfXbOz1IgOw9SFj+CgWjCQ0i8rx45I +z3+4By/MIR4LQL/5aq0yJLlmB12eEobor5VT0Y2lnGwYmcPpAfAZDg0X7vVV+lr6abv ZSew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702584923; x=1703189723; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lYn3XSuRHDDq/Zync3oAUzzVjclvHdgPGLyTrmv0E9s=; b=X9WU6hV2ANZ4EyAzMqgkwfU8S12dou3md6VyL9qZazDJ5LYYMsD92zrD4IKjkFl5Dh pbkHdCR37EcQ4H309qpMdZNs0qiJBxGLW6mD3UdtiGATTdJ8WXRQ9cO7DMo5Of1Z0xlU Pb3xTI2xSOYv0L2+X0gV2wbFKseYQYtwej7M3gbdjNz/8Dm07kHz3r+YgjH/+v6YVVlv eR7GJLv1p33ykHnNm2tSP/K5uVQcW24D7gQJgBrgmVTjRq+pLkZK5EbQg4jXWrDt3sJL hFVxAKgeiP399ePQwnMCEaQBcENW0Tt9mAjstHdsVhGjy8Na0wADoxTCYb6He1RdTFZr zG7A== X-Gm-Message-State: AOJu0YwshFt5eZItbQAv6+Xv5n7G/5fZmItKcRpW9pnu7dLUoUq+Uydf q2lHcXuoZPVv8bxqvnNqdIb+Vm6Tu5U= X-Received: by 2002:a05:6a20:2927:b0:18b:960a:efc2 with SMTP id t39-20020a056a20292700b0018b960aefc2mr5201029pzf.10.1702584922174; Thu, 14 Dec 2023 12:15:22 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id h12-20020a62b40c000000b006d0d4bafe31sm3352885pfn.6.2023.12.14.12.15.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 12:15:21 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Thu, 14 Dec 2023 17:14:33 -0300 Message-ID: <20231214201433.4608-9-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214201433.4608-1-jamrial@gmail.com> References: <20231214201433.4608-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 8/8] avformat: Immersive Audio Model and Formats muxer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 4TSX6lDJvIOp Signed-off-by: James Almer --- libavformat/Makefile | 1 + libavformat/allformats.c | 1 + libavformat/iamf_writer.c | 860 ++++++++++++++++++++++++++++++++++++++ libavformat/iamf_writer.h | 51 +++ libavformat/iamfenc.c | 387 +++++++++++++++++ 5 files changed, 1300 insertions(+) create mode 100644 libavformat/iamf_writer.c create mode 100644 libavformat/iamf_writer.h create mode 100644 libavformat/iamfenc.c diff --git a/libavformat/Makefile b/libavformat/Makefile index f23c22792b..581e378d95 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -259,6 +259,7 @@ OBJS-$(CONFIG_HLS_DEMUXER) += hls.o hls_sample_encryption.o OBJS-$(CONFIG_HLS_MUXER) += hlsenc.o hlsplaylist.o avc.o OBJS-$(CONFIG_HNM_DEMUXER) += hnm.o OBJS-$(CONFIG_IAMF_DEMUXER) += iamfdec.o iamf_parse.o iamf.o +OBJS-$(CONFIG_IAMF_MUXER) += iamfenc.o iamf_writer.o iamf.o OBJS-$(CONFIG_ICO_DEMUXER) += icodec.o OBJS-$(CONFIG_ICO_MUXER) += icoenc.o OBJS-$(CONFIG_IDCIN_DEMUXER) += idcin.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index 6e520b78a6..ce6be5f04d 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -213,6 +213,7 @@ extern const AVInputFormat ff_hls_demuxer; extern const FFOutputFormat ff_hls_muxer; extern const AVInputFormat ff_hnm_demuxer; extern const AVInputFormat ff_iamf_demuxer; +extern const FFOutputFormat ff_iamf_muxer; extern const AVInputFormat ff_ico_demuxer; extern const FFOutputFormat ff_ico_muxer; extern const AVInputFormat ff_idcin_demuxer; diff --git a/libavformat/iamf_writer.c b/libavformat/iamf_writer.c new file mode 100644 index 0000000000..9962845049 --- /dev/null +++ b/libavformat/iamf_writer.c @@ -0,0 +1,860 @@ +/* + * Immersive Audio Model and Formats muxing helpers and structs + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/channel_layout.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/iamf.h" +#include "libavutil/mem.h" +#include "libavcodec/get_bits.h" +#include "libavcodec/flac.h" +#include "libavcodec/mpeg4audio.h" +#include "libavcodec/put_bits.h" +#include "avformat.h" +#include "avio_internal.h" +#include "iamf.h" +#include "iamf_writer.h" + + +static int update_extradata(IAMFCodecConfig *codec_config) +{ + GetBitContext gb; + PutBitContext pb; + int ret; + + switch(codec_config->codec_id) { + case AV_CODEC_ID_OPUS: + if (codec_config->extradata_size < 19) + return AVERROR_INVALIDDATA; + codec_config->extradata_size -= 8; + memmove(codec_config->extradata, codec_config->extradata + 8, codec_config->extradata_size); + AV_WB8(codec_config->extradata + 1, 2); // set channels to stereo + break; + case AV_CODEC_ID_FLAC: { + uint8_t buf[13]; + + init_put_bits(&pb, buf, sizeof(buf)); + ret = init_get_bits8(&gb, codec_config->extradata, codec_config->extradata_size); + if (ret < 0) + return ret; + + put_bits32(&pb, get_bits_long(&gb, 32)); // min/max blocksize + put_bits64(&pb, 48, get_bits64(&gb, 48)); // min/max framesize + put_bits(&pb, 20, get_bits(&gb, 20)); // samplerate + skip_bits(&gb, 3); + put_bits(&pb, 3, 1); // set channels to stereo + ret = put_bits_left(&pb); + put_bits(&pb, ret, get_bits(&gb, ret)); + flush_put_bits(&pb); + + memcpy(codec_config->extradata, buf, sizeof(buf)); + break; + } + default: + break; + } + + return 0; +} + +static int fill_codec_config(IAMFContext *iamf, const AVStreamGroup *stg, + IAMFCodecConfig *codec_config) +{ + const AVStream *st = stg->streams[0]; + IAMFCodecConfig **tmp; + int j, ret = 0; + + codec_config->codec_id = st->codecpar->codec_id; + codec_config->sample_rate = st->codecpar->sample_rate; + codec_config->codec_tag = st->codecpar->codec_tag; + codec_config->nb_samples = st->codecpar->frame_size; + codec_config->seek_preroll = st->codecpar->seek_preroll; + if (st->codecpar->extradata_size) { + codec_config->extradata = av_memdup(st->codecpar->extradata, st->codecpar->extradata_size); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + codec_config->extradata_size = st->codecpar->extradata_size; + ret = update_extradata(codec_config); + if (ret < 0) + goto fail; + } + + for (j = 0; j < iamf->nb_codec_configs; j++) { + if (!memcmp(iamf->codec_configs[j], codec_config, offsetof(IAMFCodecConfig, extradata)) && + (!codec_config->extradata_size || !memcmp(iamf->codec_configs[j]->extradata, + codec_config->extradata, codec_config->extradata_size))) + break; + } + + if (j < iamf->nb_codec_configs) { + av_free(iamf->codec_configs[j]->extradata); + av_free(iamf->codec_configs[j]); + iamf->codec_configs[j] = codec_config; + return j; + } + + tmp = av_realloc_array(iamf->codec_configs, iamf->nb_codec_configs + 1, sizeof(*iamf->codec_configs)); + if (!tmp) { + ret = AVERROR(ENOMEM); + goto fail; + } + + iamf->codec_configs = tmp; + iamf->codec_configs[iamf->nb_codec_configs] = codec_config; + codec_config->codec_config_id = iamf->nb_codec_configs; + + return iamf->nb_codec_configs++; + +fail: + av_freep(&codec_config->extradata); + return ret; +} + +static IAMFParamDefinition *add_param_definition(IAMFContext *iamf, AVIAMFParamDefinition *param, + const IAMFAudioElement *audio_element, void *log_ctx) +{ + IAMFParamDefinition **tmp, *param_definition; + IAMFCodecConfig *codec_config = NULL; + + tmp = av_realloc_array(iamf->param_definitions, iamf->nb_param_definitions + 1, + sizeof(*iamf->param_definitions)); + if (!tmp) + return NULL; + + iamf->param_definitions = tmp; + + param_definition = av_mallocz(sizeof(*param_definition)); + if (!param_definition) + return NULL; + + if (audio_element) + codec_config = iamf->codec_configs[audio_element->codec_config_id]; + + if (!param->parameter_rate) { + if (!codec_config) { + av_log(log_ctx, AV_LOG_ERROR, "parameter_rate needed but not set for parameter_id %u\n", + param->parameter_id); + return NULL; + } + param->parameter_rate = codec_config->sample_rate; + } + if (codec_config) { + if (!param->duration) + param->duration = codec_config->nb_samples; + if (!param->constant_subblock_duration) + param->constant_subblock_duration = codec_config->nb_samples; + } + + param_definition->mode = !!param->duration; + param_definition->param = param; + param_definition->audio_element = audio_element; + iamf->param_definitions[iamf->nb_param_definitions++] = param_definition; + + return param_definition; +} + +int ff_iamf_add_audio_element(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx) +{ + const AVIAMFAudioElement *iamf_audio_element; + IAMFAudioElement **tmp, *audio_element; + IAMFCodecConfig *codec_config; + int ret; + + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + return AVERROR(EINVAL); + + iamf_audio_element = stg->params.iamf_audio_element; + if (iamf_audio_element->audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE) { + const AVIAMFLayer *layer = iamf_audio_element->layers[0]; + if (iamf_audio_element->nb_layers != 1) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid amount of layers for SCENE_BASED audio element. Must be 1\n"); + return AVERROR(EINVAL); + } + if (layer->ch_layout.order != AV_CHANNEL_ORDER_CUSTOM && + layer->ch_layout.order != AV_CHANNEL_ORDER_AMBISONIC) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid channel layout for SCENE_BASED audio element\n"); + return AVERROR(EINVAL); + } + if (layer->ambisonics_mode >= AV_IAMF_AMBISONICS_MODE_PROJECTION) { + av_log(log_ctx, AV_LOG_ERROR, "Unsuported ambisonics mode %d\n", layer->ambisonics_mode); + return AVERROR_PATCHWELCOME; + } + for (int i = 0; i < stg->nb_streams; i++) { + if (stg->streams[i]->codecpar->ch_layout.nb_channels > 1) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid amount of channels in a stream for MONO mode ambisonics\n"); + return AVERROR(EINVAL); + } + } + } else + for (int j, i = 0; i < iamf_audio_element->nb_layers; i++) { + const AVIAMFLayer *layer = iamf_audio_element->layers[i]; + for (j = 0; j < FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts); j++) + if (!av_channel_layout_compare(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[j])) + break; + + if (j >= FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts)) { + av_log(log_ctx, AV_LOG_ERROR, "Unsupported channel layout in stream group #%d\n", i); + return AVERROR(EINVAL); + } + } + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + if (stg->id == iamf->audio_elements[i]->audio_element_id) { + av_log(log_ctx, AV_LOG_ERROR, "Duplicated Audio Element id %"PRId64"\n", stg->id); + return AVERROR(EINVAL); + } + } + + codec_config = av_mallocz(sizeof(*codec_config)); + if (!codec_config) + return AVERROR(ENOMEM); + + ret = fill_codec_config(iamf, stg, codec_config); + if (ret < 0) { + av_free(codec_config); + return ret; + } + + audio_element = av_mallocz(sizeof(*audio_element)); + if (!audio_element) + return AVERROR(ENOMEM); + + audio_element->element = stg->params.iamf_audio_element; + audio_element->audio_element_id = stg->id; + audio_element->codec_config_id = ret; + + audio_element->substreams = av_calloc(stg->nb_streams, sizeof(*audio_element->substreams)); + if (!audio_element->substreams) + return AVERROR(ENOMEM); + audio_element->nb_substreams = stg->nb_streams; + + audio_element->layers = av_calloc(iamf_audio_element->nb_layers, sizeof(*audio_element->layers)); + if (!audio_element->layers) + return AVERROR(ENOMEM); + + for (int i = 0, j = 0; i < iamf_audio_element->nb_layers; i++) { + int nb_channels = iamf_audio_element->layers[i]->ch_layout.nb_channels; + + IAMFLayer *layer = &audio_element->layers[i]; + if (!layer) + return AVERROR(ENOMEM); + memset(layer, 0, sizeof(*layer)); + + if (i) + nb_channels -= iamf_audio_element->layers[i - 1]->ch_layout.nb_channels; + for (; nb_channels > 0 && j < stg->nb_streams; j++) { + const AVStream *st = stg->streams[j]; + IAMFSubStream *substream = &audio_element->substreams[j]; + + substream->audio_substream_id = st->id; + layer->substream_count++; + layer->coupled_substream_count += st->codecpar->ch_layout.nb_channels == 2; + nb_channels -= st->codecpar->ch_layout.nb_channels; + } + if (nb_channels) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid channel count across substreams in layer %u from stream group %u\n", + i, stg->index); + return AVERROR(EINVAL); + } + } + + if (iamf_audio_element->demixing_info) { + AVIAMFParamDefinition *param = iamf_audio_element->demixing_info; + IAMFParamDefinition *param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id); + + if (param->nb_subblocks != 1) { + av_log(log_ctx, AV_LOG_ERROR, "nb_subblocks in demixing_info for stream group %u is not 1\n", stg->index); + return AVERROR(EINVAL); + } + + if (!param_definition) { + param_definition = add_param_definition(iamf, param, audio_element, log_ctx); + if (!param_definition) + return AVERROR(ENOMEM); + } + } + if (iamf_audio_element->recon_gain_info) { + AVIAMFParamDefinition *param = iamf_audio_element->recon_gain_info; + IAMFParamDefinition *param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id); + + if (param->nb_subblocks != 1) { + av_log(log_ctx, AV_LOG_ERROR, "nb_subblocks in recon_gain_info for stream group %u is not 1\n", stg->index); + return AVERROR(EINVAL); + } + + if (!param_definition) { + param_definition = add_param_definition(iamf, param, audio_element, log_ctx); + if (!param_definition) + return AVERROR(ENOMEM); + } + } + + tmp = av_realloc_array(iamf->audio_elements, iamf->nb_audio_elements + 1, sizeof(*iamf->audio_elements)); + if (!tmp) + return AVERROR(ENOMEM); + + iamf->audio_elements = tmp; + iamf->audio_elements[iamf->nb_audio_elements++] = audio_element; + + return 0; +} + +int ff_iamf_add_mix_presentation(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx) +{ + IAMFMixPresentation **tmp, *mix_presentation; + + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + return AVERROR(EINVAL); + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + if (stg->id == iamf->mix_presentations[i]->mix_presentation_id) { + av_log(log_ctx, AV_LOG_ERROR, "Duplicate Mix Presentation id %"PRId64"\n", stg->id); + return AVERROR(EINVAL); + } + } + + mix_presentation = av_mallocz(sizeof(*mix_presentation)); + if (!mix_presentation) + return AVERROR(ENOMEM); + + mix_presentation->mix = stg->params.iamf_mix_presentation; + mix_presentation->mix_presentation_id = stg->id; + + for (int i = 0; i < mix_presentation->mix->nb_submixes; i++) { + const AVIAMFSubmix *submix = mix_presentation->mix->submixes[i]; + AVIAMFParamDefinition *param = submix->output_mix_config; + IAMFParamDefinition *param_definition; + + if (!param) { + av_log(log_ctx, AV_LOG_ERROR, "output_mix_config is not present in submix %u from " + "Mix Presentation ID %"PRId64"\n", i, stg->id); + return AVERROR(EINVAL); + } + + param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id); + if (!param_definition) { + param_definition = add_param_definition(iamf, param, NULL, log_ctx); + if (!param_definition) + return AVERROR(ENOMEM); + } + + for (int j = 0; j < submix->nb_elements; j++) { + const AVIAMFSubmixElement *element = submix->elements[j]; + param = element->element_mix_config; + + if (!param) { + av_log(log_ctx, AV_LOG_ERROR, "element_mix_config is not present for element %u in submix %u from " + "Mix Presentation ID %"PRId64"\n", j, i, stg->id); + return AVERROR(EINVAL); + } + param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id); + if (!param_definition) { + param_definition = add_param_definition(iamf, param, NULL, log_ctx); + if (!param_definition) + return AVERROR(ENOMEM); + } + } + } + + tmp = av_realloc_array(iamf->mix_presentations, iamf->nb_mix_presentations + 1, sizeof(*iamf->mix_presentations)); + if (!tmp) + return AVERROR(ENOMEM); + + iamf->mix_presentations = tmp; + iamf->mix_presentations[iamf->nb_mix_presentations++] = mix_presentation; + + return 0; +} + +static int iamf_write_codec_config(const IAMFContext *iamf, + const IAMFCodecConfig *codec_config, + AVIOContext *pb) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + PutBitContext pbc; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + ffio_write_leb(dyn_bc, codec_config->codec_config_id); + avio_wl32(dyn_bc, codec_config->codec_tag); + + ffio_write_leb(dyn_bc, codec_config->nb_samples); + avio_wb16(dyn_bc, codec_config->seek_preroll); + + switch(codec_config->codec_id) { + case AV_CODEC_ID_OPUS: + avio_write(dyn_bc, codec_config->extradata, codec_config->extradata_size); + break; + case AV_CODEC_ID_AAC: + return AVERROR_PATCHWELCOME; + case AV_CODEC_ID_FLAC: + avio_w8(dyn_bc, 0x80); + avio_wb24(dyn_bc, codec_config->extradata_size); + avio_write(dyn_bc, codec_config->extradata, codec_config->extradata_size); + break; + case AV_CODEC_ID_PCM_S16LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 16); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S24LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 24); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S32LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 32); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S16BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 16); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S24BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 24); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S32BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 32); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + default: + break; + } + + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 5, IAMF_OBU_IA_CODEC_CONFIG); + put_bits(&pbc, 3, 0); + flush_put_bits(&pbc); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(pb, header, put_bytes_count(&pbc, 1)); + ffio_write_leb(pb, dyn_size); + avio_write(pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static inline int rescale_rational(AVRational q, int b) +{ + return av_clip_int16(av_rescale(q.num, b, q.den)); +} + +static int scalable_channel_layout_config(const IAMFAudioElement *audio_element, + AVIOContext *dyn_bc) +{ + const AVIAMFAudioElement *element = audio_element->element; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pb; + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 3, element->nb_layers); + put_bits(&pb, 5, 0); + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + for (int i = 0; i < element->nb_layers; i++) { + AVIAMFLayer *layer = element->layers[i]; + int layout; + for (layout = 0; layout < FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts); layout++) { + if (!av_channel_layout_compare(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[layout])) + break; + } + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 4, layout); + put_bits(&pb, 1, !!layer->output_gain_flags); + put_bits(&pb, 1, !!(layer->flags & AV_IAMF_LAYER_FLAG_RECON_GAIN)); + put_bits(&pb, 2, 0); // reserved + put_bits(&pb, 8, audio_element->layers[i].substream_count); + put_bits(&pb, 8, audio_element->layers[i].coupled_substream_count); + if (layer->output_gain_flags) { + put_bits(&pb, 6, layer->output_gain_flags); + put_bits(&pb, 2, 0); + put_bits(&pb, 16, rescale_rational(layer->output_gain, 1 << 8)); + } + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + } + + return 0; +} + +static int ambisonics_config(const IAMFAudioElement *audio_element, + AVIOContext *dyn_bc) +{ + const AVIAMFAudioElement *element = audio_element->element; + AVIAMFLayer *layer = element->layers[0]; + + ffio_write_leb(dyn_bc, 0); // ambisonics_mode + ffio_write_leb(dyn_bc, layer->ch_layout.nb_channels); // output_channel_count + ffio_write_leb(dyn_bc, audio_element->nb_substreams); // substream_count + + if (layer->ch_layout.order == AV_CHANNEL_ORDER_AMBISONIC) + for (int i = 0; i < layer->ch_layout.nb_channels; i++) + avio_w8(dyn_bc, i); + else + for (int i = 0; i < layer->ch_layout.nb_channels; i++) + avio_w8(dyn_bc, layer->ch_layout.u.map[i].id); + + return 0; +} + +static int param_definition(const IAMFContext *iamf, + const IAMFParamDefinition *param_def, + AVIOContext *dyn_bc, void *log_ctx) +{ + const AVIAMFParamDefinition *param = param_def->param; + + ffio_write_leb(dyn_bc, param->parameter_id); + ffio_write_leb(dyn_bc, param->parameter_rate); + avio_w8(dyn_bc, param->duration ? 0 : 1 << 7); + if (param->duration) { + ffio_write_leb(dyn_bc, param->duration); + ffio_write_leb(dyn_bc, param->constant_subblock_duration); + if (param->constant_subblock_duration == 0) { + ffio_write_leb(dyn_bc, param->nb_subblocks); + for (int i = 0; i < param->nb_subblocks; i++) { + const void *subblock = av_iamf_param_definition_get_subblock(param, i); + + switch (param->type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + const AVIAMFMixGain *mix = subblock; + ffio_write_leb(dyn_bc, mix->subblock_duration); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + const AVIAMFDemixingInfo *demix = subblock; + ffio_write_leb(dyn_bc, demix->subblock_duration); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + const AVIAMFReconGain *recon = subblock; + ffio_write_leb(dyn_bc, recon->subblock_duration); + break; + } + } + } + } + } + + return 0; +} + +static int iamf_write_audio_element(const IAMFContext *iamf, + const IAMFAudioElement *audio_element, + AVIOContext *pb, void *log_ctx) +{ + const AVIAMFAudioElement *element = audio_element->element; + const IAMFCodecConfig *codec_config = iamf->codec_configs[audio_element->codec_config_id]; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + PutBitContext pbc; + int param_definition_types = AV_IAMF_PARAMETER_DEFINITION_DEMIXING, dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + ffio_write_leb(dyn_bc, audio_element->audio_element_id); + + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 3, element->audio_element_type); + put_bits(&pbc, 5, 0); + flush_put_bits(&pbc); + avio_write(dyn_bc, header, put_bytes_count(&pbc, 1)); + + ffio_write_leb(dyn_bc, audio_element->codec_config_id); + ffio_write_leb(dyn_bc, audio_element->nb_substreams); + + for (int i = 0; i < audio_element->nb_substreams; i++) + ffio_write_leb(dyn_bc, audio_element->substreams[i].audio_substream_id); + + if (element->nb_layers == 1) + param_definition_types &= ~AV_IAMF_PARAMETER_DEFINITION_DEMIXING; + if (element->nb_layers > 1) + param_definition_types |= AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN; + if (codec_config->codec_tag == MKTAG('f','L','a','C') || + codec_config->codec_tag == MKTAG('i','p','c','m')) + param_definition_types &= ~AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN; + + ffio_write_leb(dyn_bc, av_popcount(param_definition_types)); // num_parameters + + if (param_definition_types & 1) { + const AVIAMFParamDefinition *param = element->demixing_info; + const IAMFParamDefinition *param_def; + const AVIAMFDemixingInfo *demix; + + if (!param) { + av_log(log_ctx, AV_LOG_ERROR, "demixing_info needed but not set in Stream Group #%u\n", + audio_element->audio_element_id); + return AVERROR(EINVAL); + } + + demix = av_iamf_param_definition_get_subblock(param, 0); + ffio_write_leb(dyn_bc, AV_IAMF_PARAMETER_DEFINITION_DEMIXING); // type + + param_def = ff_iamf_get_param_definition(iamf, param->parameter_id); + ret = param_definition(iamf, param_def, dyn_bc, log_ctx); + if (ret < 0) + return ret; + + avio_w8(dyn_bc, demix->dmixp_mode << 5); // dmixp_mode + avio_w8(dyn_bc, element->default_w << 4); // default_w + } + if (param_definition_types & 2) { + const AVIAMFParamDefinition *param = element->recon_gain_info; + const IAMFParamDefinition *param_def; + + if (!param) { + av_log(log_ctx, AV_LOG_ERROR, "recon_gain_info needed but not set in Stream Group #%u\n", + audio_element->audio_element_id); + return AVERROR(EINVAL); + } + ffio_write_leb(dyn_bc, AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN); // type + + param_def = ff_iamf_get_param_definition(iamf, param->parameter_id); + ret = param_definition(iamf, param_def, dyn_bc, log_ctx); + if (ret < 0) + return ret; + } + + if (element->audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) { + ret = scalable_channel_layout_config(audio_element, dyn_bc); + if (ret < 0) + return ret; + } else { + ret = ambisonics_config(audio_element, dyn_bc); + if (ret < 0) + return ret; + } + + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 5, IAMF_OBU_IA_AUDIO_ELEMENT); + put_bits(&pbc, 3, 0); + flush_put_bits(&pbc); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(pb, header, put_bytes_count(&pbc, 1)); + ffio_write_leb(pb, dyn_size); + avio_write(pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static int iamf_write_mixing_presentation(const IAMFContext *iamf, + const IAMFMixPresentation *mix_presentation, + AVIOContext *pb, void *log_ctx) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + const AVIAMFMixPresentation *mix = mix_presentation->mix; + const AVDictionaryEntry *tag = NULL; + PutBitContext pbc; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + ffio_write_leb(dyn_bc, mix_presentation->mix_presentation_id); // mix_presentation_id + ffio_write_leb(dyn_bc, av_dict_count(mix->annotations)); // count_label + + while ((tag = av_dict_iterate(mix->annotations, tag))) + avio_put_str(dyn_bc, tag->key); + while ((tag = av_dict_iterate(mix->annotations, tag))) + avio_put_str(dyn_bc, tag->value); + + ffio_write_leb(dyn_bc, mix->nb_submixes); + for (int i = 0; i < mix->nb_submixes; i++) { + const AVIAMFSubmix *sub_mix = mix->submixes[i]; + const IAMFParamDefinition *param_def; + + ffio_write_leb(dyn_bc, sub_mix->nb_elements); + for (int j = 0; j < sub_mix->nb_elements; j++) { + const IAMFAudioElement *audio_element = NULL; + const AVIAMFSubmixElement *submix_element = sub_mix->elements[j]; + + for (int k = 0; k < iamf->nb_audio_elements; k++) + if (iamf->audio_elements[k]->audio_element_id == submix_element->audio_element_id) { + audio_element = iamf->audio_elements[k]; + break; + } + + av_assert0(audio_element); + ffio_write_leb(dyn_bc, submix_element->audio_element_id); + + if (av_dict_count(submix_element->annotations) != av_dict_count(mix->annotations)) { + av_log(log_ctx, AV_LOG_ERROR, "Inconsistent amount of labels in submix %d from Mix Presentation id #%u\n", + j, audio_element->audio_element_id); + return AVERROR(EINVAL); + } + while ((tag = av_dict_iterate(submix_element->annotations, tag))) + avio_put_str(dyn_bc, tag->value); + + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 2, submix_element->headphones_rendering_mode); + put_bits(&pbc, 6, 0); // reserved + flush_put_bits(&pbc); + avio_write(dyn_bc, header, put_bytes_count(&pbc, 1)); + ffio_write_leb(dyn_bc, 0); // rendering_config_extension_size + + param_def = ff_iamf_get_param_definition(iamf, submix_element->element_mix_config->parameter_id); + ret = param_definition(iamf, param_def, dyn_bc, log_ctx); + if (ret < 0) + return ret; + + avio_wb16(dyn_bc, rescale_rational(submix_element->default_mix_gain, 1 << 8)); + } + + param_def = ff_iamf_get_param_definition(iamf, sub_mix->output_mix_config->parameter_id); + ret = param_definition(iamf, param_def, dyn_bc, log_ctx); + if (ret < 0) + return ret; + avio_wb16(dyn_bc, rescale_rational(sub_mix->default_mix_gain, 1 << 8)); + + ffio_write_leb(dyn_bc, sub_mix->nb_layouts); // nb_layouts + for (int i = 0; i < sub_mix->nb_layouts; i++) { + const AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[i]; + int layout, info_type; + int dialogue = submix_layout->dialogue_anchored_loudness.num && + submix_layout->dialogue_anchored_loudness.den; + int album = submix_layout->album_anchored_loudness.num && + submix_layout->album_anchored_loudness.den; + + if (layout == FF_ARRAY_ELEMS(ff_iamf_sound_system_map)) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid Sound System value in a submix\n"); + return AVERROR(EINVAL); + } + + if (submix_layout->layout_type == AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS) { + for (layout = 0; layout < FF_ARRAY_ELEMS(ff_iamf_sound_system_map); layout++) { + if (!av_channel_layout_compare(&submix_layout->sound_system, &ff_iamf_sound_system_map[layout].layout)) + break; + } + if (layout == FF_ARRAY_ELEMS(ff_iamf_sound_system_map)) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid Sound System value in a submix\n"); + return AVERROR(EINVAL); + } + } + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 2, submix_layout->layout_type); // layout_type + if (submix_layout->layout_type == AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS) { + put_bits(&pbc, 4, ff_iamf_sound_system_map[layout].id); // sound_system + put_bits(&pbc, 2, 0); // reserved + } else + put_bits(&pbc, 6, 0); // reserved + flush_put_bits(&pbc); + avio_write(dyn_bc, header, put_bytes_count(&pbc, 1)); + + info_type = (submix_layout->true_peak.num && submix_layout->true_peak.den); + info_type |= (dialogue || album) << 1; + avio_w8(dyn_bc, info_type); + avio_wb16(dyn_bc, rescale_rational(submix_layout->integrated_loudness, 1 << 8)); + avio_wb16(dyn_bc, rescale_rational(submix_layout->digital_peak, 1 << 8)); + if (info_type & 1) + avio_wb16(dyn_bc, rescale_rational(submix_layout->true_peak, 1 << 8)); + if (info_type & 2) { + avio_w8(dyn_bc, dialogue + album); // num_anchored_loudness + if (dialogue) { + avio_w8(dyn_bc, IAMF_ANCHOR_ELEMENT_DIALOGUE); + avio_wb16(dyn_bc, rescale_rational(submix_layout->dialogue_anchored_loudness, 1 << 8)); + } + if (album) { + avio_w8(dyn_bc, IAMF_ANCHOR_ELEMENT_ALBUM); + avio_wb16(dyn_bc, rescale_rational(submix_layout->album_anchored_loudness, 1 << 8)); + } + } + } + } + + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 5, IAMF_OBU_IA_MIX_PRESENTATION); + put_bits(&pbc, 3, 0); + flush_put_bits(&pbc); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(pb, header, put_bytes_count(&pbc, 1)); + ffio_write_leb(pb, dyn_size); + avio_write(pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +int ff_iamf_write_descriptors(const IAMFContext *iamf, AVIOContext *pb, void *log_ctx) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pbc; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + // Sequence Header + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 5, IAMF_OBU_IA_SEQUENCE_HEADER); + put_bits(&pbc, 3, 0); + flush_put_bits(&pbc); + + avio_write(dyn_bc, header, put_bytes_count(&pbc, 1)); + ffio_write_leb(dyn_bc, 6); + avio_wb32(dyn_bc, MKBETAG('i','a','m','f')); + avio_w8(dyn_bc, iamf->nb_audio_elements > 1); // primary_profile + avio_w8(dyn_bc, iamf->nb_audio_elements > 1); // additional_profile + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + for (int i = 0; i < iamf->nb_codec_configs; i++) { + ret = iamf_write_codec_config(iamf, iamf->codec_configs[i], pb); + if (ret < 0) + return ret; + } + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + ret = iamf_write_audio_element(iamf, iamf->audio_elements[i], pb, log_ctx); + if (ret < 0) + return ret; + } + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + ret = iamf_write_mixing_presentation(iamf, iamf->mix_presentations[i], pb, log_ctx); + if (ret < 0) + return ret; + } + + return 0; +} diff --git a/libavformat/iamf_writer.h b/libavformat/iamf_writer.h new file mode 100644 index 0000000000..93354670b8 --- /dev/null +++ b/libavformat/iamf_writer.h @@ -0,0 +1,51 @@ +/* + * Immersive Audio Model and Formats muxing helpers and structs + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFORMAT_IAMF_WRITER_H +#define AVFORMAT_IAMF_WRITER_H + +#include + +#include "libavutil/common.h" +#include "avformat.h" +#include "avio.h" +#include "iamf.h" + +static inline IAMFParamDefinition *ff_iamf_get_param_definition(const IAMFContext *iamf, + unsigned int parameter_id) +{ + IAMFParamDefinition *param_definition = NULL; + + for (int i = 0; i < iamf->nb_param_definitions; i++) + if (iamf->param_definitions[i]->param->parameter_id == parameter_id) { + param_definition = iamf->param_definitions[i]; + break; + } + + return param_definition; +} + +int ff_iamf_add_audio_element(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx); +int ff_iamf_add_mix_presentation(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx); + +int ff_iamf_write_descriptors(const IAMFContext *iamf, AVIOContext *pb, void *log_ctx); + +#endif /* AVFORMAT_IAMF_WRITER_H */ diff --git a/libavformat/iamfenc.c b/libavformat/iamfenc.c new file mode 100644 index 0000000000..0a043ce3a0 --- /dev/null +++ b/libavformat/iamfenc.c @@ -0,0 +1,387 @@ +/* + * IAMF muxer + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/avassert.h" +#include "libavutil/common.h" +#include "libavutil/iamf.h" +#include "libavcodec/get_bits.h" +#include "libavcodec/put_bits.h" +#include "avformat.h" +#include "avio_internal.h" +#include "iamf.h" +#include "iamf_writer.h" +#include "internal.h" +#include "mux.h" + +typedef struct IAMFMuxContext { + IAMFContext iamf; + + int first_stream_id; +} IAMFMuxContext; + +static int iamf_init(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + int nb_audio_elements = 0, nb_mix_presentations = 0; + int ret; + + if (!s->nb_streams) { + av_log(s, AV_LOG_ERROR, "There must be at least one stream\n"); + return AVERROR(EINVAL); + } + + for (int i = 0; i < s->nb_streams; i++) { + if (s->streams[i]->codecpar->codec_type != AVMEDIA_TYPE_AUDIO || + (s->streams[i]->codecpar->codec_tag != MKTAG('m','p','4','a') && + s->streams[i]->codecpar->codec_tag != MKTAG('O','p','u','s') && + s->streams[i]->codecpar->codec_tag != MKTAG('f','L','a','C') && + s->streams[i]->codecpar->codec_tag != MKTAG('i','p','c','m'))) { + av_log(s, AV_LOG_ERROR, "Unsupported codec id %s\n", + avcodec_get_name(s->streams[i]->codecpar->codec_id)); + return AVERROR(EINVAL); + } + + if (s->streams[i]->codecpar->ch_layout.nb_channels > 2) { + av_log(s, AV_LOG_ERROR, "Unsupported channel layout on stream #%d\n", i); + return AVERROR(EINVAL); + } + + for (int j = 0; j < i; j++) { + if (s->streams[i]->id == s->streams[j]->id) { + av_log(s, AV_LOG_ERROR, "Duplicated stream id %d\n", s->streams[j]->id); + return AVERROR(EINVAL); + } + } + } + + if (!s->nb_stream_groups) { + av_log(s, AV_LOG_ERROR, "There must be at least two stream groups\n"); + return AVERROR(EINVAL); + } + + for (int i = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + + if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + nb_audio_elements++; + if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + nb_mix_presentations++; + } + if ((nb_audio_elements < 1 && nb_audio_elements > 2) || nb_mix_presentations < 1) { + av_log(s, AV_LOG_ERROR, "There must be >= 1 and <= 2 IAMF_AUDIO_ELEMENT and at least " + "one IAMF_MIX_PRESENTATION stream groups\n"); + return AVERROR(EINVAL); + } + + for (int i = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + continue; + + ret = ff_iamf_add_audio_element(iamf, stg, s); + if (ret < 0) + return ret; + } + + for (int i = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + continue; + + ret = ff_iamf_add_mix_presentation(iamf, stg, s); + if (ret < 0) + return ret; + } + + c->first_stream_id = s->streams[0]->id; + + return 0; +} + +static int iamf_write_header(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + int ret; + + ret = ff_iamf_write_descriptors(iamf, s->pb, s); + if (ret < 0) + return ret; + + c->first_stream_id = s->streams[0]->id; + + return 0; +} + +static inline int rescale_rational(AVRational q, int b) +{ + return av_clip_int16(av_rescale(q.num, b, q.den)); +} + +static int write_parameter_block(AVFormatContext *s, const AVIAMFParamDefinition *param) +{ + const IAMFMuxContext *const c = s->priv_data; + const IAMFContext *const iamf = &c->iamf; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + IAMFParamDefinition *param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id); + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size, ret; + + if (param->type > AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) { + av_log(s, AV_LOG_DEBUG, "Ignoring side data with unknown type %u\n", + param->type); + return 0; + } + + if (!param_definition) { + av_log(s, AV_LOG_ERROR, "Non-existent Parameter Definition with ID %u referenced by a packet\n", + param->parameter_id); + return AVERROR(EINVAL); + } + + if (param->type != param_definition->param->type) { + av_log(s, AV_LOG_ERROR, "Inconsistent values for Parameter Definition " + "with ID %u in a packet\n", + param->parameter_id); + return AVERROR(EINVAL); + } + + ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + // Sequence Header + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_PARAMETER_BLOCK); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + + ffio_write_leb(dyn_bc, param->parameter_id); + if (!param_definition->mode) { + ffio_write_leb(dyn_bc, param->duration); + ffio_write_leb(dyn_bc, param->constant_subblock_duration); + if (param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, param->nb_subblocks); + } + + for (int i = 0; i < param->nb_subblocks; i++) { + const void *subblock = av_iamf_param_definition_get_subblock(param, i); + + switch (param->type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + const AVIAMFMixGain *mix = subblock; + if (!param_definition->mode && param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, mix->subblock_duration); + + ffio_write_leb(dyn_bc, mix->animation_type); + + avio_wb16(dyn_bc, rescale_rational(mix->start_point_value, 1 << 8)); + if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR) + avio_wb16(dyn_bc, rescale_rational(mix->end_point_value, 1 << 8)); + if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) { + avio_wb16(dyn_bc, rescale_rational(mix->control_point_value, 1 << 8)); + avio_w8(dyn_bc, av_clip_uint8(av_rescale(mix->control_point_relative_time.num, 1 << 8, + mix->control_point_relative_time.den))); + } + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + const AVIAMFDemixingInfo *demix = subblock; + if (!param_definition->mode && param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, demix->subblock_duration); + + avio_w8(dyn_bc, demix->dmixp_mode << 5); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + const AVIAMFReconGain *recon = subblock; + const AVIAMFAudioElement *audio_element = param_definition->audio_element->element; + + if (!param_definition->mode && param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, recon->subblock_duration); + + if (!audio_element) { + av_log(s, AV_LOG_ERROR, "Invalid Parameter Definition with ID %u referenced by a packet\n", param->parameter_id); + return AVERROR(EINVAL); + } + + for (int j = 0; j < audio_element->nb_layers; j++) { + const AVIAMFLayer *layer = audio_element->layers[j]; + + if (layer->flags & AV_IAMF_LAYER_FLAG_RECON_GAIN) { + unsigned int recon_gain_flags = 0; + int k = 0; + + for (; k < 7; k++) + recon_gain_flags |= (1 << k) * !!recon->recon_gain[j][k]; + for (; k < 12; k++) + recon_gain_flags |= (2 << k) * !!recon->recon_gain[j][k]; + if (recon_gain_flags >> 8) + recon_gain_flags |= (1 << k); + + ffio_write_leb(dyn_bc, recon_gain_flags); + for (k = 0; k < 12; k++) { + if (recon->recon_gain[j][k]) + avio_w8(dyn_bc, recon->recon_gain[j][k]); + } + } + } + break; + } + default: + av_assert0(0); + } + } + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + ffio_write_leb(s->pb, dyn_size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static int iamf_write_packet(AVFormatContext *s, AVPacket *pkt) +{ + const IAMFMuxContext *const c = s->priv_data; + AVStream *st = s->streams[pkt->stream_index]; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *side_data, *dyn_buf = NULL; + unsigned int skip_samples = 0, discard_padding = 0; + size_t side_data_size; + int dyn_size, type = st->id <= 17 ? st->id + IAMF_OBU_IA_AUDIO_FRAME_ID0 : IAMF_OBU_IA_AUDIO_FRAME; + int ret; + + if (s->nb_stream_groups && st->id == c->first_stream_id) { + AVIAMFParamDefinition *mix = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, NULL); + AVIAMFParamDefinition *demix = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, NULL); + AVIAMFParamDefinition *recon = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, NULL); + + if (mix) { + ret = write_parameter_block(s, mix); + if (ret < 0) + return ret; + } + if (demix) { + ret = write_parameter_block(s, demix); + if (ret < 0) + return ret; + } + if (recon) { + ret = write_parameter_block(s, recon); + if (ret < 0) + return ret; + } + } + side_data = av_packet_get_side_data(pkt, AV_PKT_DATA_SKIP_SAMPLES, + &side_data_size); + + if (side_data && side_data_size >= 10) { + skip_samples = AV_RL32(side_data); + discard_padding = AV_RL32(side_data + 4); + } + + ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, type); + put_bits(&pb, 1, 0); // obu_redundant_copy + put_bits(&pb, 1, skip_samples || discard_padding); + put_bits(&pb, 1, 0); // obu_extension_flag + flush_put_bits(&pb); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + + if (skip_samples || discard_padding) { + ffio_write_leb(dyn_bc, discard_padding); + ffio_write_leb(dyn_bc, skip_samples); + } + + if (st->id > 17) + ffio_write_leb(dyn_bc, st->id); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + ffio_write_leb(s->pb, dyn_size + pkt->size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + avio_write(s->pb, pkt->data, pkt->size); + + return 0; +} + +static void iamf_deinit(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + IAMFAudioElement *audio_element = iamf->audio_elements[i]; + audio_element->element = NULL; + } + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + IAMFMixPresentation *mix_presentation = iamf->mix_presentations[i]; + mix_presentation->mix = NULL; + } + + ff_iamf_uninit_context(iamf); + + return; +} + +static const AVCodecTag iamf_codec_tags[] = { + { AV_CODEC_ID_AAC, MKTAG('m','p','4','a') }, + { AV_CODEC_ID_FLAC, MKTAG('f','L','a','C') }, + { AV_CODEC_ID_OPUS, MKTAG('O','p','u','s') }, + { AV_CODEC_ID_PCM_S16LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S16BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S24LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S24BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S32LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S32BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_NONE, MKTAG('i','p','c','m') } +}; + +const FFOutputFormat ff_iamf_muxer = { + .p.name = "iamf", + .p.long_name = NULL_IF_CONFIG_SMALL("Raw Immersive Audio Model and Formats"), + .p.extensions = "iamf", + .priv_data_size = sizeof(IAMFMuxContext), + .p.audio_codec = AV_CODEC_ID_OPUS, + .init = iamf_init, + .deinit = iamf_deinit, + .write_header = iamf_write_header, + .write_packet = iamf_write_packet, + .p.codec_tag = (const AVCodecTag* const []){ iamf_codec_tags, NULL }, + .p.flags = AVFMT_GLOBALHEADER | AVFMT_NOTIMESTAMPS, +};