From patchwork Tue Dec 5 22:43:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44934 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9153:b0:181:818d:5e7f with SMTP id x19csp658809pzc; Tue, 5 Dec 2023 14:44:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IHru9dCW7PzeVk3JvgZRMm4ShPnxfD3HwWwPbRmoc52VMBeLh47Dl7EO8tvtkBwRL+06QeE X-Received: by 2002:a05:6402:3595:b0:54c:6fc0:483a with SMTP id y21-20020a056402359500b0054c6fc0483amr39622edc.2.1701816249350; Tue, 05 Dec 2023 14:44:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701816249; cv=none; d=google.com; s=arc-20160816; b=uARSmmI1Dy2JU1JD46sR5QlwFiS3p37tH9Wdn682trYzKERMysieaNG57n97ldFME1 Eg0fz9zo6v8IgofsFsfBRxjExE7lrHwCpkMUCyVTjBJBxhUUXsi9wTVTP1/uvKmXKXm+ cFoqJeePJX7AHE7MyZF1wYYJ+OQRJg3EKmK5npb9fJzFDkSjpckqyZhdwZIfe4JOVGee 8rZUFvimdQpNz4vD9XbXhkAsm/4jCOU5N3uBKQmaHxyNkVhxO39NpCZ4GyVl8uSVB3TP 5RMKDVgYAvz9n6/JhaaNH7cJpnl7afvQzwm99Tkj9R9zIc/orlgahkMuiDAe6dT11d9c RqDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=kxGltdGlDPFdTUPC2FoJC3/pxybbqGu08+bhb/67LI4=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=FKJAV5bk46Lkjn0QWvLbOjliYinbZcEZB0z9r+RUIlEY0G+StpmXM55PTLeAdDXqnv Fe4JEZxExcOm9fpvU+6BhmS/HFXXXOVkTeiIwJM5GBvKb7RdW5x8h0H3RwYa+AuKDRU+ wY0i8utkKo5bUWnTDh+ryx267xwfN5l6Qv6ZDqX70nGU9jA660H2cwi3R+gq6IMg9Z3n NFpFzDREc+0j5Q2Wczwpu90Vkd6X2aAnDDFnt6QMiTZD5FBlKeh30U5K2QhkA5x9apKR cBVjP85FqquBWjMFpQwI2O8SYvrw3o88xgJmMpNiBLuiQNekQMAW6LrsL/gTuteK1kRQ N6Aw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=gnVvtrgm; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id 25-20020a508759000000b0054ca4c38763si1391794edv.38.2023.12.05.14.44.07; Tue, 05 Dec 2023 14:44:09 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=gnVvtrgm; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id EE21E68CDF6; Wed, 6 Dec 2023 00:43:52 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-oo1-f47.google.com (mail-oo1-f47.google.com [209.85.161.47]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B1C7F68CCED for ; Wed, 6 Dec 2023 00:43:45 +0200 (EET) Received: by mail-oo1-f47.google.com with SMTP id 006d021491bc7-58e1ddc68b2so2818860eaf.2 for ; Tue, 05 Dec 2023 14:43:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701816223; x=1702421023; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=fwpRsb5hWG682ruOsy0ZVmI2G/8LE0na4QCpxaw5yO0=; b=gnVvtrgmPU6AVelyCKDrFCiHsl0Wr1ek2Sasad5Tct4kChUPUkFQEITovWJE3qBzVI OHLtpNtsBr1Yz7Q81LPkSK863ExsujZMQ7MrQP4EIwxXZpWUFEOB4PLONKIorlYmiF76 c5E9qiBz2+xfdSBZGwXiXSmWS5Zt8OOSB8AfBkMtEyxZ29cwwuWdagwMc7+ScicF5HAX AsfmaPB6E8UoDf8dbUH/i+xzh25rGwUOaNTmJZB4WxLQlG252H7db6e8CV6CJYoyJtYU 8TP1NiJ8jYBaYhpma6oUUg3A8U1naV6vBOxO6WCaPn7uiHa8qLFkss23dcATpH06B5km yWAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701816223; x=1702421023; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fwpRsb5hWG682ruOsy0ZVmI2G/8LE0na4QCpxaw5yO0=; b=FZELP6TVB7LXQlnAwrg7wC+lEkOLJW6fr9ozO7blc7QaPsW2ES2kMe8/bzxTlNYdY3 uQOu6IeJwOvTBDnygJU8c9yvXuLwSndGep4/ijxSc5AWVwTAVrwhpW88DlTcWnz+jO27 B6SxGfMLn/yI8lqTtHQtNaT2ts9uIngvoYaeHLOW2izhmR5YkBOdKjWdInFS0xE3DJjm YVIheNkhOMm8BabWalzPqgqqGBU54n0ztX7bl8f4xeyMFNif7x1U5J4mzgmlbVtoFo5i /3dw35Ja9stZPHQh0wlkmJfR8dcIx8IfV9CELsGOlZ0kJQRFChCn5pmTXnyvUREj1yRi Jp5g== X-Gm-Message-State: AOJu0YwS15Z6CLcGQTwhS67+/Y/NYVRmH0arLQkjgBC5IZqUHu3L/cTe DqZhnC7C/vt13/EypewVGcBNxzSKH/0= X-Received: by 2002:a05:6358:9101:b0:170:5522:597b with SMTP id q1-20020a056358910100b001705522597bmr18353rwq.56.1701816222693; Tue, 05 Dec 2023 14:43:42 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id hq25-20020a056a00681900b0064fd4a6b306sm2037688pfb.76.2023.12.05.14.43.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 14:43:42 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 5 Dec 2023 19:43:55 -0300 Message-ID: <20231205224402.14540-2-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231205224402.14540-1-jamrial@gmail.com> References: <20231205224402.14540-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/8] avutil: introduce an Immersive Audio Model and Formats API X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 3VL50/r7zGVF Signed-off-by: James Almer --- libavutil/Makefile | 2 + libavutil/iamf.c | 564 ++++++++++++++++++++++++++++++++++++++++++++ libavutil/iamf.h | 573 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 1139 insertions(+) create mode 100644 libavutil/iamf.c create mode 100644 libavutil/iamf.h diff --git a/libavutil/Makefile b/libavutil/Makefile index 4711f8cde8..62cc1a1831 100644 --- a/libavutil/Makefile +++ b/libavutil/Makefile @@ -51,6 +51,7 @@ HEADERS = adler32.h \ hwcontext_videotoolbox.h \ hwcontext_vdpau.h \ hwcontext_vulkan.h \ + iamf.h \ imgutils.h \ intfloat.h \ intreadwrite.h \ @@ -140,6 +141,7 @@ OBJS = adler32.o \ hdr_dynamic_vivid_metadata.o \ hmac.o \ hwcontext.o \ + iamf.o \ imgutils.o \ integer.o \ intmath.o \ diff --git a/libavutil/iamf.c b/libavutil/iamf.c new file mode 100644 index 0000000000..5f646f2d65 --- /dev/null +++ b/libavutil/iamf.c @@ -0,0 +1,564 @@ +/* + * Immersive Audio Model and Formats helper functions and defines + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include +#include + +#include "avassert.h" +#include "error.h" +#include "iamf.h" +#include "log.h" +#include "mem.h" +#include "opt.h" + +#define IAMF_ADD_FUNC_TEMPLATE(parent_type, parent_name, child_type, child_name, suffix) \ +child_type *av_iamf_ ## parent_name ## _add_ ## child_name(parent_type *parent_name) \ +{ \ + child_type **child_name ## suffix, *child_name; \ + \ + if (parent_name->nb_## child_name ## suffix == UINT_MAX) \ + return NULL; \ + \ + child_name ## suffix = av_realloc_array(parent_name->child_name ## suffix, \ + parent_name->nb_## child_name ## suffix + 1, \ + sizeof(*parent_name->child_name ## suffix)); \ + if (!child_name ## suffix) \ + return NULL; \ + \ + parent_name->child_name ## suffix = child_name ## suffix; \ + \ + child_name = parent_name->child_name ## suffix[parent_name->nb_## child_name ## suffix] \ + = av_mallocz(sizeof(*child_name)); \ + if (!child_name) \ + return NULL; \ + \ + child_name->av_class = &child_name ## _class; \ + av_opt_set_defaults(child_name); \ + parent_name->nb_## child_name ## suffix++; \ + \ + return child_name; \ +} + +#define FLAGS AV_OPT_FLAG_ENCODING_PARAM + +// +// Param Definition +// +#define OFFSET(x) offsetof(AVIAMFMixGain, x) +static const AVOption mix_gain_options[] = { + { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS }, + { "animation_type", "set animation_type", OFFSET(animation_type), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 2, FLAGS }, + { "start_point_value", "set start_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS }, + { "end_point_value", "set end_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS }, + { "control_point_value", "set control_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS }, + { "control_point_relative_time", "set control_point_relative_time", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, 0.0, 1.0, FLAGS }, + { NULL }, +}; + +static const AVClass mix_gain_class = { + .class_name = "AVIAMFSubmixElement", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = mix_gain_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFDemixingInfo, x) +static const AVOption demixing_info_options[] = { + { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS }, + { "dmixp_mode", "set dmixp_mode", OFFSET(dmixp_mode), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 6, FLAGS }, + { NULL }, +}; + +static const AVClass demixing_info_class = { + .class_name = "AVIAMFDemixingInfo", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = demixing_info_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFReconGain, x) +static const AVOption recon_gain_options[] = { + { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS }, + { NULL }, +}; + +static const AVClass recon_gain_class = { + .class_name = "AVIAMFReconGain", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = recon_gain_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFParamDefinition, x) +static const AVOption param_definition_options[] = { + { "parameter_id", "set parameter_id", OFFSET(parameter_id), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { "parameter_rate", "set parameter_rate", OFFSET(parameter_rate), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { "param_definition_mode", "set param_definition_mode", OFFSET(param_definition_mode), AV_OPT_TYPE_INT, {.i64 = 1 }, 0, 1, FLAGS }, + { "duration", "set duration", OFFSET(duration), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { "constant_subblock_duration", "set constant_subblock_duration", OFFSET(constant_subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { NULL }, +}; + +static const AVClass *param_definition_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + switch(i) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + ret = &mix_gain_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + ret = &demixing_info_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + ret = &recon_gain_class; + break; + default: + break; + } + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass param_definition_class = { + .class_name = "AVIAMFParamDefinition", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = param_definition_options, + .child_class_iterate = param_definition_child_iterate, +}; + +const AVClass *av_iamf_param_definition_get_class(void) +{ + return ¶m_definition_class; +} + +AVIAMFParamDefinition *av_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType type, + unsigned int nb_subblocks, size_t *out_size) +{ + + struct MixGainStruct { + AVIAMFParamDefinition p; + AVIAMFMixGain m; + }; + struct DemixStruct { + AVIAMFParamDefinition p; + AVIAMFDemixingInfo d; + }; + struct ReconGainStruct { + AVIAMFParamDefinition p; + AVIAMFReconGain r; + }; + size_t subblocks_offset, subblock_size; + size_t size; + AVIAMFParamDefinition *par; + + switch (type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + subblocks_offset = offsetof(struct MixGainStruct, m); + subblock_size = sizeof(AVIAMFMixGain); + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + subblocks_offset = offsetof(struct DemixStruct, d); + subblock_size = sizeof(AVIAMFDemixingInfo); + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + subblocks_offset = offsetof(struct ReconGainStruct, r); + subblock_size = sizeof(AVIAMFReconGain); + break; + default: + return NULL; + } + + size = subblocks_offset; + if (nb_subblocks > (SIZE_MAX - size) / subblock_size) + return NULL; + size += subblock_size * nb_subblocks; + + par = av_mallocz(size); + if (!par) + return NULL; + + par->av_class = ¶m_definition_class; + av_opt_set_defaults(par); + + par->param_definition_type = type; + par->nb_subblocks = nb_subblocks; + par->subblock_size = subblock_size; + par->subblocks_offset = subblocks_offset; + + for (int i = 0; i < nb_subblocks; i++) { + void *subblock = av_iamf_param_definition_get_subblock(par, i); + + switch (type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + ((AVIAMFMixGain *)subblock)->av_class = &mix_gain_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + ((AVIAMFDemixingInfo *)subblock)->av_class = &demixing_info_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + ((AVIAMFReconGain *)subblock)->av_class = &recon_gain_class; + break; + default: + av_assert0(0); + } + + av_opt_set_defaults(subblock); + } + + if (out_size) + *out_size = size; + + return par; +} + +// +// Audio Element +// +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFLayer, x) +static const AVOption layer_options[] = { + { "ch_layout", "set ch_layout", OFFSET(ch_layout), AV_OPT_TYPE_CHLAYOUT, {.str = NULL }, 0, 0, FLAGS }, + { "flags", "set flags", OFFSET(flags), AV_OPT_TYPE_FLAGS, + {.i64 = 0 }, 0, AV_IAMF_LAYER_FLAG_RECON_GAIN, FLAGS, "flags" }, + {"recon_gain", "Recon gain is present", 0, AV_OPT_TYPE_CONST, + {.i64 = AV_IAMF_LAYER_FLAG_RECON_GAIN }, INT_MIN, INT_MAX, FLAGS, "flags"}, + { "output_gain_flags", "set output_gain_flags", OFFSET(output_gain_flags), AV_OPT_TYPE_FLAGS, + {.i64 = 0 }, 0, (1 << 6) - 1, FLAGS, "output_gain_flags" }, + {"FL", "Left channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 5 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"FR", "Right channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 4 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"BL", "Left surround channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 3 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"BR", "Right surround channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 2 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"TFL", "Left top front channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 1 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"TFR", "Right top front channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 0 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + { "output_gain", "set output_gain", OFFSET(output_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "ambisonics_mode", "set ambisonics_mode", OFFSET(ambisonics_mode), AV_OPT_TYPE_INT, + { .i64 = AV_IAMF_AMBISONICS_MODE_MONO }, + AV_IAMF_AMBISONICS_MODE_MONO, AV_IAMF_AMBISONICS_MODE_PROJECTION, FLAGS, "ambisonics_mode" }, + { "mono", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AMBISONICS_MODE_MONO }, .unit = "ambisonics_mode" }, + { "projection", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AMBISONICS_MODE_PROJECTION }, .unit = "ambisonics_mode" }, + { NULL }, +}; + +static const AVClass layer_class = { + .class_name = "AVIAMFLayer", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = layer_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFAudioElement, x) +static const AVOption audio_element_options[] = { + { "audio_element_type", "set audio_element_type", OFFSET(audio_element_type), AV_OPT_TYPE_INT, + {.i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL }, + AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, FLAGS, "audio_element_type" }, + { "channel", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL }, .unit = "audio_element_type" }, + { "scene", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE }, .unit = "audio_element_type" }, + { "default_w", "set default_w", OFFSET(default_w), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 10, FLAGS }, + { NULL }, +}; + +static const AVClass *audio_element_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + if (i) + ret = &layer_class; + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass audio_element_class = { + .class_name = "AVIAMFAudioElement", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = audio_element_options, + .child_class_iterate = audio_element_child_iterate, +}; + +const AVClass *av_iamf_audio_element_get_class(void) +{ + return &audio_element_class; +} + +AVIAMFAudioElement *av_iamf_audio_element_alloc(void) +{ + AVIAMFAudioElement *audio_element = av_mallocz(sizeof(*audio_element)); + + if (audio_element) { + audio_element->av_class = &audio_element_class; + av_opt_set_defaults(audio_element); + } + + return audio_element; +} + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFAudioElement, audio_element, AVIAMFLayer, layer, s) + +void av_iamf_audio_element_free(AVIAMFAudioElement **paudio_element) +{ + AVIAMFAudioElement *audio_element = *paudio_element; + + if (!audio_element) + return; + + for (int i = 0; i < audio_element->nb_layers; i++) { + AVIAMFLayer *layer = audio_element->layers[i]; + av_opt_free(layer); + av_free(layer->demixing_matrix); + av_free(layer); + } + av_free(audio_element->layers); + + av_free(audio_element->demixing_info); + av_free(audio_element->recon_gain_info); + av_freep(paudio_element); +} + +// +// Mix Presentation +// +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFSubmixElement, x) +static const AVOption submix_element_options[] = { + { "headphones_rendering_mode", "Headphones rendering mode", OFFSET(headphones_rendering_mode), AV_OPT_TYPE_INT, + { .i64 = AV_IAMF_HEADPHONES_MODE_STEREO }, + AV_IAMF_HEADPHONES_MODE_STEREO, AV_IAMF_HEADPHONES_MODE_BINAURAL, FLAGS, "headphones_rendering_mode" }, + { "stereo", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_HEADPHONES_MODE_STEREO }, .unit = "headphones_rendering_mode" }, + { "binaural", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_HEADPHONES_MODE_BINAURAL }, .unit = "headphones_rendering_mode" }, + { "default_mix_gain", "Default mix gain", OFFSET(default_mix_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "annotations", "Annotations", OFFSET(annotations), AV_OPT_TYPE_DICT, { .str = NULL }, 0, 0, FLAGS }, + { NULL }, +}; + +static void *submix_element_child_next(void *obj, void *prev) +{ + AVIAMFSubmixElement *submix_element = obj; + if (!prev) + return submix_element->element_mix_config; + + return NULL; +} + +static const AVClass *submix_element_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + if (i) + ret = ¶m_definition_class; + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass element_class = { + .class_name = "AVIAMFSubmixElement", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = submix_element_options, + .child_next = submix_element_child_next, + .child_class_iterate = submix_element_child_iterate, +}; + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFSubmix, submix, AVIAMFSubmixElement, element, s) + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFSubmixLayout, x) +static const AVOption submix_layout_options[] = { + { "layout_type", "Layout type", OFFSET(layout_type), AV_OPT_TYPE_INT, + { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS }, + AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS, AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL, FLAGS, "layout_type" }, + { "loudspeakers", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS }, .unit = "layout_type" }, + { "binaural", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL }, .unit = "layout_type" }, + { "sound_system", "Sound System", OFFSET(sound_system), AV_OPT_TYPE_CHLAYOUT, { .str = NULL }, 0, 0, FLAGS }, + { "integrated_loudness", "Integrated loudness", OFFSET(integrated_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "digital_peak", "Digital peak", OFFSET(digital_peak), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "true_peak", "True peak", OFFSET(true_peak), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "dialog_anchored_loudness", "Anchored loudness (Dialog)", OFFSET(dialogue_anchored_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "album_anchored_loudness", "Anchored loudness (Album)", OFFSET(album_anchored_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { NULL }, +}; + +static const AVClass layout_class = { + .class_name = "AVIAMFSubmixLayout", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = submix_layout_options, +}; + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFSubmix, submix, AVIAMFSubmixLayout, layout, s) + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFSubmix, x) +static const AVOption submix_presentation_options[] = { + { "default_mix_gain", "Default mix gain", OFFSET(default_mix_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { NULL }, +}; + +static void *submix_presentation_child_next(void *obj, void *prev) +{ + AVIAMFSubmix *sub_mix = obj; + if (!prev) + return sub_mix->output_mix_config; + + return NULL; +} + +static const AVClass *submix_presentation_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + switch(i) { + case 0: + ret = &element_class; + break; + case 1: + ret = &layout_class; + break; + case 2: + ret = ¶m_definition_class; + break; + default: + break; + } + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass submix_class = { + .class_name = "AVIAMFSubmix", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = submix_presentation_options, + .child_next = submix_presentation_child_next, + .child_class_iterate = submix_presentation_child_iterate, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFMixPresentation, x) +static const AVOption mix_presentation_options[] = { + { "annotations", "set annotations", OFFSET(annotations), AV_OPT_TYPE_DICT, {.str = NULL }, 0, 0, FLAGS }, + { NULL }, +}; + +#undef OFFSET +#undef FLAGS + +static const AVClass *mix_presentation_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + if (i) + ret = &submix_class; + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass mix_presentation_class = { + .class_name = "AVIAMFMixPresentation", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = mix_presentation_options, + .child_class_iterate = mix_presentation_child_iterate, +}; + +const AVClass *av_iamf_mix_presentation_get_class(void) +{ + return &mix_presentation_class; +} + +AVIAMFMixPresentation *av_iamf_mix_presentation_alloc(void) +{ + AVIAMFMixPresentation *mix_presentation = av_mallocz(sizeof(*mix_presentation)); + + if (mix_presentation) { + mix_presentation->av_class = &mix_presentation_class; + av_opt_set_defaults(mix_presentation); + } + + return mix_presentation; +} + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFMixPresentation, mix_presentation, AVIAMFSubmix, submix, es) + +void av_iamf_mix_presentation_free(AVIAMFMixPresentation **pmix_presentation) +{ + AVIAMFMixPresentation *mix_presentation = *pmix_presentation; + + if (!mix_presentation) + return; + + for (int i = 0; i < mix_presentation->nb_submixes; i++) { + AVIAMFSubmix *sub_mix = mix_presentation->submixes[i]; + for (int j = 0; j < sub_mix->nb_elements; j++) { + AVIAMFSubmixElement *submix_element = sub_mix->elements[j]; + av_opt_free(submix_element); + av_free(submix_element->element_mix_config); + av_free(submix_element); + } + av_free(sub_mix->elements); + for (int j = 0; j < sub_mix->nb_layouts; j++) { + AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[j]; + av_opt_free(submix_layout); + av_free(submix_layout); + } + av_free(sub_mix->layouts); + av_free(sub_mix->output_mix_config); + av_free(sub_mix); + } + av_opt_free(mix_presentation); + av_free(mix_presentation->submixes); + + av_freep(pmix_presentation); +} diff --git a/libavutil/iamf.h b/libavutil/iamf.h new file mode 100644 index 0000000000..bc0363153d --- /dev/null +++ b/libavutil/iamf.h @@ -0,0 +1,573 @@ +/* + * Immersive Audio Model and Formats helper functions and defines + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_IAMF_H +#define AVUTIL_IAMF_H + +/** + * @file + * Immersive Audio Model and Formats API header + * @see Immersive Audio Model and Formats + */ + +#include +#include + +#include "attributes.h" +#include "avassert.h" +#include "channel_layout.h" +#include "dict.h" +#include "rational.h" + +/** + * @defgroup lavf_iamf_params Parameter Definition + * @{ + * Parameters as defined in section 3.6.1 and 3.8 of IAMF. + * @} + * @defgroup lavf_iamf_audio Audio Element + * @{ + * Audio Elements as defined in section 3.6 of IAMF. + * @} + * @defgroup lavf_iamf_mix Mix Presentation + * @{ + * Mix Presentations as defined in section 3.7 of IAMF. + * @} + * + * @} + * @addtogroup lavf_iamf_params + * @{ + */ +enum AVIAMFAnimationType { + AV_IAMF_ANIMATION_TYPE_STEP, + AV_IAMF_ANIMATION_TYPE_LINEAR, + AV_IAMF_ANIMATION_TYPE_BEZIER, +}; + +/** + * Mix Gain Parameter Data as defined in section 3.8.1 of IAMF. + * + * Subblocks in AVIAMFParamDefinition use this struct when the value or + * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is + * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN. + */ +typedef struct AVIAMFMixGain { + const AVClass *av_class; + + unsigned int subblock_duration; + /** + * The type of animation applied to the parameter values. + */ + enum AVIAMFAnimationType animation_type; + /** + * Parameter value that is applied at the start of the subblock. + * Applies to all defined Animation Types. + * + * Valid range of values is -128.0 to 128.0 + */ + AVRational start_point_value; + /** + * Parameter value that is applied at the end of the subblock. + * Applies only to AV_IAMF_ANIMATION_TYPE_LINEAR and + * AV_IAMF_ANIMATION_TYPE_BEZIER Animation Types. + * + * Valid range of values is -128.0 to 128.0 + */ + AVRational end_point_value; + /** + * Parameter value of the middle control point of a quadratic Bezier + * curve, i.e., its y-axis value. + * Applies only to AV_IAMF_ANIMATION_TYPE_BEZIER Animation Type. + * + * Valid range of values is -128.0 to 128.0 + */ + AVRational control_point_value; + /** + * Parameter value of the time of the middle control point of a + * quadratic Bezier curve, i.e., its x-axis value. + * Applies only to AV_IAMF_ANIMATION_TYPE_BEZIER Animation Type. + * + * Valid range of values is 0.0 to 1.0 + */ + AVRational control_point_relative_time; +} AVIAMFMixGain; + +/** + * Demixing Info Parameter Data as defined in section 3.8.2 of IAMF.. + * + * Subblocks in AVIAMFParamDefinition use this struct when the value or + * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is + * AV_IAMF_PARAMETER_DEFINITION_DEMIXING. + */ +typedef struct AVIAMFDemixingInfo { + const AVClass *av_class; + + unsigned int subblock_duration; + /** + * Pre-defined combination of demixing parameters. + */ + unsigned int dmixp_mode; +} AVIAMFDemixingInfo; + +/** + * Recon Gain Info Parameter Data as defined in section 3.8.3 of IAMF. + * + * Subblocks in AVIAMFParamDefinition use this struct when the value or + * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is + * AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN. + */ +typedef struct AVIAMFReconGain { + const AVClass *av_class; + + unsigned int subblock_duration; + + /** + * Array of gain values to be applied to each channel for each layer + * defined in the Audio Element referencing the parent Parameter Definition. + * Values for layers where the AV_IAMF_LAYER_FLAG_RECON_GAIN flag is not set + * are undefined. + * + * Channel order is: FL, C, FR, SL, SR, TFL, TFR, BL, BR, TBL, TBR, LFE + */ + uint8_t recon_gain[6][12]; +} AVIAMFReconGain; + +enum AVIAMFParamDefinitionType { + AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, + AV_IAMF_PARAMETER_DEFINITION_DEMIXING, + AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN, +}; + +/** + * Parameters as defined in section 3.6.1 of IAMF. + */ +typedef struct AVIAMFParamDefinition { + const AVClass *av_class; + + size_t subblocks_offset; + size_t subblock_size; + + enum AVIAMFParamDefinitionType param_definition_type; + unsigned int nb_subblocks; + + unsigned int parameter_id; + unsigned int parameter_rate; + unsigned int param_definition_mode; + unsigned int duration; + unsigned int constant_subblock_duration; +} AVIAMFParamDefinition; + +const AVClass *av_iamf_param_definition_get_class(void); + +/** + * Allocates memory for AVIAMFParamDefinition, plus an array of {@code nb_subblocks} + * amount of subblocks of the given type and initializes the variables. Can be + * freed with a normal av_free() call. + * + * @param size if non-NULL, the size in bytes of the resulting data array is written here. + */ +AVIAMFParamDefinition *av_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType param_definition_type, + unsigned int nb_subblocks, size_t *size); + +/** + * Get the subblock at the specified {@code idx}. Must be between 0 and nb_subblocks - 1. + * + * The @ref AVIAMFParamDefinition.param_definition_type "param definition type" defines + * the struct type of the returned pointer. + */ +static av_always_inline void* +av_iamf_param_definition_get_subblock(const AVIAMFParamDefinition *par, unsigned int idx) +{ + av_assert0(idx < par->nb_subblocks); + return (void *)((uint8_t *)par + par->subblocks_offset + idx * par->subblock_size); +} + +/** + * @} + * @addtogroup lavf_iamf_audio + * @{ + */ + +enum AVIAMFAmbisonicsMode { + AV_IAMF_AMBISONICS_MODE_MONO, + AV_IAMF_AMBISONICS_MODE_PROJECTION, +}; + +/** + * Recon gain information for the layer is present in AVIAMFReconGain + */ +#define AV_IAMF_LAYER_FLAG_RECON_GAIN (1 << 0) + +/** + * A layer defining a Channel Layout in the Audio Element. + * + * When @ref AVIAMFAudioElement.audio_element_type "the parent's Audio Element type" + * is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, this corresponds to an Scalable Channel + * Layout layer as defined in section 3.6.2 of IAMF. + * For AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, it is an Ambisonics channel + * layout as defined in section 3.6.3 of IAMF. + */ +typedef struct AVIAMFLayer { + const AVClass *av_class; + + AVChannelLayout ch_layout; + + /** + * A bitmask which may contain a combination of AV_IAMF_LAYER_FLAG_* flags. + */ + unsigned int flags; + /** + * Output gain channel flags as defined in section 3.6.2 of IAMF. + * + * This field is defined only if @ref AVIAMFAudioElement.audio_element_type + * "the parent's Audio Element type" is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, + * must be 0 otherwise. + */ + unsigned int output_gain_flags; + /** + * Output gain as defined in section 3.6.2 of IAMF. + * + * Must be 0 if @ref output_gain_flags is 0. + */ + AVRational output_gain; + /** + * Ambisonics mode as defined in section 3.6.3 of IAMF. + * + * This field is defined only if @ref AVIAMFAudioElement.audio_element_type + * "the parent's Audio Element type" is AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE. + * + * If AV_IAMF_AMBISONICS_MODE_MONO, channel_mapping is defined implicitly + * (Ambisonic Order) or explicitly (Custom Order with ambi channels) in + * @ref ch_layout. + * If AV_IAMF_AMBISONICS_MODE_PROJECTION, @ref demixing_matrix must be set. + */ + enum AVIAMFAmbisonicsMode ambisonics_mode; + + /** + * Demixing matrix as defined in section 3.6.3 of IAMF. + * + * May be set only if @ref ambisonics_mode == AV_IAMF_AMBISONICS_MODE_PROJECTION, + * must be NULL otherwise. + */ + AVRational *demixing_matrix; +} AVIAMFLayer; + + +enum AVIAMFAudioElementType { + AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, + AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, +}; + +typedef struct AVIAMFAudioElement { + const AVClass *av_class; + + AVIAMFLayer **layers; + /** + * Number of layers, or channel groups, in the Audio Element. + * For @ref audio_element_type AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, there + * may be exactly 1. + * + * Set by av_iamf_audio_element_add_layer(), must not be + * modified by any other code. + */ + unsigned int nb_layers; + + /** + * Demixing information used to reconstruct a scalable channel audio + * representation. + * The @ref AVIAMFParamDefinition.param_definition_type "type" must be + * AV_IAMF_PARAMETER_DEFINITION_DEMIXING. + */ + AVIAMFParamDefinition *demixing_info; + /** + * Recon gain information used to reconstruct a scalable channel audio + * representation. + * The @ref AVIAMFParamDefinition.param_definition_type "type" must be + * AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN. + */ + AVIAMFParamDefinition *recon_gain_info; + + /** + * Audio element type as defined in section 3.6 of IAMF. + */ + enum AVIAMFAudioElementType audio_element_type; + + /** + * Default weight value as defined in section 3.6 of IAMF. + */ + unsigned int default_w; +} AVIAMFAudioElement; + +const AVClass *av_iamf_audio_element_get_class(void); + +/** + * Allocates a AVIAMFAudioElement, and initializes its fields with default values. + * No layers are allocated. Must be freed with av_iamf_audio_element_free(). + * + * @see av_iamf_audio_element_add_layer() + */ +AVIAMFAudioElement *av_iamf_audio_element_alloc(void); + +/** + * Allocate a layer and add it to a given AVIAMFAudioElement. + * It is freed by av_iamf_audio_element_free() alongside the rest of the parent + * AVIAMFAudioElement. + * + * @return a pointer to the allocated layer. + */ +AVIAMFLayer *av_iamf_audio_element_add_layer(AVIAMFAudioElement *audio_element); + +void av_iamf_audio_element_free(AVIAMFAudioElement **audio_element); + +/** + * @} + * @addtogroup lavf_iamf_mix + * @{ + */ + +enum AVIAMFHeadphonesMode { + /** + * The referenced Audio Element shall be rendered to stereo loudspeakers. + */ + AV_IAMF_HEADPHONES_MODE_STEREO, + /** + * The referenced Audio Element shall be rendered with a binaural renderer. + */ + AV_IAMF_HEADPHONES_MODE_BINAURAL, +}; + +typedef struct AVIAMFSubmixElement { + const AVClass *av_class; + + /** + * The id of the Audio Element this submix element references. + */ + unsigned int audio_element_id; + + /** + * Information required required for applying any processing to the + * referenced and rendered Audio Element before being summed with other + * processed Audio Elements. + * The @ref AVIAMFParamDefinition.param_definition_type "type" must be + * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN. + */ + AVIAMFParamDefinition *element_mix_config; + + /** + * Default mix gain value to apply when there are no AVIAMFParamDefinition + * with @ref element_mix_config "element_mix_config's" + * @ref AVIAMFParamDefinition.parameter_id "parameter_id" available for a + * given audio frame. + */ + AVRational default_mix_gain; + + /** + * A value that indicates whether the referenced channel-based Audio Element + * shall be rendered to stereo loudspeakers or spatialized with a binaural + * renderer when played back on headphones. + * If the Audio Element is not of @ref AVIAMFAudioElement.audio_element_type + * "type" AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, then this field is undefined. + */ + enum AVIAMFHeadphonesMode headphones_rendering_mode; + + /** + * A dictionary of strings describing the submix in different languages. + * Must have the same amount of entries as + * @ref AVIAMFMixPresentation.annotations "the mix's annotations", stored + * in the same order, and with the same key strings. + * + * @ref AVDictionaryEntry.key "key" is a string conforming to BCP-47 that + * specifies the language for the string stored in + * @ref AVDictionaryEntry.value "value". + */ + AVDictionary *annotations; +} AVIAMFSubmixElement; + +enum AVIAMFSubmixLayoutType { + /** + * The layout follows the loudspeaker sound system convention of ITU-2051-3. + */ + AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS = 2, + /** + * The layout is binaural. + */ + AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL = 3, +}; + +typedef struct AVIAMFSubmixLayout { + const AVClass *av_class; + + enum AVIAMFSubmixLayoutType layout_type; + + /** + * Channel layout matching one of Sound Systems A to J of ITU-2051-3, plus + * 7.1.2ch and 3.1.2ch + * If layout_type is not AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS, this field + * is undefined. + */ + AVChannelLayout sound_system; + /** + * The program integrated loudness information, as defined in + * ITU-1770-4. + */ + AVRational integrated_loudness; + /** + * The digital (sampled) peak value of the audio signal, as defined + * in ITU-1770-4. + */ + AVRational digital_peak; + /** + * The true peak of the audio signal, as defined in ITU-1770-4. + */ + AVRational true_peak; + /** + * The Dialogue loudness information, as defined in ITU-1770-4. + */ + AVRational dialogue_anchored_loudness; + /** + * The Album loudness information, as defined in ITU-1770-4. + */ + AVRational album_anchored_loudness; +} AVIAMFSubmixLayout; + +typedef struct AVIAMFSubmix { + const AVClass *av_class; + + /** + * Array of submix elements. + * + * Set by av_iamf_submix_add_element(), must not be modified by any + * other code. + */ + AVIAMFSubmixElement **elements; + /** + * Number of elements in the submix. + * + * Set by av_iamf_submix_add_element(), must not be modified by any + * other code. + */ + unsigned int nb_elements; + + /** + * Array of submix layouts. + * + * Set by av_iamf_submix_add_layout(), must not be modified by any + * other code. + */ + AVIAMFSubmixLayout **layouts; + /** + * Number of layouts in the submix. + * + * Set by av_iamf_submix_add_layout(), must not be modified by any + * other code. + */ + unsigned int nb_layouts; + + /** + * Information required for post-processing the mixed audio signal to + * generate the audio signal for playback. + * The @ref AVIAMFParamDefinition.param_definition_type "type" must be + * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN. + */ + AVIAMFParamDefinition *output_mix_config; + + /** + * Default mix gain value to apply when there are no AVIAMFParamDefinition + * with @ref output_mix_config "output_mix_config's" + * @ref AVIAMFParamDefinition.parameter_id "parameter_id" available for a + * given audio frame. + */ + AVRational default_mix_gain; +} AVIAMFSubmix; + +typedef struct AVIAMFMixPresentation { + const AVClass *av_class; + + /** + * Array of submixes. + * + * Set by av_iamf_mix_presentation_add_submix(), must not be modified + * by any other code. + */ + AVIAMFSubmix **submixes; + /** + * Number of submixes in the presentation. + * + * Set by av_iamf_mix_presentation_add_submix(), must not be modified + * by any other code. + */ + unsigned int nb_submixes; + + /** + * A dictionary of strings describing the mix in different languages. + * Must have the same amount of entries as every + * @ref AVIAMFSubmixElement.annotations "Submix element annotations", + * stored in the same order, and with the same key strings. + * + * @ref AVDictionaryEntry.key "key" is a string conforming to BCP-47 + * that specifies the language for the string stored in + * @ref AVDictionaryEntry.value "value". + */ + AVDictionary *annotations; +} AVIAMFMixPresentation; + +const AVClass *av_iamf_mix_presentation_get_class(void); + +/** + * Allocates a AVIAMFMixPresentation, and initializes its fields with default + * values. No submixes are allocated. + * Must be freed with av_iamf_mix_presentation_free(). + * + * @see av_iamf_mix_presentation_add_submix() + */ +AVIAMFMixPresentation *av_iamf_mix_presentation_alloc(void); + +/** + * Allocate a submix and add it to a given AVIAMFMixPresentation. + * It is freed by av_iamf_mix_presentation_free() alongside the rest of the + * parent AVIAMFMixPresentation. + * + * @return a pointer to the allocated submix. + */ +AVIAMFSubmix *av_iamf_mix_presentation_add_submix(AVIAMFMixPresentation *mix_presentation); + +/** + * Allocate a submix element and add it to a given AVIAMFSubmix. + * It is freed by av_iamf_mix_presentation_free() alongside the rest of the + * parent AVIAMFSubmix. + * + * @return a pointer to the allocated submix. + */ +AVIAMFSubmixElement *av_iamf_submix_add_element(AVIAMFSubmix *submix); + +/** + * Allocate a submix layout and add it to a given AVIAMFSubmix. + * It is freed by av_iamf_mix_presentation_free() alongside the rest of the + * parent AVIAMFSubmix. + * + * @return a pointer to the allocated submix. + */ +AVIAMFSubmixLayout *av_iamf_submix_add_layout(AVIAMFSubmix *submix); + +void av_iamf_mix_presentation_free(AVIAMFMixPresentation **mix_presentation); +/** + * @} + */ + +#endif /* AVUTIL_IAMF_H */ From patchwork Tue Dec 5 22:43:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44935 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9153:b0:181:818d:5e7f with SMTP id x19csp658855pzc; Tue, 5 Dec 2023 14:44:18 -0800 (PST) X-Google-Smtp-Source: AGHT+IGUYHnqbLFWalg7oYwI6yDt+H56S9BvwejsBdzPBrTTlQSeKAmOD1ArkhV7VoQs2VF1+hpc X-Received: by 2002:a17:906:4553:b0:a19:a19b:78c4 with SMTP id s19-20020a170906455300b00a19a19b78c4mr1038383ejq.135.1701816257864; Tue, 05 Dec 2023 14:44:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701816257; cv=none; d=google.com; s=arc-20160816; b=suh0Qg8Dh6rV/56ybL0b8Yy/RHHu6TjQO3wtaaoxyLFCWuM6D9wbG4nm0VRiKr49+k C+0BlNSEgdzVbLeZdRVKWP78M/NR/tfq2xpIXkQiiyEvaZXYQuQbbA/REX6EzRxkKF3R NDiRZb1FYhcyS19wgWRZskVEAFk92dydX5CBMQkJcXrFBp5n0JfMCTXDkY+o0ixuldRy 4H7Ef9Mo7RBXkjzWqpZz8mH5rfnCUh3ynYOEgL7GVzjXCBmp8pJ9iu2FRguGzcdXMESd Bgo/FDNlHmefwMQhG7RX8dJrVVc13Jn5H8LzFO1eOsPY8jzbg8H5gCt/DyKpItXmyejr n/gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=4VAFIK+NyuTjC2oJmMVuTUnCZzEC/jW2KhrlVIFK9SE=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=B80kIOtgCkcGT2YMgHlPnrumNzCPrdNiuDFC9x0adBcZc+XO9fNtMoCihH1+WsVCre P9ye+yR6nRuPxfhuhlmSnx7ysxkCfJAdj9XaaPKK9kXVJ1lNLg0FLfVVjwCjimlnEEf8 GqQPGiC5tv8+YvszhwA+y5iTZdIB6uqB5+bCPwDag4t38UWm9sXx9D8snljZcu9PCyAV x1TQ2UwtK9xi1PQZSkFiGpkG8jOsUPAAcXbaRFhOoDPgztwzbY8YOckoxEtZSOawEI93 aYKUInA8ufLo9cwjdhtyt4cY5cd9Ne6f9oaqYRTUKFzEmQyLVodDhR485/UbLwHAOAbo Jr/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=JNDT8ZwG; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id zy23-20020a17090734d700b00a1c8ee49d8esi884521ejb.791.2023.12.05.14.44.17; Tue, 05 Dec 2023 14:44:17 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=JNDT8ZwG; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 15B8168CEFD; Wed, 6 Dec 2023 00:43:54 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-oo1-f52.google.com (mail-oo1-f52.google.com [209.85.161.52]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id A5F8B68CDF6 for ; Wed, 6 Dec 2023 00:43:47 +0200 (EET) Received: by mail-oo1-f52.google.com with SMTP id 006d021491bc7-58de42d0ff7so3708887eaf.0 for ; Tue, 05 Dec 2023 14:43:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701816225; x=1702421025; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=D7jlj6TocYytWxgw7t7BTCFTj5NL/TkaGL/1KCrQhJ4=; b=JNDT8ZwGOaWj7Z1hvk1JTbPc1E5pnJqi7CIHqRbqloNDW7XaCBgD3YE7slYIiB9MXa e1vZosPiO9jwWVM81OrX5ihr3fspfV1V9eEgsOYLbNVg2qRz+J5TICui7SrsJAeGX49R YsG2lmZyD4llLKQ5HmPNbHIhFHFERFzHGTw4J3poGY8oJK4zo2PhPtLmoW9/fxjTQvVZ qxcIp/Gvprt0iJaiwlJVlGp4YxVsmIEt6vX2yGyJVxEYx1OaPYr4Cu5jwVZf1B4xRnan GBwWVkyWFKktwm6pSzYi/T4cjS7yaU+mG0cmLo2Ji2SKXTIyxvWlXwZzGLZEAUvlDGuK g0Ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701816225; x=1702421025; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=D7jlj6TocYytWxgw7t7BTCFTj5NL/TkaGL/1KCrQhJ4=; b=lnsGbrv3n1/eV7DhHQJf0TbFuqxqslf8+Ott8jNgGIpwCcR+ylnVz9S0AmGVWkOnMu MtmIXeWMI9+ewTMRp6U5lfNg02opHcCmV5SOmytlqfPVNQwzV/pa3mT9wJZVihIM4zrU ojez/70dZAZh5FBqnS07qsrGtlLtAnyIjtKNvK4r5VSPc/8+WAqF53WPCPltCZ4q5Eay urCJvF5pTXkKj0FPoQAW0aKx/Lwb6QHeHb3HUQFivdgwQMjkJpOqGmlo4C1R03N9zpB2 kKw2Y2bqEp9QiQHXMEn41IvK5i2ASJZL14VseFAGjGAZEm2XY/pPcpzYOqX4IFcrz63j 8xWA== X-Gm-Message-State: AOJu0YxK8n8FcWv2YeONlfafQMw4+Opnc0iExexj8qigFK/S34b2P7ui 5qyeWdlXpzlFXAcTtwuRtxJcnnAR9IU= X-Received: by 2002:a05:6358:2813:b0:170:17eb:1e9 with SMTP id k19-20020a056358281300b0017017eb01e9mr15862rwb.44.1701816224760; Tue, 05 Dec 2023 14:43:44 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id hq25-20020a056a00681900b0064fd4a6b306sm2037688pfb.76.2023.12.05.14.43.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 14:43:43 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 5 Dec 2023 19:43:56 -0300 Message-ID: <20231205224402.14540-3-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231205224402.14540-1-jamrial@gmail.com> References: <20231205224402.14540-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/8] avformat: introduce AVStreamGroup X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: OpHDgnpijQUc Signed-off-by: James Almer --- doc/fftools-common-opts.texi | 17 +++- libavformat/avformat.c | 185 ++++++++++++++++++++++++++++++++++- libavformat/avformat.h | 169 ++++++++++++++++++++++++++++++++ libavformat/dump.c | 147 +++++++++++++++++++++++----- libavformat/internal.h | 33 +++++++ libavformat/options.c | 139 ++++++++++++++++++++++++++ 6 files changed, 656 insertions(+), 34 deletions(-) diff --git a/doc/fftools-common-opts.texi b/doc/fftools-common-opts.texi index d9145704d6..f459bfdc1d 100644 --- a/doc/fftools-common-opts.texi +++ b/doc/fftools-common-opts.texi @@ -37,9 +37,9 @@ Matches the stream with this index. E.g. @code{-threads:1 4} would set the thread count for the second stream to 4. If @var{stream_index} is used as an additional stream specifier (see below), then it selects stream number @var{stream_index} from the matching streams. Stream numbering is based on the -order of the streams as detected by libavformat except when a program ID is -also specified. In this case it is based on the ordering of the streams in the -program. +order of the streams as detected by libavformat except when a stream group +specifier or program ID is also specified. In this case it is based on the +ordering of the streams in the group or program. @item @var{stream_type}[:@var{additional_stream_specifier}] @var{stream_type} is one of following: 'v' or 'V' for video, 'a' for audio, 's' for subtitle, 'd' for data, and 't' for attachments. 'v' matches all video @@ -48,6 +48,17 @@ thumbnails or cover arts. If @var{additional_stream_specifier} is used, then it matches streams which both have this type and match the @var{additional_stream_specifier}. Otherwise, it matches all streams of the specified type. +@item g:@var{group_specifier}[:@var{additional_stream_specifier}] +Matches streams which are in the group with the specifier @var{group_specifier}. +if @var{additional_stream_specifier} is used, then it matches streams which both +are part of the group and match the @var{additional_stream_specifier}. +@var{group_specifier} may be one of the following: +@table @option +@item @var{group_index} +Match the stream with this group index. +@item #@var{group_id} or i:@var{group_id} +Match the stream with this group id. +@end table @item p:@var{program_id}[:@var{additional_stream_specifier}] Matches streams which are in the program with the id @var{program_id}. If @var{additional_stream_specifier} is used, then it matches streams which both diff --git a/libavformat/avformat.c b/libavformat/avformat.c index 5b8bb7879e..a02ec965dd 100644 --- a/libavformat/avformat.c +++ b/libavformat/avformat.c @@ -24,6 +24,7 @@ #include "libavutil/avstring.h" #include "libavutil/channel_layout.h" #include "libavutil/frame.h" +#include "libavutil/iamf.h" #include "libavutil/intreadwrite.h" #include "libavutil/mem.h" #include "libavutil/opt.h" @@ -80,6 +81,32 @@ FF_ENABLE_DEPRECATION_WARNINGS av_freep(pst); } +void ff_free_stream_group(AVStreamGroup **pstg) +{ + AVStreamGroup *stg = *pstg; + + if (!stg) + return; + + av_freep(&stg->streams); + av_dict_free(&stg->metadata); + av_freep(&stg->priv_data); + switch (stg->type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: { + av_iamf_audio_element_free(&stg->params.iamf_audio_element); + break; + } + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: { + av_iamf_mix_presentation_free(&stg->params.iamf_mix_presentation); + break; + } + default: + break; + } + + av_freep(pstg); +} + void ff_remove_stream(AVFormatContext *s, AVStream *st) { av_assert0(s->nb_streams>0); @@ -88,6 +115,14 @@ void ff_remove_stream(AVFormatContext *s, AVStream *st) ff_free_stream(&s->streams[ --s->nb_streams ]); } +void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg) +{ + av_assert0(s->nb_stream_groups > 0); + av_assert0(s->stream_groups[ s->nb_stream_groups - 1 ] == stg); + + ff_free_stream_group(&s->stream_groups[ --s->nb_stream_groups ]); +} + /* XXX: suppress the packet queue */ void ff_flush_packet_queue(AVFormatContext *s) { @@ -118,6 +153,9 @@ void avformat_free_context(AVFormatContext *s) for (unsigned i = 0; i < s->nb_streams; i++) ff_free_stream(&s->streams[i]); + for (unsigned i = 0; i < s->nb_stream_groups; i++) + ff_free_stream_group(&s->stream_groups[i]); + s->nb_stream_groups = 0; s->nb_streams = 0; for (unsigned i = 0; i < s->nb_programs; i++) { @@ -139,6 +177,7 @@ void avformat_free_context(AVFormatContext *s) av_packet_free(&si->pkt); av_packet_free(&si->parse_pkt); av_freep(&s->streams); + av_freep(&s->stream_groups); ff_flush_packet_queue(s); av_freep(&s->url); av_free(s); @@ -464,7 +503,7 @@ int av_find_best_stream(AVFormatContext *ic, enum AVMediaType type, */ static int match_stream_specifier(const AVFormatContext *s, const AVStream *st, const char *spec, const char **indexptr, - const AVProgram **p) + const AVStreamGroup **g, const AVProgram **p) { int match = 1; /* Stores if the specifier matches so far. */ while (*spec) { @@ -493,6 +532,46 @@ static int match_stream_specifier(const AVFormatContext *s, const AVStream *st, match = 0; if (nopic && (st->disposition & AV_DISPOSITION_ATTACHED_PIC)) match = 0; + } else if (*spec == 'g' && *(spec + 1) == ':') { + int64_t group_idx = -1, group_id = -1; + int found = 0; + char *endptr; + spec += 2; + if (*spec == '#' || (*spec == 'i' && *(spec + 1) == ':')) { + spec += 1 + (*spec == 'i'); + group_id = strtol(spec, &endptr, 0); + if (spec == endptr || (*endptr && *endptr++ != ':')) + return AVERROR(EINVAL); + spec = endptr; + } else { + group_idx = strtol(spec, &endptr, 0); + /* Disallow empty id and make sure that if we are not at the end, then another specifier must follow. */ + if (spec == endptr || (*endptr && *endptr++ != ':')) + return AVERROR(EINVAL); + spec = endptr; + } + if (match) { + if (group_id > 0) { + for (unsigned i = 0; i < s->nb_stream_groups; i++) { + if (group_id == s->stream_groups[i]->id) { + group_idx = i; + break; + } + } + } + if (group_idx < 0 || group_idx > s->nb_stream_groups) + return AVERROR(EINVAL); + for (unsigned j = 0; j < s->stream_groups[group_idx]->nb_streams; j++) { + if (st->index == s->stream_groups[group_idx]->streams[j]->index) { + found = 1; + if (g) + *g = s->stream_groups[group_idx]; + break; + } + } + } + if (!found) + match = 0; } else if (*spec == 'p' && *(spec + 1) == ':') { int prog_id; int found = 0; @@ -591,10 +670,11 @@ int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st, int ret, index; char *endptr; const char *indexptr = NULL; + const AVStreamGroup *g = NULL; const AVProgram *p = NULL; int nb_streams; - ret = match_stream_specifier(s, st, spec, &indexptr, &p); + ret = match_stream_specifier(s, st, spec, &indexptr, &g, &p); if (ret < 0) goto error; @@ -612,10 +692,11 @@ int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st, return (index == st->index); /* If we requested a matching stream index, we have to ensure st is that. */ - nb_streams = p ? p->nb_stream_indexes : s->nb_streams; + nb_streams = g ? g->nb_streams : (p ? p->nb_stream_indexes : s->nb_streams); for (int i = 0; i < nb_streams && index >= 0; i++) { - const AVStream *candidate = s->streams[p ? p->stream_index[i] : i]; - ret = match_stream_specifier(s, candidate, spec, NULL, NULL); + unsigned idx = g ? g->streams[i]->index : (p ? p->stream_index[i] : i); + const AVStream *candidate = s->streams[idx]; + ret = match_stream_specifier(s, candidate, spec, NULL, NULL, NULL); if (ret < 0) goto error; if (ret > 0 && index-- == 0 && st == candidate) @@ -629,6 +710,100 @@ error: return ret; } +/** + * Matches a stream specifier (but ignores requested index). + * + * @param indexptr set to point to the requested stream index if there is one + * + * @return <0 on error + * 0 if st is NOT a matching stream + * >0 if st is a matching stream + */ +static int match_stream_group_specifier(const AVFormatContext *s, const AVStreamGroup *stg, + const char *spec, const char **indexptr) +{ + int match = 1; /* Stores if the specifier matches so far. */ + while (*spec) { + if (*spec <= '9' && *spec >= '0') { /* opt:index */ + if (indexptr) + *indexptr = spec; + return match; + } else if (*spec == 't' && *(spec + 1) == ':') { + int64_t group_type = -1; + int found = 0; + char *endptr; + spec += 2; + group_type = strtol(spec, &endptr, 0); + /* Disallow empty type and make sure that if we are not at the end, then another specifier must follow. */ + if (spec == endptr || (*endptr && *endptr++ != ':')) + return AVERROR(EINVAL); + spec = endptr; + if (match && group_type > 0) { + for (unsigned i = 0; i < s->nb_stream_groups; i++) { + if (group_type == s->stream_groups[i]->type) { + found = 1; + break; + } + } + } + if (!found) + match = 0; + } else if (*spec == '#' || + (*spec == 'i' && *(spec + 1) == ':')) { + int group_id; + char *endptr; + spec += 1 + (*spec == 'i'); + group_id = strtol(spec, &endptr, 0); + if (spec == endptr || *endptr) /* Disallow empty id and make sure we are at the end. */ + return AVERROR(EINVAL); + return match && (group_id == stg->id); + } + } + + return match; +} + +int avformat_match_stream_group_specifier(AVFormatContext *s, AVStreamGroup *stg, + const char *spec) +{ + int ret, index; + char *endptr; + const char *indexptr = NULL; + + ret = match_stream_group_specifier(s, stg, spec, &indexptr); + if (ret < 0) + goto error; + + if (!indexptr) + return ret; + + index = strtol(indexptr, &endptr, 0); + if (*endptr) { /* We can't have anything after the requested index. */ + ret = AVERROR(EINVAL); + goto error; + } + + /* This is not really needed but saves us a loop for simple stream index specifiers. */ + if (spec == indexptr) + return (index == stg->index); + + /* If we requested a matching stream index, we have to ensure stg is that. */ + for (int i = 0; i < s->nb_stream_groups && index >= 0; i++) { + const AVStreamGroup *candidate = s->stream_groups[i]; + ret = match_stream_group_specifier(s, candidate, spec, NULL); + if (ret < 0) + goto error; + if (ret > 0 && index-- == 0 && stg == candidate) + return 1; + } + return 0; + +error: + if (ret == AVERROR(EINVAL)) + av_log(s, AV_LOG_ERROR, "Invalid stream group specifier: %s.\n", spec); + return ret; +} + AVRational av_guess_sample_aspect_ratio(AVFormatContext *format, AVStream *stream, AVFrame *frame) { AVRational undef = {0, 1}; diff --git a/libavformat/avformat.h b/libavformat/avformat.h index 9e7eca007e..9e428ee843 100644 --- a/libavformat/avformat.h +++ b/libavformat/avformat.h @@ -1018,6 +1018,83 @@ typedef struct AVStream { int pts_wrap_bits; } AVStream; +enum AVStreamGroupParamsType { + AV_STREAM_GROUP_PARAMS_NONE, + AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT, + AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION, +}; + +struct AVIAMFAudioElement; +struct AVIAMFMixPresentation; + +typedef struct AVStreamGroup { + /** + * A class for @ref avoptions. Set by avformat_stream_group_create(). + */ + const AVClass *av_class; + + void *priv_data; + + /** + * Group index in AVFormatContext. + */ + unsigned int index; + + /** + * Group type-specific group ID. + * + * decoding: set by libavformat + * encoding: may set by the user + */ + int64_t id; + + /** + * Group type + * + * decoding: set by libavformat on group creation + * encoding: set by avformat_stream_group_create() + */ + enum AVStreamGroupParamsType type; + + /** + * Group type-specific parameters + */ + union { + struct AVIAMFAudioElement *iamf_audio_element; + struct AVIAMFMixPresentation *iamf_mix_presentation; + } params; + + /** + * Metadata that applies to the whole group. + * + * - demuxing: set by libavformat on group creation + * - muxing: may be set by the caller before avformat_write_header() + * + * Freed by libavformat in avformat_free_context(). + */ + AVDictionary *metadata; + + /** + * Number of elements in AVStreamGroup.streams. + * + * Set by avformat_stream_group_add_stream() must not be modified by any other code. + */ + unsigned int nb_streams; + + /** + * A list of streams in the group. New entries are created with + * avformat_stream_group_add_stream(). + * + * - demuxing: entries are created by libavformat on group creation. + * If AVFMTCTX_NOHEADER is set in ctx_flags, then new entries may also + * appear in av_read_frame(). + * - muxing: entries are created by the user before avformat_write_header(). + * + * Freed by libavformat in avformat_free_context(). + */ + AVStream **streams; +} AVStreamGroup; + struct AVCodecParserContext *av_stream_get_parser(const AVStream *s); #if FF_API_GET_END_PTS @@ -1726,6 +1803,26 @@ typedef struct AVFormatContext { * @return 0 on success, a negative AVERROR code on failure */ int (*io_close2)(struct AVFormatContext *s, AVIOContext *pb); + + /** + * Number of elements in AVFormatContext.stream_groups. + * + * Set by avformat_stream_group_create(), must not be modified by any other code. + */ + unsigned int nb_stream_groups; + + /** + * A list of all stream groups in the file. New groups are created with + * avformat_stream_group_create(), and filled with avformat_stream_group_add_stream(). + * + * - demuxing: groups may be created by libavformat in avformat_open_input(). + * If AVFMTCTX_NOHEADER is set in ctx_flags, then new groups may also + * appear in av_read_frame(). + * - muxing: groups may be created by the user before avformat_write_header(). + * + * Freed by libavformat in avformat_free_context(). + */ + AVStreamGroup **stream_groups; } AVFormatContext; /** @@ -1844,6 +1941,37 @@ const AVClass *avformat_get_class(void); */ const AVClass *av_stream_get_class(void); +/** + * Get the AVClass for AVStreamGroup. It can be used in combination with + * AV_OPT_SEARCH_FAKE_OBJ for examining options. + * + * @see av_opt_find(). + */ +const AVClass *av_stream_group_get_class(void); + +/** + * Add a new empty stream group to a media file. + * + * When demuxing, it may be called by the demuxer in read_header(). If the + * flag AVFMTCTX_NOHEADER is set in s.ctx_flags, then it may also + * be called in read_packet(). + * + * When muxing, may be called by the user before avformat_write_header(). + * + * User is required to call avformat_free_context() to clean up the allocation + * by avformat_stream_group_create(). + * + * New streams can be added to the group with avformat_stream_group_add_stream(). + * + * @param s media file handle + * + * @return newly created group or NULL on error. + * @see avformat_new_stream, avformat_stream_group_add_stream. + */ +AVStreamGroup *avformat_stream_group_create(AVFormatContext *s, + enum AVStreamGroupParamsType type, + AVDictionary **options); + /** * Add a new stream to a media file. * @@ -1863,6 +1991,31 @@ const AVClass *av_stream_get_class(void); */ AVStream *avformat_new_stream(AVFormatContext *s, const struct AVCodec *c); +/** + * Add an already allocated stream to a stream group. + * + * When demuxing, it may be called by the demuxer in read_header(). If the + * flag AVFMTCTX_NOHEADER is set in s.ctx_flags, then it may also + * be called in read_packet(). + * + * When muxing, may be called by the user before avformat_write_header() after + * having allocated a new group with avformat_stream_group_create() and stream with + * avformat_new_stream(). + * + * User is required to call avformat_free_context() to clean up the allocation + * by avformat_stream_group_add_stream(). + * + * @param stg stream group belonging to a media file. + * @param st stream in the media file to add to the group. + * + * @retval 0 success + * @retval AVERROR(EEXIST) the stream was already in the group + * @retval "another negative error code" legitimate errors + * + * @see avformat_new_stream, avformat_stream_group_create. + */ +int avformat_stream_group_add_stream(AVStreamGroup *stg, AVStream *st); + #if FF_API_AVSTREAM_SIDE_DATA /** * Wrap an existing array as stream side data. @@ -2819,6 +2972,22 @@ AVRational av_guess_frame_rate(AVFormatContext *ctx, AVStream *stream, int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st, const char *spec); +/** + * Check if the group stg contained in s is matched by the stream group + * specifier spec. + * + * See the "stream group specifiers" chapter in the documentation for the + * syntax of spec. + * + * @return >0 if stg is matched by spec; + * 0 if stg is not matched by spec; + * AVERROR code if spec is invalid + * + * @note A stream group specifier can match several groups in the format. + */ +int avformat_match_stream_group_specifier(AVFormatContext *s, AVStreamGroup *stg, + const char *spec); + int avformat_queue_attached_pictures(AVFormatContext *s); enum AVTimebaseSource { diff --git a/libavformat/dump.c b/libavformat/dump.c index c0868a1bb3..cc179f284f 100644 --- a/libavformat/dump.c +++ b/libavformat/dump.c @@ -24,6 +24,7 @@ #include "libavutil/channel_layout.h" #include "libavutil/display.h" +#include "libavutil/iamf.h" #include "libavutil/intreadwrite.h" #include "libavutil/log.h" #include "libavutil/mastering_display_metadata.h" @@ -134,28 +135,36 @@ static void print_fps(double d, const char *postfix) av_log(NULL, AV_LOG_INFO, "%1.0fk %s", d / 1000, postfix); } -static void dump_metadata(void *ctx, const AVDictionary *m, const char *indent) +static void dump_dictionary(void *ctx, const AVDictionary *m, + const char *name, const char *indent) { - if (m && !(av_dict_count(m) == 1 && av_dict_get(m, "language", NULL, 0))) { - const AVDictionaryEntry *tag = NULL; - - av_log(ctx, AV_LOG_INFO, "%sMetadata:\n", indent); - while ((tag = av_dict_iterate(m, tag))) - if (strcmp("language", tag->key)) { - const char *p = tag->value; - av_log(ctx, AV_LOG_INFO, - "%s %-16s: ", indent, tag->key); - while (*p) { - size_t len = strcspn(p, "\x8\xa\xb\xc\xd"); - av_log(ctx, AV_LOG_INFO, "%.*s", (int)(FFMIN(255, len)), p); - p += len; - if (*p == 0xd) av_log(ctx, AV_LOG_INFO, " "); - if (*p == 0xa) av_log(ctx, AV_LOG_INFO, "\n%s %-16s: ", indent, ""); - if (*p) p++; - } - av_log(ctx, AV_LOG_INFO, "\n"); + const AVDictionaryEntry *tag = NULL; + + if (!m) + return; + + av_log(ctx, AV_LOG_INFO, "%s%s:\n", indent, name); + while ((tag = av_dict_iterate(m, tag))) + if (strcmp("language", tag->key)) { + const char *p = tag->value; + av_log(ctx, AV_LOG_INFO, + "%s %-16s: ", indent, tag->key); + while (*p) { + size_t len = strcspn(p, "\x8\xa\xb\xc\xd"); + av_log(ctx, AV_LOG_INFO, "%.*s", (int)(FFMIN(255, len)), p); + p += len; + if (*p == 0xd) av_log(ctx, AV_LOG_INFO, " "); + if (*p == 0xa) av_log(ctx, AV_LOG_INFO, "\n%s %-16s: ", indent, ""); + if (*p) p++; } - } + av_log(ctx, AV_LOG_INFO, "\n"); + } +} + +static void dump_metadata(void *ctx, const AVDictionary *m, const char *indent) +{ + if (m && !(av_dict_count(m) == 1 && av_dict_get(m, "language", NULL, 0))) + dump_dictionary(ctx, m, "Metadata", indent); } /* param change side data*/ @@ -509,7 +518,7 @@ static void dump_sidedata(void *ctx, const AVStream *st, const char *indent) /* "user interface" functions */ static void dump_stream_format(const AVFormatContext *ic, int i, - int index, int is_output) + int group_index, int index, int is_output) { char buf[256]; int flags = (is_output ? ic->oformat->flags : ic->iformat->flags); @@ -517,6 +526,8 @@ static void dump_stream_format(const AVFormatContext *ic, int i, const FFStream *const sti = cffstream(st); const AVDictionaryEntry *lang = av_dict_get(st->metadata, "language", NULL, 0); const char *separator = ic->dump_separator; + const char *group_indent = group_index >= 0 ? " " : ""; + const char *extra_indent = group_index >= 0 ? " " : " "; AVCodecContext *avctx; int ret; @@ -543,7 +554,8 @@ static void dump_stream_format(const AVFormatContext *ic, int i, avcodec_string(buf, sizeof(buf), avctx, is_output); avcodec_free_context(&avctx); - av_log(NULL, AV_LOG_INFO, " Stream #%d:%d", index, i); + av_log(NULL, AV_LOG_INFO, "%s Stream #%d", group_indent, index); + av_log(NULL, AV_LOG_INFO, ":%d", i); /* the pid is an important information, so we display it */ /* XXX: add a generic system */ @@ -621,9 +633,89 @@ static void dump_stream_format(const AVFormatContext *ic, int i, av_log(NULL, AV_LOG_INFO, " (non-diegetic)"); av_log(NULL, AV_LOG_INFO, "\n"); - dump_metadata(NULL, st->metadata, " "); + dump_metadata(NULL, st->metadata, extra_indent); + + dump_sidedata(NULL, st, extra_indent); +} + +static void dump_stream_group(const AVFormatContext *ic, uint8_t *printed, + int i, int index, int is_output) +{ + const AVStreamGroup *stg = ic->stream_groups[i]; + int flags = (is_output ? ic->oformat->flags : ic->iformat->flags); + char buf[512]; + int ret; - dump_sidedata(NULL, st, " "); + av_log(NULL, AV_LOG_INFO, " Stream group #%d:%d", index, i); + if (flags & AVFMT_SHOW_IDS) + av_log(NULL, AV_LOG_INFO, "[0x%"PRIx64"]", stg->id); + av_log(NULL, AV_LOG_INFO, ":"); + + switch (stg->type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: { + const AVIAMFAudioElement *audio_element = stg->params.iamf_audio_element; + av_log(NULL, AV_LOG_INFO, " IAMF Audio Element\n"); + dump_metadata(NULL, stg->metadata, " "); + for (int j = 0; j < audio_element->nb_layers; j++) { + const AVIAMFLayer *layer = audio_element->layers[j]; + int channel_count = layer->ch_layout.nb_channels; + av_log(NULL, AV_LOG_INFO, " Layer %d:", j); + ret = av_channel_layout_describe(&layer->ch_layout, buf, sizeof(buf)); + if (ret >= 0) + av_log(NULL, AV_LOG_INFO, " %s", buf); + av_log(NULL, AV_LOG_INFO, "\n"); + for (int k = 0; channel_count > 0 && k < stg->nb_streams; k++) { + AVStream *st = stg->streams[k]; + dump_stream_format(ic, st->index, i, index, is_output); + printed[st->index] = 1; + channel_count -= st->codecpar->ch_layout.nb_channels; + } + } + break; + } + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: { + const AVIAMFMixPresentation *mix_presentation = stg->params.iamf_mix_presentation; + av_log(NULL, AV_LOG_INFO, " IAMF Mix Presentation\n"); + dump_metadata(NULL, stg->metadata, " "); + dump_dictionary(NULL, mix_presentation->annotations, "Annotations", " "); + for (int j = 0; j < mix_presentation->nb_submixes; j++) { + AVIAMFSubmix *sub_mix = mix_presentation->submixes[j]; + av_log(NULL, AV_LOG_INFO, " Submix %d:\n", j); + for (int k = 0; k < sub_mix->nb_elements; k++) { + const AVIAMFSubmixElement *submix_element = sub_mix->elements[k]; + const AVStreamGroup *audio_element = NULL; + for (int l = 0; l < ic->nb_stream_groups; l++) + if (ic->stream_groups[l]->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT && + ic->stream_groups[l]->id == submix_element->audio_element_id) { + audio_element = ic->stream_groups[l]; + break; + } + if (audio_element) { + av_log(NULL, AV_LOG_INFO, " IAMF Audio Element #%d:%d", + index, audio_element->index); + if (flags & AVFMT_SHOW_IDS) + av_log(NULL, AV_LOG_INFO, "[0x%"PRIx64"]", audio_element->id); + av_log(NULL, AV_LOG_INFO, "\n"); + dump_dictionary(NULL, submix_element->annotations, "Annotations", " "); + } + } + for (int k = 0; k < sub_mix->nb_layouts; k++) { + const AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[k]; + av_log(NULL, AV_LOG_INFO, " Layout #%d:", k); + if (submix_layout->layout_type == 2) { + ret = av_channel_layout_describe(&submix_layout->sound_system, buf, sizeof(buf)); + if (ret >= 0) + av_log(NULL, AV_LOG_INFO, " %s", buf); + } else if (submix_layout->layout_type == 3) + av_log(NULL, AV_LOG_INFO, " Binaural"); + av_log(NULL, AV_LOG_INFO, "\n"); + } + } + break; + } + default: + break; + } } void av_dump_format(AVFormatContext *ic, int index, @@ -699,7 +791,7 @@ void av_dump_format(AVFormatContext *ic, int index, dump_metadata(NULL, program->metadata, " "); for (k = 0; k < program->nb_stream_indexes; k++) { dump_stream_format(ic, program->stream_index[k], - index, is_output); + -1, index, is_output); printed[program->stream_index[k]] = 1; } total += program->nb_stream_indexes; @@ -708,9 +800,12 @@ void av_dump_format(AVFormatContext *ic, int index, av_log(NULL, AV_LOG_INFO, " No Program\n"); } + for (i = 0; i < ic->nb_stream_groups; i++) + dump_stream_group(ic, printed, i, index, is_output); + for (i = 0; i < ic->nb_streams; i++) if (!printed[i]) - dump_stream_format(ic, i, index, is_output); + dump_stream_format(ic, i, -1, index, is_output); av_free(printed); } diff --git a/libavformat/internal.h b/libavformat/internal.h index 7702986c9c..c6181683ef 100644 --- a/libavformat/internal.h +++ b/libavformat/internal.h @@ -202,6 +202,7 @@ typedef struct FFStream { */ AVStream pub; + AVFormatContext *fmtctx; /** * Set to 1 if the codec allows reordering, so pts can be different * from dts. @@ -427,6 +428,26 @@ static av_always_inline const FFStream *cffstream(const AVStream *st) return (const FFStream*)st; } +typedef struct FFStreamGroup { + /** + * The public context. + */ + AVStreamGroup pub; + + AVFormatContext *fmtctx; +} FFStreamGroup; + + +static av_always_inline FFStreamGroup *ffstreamgroup(AVStreamGroup *stg) +{ + return (FFStreamGroup*)stg; +} + +static av_always_inline const FFStreamGroup *cffstreamgroup(const AVStreamGroup *stg) +{ + return (const FFStreamGroup*)stg; +} + #ifdef __GNUC__ #define dynarray_add(tab, nb_ptr, elem)\ do {\ @@ -608,6 +629,18 @@ void ff_free_stream(AVStream **st); */ void ff_remove_stream(AVFormatContext *s, AVStream *st); +/** + * Frees a stream group without modifying the corresponding AVFormatContext. + * Must only be called if the latter doesn't matter or if the stream + * is not yet attached to an AVFormatContext. + */ +void ff_free_stream_group(AVStreamGroup **pstg); +/** + * Remove a stream group from its AVFormatContext and free it. + * The group must be the last stream of the AVFormatContext. + */ +void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg); + unsigned int ff_codec_get_tag(const AVCodecTag *tags, enum AVCodecID id); enum AVCodecID ff_codec_get_id(const AVCodecTag *tags, unsigned int tag); diff --git a/libavformat/options.c b/libavformat/options.c index 1d8c52246b..bf6113ca95 100644 --- a/libavformat/options.c +++ b/libavformat/options.c @@ -26,6 +26,7 @@ #include "libavcodec/codec_par.h" #include "libavutil/avassert.h" +#include "libavutil/iamf.h" #include "libavutil/internal.h" #include "libavutil/intmath.h" #include "libavutil/opt.h" @@ -271,6 +272,7 @@ AVStream *avformat_new_stream(AVFormatContext *s, const AVCodec *c) if (!st->codecpar) goto fail; + sti->fmtctx = s; sti->avctx = avcodec_alloc_context3(NULL); if (!sti->avctx) goto fail; @@ -325,6 +327,143 @@ fail: return NULL; } +static void *stream_group_child_next(void *obj, void *prev) +{ + AVStreamGroup *stg = obj; + if (!prev) { + switch(stg->type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + return stg->params.iamf_audio_element; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + return stg->params.iamf_mix_presentation; + default: + break; + } + } + return NULL; +} + +static const AVClass *stream_group_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + switch(i) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + ret = av_iamf_audio_element_get_class(); + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + ret = av_iamf_mix_presentation_get_class(); + break; + default: + break; + } + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVOption stream_group_options[] = { + {"id", "Set group id", offsetof(AVStreamGroup, id), AV_OPT_TYPE_INT64, {.i64 = 0}, 0, INT64_MAX, AV_OPT_FLAG_ENCODING_PARAM }, + { NULL } +}; + +static const AVClass stream_group_class = { + .class_name = "AVStreamGroup", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = stream_group_options, + .child_next = stream_group_child_next, + .child_class_iterate = stream_group_child_iterate, +}; + +const AVClass *av_stream_group_get_class(void) +{ + return &stream_group_class; +} + +AVStreamGroup *avformat_stream_group_create(AVFormatContext *s, + enum AVStreamGroupParamsType type, + AVDictionary **options) +{ + AVStreamGroup **stream_groups; + AVStreamGroup *stg; + FFStreamGroup *stgi; + + stream_groups = av_realloc_array(s->stream_groups, s->nb_stream_groups + 1, + sizeof(*stream_groups)); + if (!stream_groups) + return NULL; + s->stream_groups = stream_groups; + + stgi = av_mallocz(sizeof(*stgi)); + if (!stgi) + return NULL; + stg = &stgi->pub; + + stg->av_class = &stream_group_class; + av_opt_set_defaults(stg); + stg->type = type; + switch (type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + stg->params.iamf_audio_element = av_iamf_audio_element_alloc(); + if (!stg->params.iamf_audio_element) + goto fail; + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + stg->params.iamf_mix_presentation = av_iamf_mix_presentation_alloc(); + if (!stg->params.iamf_mix_presentation) + goto fail; + break; + default: + goto fail; + } + + if (options) { + if (av_opt_set_dict2(stg, options, AV_OPT_SEARCH_CHILDREN)) + goto fail; + } + + stgi->fmtctx = s; + stg->index = s->nb_stream_groups; + + s->stream_groups[s->nb_stream_groups++] = stg; + + return stg; +fail: + ff_free_stream_group(&stg); + return NULL; +} + +static int stream_group_add_stream(AVStreamGroup *stg, AVStream *st) +{ + AVStream **streams = av_realloc_array(stg->streams, stg->nb_streams + 1, + sizeof(*stg->streams)); + if (!streams) + return AVERROR(ENOMEM); + + stg->streams = streams; + stg->streams[stg->nb_streams++] = st; + + return 0; +} + +int avformat_stream_group_add_stream(AVStreamGroup *stg, AVStream *st) +{ + const FFStreamGroup *stgi = cffstreamgroup(stg); + const FFStream *sti = cffstream(st); + + if (stgi->fmtctx != sti->fmtctx) + return AVERROR(EINVAL); + + for (int i = 0; i < stg->nb_streams; i++) + if (stg->streams[i]->index == st->index) + return AVERROR(EEXIST); + + return stream_group_add_stream(stg, st); +} + static int option_is_disposition(const AVOption *opt) { return opt->type == AV_OPT_TYPE_CONST && From patchwork Tue Dec 5 22:43:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44936 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9153:b0:181:818d:5e7f with SMTP id x19csp658943pzc; Tue, 5 Dec 2023 14:44:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IHKgLWrBVyoE2Zbwoo98gZQ75BZzVur2ZxChVTjy0iaVxd/1h+kSijQeKEm1GmVDXTYfUp0 X-Received: by 2002:a2e:130a:0:b0:2c9:f509:d81e with SMTP id 10-20020a2e130a000000b002c9f509d81emr25733ljt.0.1701816268413; Tue, 05 Dec 2023 14:44:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701816268; cv=none; d=google.com; s=arc-20160816; b=fDm6nwmpfuxeWQW04Y7D3tYtv/GOkWIFyYjX7/hvkypSRwjRDDTRGn8TPBTY2iyWWi 2O0iHs2zxhl+zT25w0AafebvR3hPgJKw8ermpR+C35LTxuw5YOTn+ZrnbtQ0/qnAwkiE FhcBodT5uAfIUENlLSJBRh6wG3B8kZeLne2dXuma6aD1g9XtdoRkOKsu3qdGoxSjHpxV 7ZdRc5aZ/b1mLblfKIxp2Oe8h2KHKK21zT0wO6MjgsHNCe9sgYjd88qFxveQMjN59dgJ XJtvxF8MRt0uTkcuhC9btf4t67Li1JCSHNiYGKWFKzUnmWROTlhkwZPZdOBQlZLgo2Yc 6HUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=r7CM24+BepWT9uLO2E4SFh6j6DJ4jDy09hM/jKZe8FY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=owN8Kxs37gSl3iSuSgfu6N9xG9tuC4dv6aWj69LAEqalk7REpRsU6s6OCOj/enpMat CnWLLx+D4eo/gLHfsufHjyF0mejxOarsAVqfEv62b1wxj4YXA776abaMom5cENjg6OUF bruUevaZbIcg+Lub1oWangubMpmTNE5/WJzihcI29Z5PjWqbjPS3NyeI2ViGpqOqLAeO P6PMqjsB0bs+0HPtQViNm/5+URdFQPY1f8NlJLfkw7n5uLQC/eqR+agLeuNM8xv5hEFK W+wiJw/Dnw+kHIg7+azro9+MzGg35s+2Xpw4mqO+VxDPdpotv5348JcClG17Y5h1Hzjb MsfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=mWrUXZwB; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id di8-20020a170906730800b00a1d18360a11si564770ejc.311.2023.12.05.14.44.27; Tue, 05 Dec 2023 14:44:28 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=mWrUXZwB; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 404E468CA8F; Wed, 6 Dec 2023 00:43:56 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-oo1-f49.google.com (mail-oo1-f49.google.com [209.85.161.49]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 0C6BD68CEFC for ; Wed, 6 Dec 2023 00:43:49 +0200 (EET) Received: by mail-oo1-f49.google.com with SMTP id 006d021491bc7-58d08497aa1so4392010eaf.0 for ; Tue, 05 Dec 2023 14:43:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701816227; x=1702421027; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=OekuwNgONjjVHnJE+dDtobgDVcbuHsOjZx0p67JmfQY=; b=mWrUXZwBYo2sydqOXW/fZAUnQu94BkV6es1rP4kbEE2j/od/ulaEGr2u8uNA4s3Vsk o+YfwN7kcKt1vNUicGM2xr14k8dJUQAV+zekSgtxMusybEOAFSWemId+ncvF0lhYKa6g ogeArlTXElJ73K/zt1S0x+D98HFHmoFp7VxAFOOtTz1iIMGWogKddcehcm6+D0MAYisF Z0f4wtbsz4+2WPRu8m8GDwOqIiSHR2lBKYGn7NVrH2PEKAZ4SA3+KUZZird32VuOrQKc rWqO0nflktPOMcRS+Mqc26jJGtzuthpevw2to3ol3PzhnE5vR+9QWuo2umBtuMu1Phce oefg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701816227; x=1702421027; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OekuwNgONjjVHnJE+dDtobgDVcbuHsOjZx0p67JmfQY=; b=d7inl+Zc+jXN1F8h/5Ai3+4DbLUPaPrKVJyf2eN05H6n7X1RJFcVb4TnSgP3r7/wFz FtHib4oYJx1jWe/cf2M1L9jqWAcW7IS9R9ex0MAbz9IkjFebiOKvaaZz2LqnfamtGj9z VlT4NWyqrBeyMxvIDq4MuCfpJ5/vkSER0ZKzuf+kwdc5w3gNs8aWcjuhcRSG0l6fxSBx zWl7KQv2mQbZn4t/aEzrC4uwaECDxQt+C/fMWt9pIhviZdt39LG23lmXrOh/DJcDppKG 35jqk9IDzfbt/gtaxkNiPHcnOUtPMWy/OrEV84cqa06Z8j8zgeDv6ok1FYf6Hy/rp7TB U3pw== X-Gm-Message-State: AOJu0YxThMJx+9OOuNIiP73wHBOjV2mC9ROs6fSngA4ptekZAvDuaMBO LAWB26byoh6AgARzZjyGrzRDqJX/Fw4= X-Received: by 2002:a05:6358:2919:b0:170:8ef:d223 with SMTP id y25-20020a056358291900b0017008efd223mr54513rwb.25.1701816226571; Tue, 05 Dec 2023 14:43:46 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id hq25-20020a056a00681900b0064fd4a6b306sm2037688pfb.76.2023.12.05.14.43.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 14:43:45 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 5 Dec 2023 19:43:57 -0300 Message-ID: <20231205224402.14540-4-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231205224402.14540-1-jamrial@gmail.com> References: <20231205224402.14540-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: V1zGFVZsNpt2 Starting with IAMF support. Signed-off-by: James Almer --- fftools/ffmpeg.h | 2 + fftools/ffmpeg_mux_init.c | 335 ++++++++++++++++++++++++++++++++++++++ fftools/ffmpeg_opt.c | 2 + 3 files changed, 339 insertions(+) diff --git a/fftools/ffmpeg.h b/fftools/ffmpeg.h index 41935d39d5..057535adbb 100644 --- a/fftools/ffmpeg.h +++ b/fftools/ffmpeg.h @@ -262,6 +262,8 @@ typedef struct OptionsContext { int nb_disposition; SpecifierOpt *program; int nb_program; + SpecifierOpt *stream_groups; + int nb_stream_groups; SpecifierOpt *time_bases; int nb_time_bases; SpecifierOpt *enc_time_bases; diff --git a/fftools/ffmpeg_mux_init.c b/fftools/ffmpeg_mux_init.c index 63a25a350f..7648f2a2f1 100644 --- a/fftools/ffmpeg_mux_init.c +++ b/fftools/ffmpeg_mux_init.c @@ -39,6 +39,7 @@ #include "libavutil/dict.h" #include "libavutil/display.h" #include "libavutil/getenv_utf8.h" +#include "libavutil/iamf.h" #include "libavutil/intreadwrite.h" #include "libavutil/log.h" #include "libavutil/mem.h" @@ -1943,6 +1944,336 @@ static int setup_sync_queues(Muxer *mux, AVFormatContext *oc, int64_t buf_size_u return 0; } +static int of_parse_iamf_audio_element_layers(Muxer *mux, AVStreamGroup *stg, char **ptr) +{ + AVIAMFAudioElement *audio_element = stg->params.iamf_audio_element; + AVDictionary *dict = NULL; + const char *token; + int ret = 0; + + audio_element->demixing_info = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_DEMIXING, 1, NULL); + audio_element->recon_gain_info = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN, 1, NULL); + + if (!audio_element->demixing_info || + !audio_element->recon_gain_info) + return AVERROR(ENOMEM); + + /* process manually set layers and parameters */ + token = av_strtok(NULL, ",", ptr); + while (token) { + const AVDictionaryEntry *e; + int demixing = 0, recon_gain = 0; + int layer = 0; + + if (av_strstart(token, "layer=", &token)) + layer = 1; + else if (av_strstart(token, "demixing=", &token)) + demixing = 1; + else if (av_strstart(token, "recon_gain=", &token)) + recon_gain = 1; + + av_dict_free(&dict); + ret = av_dict_parse_string(&dict, token, "=", ":", 0); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing audio element specification %s\n", token); + goto fail; + } + + if (layer) { + AVIAMFLayer *audio_layer = av_iamf_audio_element_add_layer(audio_element); + if (!audio_layer) { + av_log(mux, AV_LOG_ERROR, "Error adding layer to stream group %d\n", stg->index); + ret = AVERROR(ENOMEM); + goto fail; + } + av_opt_set_dict(audio_layer, &dict); + } else if (demixing || recon_gain) { + AVIAMFParamDefinition *param = demixing ? audio_element->demixing_info + : audio_element->recon_gain_info; + void *subblock = av_iamf_param_definition_get_subblock(param, 0); + + av_opt_set_dict(param, &dict); + av_opt_set_dict(subblock, &dict); + + /* Hardcode spec parameters */ + param->param_definition_mode = 0; + param->parameter_rate = stg->streams[0]->codecpar->sample_rate; + param->duration = + param->constant_subblock_duration = stg->streams[0]->codecpar->frame_size; + } + + // make sure that no entries are left in the dict + e = NULL; + if (e = av_dict_iterate(dict, e)) { + av_log(mux, AV_LOG_FATAL, "Unknown layer key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto fail; + } + token = av_strtok(NULL, ",", ptr); + } + +fail: + av_dict_free(&dict); + if (!ret && !audio_element->nb_layers) { + av_log(mux, AV_LOG_ERROR, "No layer in audio element specification\n"); + ret = AVERROR(EINVAL); + } + + return ret; +} + +static int of_parse_iamf_submixes(Muxer *mux, AVStreamGroup *stg, char **ptr) +{ + AVFormatContext *oc = mux->fc; + AVIAMFMixPresentation *mix = stg->params.iamf_mix_presentation; + AVDictionary *dict = NULL; + const char *token; + char *submix_str = NULL; + int ret = 0; + + /* process manually set submixes */ + token = av_strtok(NULL, ",", ptr); + while (token) { + AVIAMFSubmix *submix = NULL; + const char *subtoken; + char *subptr = NULL; + + if (!av_strstart(token, "submix=", &token)) { + av_log(mux, AV_LOG_ERROR, "No submix in mix presentation specification \"%s\"\n", token); + goto fail; + } + + submix_str = av_strdup(token); + if (!submix_str) + goto fail; + + submix = av_iamf_mix_presentation_add_submix(mix); + if (!submix) { + av_log(mux, AV_LOG_ERROR, "Error adding submix to stream group %d\n", stg->index); + ret = AVERROR(ENOMEM); + goto fail; + } + submix->output_mix_config = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, 0, NULL); + if (!submix->output_mix_config) { + ret = AVERROR(ENOMEM); + goto fail; + } + + submix->output_mix_config->parameter_rate = stg->streams[0]->codecpar->sample_rate; + + subptr = NULL; + subtoken = av_strtok(submix_str, "|", &subptr); + while (subtoken) { + const AVDictionaryEntry *e; + int element = 0, layout = 0; + + if (av_strstart(subtoken, "element=", &subtoken)) + element = 1; + else if (av_strstart(subtoken, "layout=", &subtoken)) + layout = 1; + + av_dict_free(&dict); + ret = av_dict_parse_string(&dict, subtoken, "=", ":", 0); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing submix specification \"%s\"\n", subtoken); + goto fail; + } + + if (element) { + AVIAMFSubmixElement *submix_element; + int idx = -1; + + if (e = av_dict_get(dict, "stg", NULL, 0)) + idx = strtol(e->value, NULL, 0); + av_dict_set(&dict, "stg", NULL, 0); + if (idx < 0 || idx >= oc->nb_stream_groups) { + av_log(mux, AV_LOG_ERROR, "Invalid or missing stream group index in " + "submix element specification \"%s\"\n", subtoken); + ret = AVERROR(EINVAL); + goto fail; + } + submix_element = av_iamf_submix_add_element(submix); + if (!submix_element) { + av_log(mux, AV_LOG_ERROR, "Error adding element to submix\n"); + ret = AVERROR(ENOMEM); + goto fail; + } + + submix_element->audio_element_id = oc->stream_groups[idx]->id; + + submix_element->element_mix_config = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, 0, NULL); + if (!submix_element->element_mix_config) + ret = AVERROR(ENOMEM); + av_opt_set_dict2(submix_element, &dict, AV_OPT_SEARCH_CHILDREN); + submix_element->element_mix_config->parameter_rate = stg->streams[0]->codecpar->sample_rate; + } else if (layout) { + AVIAMFSubmixLayout *submix_layout = av_iamf_submix_add_layout(submix); + if (!submix_layout) { + av_log(mux, AV_LOG_ERROR, "Error adding layout to submix\n"); + ret = AVERROR(ENOMEM); + goto fail; + } + av_opt_set_dict(submix_layout, &dict); + } else + av_opt_set_dict2(submix, &dict, AV_OPT_SEARCH_CHILDREN); + + if (ret < 0) { + goto fail; + } + + // make sure that no entries are left in the dict + e = NULL; + while (e = av_dict_iterate(dict, e)) { + av_log(mux, AV_LOG_FATAL, "Unknown submix key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto fail; + } + subtoken = av_strtok(NULL, "|", &subptr); + } + av_freep(&submix_str); + + if (!submix->nb_elements) { + av_log(mux, AV_LOG_ERROR, "No audio elements in submix specification \"%s\"\n", token); + ret = AVERROR(EINVAL); + } + token = av_strtok(NULL, ",", ptr); + } + +fail: + av_dict_free(&dict); + av_free(submix_str); + + return ret; +} + +static int of_add_groups(Muxer *mux, const OptionsContext *o) +{ + AVFormatContext *oc = mux->fc; + int ret; + + /* process manually set groups */ + for (int i = 0; i < o->nb_stream_groups; i++) { + AVDictionary *dict = NULL, *tmp = NULL; + const AVDictionaryEntry *e; + AVStreamGroup *stg = NULL; + int type; + const char *token; + char *str, *ptr = NULL; + const AVOption opts[] = { + { "type", "Set group type", offsetof(AVStreamGroup, type), AV_OPT_TYPE_INT, + { .i64 = 0 }, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM, "type" }, + { "iamf_audio_element", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT }, .unit = "type" }, + { "iamf_mix_presentation", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION }, .unit = "type" }, + { NULL }, + }; + const AVClass class = { + .class_name = "StreamGroupType", + .item_name = av_default_item_name, + .option = opts, + .version = LIBAVUTIL_VERSION_INT, + }; + const AVClass *pclass = &class; + + str = av_strdup(o->stream_groups[i].u.str); + if (!str) + goto end; + + token = av_strtok(str, ",", &ptr); + if (token) { + ret = av_dict_parse_string(&dict, token, "=", ":", AV_DICT_MULTIKEY); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing group specification %s\n", token); + goto end; + } + + // "type" is not a user settable option in AVStreamGroup + e = av_dict_get(dict, "type", NULL, 0); + if (!e) { + av_log(mux, AV_LOG_ERROR, "No type define for Steam Group %d\n", i); + ret = AVERROR(EINVAL); + goto end; + } + + ret = av_opt_eval_int(&pclass, opts, e->value, &type); + if (ret < 0 || type == AV_STREAM_GROUP_PARAMS_NONE) { + av_log(mux, AV_LOG_ERROR, "Invalid group type \"%s\"\n", e->value); + goto end; + } + + av_dict_copy(&tmp, dict, 0); + stg = avformat_stream_group_create(oc, type, &tmp); + if (!stg) { + ret = AVERROR(ENOMEM); + goto end; + } + av_dict_set(&tmp, "type", NULL, 0); + + e = NULL; + while (e = av_dict_get(dict, "st", e, 0)) { + unsigned int idx = strtol(e->value, NULL, 0); + if (idx >= oc->nb_streams) { + av_log(mux, AV_LOG_ERROR, "Invalid stream index %d\n", idx); + ret = AVERROR(EINVAL); + goto end; + } + avformat_stream_group_add_stream(stg, oc->streams[idx]); + } + while (e = av_dict_get(dict, "stg", e, 0)) { + unsigned int idx = strtol(e->value, NULL, 0); + if (idx >= oc->nb_stream_groups || idx == stg->index) { + av_log(mux, AV_LOG_ERROR, "Invalid stream group index %d\n", idx); + ret = AVERROR(EINVAL); + goto end; + } + for (int j = 0; j < oc->stream_groups[idx]->nb_streams; j++) + avformat_stream_group_add_stream(stg, oc->stream_groups[idx]->streams[j]); + } + + switch(type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + ret = of_parse_iamf_audio_element_layers(mux, stg, &ptr); + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + ret = of_parse_iamf_submixes(mux, stg, &ptr); + break; + default: + av_log(mux, AV_LOG_FATAL, "Unknown group type %d.\n", type); + ret = AVERROR(EINVAL); + break; + } + + if (ret < 0) + goto end; + + // make sure that nothing but "st" and "stg" entries are left in the dict + e = NULL; + while (e = av_dict_iterate(tmp, e)) { + if (!strcmp(e->key, "st") || !strcmp(e->key, "stg")) + continue; + + av_log(mux, AV_LOG_FATAL, "Unknown group key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto end; + } + } + +end: + av_dict_free(&dict); + av_dict_free(&tmp); + av_free(str); + if (ret < 0) + return ret; + } + + return 0; +} + static int of_add_programs(Muxer *mux, const OptionsContext *o) { AVFormatContext *oc = mux->fc; @@ -2740,6 +3071,10 @@ int of_open(const OptionsContext *o, const char *filename) if (err < 0) return err; + err = of_add_groups(mux, o); + if (err < 0) + return err; + err = of_add_programs(mux, o); if (err < 0) return err; diff --git a/fftools/ffmpeg_opt.c b/fftools/ffmpeg_opt.c index 304471dd03..1144f64f89 100644 --- a/fftools/ffmpeg_opt.c +++ b/fftools/ffmpeg_opt.c @@ -1491,6 +1491,8 @@ const OptionDef options[] = { "add metadata", "string=string" }, { "program", HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(program) }, "add program with specified streams", "title=string:st=number..." }, + { "stream_group", HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(stream_groups) }, + "add stream group with specified streams and group type-specific arguments", "id=number:st=number..." }, { "dframes", HAS_ARG | OPT_PERFILE | OPT_EXPERT | OPT_OUTPUT, { .func_arg = opt_data_frames }, "set the number of data frames to output", "number" }, From patchwork Tue Dec 5 22:43:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44937 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9153:b0:181:818d:5e7f with SMTP id x19csp659016pzc; Tue, 5 Dec 2023 14:44:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IGN5tY/cIeBZl9jhzGPw5oNx5AZsNYNvCdsAjSdpAeJZcrVP4kVUFArdal7SxcRbpvST8ff X-Received: by 2002:a50:f692:0:b0:54c:4837:9055 with SMTP id d18-20020a50f692000000b0054c48379055mr30450edn.77.1701816278387; Tue, 05 Dec 2023 14:44:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701816278; cv=none; d=google.com; s=arc-20160816; b=Em1hY6EbdT0MRGzBU1t26ABlmmB6r0N7aeIaSazlJTsDA/tLzCkTnMotzQFuJHKmaQ UB/9B1+p6hdEGv2mLQ7wEnHUeEp43AbNcHqWVJCMPk2SN9diZbSJkOoiQm2IeukRi3ot EzSinN91U3zZC35l2vVbmv1rXuMVDnzxfio2C/ImhE6q8zlDy+5fTHLr7v5WnEAe7yfH 8QEHscIVMh2LgqgmDrC1T6XQ7t3BbExPp6I9BMLRbGIovxcwltsD2ivpl23YA9tj3bn4 4xW4s80hotUa14Y9ezjA/Up3KaSW0KAOnlBfEvcO4RzkcddVm3leJtLpUvcP+ooZMtty j4tw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=qy1KQwAlYk+XOqPZ/0jKNRuKNr4MMVZXVDNzQ9D6fHM=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=A5T9PhX6U5R6ep+bKMcAcGUEKpdnJApdfJDhhtPdPj4TuGsFlc4cdoBgml9wG+dKtc 27n4yQISL6uPdoEF6lXVfCSHeaLvolvjHqTNFWqvj/nt2pCH2B3+/1PeKCNrd9VQmiNY GrmNwNAz0chnI74tHskcns94lwDP2DOROypgtDVKGft60xurgKjttFNaWxUNJv8uoYEy 1nGiUKOXmVTn9lx4b+1IqiaLMTP7HfxYh3lUz9t7ZI9EL+T+gP3oj4nbk3mGy5WWYPn/ e0a6TaWwnOkf6s3eoseTLV68DD5beWusblh1qSIcRrvG7RsAXBr0ikyIv84HYAUW+ro8 kk+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=iq9njH1p; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s22-20020a50ab16000000b0054c5febab05si1353306edc.333.2023.12.05.14.44.36; Tue, 05 Dec 2023 14:44:38 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=iq9njH1p; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 524B468CF2C; Wed, 6 Dec 2023 00:43:58 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-ot1-f43.google.com (mail-ot1-f43.google.com [209.85.210.43]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id BF18D68CEF7 for ; Wed, 6 Dec 2023 00:43:50 +0200 (EET) Received: by mail-ot1-f43.google.com with SMTP id 46e09a7af769-6d9a6f756c3so1560439a34.2 for ; Tue, 05 Dec 2023 14:43:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701816228; x=1702421028; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=2tkQy47363R1byuwNxSqzL7sZyukku8A9sA+O1dn8Ms=; b=iq9njH1p3l0SKSFedEvgKqfaDz4zd1yPwdszzNgxiqB3SwcSW06jaxsWHLNXSnv7gn BFFMkfgIMKI4HxD6whrSl6HSAFNSpC+41dXyJAbL079N8u5AERAOk99xyWspbDbg8pLU O3p7aNtS7gm/90jDtC+vhEHkIuskNZf9j3YrQ3r8YYPqXXz5BkoMcEgSCUPxLj0ht2dO lHZ178xTzuXqSbNO3tcki8mMUydjYqPccJNUwXNDSXxNaCemYsS4a9FAlQ2ljvxqoj4s oo2CrcZiJXes6uKR8MgEBWQGoCEAl0j5ZjfnSB5fM4C6Dd0pE5ojCrSagShjPgDLMV0m 2F9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701816228; x=1702421028; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2tkQy47363R1byuwNxSqzL7sZyukku8A9sA+O1dn8Ms=; b=hhsKi8nIkqSF3vkYYhgfOZw3yMfa9hzXEWKnLA5Cgzi5e+FgMpmt9V6rcp81IzkpFe ofLjT3yDkPIKSxE3qOq+q+wN0Kd0rk6VmBxSiMDOkvDYJz/isxnk5VLUs3O557kogazj rMfWnJ+faSZA3yiOevVkca8PZFTR7tbr4kngQAfEEvg8xLJHXEGE8mmj3UDX6j1yMk3y XNR+y+a6OUTwGxgBk5//G26cg6bPy6U9SPqKOySWNAYuHt0hiOuXhOSGXQ7LWFOo9rBL Z5EYaj2YFi500NdwYqoPwo8HJsEwVbpL+qiMHa+JM8/moCdUaB4tmAR76udH5iGsGpOs /KTA== X-Gm-Message-State: AOJu0Yz5RfqsDkx9uk7bzjxrhSjdchq1yL/PWFWYh6uFRmiPNhFGgM57 OFiEMy8xS/KGsLInmnWmVtWkvdrnBsk= X-Received: by 2002:a05:6830:154a:b0:6d8:ab64:7d81 with SMTP id l10-20020a056830154a00b006d8ab647d81mr23517otp.11.1701816228171; Tue, 05 Dec 2023 14:43:48 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id hq25-20020a056a00681900b0064fd4a6b306sm2037688pfb.76.2023.12.05.14.43.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 14:43:47 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 5 Dec 2023 19:43:58 -0300 Message-ID: <20231205224402.14540-5-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231205224402.14540-1-jamrial@gmail.com> References: <20231205224402.14540-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/8] avcodec/packet: add IAMF Parameters side data types X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: kO6LCiezuQFR Signed-off-by: James Almer --- libavcodec/avpacket.c | 3 +++ libavcodec/packet.h | 24 ++++++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/libavcodec/avpacket.c b/libavcodec/avpacket.c index e29725c2d2..0f8c9b77ae 100644 --- a/libavcodec/avpacket.c +++ b/libavcodec/avpacket.c @@ -301,6 +301,9 @@ const char *av_packet_side_data_name(enum AVPacketSideDataType type) case AV_PKT_DATA_DOVI_CONF: return "DOVI configuration record"; case AV_PKT_DATA_S12M_TIMECODE: return "SMPTE ST 12-1:2014 timecode"; case AV_PKT_DATA_DYNAMIC_HDR10_PLUS: return "HDR10+ Dynamic Metadata (SMPTE 2094-40)"; + case AV_PKT_DATA_IAMF_MIX_GAIN_PARAM: return "IAMF Mix Gain Parameter Data"; + case AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM: return "IAMF Demixing Info Parameter Data"; + case AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM: return "IAMF Recon Gain Info Parameter Data"; } return NULL; } diff --git a/libavcodec/packet.h b/libavcodec/packet.h index b19409b719..2c57d262c6 100644 --- a/libavcodec/packet.h +++ b/libavcodec/packet.h @@ -299,6 +299,30 @@ enum AVPacketSideDataType { */ AV_PKT_DATA_DYNAMIC_HDR10_PLUS, + /** + * IAMF Mix Gain Parameter Data associated with the audio frame. This metadata + * is in the form of the AVIAMFParamDefinition struct and contains information + * defined in sections 3.6.1 and 3.8.1 of the Immersive Audio Model and + * Formats standard. + */ + AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, + + /** + * IAMF Demixing Info Parameter Data associated with the audio frame. This + * metadata is in the form of the AVIAMFParamDefinition struct and contains + * information defined in sections 3.6.1 and 3.8.2 of the Immersive Audio Model + * and Formats standard. + */ + AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, + + /** + * IAMF Recon Gain Info Parameter Data associated with the audio frame. This + * metadata is in the form of the AVIAMFParamDefinition struct and contains + * information defined in sections 3.6.1 and 3.8.3 of the Immersive Audio Model + * and Formats standard. + */ + AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, + /** * The number of side data types. * This is not part of the public API/ABI in the sense that it may From patchwork Tue Dec 5 22:43:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44938 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9153:b0:181:818d:5e7f with SMTP id x19csp659050pzc; Tue, 5 Dec 2023 14:44:45 -0800 (PST) X-Google-Smtp-Source: AGHT+IHyu+ZJGrwXmnnqAbMLdEu7Dixk2zmJ4/imv7Quo2t7jBA7o2wXEBi3/6nqTS773qQbvWJw X-Received: by 2002:a05:600c:510d:b0:40b:4ba1:c502 with SMTP id o13-20020a05600c510d00b0040b4ba1c502mr22372wms.37.1701816285059; Tue, 05 Dec 2023 14:44:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701816285; cv=none; d=google.com; s=arc-20160816; b=ubpRGj5ge7iIx+jwtOz5aXEdTm9wmYs+CMoMQmrZeA+XzGwsndDqGE8mV0sH3IemL9 Is0Sq5h7iXHimsUUa7FxAN0Wa/4DcjGgxHRe+JoZrmFjbCCZ8odjl6q8zlKxxvxZ//cv a1UTg/qbrZI/KZiLdJzcasg5IG2TP6jbt86jUjzYLpHdedc1D0zjkTHH20BmZZlFNPBd YtHTDkJ+aSkFGtlPJR6j/v+3oHei1FjNjby/8JO00MjnL5wYvsC12coifGwscvZOEbr1 fEhw/P/S1VoINGzfVaF1O4fH0nttBBVVNXsvQNskJu5nEtn9KoBvEy7R/qnflUG+DV6z pwOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=sx12p/fxp4uVYjc5O7mFa/tk5nqzo6SAAWWr3sAQxlY=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=0Yhk0FLglhAyDg7LY6avfmQaCYWpwLbeIGmhNnBBSplJLPT2FaG8mgtcUf/UCJ6idC q/SDLsZnQWB0920Pjpi7iIWnhGCCH2+bsACgC8YmfJqw3iGGIcgHg1F1K/UQm+5QchuC LnikVvS/0sRtW3RXlZq5BhCbQW+KVl2tMgVYlG5pRBcbTHz2KuH89A69FQy9W5w/7deL r5Gr1+ErGwUzGKZe/apWfkmQlSARXFn+rWXlVl1hoi6XMS+aFZi85qh26AaPgBThXX9c I21rOKu6ZLAvBTtoE00K0R1zHDmtqReRXOyP1k++yM3KPNVFsAiwHLcX7WPMG39J1upl L96Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="WJzR/Fjh"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id x21-20020a170906299500b00a1b60e9becfsi3074644eje.546.2023.12.05.14.44.44; Tue, 05 Dec 2023 14:44:45 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="WJzR/Fjh"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 626D368CF45; Wed, 6 Dec 2023 00:43:59 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 49A9268CEF5 for ; Wed, 6 Dec 2023 00:43:52 +0200 (EET) Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-6ce7632b032so911831b3a.1 for ; Tue, 05 Dec 2023 14:43:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701816230; x=1702421030; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QxUS7vp0JdGrHnyRk3Y/VWtRAl3/pjEkbskQJyLXrnU=; b=WJzR/FjhBT2WGpPayRlTP+sL0RABONXltDm9MKTBSa5YKKfe5oiQwjlMzZhobEpXCY 1yQoODBsVbcxeg71YHou1OjqjpSBxJVWk0PK7Mr5e8VxY3d1Wi4mAABgI27+ogqnu6wF 4eRu/nOE5wB3+GOmUHyk0X3FvERui597mXYG26vI0PGZx1SHth8Mg0tBGqpgvDfNv5Qe j1NVNYcxxbvXDu7iPmnNcHw1W15R/ZonzxVs7vg1cW+1FS15XUmozj0PnsSVTquIF/ME 3Bo8vvipYFgTtYs8d1sMLVr9qRqDjMEAd9Xb7qQr8wPufDg6trz/dN+UygdLfsgQCtsY TDXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701816230; x=1702421030; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QxUS7vp0JdGrHnyRk3Y/VWtRAl3/pjEkbskQJyLXrnU=; b=TpIkQ8m4od20c3uLu17lZXPVVu1EwHxa2SoK02oo2lb/uVVKumzNbKuUspc49estI9 kwQg1ME1jxQZGyaTncIoJ2mex3tD2C3nKiiyvbQ1dQPF24+RNDjy1jWjy+vqD9TRRxTT a7z9cMVjFJ891IMntODAHl0tHuwfwRR7DUIQQgn4luZ+hwhMTwjVOkcZOAnLGhOfXoAH XhPReZ9WhBJJ2V5qQuuqL8XJWlmfGtA5MkvX7g0VRRdeGsbzUPOTy/jUNCid0SK7Zq/C lgpN73YX7BC/++6zB8aIvYC7cEUfMLIWFWJejZ3Rdgv7Vy9i0Md/f25eQcpdP5f8Yc2/ ipqw== X-Gm-Message-State: AOJu0Yy5qI7eYag2SXQp7SVKY3iKVpARBLP6qT4d/CRyGgCi5wpx+S4c y0cOxsy6SUGMFvAegZTHXme4x65bLd8= X-Received: by 2002:a05:6a00:150e:b0:6ce:81e9:1e2 with SMTP id q14-20020a056a00150e00b006ce81e901e2mr239552pfu.64.1701816229722; Tue, 05 Dec 2023 14:43:49 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id hq25-20020a056a00681900b0064fd4a6b306sm2037688pfb.76.2023.12.05.14.43.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 14:43:49 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 5 Dec 2023 19:43:59 -0300 Message-ID: <20231205224402.14540-6-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231205224402.14540-1-jamrial@gmail.com> References: <20231205224402.14540-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/8] avcodec/get_bits: add get_leb() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: U78ULTMhKkJ4 Signed-off-by: James Almer --- libavcodec/bitstream.h | 2 ++ libavcodec/bitstream_template.h | 23 +++++++++++++++++++++++ libavcodec/get_bits.h | 24 ++++++++++++++++++++++++ 3 files changed, 49 insertions(+) diff --git a/libavcodec/bitstream.h b/libavcodec/bitstream.h index 35b7873b9c..17f8a5da83 100644 --- a/libavcodec/bitstream.h +++ b/libavcodec/bitstream.h @@ -103,6 +103,7 @@ # define bits_apply_sign bits_apply_sign_le # define bits_read_vlc bits_read_vlc_le # define bits_read_vlc_multi bits_read_vlc_multi_le +# define bits_read_leb bits_read_leb_le #elif defined(BITS_DEFAULT_BE) @@ -132,6 +133,7 @@ # define bits_apply_sign bits_apply_sign_be # define bits_read_vlc bits_read_vlc_be # define bits_read_vlc_multi bits_read_vlc_multi_be +# define bits_read_leb bits_read_leb_be #endif diff --git a/libavcodec/bitstream_template.h b/libavcodec/bitstream_template.h index 4f3d07275f..4c7101632f 100644 --- a/libavcodec/bitstream_template.h +++ b/libavcodec/bitstream_template.h @@ -562,6 +562,29 @@ static inline int BS_FUNC(read_vlc_multi)(BSCTX *bc, uint8_t dst[8], return ret; } +/** + * Read a unsigned integer coded as a variable number of up to eight + * little-endian bytes, where the MSB in a byte signals another byte + * must be read. + * Values > UINT_MAX are truncated, but all coded bits are read. + */ +static inline unsigned BS_FUNC(read_leb)(BSCTX *bc) { + int more, i = 0; + unsigned leb = 0; + + do { + int byte = BS_FUNC(read)(bc, 8); + unsigned bits = byte & 0x7f; + more = byte & 0x80; + if (i <= 4) + leb |= bits << (i * 7); + if (++i == 8) + break; + } while (more); + + return leb; +} + #undef BSCTX #undef BS_FUNC #undef BS_JOIN3 diff --git a/libavcodec/get_bits.h b/libavcodec/get_bits.h index cfcf97c021..9e19d2a439 100644 --- a/libavcodec/get_bits.h +++ b/libavcodec/get_bits.h @@ -94,6 +94,7 @@ typedef BitstreamContext GetBitContext; #define align_get_bits bits_align #define get_vlc2 bits_read_vlc #define get_vlc_multi bits_read_vlc_multi +#define get_leb bits_read_leb #define init_get_bits8_le(s, buffer, byte_size) bits_init8_le((BitstreamContextLE*)s, buffer, byte_size) #define get_bits_le(s, n) bits_read_le((BitstreamContextLE*)s, n) @@ -710,6 +711,29 @@ static inline int skip_1stop_8data_bits(GetBitContext *gb) return 0; } +/** + * Read a unsigned integer coded as a variable number of up to eight + * little-endian bytes, where the MSB in a byte signals another byte + * must be read. + * All coded bits are read, but values > UINT_MAX are truncated. + */ +static inline unsigned get_leb(GetBitContext *s) { + int more, i = 0; + unsigned leb = 0; + + do { + int byte = get_bits(s, 8); + unsigned bits = byte & 0x7f; + more = byte & 0x80; + if (i <= 4) + leb |= bits << (i * 7); + if (++i == 8) + break; + } while (more); + + return leb; +} + #endif // CACHED_BITSTREAM_READER #endif /* AVCODEC_GET_BITS_H */ From patchwork Tue Dec 5 22:44:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44940 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9153:b0:181:818d:5e7f with SMTP id x19csp659174pzc; Tue, 5 Dec 2023 14:45:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IEANq9dLGpePPtsbKLLyBxAT0UN80ZdktiZSG5VIPbNeL3xnXUgtsNTu7/Sv0eO7k8RW3WK X-Received: by 2002:a17:907:2d0a:b0:a18:ef56:8876 with SMTP id gs10-20020a1709072d0a00b00a18ef568876mr1435321ejc.47.1701816304116; Tue, 05 Dec 2023 14:45:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701816304; cv=none; d=google.com; s=arc-20160816; b=qUfPgz04/QJII4tn9WQbZ7AgULAsdkInm27C7lPFjdczJ3edEB4nhay6mMhxA2LW8v PI1HFSW1X+Hne13D1jwW31KWJ9gx3JQuBuqxgKwt1edp5syLPaZs4Qf1tH2WBJAxmosn c97J8sL2cWKJ+6eHDxuV5nr5/5mzkszvegkrVf8rY1/IjD3IvxnrxHpCG//7dHjwqgqZ FYoHp+fMbtbMUk11VNcOF5uhDHOxnZFep95dmJ0pVav9fqK0HblDVWCRd+F/cynr8gbc eW8KDTaUkrcI9cYdoBuYfdru2MtdaR2WWBxcdnrkxDuEw4f0jhow4k/fOVFhEve/VbcD 3iag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=b/pHH4OKHoqsrRSUlrIZPcrS5WucfyFmg9GXD42G98c=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=IbL02j/MUzOuIR+h1OIrOvppbIDaXirXtWRW24Bw35xQardKF1d9DPkPoN+sK/q82z TNhRKb6l1A4KSJO+Jr4RYoSlaVXlYYrc1Ty6TCSGIdCQEgmhwoS5Z1P12Hrp/Ut/sAq5 EO9+uXtJFRBmXBmgkHbmS5Br7efkYZlP1EHGOvVC6d5lG0o5ynnmRL0TAd1pef4WZNPI I4TRiQQWNni0+fc/lvahQyfMAXRorVcCEVM6DWhqjy20RFCC9hxeHV4Ce2NHJAxYKpHa JkZEt4mpf/owk6/tDisZaAaB7/wmX349jXWIg4zF6Jb4LACtT+qwWRZ+cwLeop/4dHhB B2aw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=JuVuQYo7; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id r10-20020a170906350a00b009d441527214si6016851eja.1045.2023.12.05.14.45.03; Tue, 05 Dec 2023 14:45:04 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=JuVuQYo7; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 793B268CF53; Wed, 6 Dec 2023 00:44:01 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C784268CF29 for ; Wed, 6 Dec 2023 00:43:53 +0200 (EET) Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-6cdcef787ffso6451522b3a.0 for ; Tue, 05 Dec 2023 14:43:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701816231; x=1702421031; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=mSE6cPNs1R3AaCBlo94VrNHgs5ToTsW0px4hqwFDXSQ=; b=JuVuQYo7iV4No5fOY3hMEBnvr3b2l5BnVJziLrSbihU8ZRfnnfWPr4G22tHqXfF6oT wTWRdALRZ57Uprz0eykVXGurS+teeHcWuBMjGd0e3UjiVJ/gGwAs2mmiUrfDh38NFjMT nk6Y+DlJwM4+tJEu3HSh1XQC7bAcEa7etZfhlsSU0T5guMX7KbcPQdDCz6JLd4dwhU92 JOknNGXvexK2MWQK6g38LUdv5D7lqskqyNtgxJBBjHYNFRsh4UJgzuzCX+5AOtyXqxxk 6wuu16MmSocKjPNptoL5Eh136eBqmbJ2xNt1Ub7RU5MSfjbIuTPp2L5sRJoJOvCh3FSv sdAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701816231; x=1702421031; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mSE6cPNs1R3AaCBlo94VrNHgs5ToTsW0px4hqwFDXSQ=; b=G615WMHa8aEhNuMe6raljr3P+QgvFcohN15rFS7zxi7ff5dhZ5TuN+xwAkpHvyDlpN /Qnm2GMLWDHmpo1GvYqlvhL1sJR+mzKzyI59osuDAOE41pgCnBdjDBx0LB3JB0c0rXiP 3Q9RuxuKNo7bAQrayiP+1NUBWBL9H2f+OQnzSb/gVjqPp72rA1rLr2vmpzqEyAJsW7QB h9rthVFAKM906DATRmh3nTG6NyjwWuAIuLgDqIYB8YJjVJ5moDIm1vCwAbRWolRGQzeo Uvh792SUbyvbOf+joCbGxWWvWN0CORxSeN1T+uwU+ODaRI+9PBiRtWep+mZmfjAMZt+1 eVqw== X-Gm-Message-State: AOJu0Yx1nebJecuehJcTPDj8hbvOmwFeujd2N8PGCteARGrhcdT3Vbr6 VH7JSSeijwa2QLNEka+fIWCs/Z1saA4= X-Received: by 2002:a05:6a20:3d95:b0:189:bde9:71aa with SMTP id s21-20020a056a203d9500b00189bde971aamr8569321pzi.48.1701816231318; Tue, 05 Dec 2023 14:43:51 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id hq25-20020a056a00681900b0064fd4a6b306sm2037688pfb.76.2023.12.05.14.43.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 14:43:50 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 5 Dec 2023 19:44:00 -0300 Message-ID: <20231205224402.14540-7-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231205224402.14540-1-jamrial@gmail.com> References: <20231205224402.14540-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 6/8] avformat/aviobuf: add ffio_read_leb() and ffio_write_leb() X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 0QMbJjPWN0PT Signed-off-by: James Almer --- libavformat/avio_internal.h | 10 ++++++++++ libavformat/aviobuf.c | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) diff --git a/libavformat/avio_internal.h b/libavformat/avio_internal.h index bd58499b64..f2e4ff30cb 100644 --- a/libavformat/avio_internal.h +++ b/libavformat/avio_internal.h @@ -146,6 +146,16 @@ int ffio_rewind_with_probe_data(AVIOContext *s, unsigned char **buf, int buf_siz uint64_t ffio_read_varlen(AVIOContext *bc); +/** + * Read a unsigned integer coded as a variable number of up to eight + * little-endian bytes, where the MSB in a byte signals another byte + * must be read. + * All coded bytes are read, but values > UINT_MAX are truncated. + */ +unsigned int ffio_read_leb(AVIOContext *s); + +void ffio_write_leb(AVIOContext *s, unsigned val); + /** * Read size bytes from AVIOContext into buf. * Check that exactly size bytes have been read. diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c index 2899c75521..5a329ce465 100644 --- a/libavformat/aviobuf.c +++ b/libavformat/aviobuf.c @@ -971,6 +971,39 @@ uint64_t ffio_read_varlen(AVIOContext *bc){ return val; } +unsigned int ffio_read_leb(AVIOContext *s) { + int more, i = 0; + unsigned leb = 0; + + do { + int byte = avio_r8(s); + unsigned bits = byte & 0x7f; + more = byte & 0x80; + if (i <= 4) + leb |= bits << (i * 7); + if (++i == 8) + break; + } while (more); + + return leb; +} + +void ffio_write_leb(AVIOContext *s, unsigned val) +{ + int len; + uint8_t byte; + + len = (av_log2(val) + 7) / 7; + + for (int i = 0; i < len; i++) { + byte = val >> (7 * i) & 0x7f; + if (i < len - 1) + byte |= 0x80; + + avio_w8(s, byte); + } +} + int ffio_fdopen(AVIOContext **s, URLContext *h) { uint8_t *buffer = NULL; From patchwork Tue Dec 5 22:44:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44941 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9153:b0:181:818d:5e7f with SMTP id x19csp659240pzc; Tue, 5 Dec 2023 14:45:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IGjJmPDPN/TrdbSVcoBfHch+839GQvDTMDqtNHjTtULAvgM8jl6uBkvLig+RjgwP202W5iQ X-Received: by 2002:a17:906:748f:b0:a16:3628:b71d with SMTP id e15-20020a170906748f00b00a163628b71dmr2469297ejl.0.1701816314312; Tue, 05 Dec 2023 14:45:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701816314; cv=none; d=google.com; s=arc-20160816; b=tJ9gWXYNWcVExr3YIhQKXW5AdqsRlvZKhWatE6GyPZC2xgjYQ16oRsg/7YLmosi1Ss xvNWW7CJLelfjErZZ10sdgYxIsTI8Ey5/a53m0rxKYTuvfctL9p74/k5mjlnpXemtSEb i7MvR1nQZsGhMJB4Qb4+nMck2DxyCX9fbkr69brP8a0M5t7rECUZ02ZQ6O5udWT3SkbW LfOZ/YsN6Ujl1EZKGWNLxnRRGStquNhsdOOygeNj/rlllQhbfBLJSN0kG6kDhWbUkRk+ /q4C0hbYPAE+gWli8oxUfn/M3ZZGIzNqE9CZaUSae/pwcv1pjmkDyxoJYL4Z0EHYnPAU jj+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=J2iW0Duz8WXoNZmItGAEaXgifYoPsVsddHsCZSRb3Kk=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=RDQHemw0oO4mlIP/hJAl6wPl8zg+k7OsNhwIxruygVtTPCKE684HY+sYIDGL/YpGyb pwKxVxlnJjmOdnArFpiV0uQtbovCzOF6qb590WHqA4Zrq+cr5Ht2+fzLtSqOOGSp98co reo0FxY2EpHF0B5CniD6NIblPex2dAUUeMvVr+SirLLaP08zxFjEAV801oV0QyFeh8pP azrxDpEnChJqF1a7qY4+6lRNPMwGl7veVONV4JT5iNkrtuwyZXf/21p1zDnti8nf/YhV cbIAlkVUfzNf+Ufw5fi8zQaJ3hxs1mNWLo5qeRQoVmXY3jin3xHubL5aT76CkFt0Ke3N quGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="Fpa4x6h/"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h9-20020a170906260900b00a1d8ba21d6fsi51787ejc.16.2023.12.05.14.45.13; Tue, 05 Dec 2023 14:45:14 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="Fpa4x6h/"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8D1CA68CF5D; Wed, 6 Dec 2023 00:44:02 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id B22A268CEF6 for ; Wed, 6 Dec 2023 00:43:55 +0200 (EET) Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-6ce52d796d2so2485150b3a.3 for ; Tue, 05 Dec 2023 14:43:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701816234; x=1702421034; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=NY9e2ju2n6BebtwqcWQVC68qD1bYPUiRGM21b7sejxE=; b=Fpa4x6h/EWpQZMRIFD8jbegrlry7NfKQ4v5XJxkcDNEjkG94zCADBS+DZnO9aBobq8 eYERtofXowQ8C3g1OqgIA4qZC4aHmsRKoGfi9kgFoUwJt/7fIDZuFL+WQWaT9m7WzD5d SlNGog7Ba4tCWHN2nZnu7CTuUYY+Kx1MXEwloFYdQFlHbovw9TIgQcBryvD1oxuXFGNI ruhDjp2Xi8oe+neT9jcaiz0bwGhJ04eoXwmhvyJBhw3kQ/Mgtm8rEdEyfEUS8RUZq7s4 FW4Pj0Vxt/TI/oPOqG8NGPJqy6Y3O8hwzXT8JZl0G6tYsE9Iwjxu48A+VtIxfnQjRv+W 69Tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701816234; x=1702421034; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NY9e2ju2n6BebtwqcWQVC68qD1bYPUiRGM21b7sejxE=; b=KLjfJCSaTDrnwzOBe6dEUPjxvZnsxayLPN/CiIy2hfz2ecylKg2o43/QrmBqdZxtPx Ydn1x5GIdpNecJZ29biX6RIBCwN3EXcFPPsbg6PeieQ31zK8RUXjj0Mk9+2JA7Z1cgRZ 286diWmpUrfEZHrjbKDrC+bEaWB96lkg/aZamSZezHNWYg5IWZFgVWD1azpd5cgxP6dj HhcxGn/UvWwA7XxzFWy3zpOo1zIVpG/SHkrGw9vLZtwBejlWyIhcHxO0s7SaAvnYce3Q iZ7tENgSkxQW0XkZboVN3EOmG2OhFnjlm5NbiJaz4T/bhZ6CzCidAPnLsDCjlCKpZMUt 3hEw== X-Gm-Message-State: AOJu0Yw+iMytV/QBMWc51XyvDszyOQXdK92f+WSpLMJFYbp9HRBJw0y4 5hk1xhKHsC9hTf3YE0rOJvgEfZpVivE= X-Received: by 2002:a05:6a00:194f:b0:6ce:2732:575 with SMTP id s15-20020a056a00194f00b006ce27320575mr2259540pfk.38.1701816232914; Tue, 05 Dec 2023 14:43:52 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id hq25-20020a056a00681900b0064fd4a6b306sm2037688pfb.76.2023.12.05.14.43.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 14:43:52 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 5 Dec 2023 19:44:01 -0300 Message-ID: <20231205224402.14540-8-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231205224402.14540-1-jamrial@gmail.com> References: <20231205224402.14540-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 7/8] avformat: Immersive Audio Model and Formats demuxer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: w+15D3twdiwl Signed-off-by: James Almer --- libavformat/Makefile | 1 + libavformat/allformats.c | 1 + libavformat/iamf.c | 125 +++++ libavformat/iamf.h | 162 ++++++ libavformat/iamf_parse.c | 1106 ++++++++++++++++++++++++++++++++++++++ libavformat/iamf_parse.h | 38 ++ libavformat/iamfdec.c | 495 +++++++++++++++++ 7 files changed, 1928 insertions(+) create mode 100644 libavformat/iamf.c create mode 100644 libavformat/iamf.h create mode 100644 libavformat/iamf_parse.c create mode 100644 libavformat/iamf_parse.h create mode 100644 libavformat/iamfdec.c diff --git a/libavformat/Makefile b/libavformat/Makefile index 2db83aff81..f23c22792b 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -258,6 +258,7 @@ OBJS-$(CONFIG_EVC_MUXER) += rawenc.o OBJS-$(CONFIG_HLS_DEMUXER) += hls.o hls_sample_encryption.o OBJS-$(CONFIG_HLS_MUXER) += hlsenc.o hlsplaylist.o avc.o OBJS-$(CONFIG_HNM_DEMUXER) += hnm.o +OBJS-$(CONFIG_IAMF_DEMUXER) += iamfdec.o iamf_parse.o iamf.o OBJS-$(CONFIG_ICO_DEMUXER) += icodec.o OBJS-$(CONFIG_ICO_MUXER) += icoenc.o OBJS-$(CONFIG_IDCIN_DEMUXER) += idcin.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index c8bb4e3866..6e520b78a6 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -212,6 +212,7 @@ extern const FFOutputFormat ff_hevc_muxer; extern const AVInputFormat ff_hls_demuxer; extern const FFOutputFormat ff_hls_muxer; extern const AVInputFormat ff_hnm_demuxer; +extern const AVInputFormat ff_iamf_demuxer; extern const AVInputFormat ff_ico_demuxer; extern const FFOutputFormat ff_ico_muxer; extern const AVInputFormat ff_idcin_demuxer; diff --git a/libavformat/iamf.c b/libavformat/iamf.c new file mode 100644 index 0000000000..5de70dc082 --- /dev/null +++ b/libavformat/iamf.c @@ -0,0 +1,125 @@ +/* + * Immersive Audio Model and Formats common helpers and structs + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/channel_layout.h" +#include "libavutil/iamf.h" +#include "libavutil/mem.h" +#include "iamf.h" + +const AVChannelLayout ff_iamf_scalable_ch_layouts[10] = { + AV_CHANNEL_LAYOUT_MONO, + AV_CHANNEL_LAYOUT_STEREO, + // "Loudspeaker configuration for Sound System B" + AV_CHANNEL_LAYOUT_5POINT1_BACK, + // "Loudspeaker configuration for Sound System C" + AV_CHANNEL_LAYOUT_5POINT1POINT2_BACK, + // "Loudspeaker configuration for Sound System D" + AV_CHANNEL_LAYOUT_5POINT1POINT4_BACK, + // "Loudspeaker configuration for Sound System I" + AV_CHANNEL_LAYOUT_7POINT1, + // "Loudspeaker configuration for Sound System I" + Ltf + Rtf + AV_CHANNEL_LAYOUT_7POINT1POINT2, + // "Loudspeaker configuration for Sound System J" + AV_CHANNEL_LAYOUT_7POINT1POINT4_BACK, + // Front subset of "Loudspeaker configuration for Sound System J" + AV_CHANNEL_LAYOUT_3POINT1POINT2, + // Binaural + AV_CHANNEL_LAYOUT_STEREO, +}; + +const struct IAMFSoundSystemMap ff_iamf_sound_system_map[13] = { + { SOUND_SYSTEM_A_0_2_0, AV_CHANNEL_LAYOUT_STEREO }, + { SOUND_SYSTEM_B_0_5_0, AV_CHANNEL_LAYOUT_5POINT1_BACK }, + { SOUND_SYSTEM_C_2_5_0, AV_CHANNEL_LAYOUT_5POINT1POINT2_BACK }, + { SOUND_SYSTEM_D_4_5_0, AV_CHANNEL_LAYOUT_5POINT1POINT4_BACK }, + { SOUND_SYSTEM_E_4_5_1, + { + .nb_channels = 11, + .order = AV_CHANNEL_ORDER_NATIVE, + .u.mask = AV_CH_LAYOUT_5POINT1POINT4_BACK | AV_CH_BOTTOM_FRONT_CENTER, + }, + }, + { SOUND_SYSTEM_F_3_7_0, AV_CHANNEL_LAYOUT_7POINT2POINT3 }, + { SOUND_SYSTEM_G_4_9_0, AV_CHANNEL_LAYOUT_9POINT1POINT4_BACK }, + { SOUND_SYSTEM_H_9_10_3, AV_CHANNEL_LAYOUT_22POINT2 }, + { SOUND_SYSTEM_I_0_7_0, AV_CHANNEL_LAYOUT_7POINT1 }, + { SOUND_SYSTEM_J_4_7_0, AV_CHANNEL_LAYOUT_7POINT1POINT4_BACK }, + { SOUND_SYSTEM_10_2_7_0, AV_CHANNEL_LAYOUT_7POINT1POINT2 }, + { SOUND_SYSTEM_11_2_3_0, AV_CHANNEL_LAYOUT_3POINT1POINT2 }, + { SOUND_SYSTEM_12_0_1_0, AV_CHANNEL_LAYOUT_MONO }, +}; + +void ff_iamf_free_audio_element(IAMFAudioElement **paudio_element) +{ + IAMFAudioElement *audio_element = *paudio_element; + + if (!audio_element) + return; + + for (int i = 0; i < audio_element->nb_substreams; i++) + avcodec_parameters_free(&audio_element->substreams[i].codecpar); + av_free(audio_element->substreams); + av_free(audio_element->layers); + av_iamf_audio_element_free(&audio_element->element); + av_freep(paudio_element); +} + +void ff_iamf_free_mix_presentation(IAMFMixPresentation **pmix_presentation) +{ + IAMFMixPresentation *mix_presentation = *pmix_presentation; + + if (!mix_presentation) + return; + + for (int i = 0; i < mix_presentation->count_label; i++) + av_free(mix_presentation->language_label[i]); + av_free(mix_presentation->language_label); + av_iamf_mix_presentation_free(&mix_presentation->mix); + av_freep(pmix_presentation); +} + +void ff_iamf_uninit_context(IAMFContext *c) +{ + if (!c) + return; + + for (int i = 0; i < c->nb_codec_configs; i++) { + av_free(c->codec_configs[i]->extradata); + av_free(c->codec_configs[i]); + } + av_freep(&c->codec_configs); + c->nb_codec_configs = 0; + + for (int i = 0; i < c->nb_audio_elements; i++) + ff_iamf_free_audio_element(&c->audio_elements[i]); + av_freep(&c->audio_elements); + c->nb_audio_elements = 0; + + for (int i = 0; i < c->nb_mix_presentations; i++) + ff_iamf_free_mix_presentation(&c->mix_presentations[i]); + av_freep(&c->mix_presentations); + c->nb_mix_presentations = 0; + + for (int i = 0; i < c->nb_param_definitions; i++) + av_free(c->param_definitions[i]); + av_freep(&c->param_definitions); + c->nb_param_definitions = 0; +} diff --git a/libavformat/iamf.h b/libavformat/iamf.h new file mode 100644 index 0000000000..efcd5dc4e2 --- /dev/null +++ b/libavformat/iamf.h @@ -0,0 +1,162 @@ +/* + * Immersive Audio Model and Formats common helpers and structs + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFORMAT_IAMF_H +#define AVFORMAT_IAMF_H + +#include + +#include "libavutil/channel_layout.h" +#include "libavutil/iamf.h" +#include "libavcodec/codec_id.h" +#include "libavcodec/codec_par.h" +#include "avformat.h" + +#define MAX_IAMF_OBU_HEADER_SIZE (1 + 8 * 3) + +// OBU types (section 3.2). +enum IAMF_OBU_Type { + IAMF_OBU_IA_CODEC_CONFIG = 0, + IAMF_OBU_IA_AUDIO_ELEMENT = 1, + IAMF_OBU_IA_MIX_PRESENTATION = 2, + IAMF_OBU_IA_PARAMETER_BLOCK = 3, + IAMF_OBU_IA_TEMPORAL_DELIMITER = 4, + IAMF_OBU_IA_AUDIO_FRAME = 5, + IAMF_OBU_IA_AUDIO_FRAME_ID0 = 6, + IAMF_OBU_IA_AUDIO_FRAME_ID1 = 7, + IAMF_OBU_IA_AUDIO_FRAME_ID2 = 8, + IAMF_OBU_IA_AUDIO_FRAME_ID3 = 9, + IAMF_OBU_IA_AUDIO_FRAME_ID4 = 10, + IAMF_OBU_IA_AUDIO_FRAME_ID5 = 11, + IAMF_OBU_IA_AUDIO_FRAME_ID6 = 12, + IAMF_OBU_IA_AUDIO_FRAME_ID7 = 13, + IAMF_OBU_IA_AUDIO_FRAME_ID8 = 14, + IAMF_OBU_IA_AUDIO_FRAME_ID9 = 15, + IAMF_OBU_IA_AUDIO_FRAME_ID10 = 16, + IAMF_OBU_IA_AUDIO_FRAME_ID11 = 17, + IAMF_OBU_IA_AUDIO_FRAME_ID12 = 18, + IAMF_OBU_IA_AUDIO_FRAME_ID13 = 19, + IAMF_OBU_IA_AUDIO_FRAME_ID14 = 20, + IAMF_OBU_IA_AUDIO_FRAME_ID15 = 21, + IAMF_OBU_IA_AUDIO_FRAME_ID16 = 22, + IAMF_OBU_IA_AUDIO_FRAME_ID17 = 23, + // 24~30 reserved. + IAMF_OBU_IA_SEQUENCE_HEADER = 31, +}; + +typedef struct IAMFCodecConfig { + unsigned codec_config_id; + enum AVCodecID codec_id; + uint32_t codec_tag; + unsigned nb_samples; + int seek_preroll; + int sample_rate; + int extradata_size; + uint8_t *extradata; +} IAMFCodecConfig; + +typedef struct IAMFLayer { + unsigned int substream_count; + unsigned int coupled_substream_count; +} IAMFLayer; + +typedef struct IAMFSubStream { + unsigned int audio_substream_id; + + // demux + AVCodecParameters *codecpar; +} IAMFSubStream; + +typedef struct IAMFAudioElement { + AVIAMFAudioElement *element; + unsigned int audio_element_id; + + IAMFSubStream *substreams; + unsigned int nb_substreams; + + unsigned int codec_config_id; + + // mux + IAMFLayer *layers; + unsigned int nb_layers; +} IAMFAudioElement; + +typedef struct IAMFMixPresentation { + AVIAMFMixPresentation *mix; + unsigned int mix_presentation_id; + + // demux + unsigned int count_label; + char **language_label; +} IAMFMixPresentation; + +typedef struct IAMFParamDefinition { + const AVIAMFAudioElement *audio_element; + AVIAMFParamDefinition *param; + size_t param_size; +} IAMFParamDefinition; + +typedef struct IAMFContext { + IAMFCodecConfig **codec_configs; + int nb_codec_configs; + IAMFAudioElement **audio_elements; + int nb_audio_elements; + IAMFMixPresentation **mix_presentations; + int nb_mix_presentations; + IAMFParamDefinition **param_definitions; + int nb_param_definitions; +} IAMFContext; + +enum IAMF_Anchor_Element { + IAMF_ANCHOR_ELEMENT_UNKNWONW, + IAMF_ANCHOR_ELEMENT_DIALOGUE, + IAMF_ANCHOR_ELEMENT_ALBUM, +}; + +enum IAMF_Sound_System { + SOUND_SYSTEM_A_0_2_0 = 0, // "Loudspeaker configuration for Sound System A" + SOUND_SYSTEM_B_0_5_0 = 1, // "Loudspeaker configuration for Sound System B" + SOUND_SYSTEM_C_2_5_0 = 2, // "Loudspeaker configuration for Sound System C" + SOUND_SYSTEM_D_4_5_0 = 3, // "Loudspeaker configuration for Sound System D" + SOUND_SYSTEM_E_4_5_1 = 4, // "Loudspeaker configuration for Sound System E" + SOUND_SYSTEM_F_3_7_0 = 5, // "Loudspeaker configuration for Sound System F" + SOUND_SYSTEM_G_4_9_0 = 6, // "Loudspeaker configuration for Sound System G" + SOUND_SYSTEM_H_9_10_3 = 7, // "Loudspeaker configuration for Sound System H" + SOUND_SYSTEM_I_0_7_0 = 8, // "Loudspeaker configuration for Sound System I" + SOUND_SYSTEM_J_4_7_0 = 9, // "Loudspeaker configuration for Sound System J" + SOUND_SYSTEM_10_2_7_0 = 10, // "Loudspeaker configuration for Sound System I" + Ltf + Rtf + SOUND_SYSTEM_11_2_3_0 = 11, // Front subset of "Loudspeaker configuration for Sound System J" + SOUND_SYSTEM_12_0_1_0 = 12, // Mono +}; + +struct IAMFSoundSystemMap { + enum IAMF_Sound_System id; + AVChannelLayout layout; +}; + +extern const AVChannelLayout ff_iamf_scalable_ch_layouts[10]; +extern const struct IAMFSoundSystemMap ff_iamf_sound_system_map[13]; + +void ff_iamf_free_audio_element(IAMFAudioElement **paudio_element); +void ff_iamf_free_mix_presentation(IAMFMixPresentation **pmix_presentation); +void ff_iamf_uninit_context(IAMFContext *c); + +#endif /* AVFORMAT_IAMF_H */ diff --git a/libavformat/iamf_parse.c b/libavformat/iamf_parse.c new file mode 100644 index 0000000000..62abb5fe9f --- /dev/null +++ b/libavformat/iamf_parse.c @@ -0,0 +1,1106 @@ +/* + * Immersive Audio Model and Formats parsing + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/avassert.h" +#include "libavutil/common.h" +#include "libavutil/iamf.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/log.h" +#include "libavcodec/get_bits.h" +#include "libavcodec/flac.h" +#include "libavcodec/mpeg4audio.h" +#include "libavcodec/put_bits.h" +#include "avio_internal.h" +#include "iamf_parse.h" +#include "isom.h" + +static int opus_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len) +{ + int left = len - avio_tell(pb); + + if (left < 11) + return AVERROR_INVALIDDATA; + + codec_config->extradata = av_malloc(left + 8); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + + AV_WB32(codec_config->extradata, MKBETAG('O','p','u','s')); + AV_WB32(codec_config->extradata + 4, MKBETAG('H','e','a','d')); + codec_config->extradata_size = avio_read(pb, codec_config->extradata + 8, left); + if (codec_config->extradata_size < left) + return AVERROR_INVALIDDATA; + + codec_config->extradata_size += 8; + codec_config->sample_rate = 48000; + + return 0; +} + +static int aac_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len, void *logctx) +{ + MPEG4AudioConfig cfg = { 0 }; + int object_type_id, codec_id, stream_type; + int ret, tag, left; + + tag = avio_r8(pb); + if (tag != MP4DecConfigDescrTag) + return AVERROR_INVALIDDATA; + + object_type_id = avio_r8(pb); + if (object_type_id != 0x40) + return AVERROR_INVALIDDATA; + + stream_type = avio_r8(pb); + if (((stream_type >> 2) != 5) || ((stream_type >> 1) & 1)) + return AVERROR_INVALIDDATA; + + avio_skip(pb, 3); // buffer size db + avio_skip(pb, 4); // rc_max_rate + avio_skip(pb, 4); // avg bitrate + + codec_id = ff_codec_get_id(ff_mp4_obj_type, object_type_id); + if (codec_id && codec_id != codec_config->codec_id) + return AVERROR_INVALIDDATA; + + tag = avio_r8(pb); + if (tag != MP4DecSpecificDescrTag) + return AVERROR_INVALIDDATA; + + left = len - avio_tell(pb); + if (left <= 0) + return AVERROR_INVALIDDATA; + + codec_config->extradata = av_malloc(left); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + + codec_config->extradata_size = avio_read(pb, codec_config->extradata, left); + if (codec_config->extradata_size < left) + return AVERROR_INVALIDDATA; + + ret = avpriv_mpeg4audio_get_config2(&cfg, codec_config->extradata, + codec_config->extradata_size, 1, logctx); + if (ret < 0) + return ret; + + codec_config->sample_rate = cfg.sample_rate; + + return 0; +} + +static int flac_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len) +{ + int left; + + avio_skip(pb, 4); // METADATA_BLOCK_HEADER + + left = len - avio_tell(pb); + if (left < FLAC_STREAMINFO_SIZE) + return AVERROR_INVALIDDATA; + + codec_config->extradata = av_malloc(left); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + + codec_config->extradata_size = avio_read(pb, codec_config->extradata, left); + if (codec_config->extradata_size < left) + return AVERROR_INVALIDDATA; + + codec_config->sample_rate = AV_RB24(codec_config->extradata + 10) >> 4; + + return 0; +} + +static int ipcm_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len) +{ + static const enum AVSampleFormat sample_fmt[2][3] = { + { AV_CODEC_ID_PCM_S16BE, AV_CODEC_ID_PCM_S24BE, AV_CODEC_ID_PCM_S32BE }, + { AV_CODEC_ID_PCM_S16LE, AV_CODEC_ID_PCM_S24LE, AV_CODEC_ID_PCM_S32LE }, + }; + int sample_format = avio_r8(pb); // 0 = BE, 1 = LE + int sample_size = (avio_r8(pb) / 8 - 2); // 16, 24, 32 + if (sample_format > 1 || sample_size > 2) + return AVERROR_INVALIDDATA; + + codec_config->codec_id = sample_fmt[sample_format][sample_size]; + codec_config->sample_rate = avio_rb32(pb); + + if (len - avio_tell(pb)) + return AVERROR_INVALIDDATA; + + return 0; +} + +static int codec_config_obu(void *s, IAMFContext *c, AVIOContext *pb, int len) +{ + IAMFCodecConfig **tmp, *codec_config = NULL; + FFIOContext b; + AVIOContext *pbc; + uint8_t *buf; + enum AVCodecID avcodec_id; + unsigned codec_config_id, nb_samples, codec_id; + int16_t seek_preroll; + int ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pbc = &b.pub; + + codec_config_id = ffio_read_leb(pbc); + codec_id = avio_rb32(pbc); + nb_samples = ffio_read_leb(pbc); + seek_preroll = avio_rb16(pbc); + + switch(codec_id) { + case MKBETAG('O','p','u','s'): + avcodec_id = AV_CODEC_ID_OPUS; + break; + case MKBETAG('m','p','4','a'): + avcodec_id = AV_CODEC_ID_AAC; + break; + case MKBETAG('f','L','a','C'): + avcodec_id = AV_CODEC_ID_FLAC; + break; + default: + avcodec_id = AV_CODEC_ID_NONE; + break; + } + + for (int i = 0; i < c->nb_codec_configs; i++) + if (c->codec_configs[i]->codec_config_id == codec_config_id) { + ret = AVERROR_INVALIDDATA; + goto fail; + } + + tmp = av_realloc_array(c->codec_configs, c->nb_codec_configs + 1, sizeof(*c->codec_configs)); + if (!tmp) { + ret = AVERROR(ENOMEM); + goto fail; + } + c->codec_configs = tmp; + + codec_config = av_mallocz(sizeof(*codec_config)); + if (!codec_config) { + ret = AVERROR(ENOMEM); + goto fail; + } + + codec_config->codec_config_id = codec_config_id; + codec_config->codec_id = avcodec_id; + codec_config->nb_samples = nb_samples; + codec_config->seek_preroll = seek_preroll; + + switch(codec_id) { + case MKBETAG('O','p','u','s'): + ret = opus_decoder_config(codec_config, pbc, len); + break; + case MKBETAG('m','p','4','a'): + ret = aac_decoder_config(codec_config, pbc, len, s); + break; + case MKBETAG('f','L','a','C'): + ret = flac_decoder_config(codec_config, pbc, len); + break; + case MKBETAG('i','p','c','m'): + ret = ipcm_decoder_config(codec_config, pbc, len); + break; + default: + break; + } + if (ret < 0) + goto fail; + + c->codec_configs[c->nb_codec_configs++] = codec_config; + + len -= avio_tell(pbc); + if (len) + av_log(s, AV_LOG_WARNING, "Underread in codec_config_obu. %d bytes left at the end\n", len); + + ret = 0; +fail: + av_free(buf); + if (ret < 0) { + if (codec_config) + av_free(codec_config->extradata); + av_free(codec_config); + } + return ret; +} + +static int update_extradata(AVCodecParameters *codecpar) +{ + GetBitContext gb; + PutBitContext pb; + int ret; + + switch(codecpar->codec_id) { + case AV_CODEC_ID_OPUS: + AV_WB8(codecpar->extradata + 9, codecpar->ch_layout.nb_channels); + break; + case AV_CODEC_ID_AAC: { + uint8_t buf[5]; + + init_put_bits(&pb, buf, sizeof(buf)); + ret = init_get_bits8(&gb, codecpar->extradata, codecpar->extradata_size); + if (ret < 0) + return ret; + + ret = get_bits(&gb, 5); + put_bits(&pb, 5, ret); + if (ret == AOT_ESCAPE) // violates section 3.11.2, but better check for it + put_bits(&pb, 6, get_bits(&gb, 6)); + ret = get_bits(&gb, 4); + put_bits(&pb, 4, ret); + if (ret == 0x0f) + put_bits(&pb, 24, get_bits(&gb, 24)); + + skip_bits(&gb, 4); + put_bits(&pb, 4, codecpar->ch_layout.nb_channels); // set channel config + ret = put_bits_left(&pb); + put_bits(&pb, ret, get_bits(&gb, ret)); + flush_put_bits(&pb); + + memcpy(codecpar->extradata, buf, sizeof(buf)); + break; + } + case AV_CODEC_ID_FLAC: { + uint8_t buf[13]; + + init_put_bits(&pb, buf, sizeof(buf)); + ret = init_get_bits8(&gb, codecpar->extradata, codecpar->extradata_size); + if (ret < 0) + return ret; + + put_bits32(&pb, get_bits_long(&gb, 32)); // min/max blocksize + put_bits64(&pb, 48, get_bits64(&gb, 48)); // min/max framesize + put_bits(&pb, 20, get_bits(&gb, 20)); // samplerate + skip_bits(&gb, 3); + put_bits(&pb, 3, codecpar->ch_layout.nb_channels - 1); + ret = put_bits_left(&pb); + put_bits(&pb, ret, get_bits(&gb, ret)); + flush_put_bits(&pb); + + memcpy(codecpar->extradata, buf, sizeof(buf)); + break; + } + } + + return 0; +} + +static int scalable_channel_layout_config(void *s, AVIOContext *pb, + IAMFAudioElement *audio_element, + const IAMFCodecConfig *codec_config) +{ + int nb_layers, k = 0; + + nb_layers = avio_r8(pb) >> 5; // get_bits(&gb, 3); + // skip_bits(&gb, 5); //reserved + + if (nb_layers > 6) + return AVERROR_INVALIDDATA; + + for (int i = 0; i < nb_layers; i++) { + AVIAMFLayer *layer; + int loudspeaker_layout, output_gain_is_present_flag; + int substream_count, coupled_substream_count; + int ret, byte = avio_r8(pb); + + layer = av_iamf_audio_element_add_layer(audio_element->element); + if (!layer) + return AVERROR(ENOMEM); + + loudspeaker_layout = byte >> 4; // get_bits(&gb, 4); + output_gain_is_present_flag = (byte >> 3) & 1; //get_bits1(&gb); + if ((byte >> 2) & 1) + layer->flags |= AV_IAMF_LAYER_FLAG_RECON_GAIN; + substream_count = avio_r8(pb); + coupled_substream_count = avio_r8(pb); + + if (output_gain_is_present_flag) { + layer->output_gain_flags = avio_r8(pb) >> 2; // get_bits(&gb, 6); + layer->output_gain = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + } + + if (loudspeaker_layout < 10) + av_channel_layout_copy(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[loudspeaker_layout]); + else + layer->ch_layout = (AVChannelLayout){ .order = AV_CHANNEL_ORDER_UNSPEC, + .nb_channels = substream_count + + coupled_substream_count }; + + for (int j = 0; j < substream_count; j++) { + IAMFSubStream *substream = &audio_element->substreams[k++]; + + substream->codecpar->ch_layout = coupled_substream_count-- > 0 ? (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO : + (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO; + + ret = update_extradata(substream->codecpar); + if (ret < 0) + return ret; + } + + } + + return 0; +} + +static int ambisonics_config(void *s, AVIOContext *pb, + IAMFAudioElement *audio_element, + const IAMFCodecConfig *codec_config) +{ + AVIAMFLayer *layer; + unsigned ambisonics_mode; + int output_channel_count, substream_count, order; + int ret; + + ambisonics_mode = ffio_read_leb(pb); + if (ambisonics_mode > 1) + return 0; + + output_channel_count = avio_r8(pb); // C + substream_count = avio_r8(pb); // N + if (audio_element->nb_substreams != substream_count) + return AVERROR_INVALIDDATA; + + order = floor(sqrt(output_channel_count - 1)); + /* incomplete order - some harmonics are missing */ + if ((order + 1) * (order + 1) != output_channel_count) + return AVERROR_INVALIDDATA; + + layer = av_iamf_audio_element_add_layer(audio_element->element); + if (!layer) + return AVERROR(ENOMEM); + + layer->ambisonics_mode = ambisonics_mode; + if (ambisonics_mode == 0) { + for (int i = 0; i < substream_count; i++) { + IAMFSubStream *substream = &audio_element->substreams[i]; + + substream->codecpar->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO; + + ret = update_extradata(substream->codecpar); + if (ret < 0) + return ret; + } + + layer->ch_layout.order = AV_CHANNEL_ORDER_CUSTOM; + layer->ch_layout.nb_channels = output_channel_count; + layer->ch_layout.u.map = av_calloc(output_channel_count, sizeof(*layer->ch_layout.u.map)); + if (!layer->ch_layout.u.map) + return AVERROR(ENOMEM); + + for (int i = 0; i < output_channel_count; i++) + layer->ch_layout.u.map[i].id = avio_r8(pb) + AV_CHAN_AMBISONIC_BASE; + } else { + int coupled_substream_count = avio_r8(pb); // M + int nb_demixing_matrix = substream_count + coupled_substream_count; + int demixing_matrix_size = nb_demixing_matrix * output_channel_count; + + layer->ch_layout = (AVChannelLayout){ .order = AV_CHANNEL_ORDER_AMBISONIC, .nb_channels = output_channel_count }; + layer->demixing_matrix = av_malloc_array(demixing_matrix_size, sizeof(*layer->demixing_matrix)); + if (!layer->demixing_matrix) + return AVERROR(ENOMEM); + + for (int i = 0; i < demixing_matrix_size; i++) + layer->demixing_matrix[i] = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + + for (int i = 0; i < substream_count; i++) { + IAMFSubStream *substream = &audio_element->substreams[i]; + + substream->codecpar->ch_layout = coupled_substream_count-- > 0 ? (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO : + (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO; + + + ret = update_extradata(substream->codecpar); + if (ret < 0) + return ret; + } + } + + return 0; +} + +static int param_parse(void *s, IAMFContext *c, AVIOContext *pb, + unsigned int param_definition_type, + const AVIAMFAudioElement *audio_element, + AVIAMFParamDefinition **out_param_definition) +{ + IAMFParamDefinition *param_definition = NULL; + AVIAMFParamDefinition *param; + unsigned int parameter_id, parameter_rate, param_definition_mode; + unsigned int duration = 0, constant_subblock_duration = 0, nb_subblocks = 0; + size_t param_size; + + parameter_id = ffio_read_leb(pb); + + for (int i = 0; i < c->nb_param_definitions; i++) + if (c->param_definitions[i]->param->parameter_id == parameter_id) { + param_definition = c->param_definitions[i]; + break; + } + + parameter_rate = ffio_read_leb(pb); + param_definition_mode = avio_r8(pb) >> 7; + + if (param_definition_mode == 0) { + duration = ffio_read_leb(pb); + constant_subblock_duration = ffio_read_leb(pb); + if (constant_subblock_duration == 0) + nb_subblocks = ffio_read_leb(pb); + else + nb_subblocks = duration / constant_subblock_duration; + } + + param = av_iamf_param_definition_alloc(param_definition_type, nb_subblocks, ¶m_size); + if (!param) + return AVERROR(ENOMEM); + + for (int i = 0; i < nb_subblocks; i++) { + void *subblock = av_iamf_param_definition_get_subblock(param, i); + unsigned int subblock_duration = constant_subblock_duration; + + if (constant_subblock_duration == 0) + subblock_duration = ffio_read_leb(pb); + + switch (param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + AVIAMFMixGain *mix = subblock; + mix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + AVIAMFDemixingInfo *demix = subblock; + demix->subblock_duration = subblock_duration; + // DemixingInfoParameterData + demix->dmixp_mode = avio_r8(pb) >> 5; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + AVIAMFReconGain *recon = subblock; + recon->subblock_duration = subblock_duration; + break; + } + default: + av_free(param); + return AVERROR_INVALIDDATA; + } + } + + param->parameter_id = parameter_id; + param->parameter_rate = parameter_rate; + param->param_definition_mode = param_definition_mode; + param->duration = duration; + param->constant_subblock_duration = constant_subblock_duration; + param->nb_subblocks = nb_subblocks; + + if (param_definition) { + if (param_definition->param_size != param_size || memcmp(param_definition->param, param, param_size)) { + av_log(s, AV_LOG_ERROR, "Incosistent parameters for parameter_id %u\n", parameter_id); + av_free(param); + return AVERROR_INVALIDDATA; + } + } else { + IAMFParamDefinition **tmp = av_realloc_array(c->param_definitions, c->nb_param_definitions + 1, + sizeof(*c->param_definitions)); + if (!tmp) { + av_free(param); + return AVERROR(ENOMEM); + } + c->param_definitions = tmp; + + param_definition = av_mallocz(sizeof(*param_definition)); + if (!param_definition) { + av_free(param); + return AVERROR(ENOMEM); + } + param_definition->param = param; + param_definition->param_size = param_size; + param_definition->audio_element = audio_element; + + c->param_definitions[c->nb_param_definitions++] = param_definition; + } + + av_assert0(out_param_definition); + *out_param_definition = param; + + return 0; +} + +static IAMFCodecConfig *get_codec_config(IAMFContext *c, unsigned int codec_config_id) +{ + for (int i = 0; i < c->nb_codec_configs; i++) { + if (c->codec_configs[i]->codec_config_id == codec_config_id) + return c->codec_configs[i]; + } + + return NULL; +} + +static int audio_element_obu(void *s, IAMFContext *c, AVIOContext *pb, int len) +{ + const IAMFCodecConfig *codec_config; + AVIAMFAudioElement *element; + IAMFAudioElement **tmp, *audio_element = NULL; + FFIOContext b; + AVIOContext *pbc; + uint8_t *buf; + unsigned audio_element_id, codec_config_id, num_parameters; + int audio_element_type, ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pbc = &b.pub; + + audio_element_id = ffio_read_leb(pbc); + + for (int i = 0; i < c->nb_audio_elements; i++) + if (c->audio_elements[i]->audio_element_id == audio_element_id) { + av_log(s, AV_LOG_ERROR, "Duplicate audio_element_id %d\n", audio_element_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + audio_element_type = avio_r8(pbc) >> 5; + codec_config_id = ffio_read_leb(pbc); + + codec_config = get_codec_config(c, codec_config_id); + if (!codec_config) { + av_log(s, AV_LOG_ERROR, "Non existant codec config id %d referenced in an audio element\n", codec_config_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + if (codec_config->codec_id == AV_CODEC_ID_NONE) { + av_log(s, AV_LOG_DEBUG, "Unknown codec id referenced in an audio element. Ignoring\n"); + ret = 0; + goto fail; + } + + tmp = av_realloc_array(c->audio_elements, c->nb_audio_elements + 1, sizeof(*c->audio_elements)); + if (!tmp) { + ret = AVERROR(ENOMEM); + goto fail; + } + c->audio_elements = tmp; + + audio_element = av_mallocz(sizeof(*audio_element)); + if (!audio_element) { + ret = AVERROR(ENOMEM); + goto fail; + } + + audio_element->nb_substreams = ffio_read_leb(pbc); + audio_element->codec_config_id = codec_config_id; + audio_element->audio_element_id = audio_element_id; + audio_element->substreams = av_calloc(audio_element->nb_substreams, sizeof(*audio_element->substreams)); + if (!audio_element->substreams) { + ret = AVERROR(ENOMEM); + goto fail; + } + + element = audio_element->element = av_iamf_audio_element_alloc(); + if (!element) { + ret = AVERROR(ENOMEM); + goto fail; + } + + element->audio_element_type = audio_element_type; + + for (int i = 0; i < audio_element->nb_substreams; i++) { + IAMFSubStream *substream = &audio_element->substreams[i]; + + substream->codecpar = avcodec_parameters_alloc(); + if (!substream->codecpar) { + ret = AVERROR(ENOMEM); + goto fail; + } + + substream->audio_substream_id = ffio_read_leb(pbc); + + substream->codecpar->codec_type = AVMEDIA_TYPE_AUDIO; + substream->codecpar->codec_id = codec_config->codec_id; + substream->codecpar->frame_size = codec_config->nb_samples; + substream->codecpar->sample_rate = codec_config->sample_rate; + substream->codecpar->seek_preroll = codec_config->seek_preroll; + + switch(substream->codecpar->codec_id) { + case AV_CODEC_ID_AAC: + case AV_CODEC_ID_FLAC: + case AV_CODEC_ID_OPUS: + substream->codecpar->extradata = av_malloc(codec_config->extradata_size + AV_INPUT_BUFFER_PADDING_SIZE); + if (!substream->codecpar->extradata) { + ret = AVERROR(ENOMEM); + goto fail; + } + memcpy(substream->codecpar->extradata, codec_config->extradata, codec_config->extradata_size); + memset(substream->codecpar->extradata + codec_config->extradata_size, 0, AV_INPUT_BUFFER_PADDING_SIZE); + substream->codecpar->extradata_size = codec_config->extradata_size; + break; + } + } + + num_parameters = ffio_read_leb(pbc); + if (num_parameters && audio_element_type != 0) { + av_log(s, AV_LOG_ERROR, "Audio Element parameter count %u is invalid" + " for Scene representations\n", num_parameters); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + for (int i = 0; i < num_parameters; i++) { + unsigned param_definition_type; + + param_definition_type = ffio_read_leb(pbc); + if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN) { + ret = AVERROR_INVALIDDATA; + goto fail; + } else if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_DEMIXING) { + ret = param_parse(s, c, pbc, param_definition_type, element, &element->demixing_info); + if (ret < 0) + goto fail; + + element->default_w = avio_r8(pbc) >> 4; + } else if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) { + ret = param_parse(s, c, pbc, param_definition_type, element, &element->recon_gain_info); + if (ret < 0) + goto fail; + } else { + unsigned param_definition_size = ffio_read_leb(pbc); + avio_skip(pbc, param_definition_size); + } + } + + if (audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) { + ret = scalable_channel_layout_config(s, pbc, audio_element, codec_config); + if (ret < 0) + goto fail; + } else if (audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE) { + ret = ambisonics_config(s, pbc, audio_element, codec_config); + if (ret < 0) + goto fail; + } else { + unsigned audio_element_config_size = ffio_read_leb(pbc); + avio_skip(pbc, audio_element_config_size); + } + + c->audio_elements[c->nb_audio_elements++] = audio_element; + + len -= avio_tell(pbc); + if (len) + av_log(s, AV_LOG_WARNING, "Underread in audio_element_obu. %d bytes left at the end\n", len); + + ret = 0; +fail: + av_free(buf); + if (ret < 0) + ff_iamf_free_audio_element(&audio_element); + return ret; +} + +static int label_string(AVIOContext *pb, char **label) +{ + uint8_t buf[128]; + + avio_get_str(pb, sizeof(buf), buf, sizeof(buf)); + + if (pb->error) + return pb->error; + if (pb->eof_reached) + return AVERROR_INVALIDDATA; + *label = av_strdup(buf); + if (!*label) + return AVERROR(ENOMEM); + + return 0; +} + +static int mix_presentation_obu(void *s, IAMFContext *c, AVIOContext *pb, int len) +{ + AVIAMFMixPresentation *mix; + IAMFMixPresentation **tmp, *mix_presentation = NULL; + FFIOContext b; + AVIOContext *pbc; + uint8_t *buf; + unsigned mix_presentation_id; + int ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pbc = &b.pub; + + mix_presentation_id = ffio_read_leb(pbc); + + for (int i = 0; i < c->nb_mix_presentations; i++) + if (c->mix_presentations[i]->mix_presentation_id == mix_presentation_id) { + av_log(s, AV_LOG_ERROR, "Duplicate mix_presentation_id %d\n", mix_presentation_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + tmp = av_realloc_array(c->mix_presentations, c->nb_mix_presentations + 1, sizeof(*c->mix_presentations)); + if (!tmp) { + ret = AVERROR(ENOMEM); + goto fail; + } + c->mix_presentations = tmp; + + mix_presentation = av_mallocz(sizeof(*mix_presentation)); + if (!mix_presentation) { + ret = AVERROR(ENOMEM); + goto fail; + } + + mix_presentation->mix_presentation_id = mix_presentation_id; + mix = mix_presentation->mix = av_iamf_mix_presentation_alloc(); + if (!mix) { + ret = AVERROR(ENOMEM); + goto fail; + } + + mix_presentation->count_label = ffio_read_leb(pbc); + mix_presentation->language_label = av_calloc(mix_presentation->count_label, + sizeof(*mix_presentation->language_label)); + if (!mix_presentation->language_label) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int i = 0; i < mix_presentation->count_label; i++) { + ret = label_string(pbc, &mix_presentation->language_label[i]); + if (ret < 0) + goto fail; + } + + for (int i = 0; i < mix_presentation->count_label; i++) { + char *annotation = NULL; + ret = label_string(pbc, &annotation); + if (ret < 0) + goto fail; + ret = av_dict_set(&mix->annotations, mix_presentation->language_label[i], annotation, + AV_DICT_DONT_STRDUP_VAL | AV_DICT_DONT_OVERWRITE); + if (ret < 0) + goto fail; + } + + mix->nb_submixes = ffio_read_leb(pbc); + mix->submixes = av_calloc(mix->nb_submixes, sizeof(*mix->submixes)); + if (!mix->submixes) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int i = 0; i < mix->nb_submixes; i++) { + AVIAMFSubmix *sub_mix; + + sub_mix = mix->submixes[i] = av_mallocz(sizeof(*sub_mix)); + if (!sub_mix) { + ret = AVERROR(ENOMEM); + goto fail; + } + + sub_mix->nb_elements = ffio_read_leb(pbc); + sub_mix->elements = av_calloc(sub_mix->nb_elements, sizeof(*sub_mix->elements)); + if (!sub_mix->elements) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int j = 0; j < sub_mix->nb_elements; j++) { + AVIAMFSubmixElement *submix_element; + IAMFAudioElement *audio_element = NULL; + unsigned int rendering_config_extension_size; + + submix_element = sub_mix->elements[j] = av_mallocz(sizeof(*submix_element)); + if (!submix_element) { + ret = AVERROR(ENOMEM); + goto fail; + } + + submix_element->audio_element_id = ffio_read_leb(pbc); + + for (int k = 0; k < c->nb_audio_elements; k++) + if (c->audio_elements[k]->audio_element_id == submix_element->audio_element_id) { + audio_element = c->audio_elements[k]; + break; + } + + if (!audio_element) { + av_log(s, AV_LOG_ERROR, "Invalid Audio Element with id %u referenced by Mix Parameters %u\n", + submix_element->audio_element_id, mix_presentation_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + for (int k = 0; k < mix_presentation->count_label; k++) { + char *annotation = NULL; + ret = label_string(pbc, &annotation); + if (ret < 0) + goto fail; + ret = av_dict_set(&submix_element->annotations, mix_presentation->language_label[k], annotation, + AV_DICT_DONT_STRDUP_VAL | AV_DICT_DONT_OVERWRITE); + if (ret < 0) + goto fail; + } + + submix_element->headphones_rendering_mode = avio_r8(pbc) >> 6; + + rendering_config_extension_size = ffio_read_leb(pbc); + avio_skip(pbc, rendering_config_extension_size); + + ret = param_parse(s, c, pbc, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, + audio_element->element, + &submix_element->element_mix_config); + if (ret < 0) + goto fail; + submix_element->default_mix_gain = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + } + + ret = param_parse(s, c, pbc, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL, &sub_mix->output_mix_config); + if (ret < 0) + goto fail; + sub_mix->default_mix_gain = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + + sub_mix->nb_layouts = ffio_read_leb(pbc); + sub_mix->layouts = av_calloc(sub_mix->nb_layouts, sizeof(*sub_mix->layouts)); + if (!sub_mix->layouts) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int j = 0; j < sub_mix->nb_layouts; j++) { + AVIAMFSubmixLayout *submix_layout; + int info_type; + int byte = avio_r8(pbc); + + submix_layout = sub_mix->layouts[j] = av_mallocz(sizeof(*submix_layout)); + if (!submix_layout) { + ret = AVERROR(ENOMEM); + goto fail; + } + + submix_layout->layout_type = byte >> 6; + if (submix_layout->layout_type < AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS && + submix_layout->layout_type > AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL) { + av_log(s, AV_LOG_ERROR, "Invalid Layout type %u in a submix from Mix Presentation %u\n", + submix_layout->layout_type, mix_presentation_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + if (submix_layout->layout_type == 2) { + int sound_system; + sound_system = (byte >> 2) & 0xF; + av_channel_layout_copy(&submix_layout->sound_system, &ff_iamf_sound_system_map[sound_system].layout); + } + + info_type = avio_r8(pbc); + submix_layout->integrated_loudness = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + submix_layout->digital_peak = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + + if (info_type & 1) + submix_layout->true_peak = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + if (info_type & 2) { + unsigned int num_anchored_loudness = avio_r8(pbc); + + for (int k = 0; k < num_anchored_loudness; k++) { + unsigned int anchor_element = avio_r8(pbc); + AVRational anchored_loudness = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + if (anchor_element == IAMF_ANCHOR_ELEMENT_DIALOGUE) + submix_layout->dialogue_anchored_loudness = anchored_loudness; + else if (anchor_element <= IAMF_ANCHOR_ELEMENT_ALBUM) + submix_layout->album_anchored_loudness = anchored_loudness; + else + av_log(s, AV_LOG_DEBUG, "Unknown anchor_element. Ignoring\n"); + } + } + + if (info_type & 0xFC) { + unsigned int info_type_size = ffio_read_leb(pbc); + avio_skip(pbc, info_type_size); + } + } + } + + c->mix_presentations[c->nb_mix_presentations++] = mix_presentation; + + len -= avio_tell(pbc); + if (len) + av_log(s, AV_LOG_WARNING, "Underread in mix_presentation_obu. %d bytes left at the end\n", len); + + ret = 0; +fail: + av_free(buf); + if (ret < 0) + ff_iamf_free_mix_presentation(&mix_presentation); + return ret; +} + +int ff_iamf_parse_obu_header(const uint8_t *buf, int buf_size, + unsigned *obu_size, int *start_pos, enum IAMF_OBU_Type *type, + unsigned *skip_samples, unsigned *discard_padding) +{ + GetBitContext gb; + int ret, extension_flag, trimming, start; + unsigned skip = 0, discard = 0; + unsigned size; + + ret = init_get_bits8(&gb, buf, FFMIN(buf_size, MAX_IAMF_OBU_HEADER_SIZE)); + if (ret < 0) + return ret; + + *type = get_bits(&gb, 5); + /*redundant =*/ get_bits1(&gb); + trimming = get_bits1(&gb); + extension_flag = get_bits1(&gb); + + *obu_size = get_leb(&gb); + if (*obu_size > INT_MAX) + return AVERROR_INVALIDDATA; + + start = get_bits_count(&gb) / 8; + + if (trimming) { + discard = get_leb(&gb); // num_samples_to_trim_at_end + skip = get_leb(&gb); // num_samples_to_trim_at_start + } + + if (skip_samples) + *skip_samples = skip; + if (discard_padding) + *discard_padding = discard; + + if (extension_flag) { + unsigned int extension_bytes; + extension_bytes = get_leb(&gb); + if (extension_bytes > INT_MAX / 8) + return AVERROR_INVALIDDATA; + skip_bits_long(&gb, extension_bytes * 8); + } + + if (get_bits_left(&gb) < 0) + return AVERROR_INVALIDDATA; + + size = *obu_size + start; + if (size > INT_MAX) + return AVERROR_INVALIDDATA; + + *obu_size -= get_bits_count(&gb) / 8 - start; + *start_pos = size - *obu_size; + + return size; +} + +int ff_iamfdec_read_descriptors(IAMFContext *c, AVIOContext *pb, + int max_size, void *log_ctx) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE + AV_INPUT_BUFFER_PADDING_SIZE]; + int ret; + + while (1) { + unsigned obu_size; + enum IAMF_OBU_Type type; + int start_pos, len, size; + + if ((ret = ffio_ensure_seekback(pb, FFMIN(MAX_IAMF_OBU_HEADER_SIZE, max_size))) < 0) + return ret; + size = avio_read(pb, header, FFMIN(MAX_IAMF_OBU_HEADER_SIZE, max_size)); + if (size < 0) + return size; + + len = ff_iamf_parse_obu_header(header, size, &obu_size, &start_pos, &type, NULL, NULL); + if (len < 0 || obu_size > max_size) { + av_log(log_ctx, AV_LOG_ERROR, "Failed to read obu header\n"); + avio_seek(pb, -size, SEEK_CUR); + return len; + } + + if (type >= IAMF_OBU_IA_PARAMETER_BLOCK && type < IAMF_OBU_IA_SEQUENCE_HEADER) { + avio_seek(pb, -size, SEEK_CUR); + break; + } + + avio_seek(pb, -(size - start_pos), SEEK_CUR); + switch (type) { + case IAMF_OBU_IA_CODEC_CONFIG: + ret = codec_config_obu(log_ctx, c, pb, obu_size); + break; + case IAMF_OBU_IA_AUDIO_ELEMENT: + ret = audio_element_obu(log_ctx, c, pb, obu_size); + break; + case IAMF_OBU_IA_MIX_PRESENTATION: + ret = mix_presentation_obu(log_ctx, c, pb, obu_size); + break; + case IAMF_OBU_IA_TEMPORAL_DELIMITER: + break; + default: { + int64_t offset = avio_skip(pb, obu_size); + if (offset < 0) + ret = offset; + break; + } + } + if (ret < 0) { + av_log(log_ctx, AV_LOG_ERROR, "Failed to read obu type %d\n", type); + return ret; + } + max_size -= obu_size + start_pos; + if (max_size < 0) + return AVERROR_INVALIDDATA; + if (!max_size) + break; + } + + return 0; +} diff --git a/libavformat/iamf_parse.h b/libavformat/iamf_parse.h new file mode 100644 index 0000000000..f4f297ecd4 --- /dev/null +++ b/libavformat/iamf_parse.h @@ -0,0 +1,38 @@ +/* + * Immersive Audio Model and Formats parsing + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFORMAT_IAMF_PARSE_H +#define AVFORMAT_IAMF_PARSE_H + +#include + +#include "libavutil/iamf.h" +#include "avio.h" +#include "iamf.h" + +int ff_iamf_parse_obu_header(const uint8_t *buf, int buf_size, + unsigned *obu_size, int *start_pos, enum IAMF_OBU_Type *type, + unsigned *skip_samples, unsigned *discard_padding); + +int ff_iamfdec_read_descriptors(IAMFContext *c, AVIOContext *pb, + int size, void *log_ctx); + +#endif /* AVFORMAT_IAMF_PARSE_H */ diff --git a/libavformat/iamfdec.c b/libavformat/iamfdec.c new file mode 100644 index 0000000000..97438f2662 --- /dev/null +++ b/libavformat/iamfdec.c @@ -0,0 +1,495 @@ +/* + * Immersive Audio Model and Formats demuxer + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config_components.h" + +#include "libavutil/avassert.h" +#include "libavutil/iamf.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/log.h" +#include "libavcodec/mathops.h" +#include "avformat.h" +#include "avio_internal.h" +#include "demux.h" +#include "iamf.h" +#include "iamf_parse.h" +#include "internal.h" + +typedef struct IAMFDemuxContext { + IAMFContext iamf; + + // Packet side data + AVIAMFParamDefinition *mix; + size_t mix_size; + AVIAMFParamDefinition *demix; + size_t demix_size; + AVIAMFParamDefinition *recon; + size_t recon_size; +} IAMFDemuxContext; + +static AVStream *find_stream_by_id(AVFormatContext *s, int id) +{ + for (int i = 0; i < s->nb_streams; i++) + if (s->streams[i]->id == id) + return s->streams[i]; + + av_log(s, AV_LOG_ERROR, "Invalid stream id %d\n", id); + return NULL; +} + +static int audio_frame_obu(AVFormatContext *s, AVPacket *pkt, int len, + enum IAMF_OBU_Type type, + unsigned skip_samples, unsigned discard_padding, + int id_in_bitstream) +{ + const IAMFDemuxContext *const c = s->priv_data; + AVStream *st; + int ret, audio_substream_id; + + if (id_in_bitstream) { + unsigned explicit_audio_substream_id; + int64_t pos = avio_tell(s->pb); + explicit_audio_substream_id = ffio_read_leb(s->pb); + len -= avio_tell(s->pb) - pos; + audio_substream_id = explicit_audio_substream_id; + } else + audio_substream_id = type - IAMF_OBU_IA_AUDIO_FRAME_ID0; + + st = find_stream_by_id(s, audio_substream_id); + if (!st) + return AVERROR_INVALIDDATA; + + ret = av_get_packet(s->pb, pkt, len); + if (ret < 0) + return ret; + if (ret != len) + return AVERROR_INVALIDDATA; + + if (skip_samples || discard_padding) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_SKIP_SAMPLES, 10); + if (!side_data) + return AVERROR(ENOMEM); + AV_WL32(side_data, skip_samples); + AV_WL32(side_data + 4, discard_padding); + } + if (c->mix) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, c->mix_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->mix, c->mix_size); + } + if (c->demix) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, c->demix_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->demix, c->demix_size); + } + if (c->recon) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, c->recon_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->recon, c->recon_size); + } + + pkt->stream_index = st->index; + return 0; +} + +static const IAMFParamDefinition *get_param_definition(AVFormatContext *s, unsigned int parameter_id) +{ + const IAMFDemuxContext *const c = s->priv_data; + const IAMFContext *const iamf = &c->iamf; + const IAMFParamDefinition *param_definition = NULL; + + for (int i = 0; i < iamf->nb_param_definitions; i++) + if (iamf->param_definitions[i]->param->parameter_id == parameter_id) { + param_definition = iamf->param_definitions[i]; + break; + } + + return param_definition; +} + +static int parameter_block_obu(AVFormatContext *s, int len) +{ + IAMFDemuxContext *const c = s->priv_data; + const IAMFParamDefinition *param_definition; + const AVIAMFParamDefinition *param; + AVIAMFParamDefinition *out_param = NULL; + FFIOContext b; + AVIOContext *pb; + uint8_t *buf; + unsigned int duration, constant_subblock_duration; + unsigned int nb_subblocks; + unsigned int parameter_id; + size_t out_param_size; + int ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(s->pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pb = &b.pub; + + parameter_id = ffio_read_leb(pb); + param_definition = get_param_definition(s, parameter_id); + if (!param_definition) { + av_log(s, AV_LOG_VERBOSE, "Non existant parameter_id %d referenced in a parameter block. Ignoring\n", + parameter_id); + ret = 0; + goto fail; + } + + param = param_definition->param; + if (param->param_definition_mode) { + duration = ffio_read_leb(pb); + constant_subblock_duration = ffio_read_leb(pb); + if (constant_subblock_duration == 0) + nb_subblocks = ffio_read_leb(pb); + else + nb_subblocks = duration / constant_subblock_duration; + } else { + duration = param->duration; + constant_subblock_duration = param->constant_subblock_duration; + nb_subblocks = param->nb_subblocks; + if (!nb_subblocks) + nb_subblocks = duration / constant_subblock_duration; + } + + out_param = av_iamf_param_definition_alloc(param->param_definition_type, nb_subblocks, &out_param_size); + if (!out_param) { + ret = AVERROR(ENOMEM); + goto fail; + } + + out_param->parameter_id = param->parameter_id; + out_param->param_definition_type = param->param_definition_type; + out_param->parameter_rate = param->parameter_rate; + out_param->param_definition_mode = param->param_definition_mode; + out_param->duration = duration; + out_param->constant_subblock_duration = constant_subblock_duration; + out_param->nb_subblocks = nb_subblocks; + + for (int i = 0; i < nb_subblocks; i++) { + void *subblock = av_iamf_param_definition_get_subblock(out_param, i); + unsigned int subblock_duration = constant_subblock_duration; + + if (param->param_definition_mode && !constant_subblock_duration) + subblock_duration = ffio_read_leb(pb); + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + AVIAMFMixGain *mix = subblock; + + mix->animation_type = ffio_read_leb(pb); + if (mix->animation_type > AV_IAMF_ANIMATION_TYPE_BEZIER) { + ret = 0; + av_free(out_param); + goto fail; + } + + mix->start_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR) + mix->end_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) { + mix->control_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + mix->control_point_relative_time = av_make_q(avio_r8(pb), 1 << 8); + } + mix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + AVIAMFDemixingInfo *demix = subblock; + + demix->dmixp_mode = avio_r8(pb) >> 5; + demix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + AVIAMFReconGain *recon = subblock; + const AVIAMFAudioElement *audio_element = param_definition->audio_element; + + av_assert0(audio_element); + for (int i = 0; i < audio_element->nb_layers; i++) { + const AVIAMFLayer *layer = audio_element->layers[i]; + if (layer->flags & AV_IAMF_LAYER_FLAG_RECON_GAIN) { + unsigned int recon_gain_flags = ffio_read_leb(pb); + unsigned int bitcount = 7 + 5 * !!(recon_gain_flags & 0x80); + recon_gain_flags = (recon_gain_flags & 0x7F) | ((recon_gain_flags & 0xFF00) >> 1); + for (int j = 0; j < bitcount; j++) { + if (recon_gain_flags & (1 << j)) + recon->recon_gain[i][j] = avio_r8(pb); + } + } + } + recon->subblock_duration = subblock_duration; + break; + } + default: + av_assert0(0); + } + } + + len -= avio_tell(pb); + if (len) { + int level = (s->error_recognition & AV_EF_EXPLODE) ? AV_LOG_ERROR : AV_LOG_WARNING; + av_log(s, level, "Underread in parameter_block_obu. %d bytes left at the end\n", len); + } + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + av_free(c->mix); + c->mix = out_param; + c->mix_size = out_param_size; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + av_free(c->demix); + c->demix = out_param; + c->demix_size = out_param_size; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + av_free(c->recon); + c->recon = out_param; + c->recon_size = out_param_size; + break; + default: + av_assert0(0); + } + + ret = 0; +fail: + if (ret < 0) + av_free(out_param); + av_free(buf); + + return ret; +} + +static int iamf_read_packet(AVFormatContext *s, AVPacket *pkt) +{ + IAMFDemuxContext *const c = s->priv_data; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE + AV_INPUT_BUFFER_PADDING_SIZE]; + unsigned obu_size; + int ret; + + while (1) { + enum IAMF_OBU_Type type; + unsigned skip_samples, discard_padding; + int len, size, start_pos; + + if ((ret = ffio_ensure_seekback(s->pb, MAX_IAMF_OBU_HEADER_SIZE)) < 0) + return ret; + size = avio_read(s->pb, header, MAX_IAMF_OBU_HEADER_SIZE); + if (size < 0) + return size; + + len = ff_iamf_parse_obu_header(header, size, &obu_size, &start_pos, &type, + &skip_samples, &discard_padding); + if (len < 0) { + av_log(s, AV_LOG_ERROR, "Failed to read obu\n"); + return len; + } + avio_seek(s->pb, -(size - start_pos), SEEK_CUR); + + if (type >= IAMF_OBU_IA_AUDIO_FRAME && type <= IAMF_OBU_IA_AUDIO_FRAME_ID17) + return audio_frame_obu(s, pkt, obu_size, type, + skip_samples, discard_padding, + type == IAMF_OBU_IA_AUDIO_FRAME); + else if (type == IAMF_OBU_IA_PARAMETER_BLOCK) { + ret = parameter_block_obu(s, obu_size); + if (ret < 0) + return ret; + } else if (type == IAMF_OBU_IA_TEMPORAL_DELIMITER) { + av_freep(&c->mix); + c->mix_size = 0; + av_freep(&c->demix); + c->demix_size = 0; + av_freep(&c->recon); + c->recon_size = 0; + } else { + int64_t offset = avio_skip(s->pb, obu_size); + if (offset < 0) { + ret = offset; + break; + } + } + } + + return ret; +} + +//return < 0 if we need more data +static int get_score(const uint8_t *buf, int buf_size, enum IAMF_OBU_Type type, int *seq) +{ + if (type == IAMF_OBU_IA_SEQUENCE_HEADER) { + if (buf_size < 4 || AV_RB32(buf) != MKBETAG('i','a','m','f')) + return 0; + *seq = 1; + return -1; + } + if (type >= IAMF_OBU_IA_CODEC_CONFIG && type <= IAMF_OBU_IA_TEMPORAL_DELIMITER) + return *seq ? -1 : 0; + if (type >= IAMF_OBU_IA_AUDIO_FRAME && type <= IAMF_OBU_IA_AUDIO_FRAME_ID17) + return *seq ? AVPROBE_SCORE_EXTENSION + 1 : 0; + return 0; +} + +static int iamf_probe(const AVProbeData *p) +{ + unsigned obu_size; + enum IAMF_OBU_Type type; + int seq = 0, cnt = 0, start_pos; + int ret; + + while (1) { + int size = ff_iamf_parse_obu_header(p->buf + cnt, p->buf_size - cnt, + &obu_size, &start_pos, &type, + NULL, NULL); + if (size < 0) + return 0; + + ret = get_score(p->buf + cnt + start_pos, + p->buf_size - cnt - start_pos, + type, &seq); + if (ret >= 0) + return ret; + + cnt += FFMIN(size, p->buf_size - cnt); + } + return 0; +} + +static int iamf_read_header(AVFormatContext *s) +{ + IAMFDemuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + int ret; + + ret = ff_iamfdec_read_descriptors(iamf, s->pb, INT_MAX, s); + if (ret < 0) + return ret; + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + IAMFAudioElement *audio_element = iamf->audio_elements[i]; + AVStreamGroup *stg = avformat_stream_group_create(s, AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT, NULL); + + if (!stg) + return AVERROR(ENOMEM); + + stg->id = audio_element->audio_element_id; + stg->params.iamf_audio_element = audio_element->element; + audio_element->element = NULL; + + for (int j = 0; j < audio_element->nb_substreams; j++) { + IAMFSubStream *substream = &audio_element->substreams[j]; + AVStream *st = avformat_new_stream(s, NULL); + + if (!st) + return AVERROR(ENOMEM); + + ret = avformat_stream_group_add_stream(stg, st); + if (ret < 0) + return ret; + + ret = avcodec_parameters_copy(st->codecpar, substream->codecpar); + if (ret < 0) + return ret; + + st->id = substream->audio_substream_id; + avpriv_set_pts_info(st, 64, 1, st->codecpar->sample_rate); + } + } + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + IAMFMixPresentation *mix_presentation = iamf->mix_presentations[i]; + AVStreamGroup *stg = avformat_stream_group_create(s, AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION, NULL); + const AVIAMFMixPresentation *mix = mix_presentation->mix; + + if (!stg) + return AVERROR(ENOMEM); + + stg->id = mix_presentation->mix_presentation_id; + stg->params.iamf_mix_presentation = mix_presentation->mix; + mix_presentation->mix = NULL; + + for (int j = 0; j < mix->nb_submixes; j++) { + AVIAMFSubmix *sub_mix = mix->submixes[j]; + + for (int k = 0; k < sub_mix->nb_elements; k++) { + AVIAMFSubmixElement *submix_element = sub_mix->elements[k]; + AVStreamGroup *audio_element = NULL; + + for (int l = 0; l < s->nb_stream_groups; l++) + if (s->stream_groups[l]->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT && + s->stream_groups[l]->id == submix_element->audio_element_id) { + audio_element = s->stream_groups[l]; + break; + } + av_assert0(audio_element); + + for (int l = 0; l < audio_element->nb_streams; l++) { + ret = avformat_stream_group_add_stream(stg, audio_element->streams[l]); + if (ret < 0 && ret != AVERROR(EEXIST)) + return ret; + } + } + } + } + + return 0; +} + +static int iamf_read_close(AVFormatContext *s) +{ + IAMFDemuxContext *const c = s->priv_data; + + ff_iamf_uninit_context(&c->iamf); + + av_freep(&c->mix); + c->mix_size = 0; + av_freep(&c->demix); + c->demix_size = 0; + av_freep(&c->recon); + c->recon_size = 0; + + return 0; +} + +const AVInputFormat ff_iamf_demuxer = { + .name = "iamf", + .long_name = NULL_IF_CONFIG_SMALL("Raw Immersive Audio Model and Formats"), + .priv_data_size = sizeof(IAMFDemuxContext), + .flags_internal = FF_FMT_INIT_CLEANUP, + .read_probe = iamf_probe, + .read_header = iamf_read_header, + .read_packet = iamf_read_packet, + .read_close = iamf_read_close, + .extensions = "iamf", + .flags = AVFMT_GENERIC_INDEX | AVFMT_NO_BYTE_SEEK | AVFMT_NOTIMESTAMPS | AVFMT_SHOW_IDS, +}; From patchwork Tue Dec 5 22:44:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44939 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:9153:b0:181:818d:5e7f with SMTP id x19csp659120pzc; Tue, 5 Dec 2023 14:44:56 -0800 (PST) X-Google-Smtp-Source: AGHT+IHaVgGNBP8nuJXfIrIknB49fYtSkohkWFYitOoRvYkOqE/Ih1V+BAu7VDSuhdnFci4JoMDN X-Received: by 2002:a17:906:2dc:b0:a19:a19b:78a2 with SMTP id 28-20020a17090602dc00b00a19a19b78a2mr1026274ejk.101.1701816296164; Tue, 05 Dec 2023 14:44:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701816296; cv=none; d=google.com; s=arc-20160816; b=AOVHpavKCwrmJozQN7pcYVvj0uiPXxWxWwyqIDQYOB2ySPKmeT2YbH05Qv8eV0rME4 oHX4L8RPA9XqdRVOp2Un+0QFlS+kc3U3iZSEKDhakmLRbDWJTRhEaUXF4t9JCc5aVb06 LFH6cqf3AXwnHI4hqoXmd64s+Q634nk2D6htsA6GD9UUrqmpJDxEGkjUcuYNU9Pq6TLR ARympfqMNBR3Nw1vspFbLIXEmDIu5T9HdJcLG79/yvXSMSmMpL+FYV+yuu2u+Vr2+twR /idS0WainV+ipDRftDxyA5c1tasbDU5mVW5VZAwvKn44n6NC1Jd/Gg/YtzmmNBqQecAf k51g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=80FBd+UGD9MGFfafNXDjQFdH1iFk82i8jVEJpKQ8vOs=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=XGUcbulftVkGx5Hr7Rg3IdbW/TIP9wc5WB+ahCQC7+1WnGkD8Jxt2FRVnAgVNZQ3YW MUF4Mv4/Me4/S2B1sNwQqFKKVy2uBuuT0s90nf6p52bRVeMVZRprPayFGsC/4Rv/WWTF a+4+eX0wsNN5E7jLLk8JmT8CsjF95qr/d6Bvo3WRdXYdR5sCqeJpsCD+gfFkE1ACaPUJ 27byigtBWgkVRarLEXcWAp7oFPQ96H3HNFR6pIa91hkHpcla38d9gqkwazkDHOokBG5Q fReLUkXZ3a6AKZYJqCiWxwTmZ2/77+PqdhDZ0SOxxO9LbNP9xoCKDEAN0a2xueks5Z9+ In8w== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=WIqgXszY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id v15-20020a17090606cf00b00a1d9ee5cbbdsi2496ejb.1002.2023.12.05.14.44.54; Tue, 05 Dec 2023 14:44:56 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=WIqgXszY; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 702C268CF2A; Wed, 6 Dec 2023 00:44:00 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id CCCC168CF19 for ; Wed, 6 Dec 2023 00:43:57 +0200 (EET) Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-6cb749044a2so6557511b3a.0 for ; Tue, 05 Dec 2023 14:43:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701816235; x=1702421035; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=7ZBKXSMsFdEBBiRp0erHlNdDQbJNWDkZhGl7OqZkuOg=; b=WIqgXszYYAlkOXNg5SbTUhN1QfsuuohbEmoi7if9BxLXiZARRSWgcXotfdYFHss97z WCvKF7AHzkOTAggL4kMxz95n8QpXL4khXIxVgzSNn1mirOgx5AtVLek2BvUOL9HUcn4O zE1gI47HvV6i7OjDQpGKSrHgyhnyVyGZZ0dS1F64fFfIsCfxiLhoZ6LPrNmZyqAu2IIU kCfdIoyuqGmUxP2Dy67f8vWhWWH97BRn9oCwET0rQ5r/TJnuwvvTWelMy8bbEEuQw3NA dKNY4VL39Qmaxbn169ueNVCXWjmF36Hxgvrp0p1+50gObV7lLVbfX6qOCPi8exnt+hBy aD5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701816235; x=1702421035; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7ZBKXSMsFdEBBiRp0erHlNdDQbJNWDkZhGl7OqZkuOg=; b=csO0+ZiEX50XxRGn6D/Lrx+UUfO+7ZS3vSuOy1A2JXzFCeqBAbQUOG0Q/5aMBMDmhq 7EAsyBlZ061w+WcNnehCdBJMPUFb9AitI+U5vPWsWH+4cZA2cpp+GEXnFUkwSJ87lVML 8S/gGtNY4MYEJjNiD1Lu17y53itYe/n6lnZVsrgBmFdcjGxjYMHTZLu221zVeKujVYP0 kzQUhyvgU7L8AOKBVMAiNLaUgR2U8ReNb2NUlFp2Afh5ObHrgU32ojr8+y5uO5pPYNAL +HqA9O4jM4GFd/h2ReI+soH5Id/1NB0zg67QUfKQZswVuHVOQjLTbxjMDPklQCL2zilF ygMQ== X-Gm-Message-State: AOJu0Yy6kY/yFukCigr9lB1G9+R/roCfxZzGmlDsbd0VXqL/3w7N+Lfw b62tPdUl1DbU05HtV++I4ZrK2xFy+jc= X-Received: by 2002:a05:6a20:1583:b0:18f:cf73:3573 with SMTP id h3-20020a056a20158300b0018fcf733573mr105968pzj.121.1701816234646; Tue, 05 Dec 2023 14:43:54 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id hq25-20020a056a00681900b0064fd4a6b306sm2037688pfb.76.2023.12.05.14.43.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 14:43:53 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 5 Dec 2023 19:44:02 -0300 Message-ID: <20231205224402.14540-9-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231205224402.14540-1-jamrial@gmail.com> References: <20231205224402.14540-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 8/8] avformat: Immersive Audio Model and Formats muxer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: QDKg68/OEwPE Signed-off-by: James Almer --- libavformat/Makefile | 1 + libavformat/allformats.c | 1 + libavformat/iamf_writer.c | 823 ++++++++++++++++++++++++++++++++++++++ libavformat/iamf_writer.h | 51 +++ libavformat/iamfenc.c | 388 ++++++++++++++++++ 5 files changed, 1264 insertions(+) create mode 100644 libavformat/iamf_writer.c create mode 100644 libavformat/iamf_writer.h create mode 100644 libavformat/iamfenc.c diff --git a/libavformat/Makefile b/libavformat/Makefile index f23c22792b..581e378d95 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -259,6 +259,7 @@ OBJS-$(CONFIG_HLS_DEMUXER) += hls.o hls_sample_encryption.o OBJS-$(CONFIG_HLS_MUXER) += hlsenc.o hlsplaylist.o avc.o OBJS-$(CONFIG_HNM_DEMUXER) += hnm.o OBJS-$(CONFIG_IAMF_DEMUXER) += iamfdec.o iamf_parse.o iamf.o +OBJS-$(CONFIG_IAMF_MUXER) += iamfenc.o iamf_writer.o iamf.o OBJS-$(CONFIG_ICO_DEMUXER) += icodec.o OBJS-$(CONFIG_ICO_MUXER) += icoenc.o OBJS-$(CONFIG_IDCIN_DEMUXER) += idcin.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index 6e520b78a6..ce6be5f04d 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -213,6 +213,7 @@ extern const AVInputFormat ff_hls_demuxer; extern const FFOutputFormat ff_hls_muxer; extern const AVInputFormat ff_hnm_demuxer; extern const AVInputFormat ff_iamf_demuxer; +extern const FFOutputFormat ff_iamf_muxer; extern const AVInputFormat ff_ico_demuxer; extern const FFOutputFormat ff_ico_muxer; extern const AVInputFormat ff_idcin_demuxer; diff --git a/libavformat/iamf_writer.c b/libavformat/iamf_writer.c new file mode 100644 index 0000000000..fc31174b53 --- /dev/null +++ b/libavformat/iamf_writer.c @@ -0,0 +1,823 @@ +/* + * Immersive Audio Model and Formats muxing helpers and structs + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/channel_layout.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/iamf.h" +#include "libavutil/mem.h" +#include "libavcodec/get_bits.h" +#include "libavcodec/flac.h" +#include "libavcodec/mpeg4audio.h" +#include "libavcodec/put_bits.h" +#include "avformat.h" +#include "avio_internal.h" +#include "iamf.h" +#include "iamf_writer.h" + + +static int update_extradata(IAMFCodecConfig *codec_config) +{ + GetBitContext gb; + PutBitContext pb; + int ret; + + switch(codec_config->codec_id) { + case AV_CODEC_ID_OPUS: + if (codec_config->extradata_size < 19) + return AVERROR_INVALIDDATA; + codec_config->extradata_size -= 8; + memmove(codec_config->extradata, codec_config->extradata + 8, codec_config->extradata_size); + AV_WB8(codec_config->extradata + 1, 2); // set channels to stereo + break; + case AV_CODEC_ID_FLAC: { + uint8_t buf[13]; + + init_put_bits(&pb, buf, sizeof(buf)); + ret = init_get_bits8(&gb, codec_config->extradata, codec_config->extradata_size); + if (ret < 0) + return ret; + + put_bits32(&pb, get_bits_long(&gb, 32)); // min/max blocksize + put_bits64(&pb, 48, get_bits64(&gb, 48)); // min/max framesize + put_bits(&pb, 20, get_bits(&gb, 20)); // samplerate + skip_bits(&gb, 3); + put_bits(&pb, 3, 1); // set channels to stereo + ret = put_bits_left(&pb); + put_bits(&pb, ret, get_bits(&gb, ret)); + flush_put_bits(&pb); + + memcpy(codec_config->extradata, buf, sizeof(buf)); + break; + } + default: + break; + } + + return 0; +} + +static int fill_codec_config(IAMFContext *iamf, const AVStreamGroup *stg, + IAMFCodecConfig *codec_config) +{ + const AVStream *st = stg->streams[0]; + IAMFCodecConfig **tmp; + int j, ret = 0; + + codec_config->codec_id = st->codecpar->codec_id; + codec_config->sample_rate = st->codecpar->sample_rate; + codec_config->codec_tag = st->codecpar->codec_tag; + codec_config->nb_samples = st->codecpar->frame_size; + codec_config->seek_preroll = st->codecpar->seek_preroll; + if (st->codecpar->extradata_size) { + codec_config->extradata = av_memdup(st->codecpar->extradata, st->codecpar->extradata_size); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + codec_config->extradata_size = st->codecpar->extradata_size; + ret = update_extradata(codec_config); + if (ret < 0) + goto fail; + } + + for (j = 0; j < iamf->nb_codec_configs; j++) { + if (!memcmp(iamf->codec_configs[j], codec_config, offsetof(IAMFCodecConfig, extradata)) && + (!codec_config->extradata_size || !memcmp(iamf->codec_configs[j]->extradata, + codec_config->extradata, codec_config->extradata_size))) + break; + } + + if (j < iamf->nb_codec_configs) { + av_free(iamf->codec_configs[j]->extradata); + av_free(iamf->codec_configs[j]); + iamf->codec_configs[j] = codec_config; + return j; + } + + tmp = av_realloc_array(iamf->codec_configs, iamf->nb_codec_configs + 1, sizeof(*iamf->codec_configs)); + if (!tmp) { + ret = AVERROR(ENOMEM); + goto fail; + } + + iamf->codec_configs = tmp; + iamf->codec_configs[iamf->nb_codec_configs] = codec_config; + codec_config->codec_config_id = iamf->nb_codec_configs; + + return iamf->nb_codec_configs++; + +fail: + av_freep(&codec_config->extradata); + return ret; +} + +static IAMFParamDefinition *add_param_definition(IAMFContext *iamf, AVIAMFParamDefinition *param) +{ + IAMFParamDefinition **tmp, *param_definition; + + tmp = av_realloc_array(iamf->param_definitions, iamf->nb_param_definitions + 1, + sizeof(*iamf->param_definitions)); + if (!tmp) + return NULL; + + iamf->param_definitions = tmp; + + param_definition = av_mallocz(sizeof(*param_definition)); + if (!param_definition) + return NULL; + + param_definition->param = param; + iamf->param_definitions[iamf->nb_param_definitions++] = param_definition; + + return param_definition; +} + +int ff_iamf_add_audio_element(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx) +{ + const AVIAMFAudioElement *iamf_audio_element; + IAMFAudioElement **tmp, *audio_element; + IAMFCodecConfig *codec_config; + int ret; + + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + return AVERROR(EINVAL); + + iamf_audio_element = stg->params.iamf_audio_element; + if (iamf_audio_element->audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE) { + const AVIAMFLayer *layer = iamf_audio_element->layers[0]; + if (iamf_audio_element->nb_layers != 1) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid amount of layers for SCENE_BASED audio element. Must be 1\n"); + return AVERROR(EINVAL); + } + if (layer->ch_layout.order != AV_CHANNEL_ORDER_CUSTOM && + layer->ch_layout.order != AV_CHANNEL_ORDER_AMBISONIC) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid channel layout for SCENE_BASED audio element\n"); + return AVERROR(EINVAL); + } + if (layer->ambisonics_mode >= AV_IAMF_AMBISONICS_MODE_PROJECTION) { + av_log(log_ctx, AV_LOG_ERROR, "Unsuported ambisonics mode %d\n", layer->ambisonics_mode); + return AVERROR_PATCHWELCOME; + } + for (int i = 0; i < stg->nb_streams; i++) { + if (stg->streams[i]->codecpar->ch_layout.nb_channels > 1) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid amount of channels in a stream for MONO mode ambisonics\n"); + return AVERROR(EINVAL); + } + } + } else + for (int j, i = 0; i < iamf_audio_element->nb_layers; i++) { + const AVIAMFLayer *layer = iamf_audio_element->layers[i]; + for (j = 0; j < FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts); j++) + if (!av_channel_layout_compare(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[j])) + break; + + if (j >= FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts)) { + av_log(log_ctx, AV_LOG_ERROR, "Unsupported channel layout in stream group #%d\n", i); + return AVERROR(EINVAL); + } + } + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + if (stg->id == iamf->audio_elements[i]->audio_element_id) { + av_log(log_ctx, AV_LOG_ERROR, "Duplicated Audio Element id %"PRId64"\n", stg->id); + return AVERROR(EINVAL); + } + } + + codec_config = av_mallocz(sizeof(*codec_config)); + if (!codec_config) + return AVERROR(ENOMEM); + + ret = fill_codec_config(iamf, stg, codec_config); + if (ret < 0) { + av_free(codec_config); + return ret; + } + + audio_element = av_mallocz(sizeof(*audio_element)); + if (!audio_element) + return AVERROR(ENOMEM); + + audio_element->element = stg->params.iamf_audio_element; + audio_element->audio_element_id = stg->id; + audio_element->codec_config_id = ret; + + audio_element->substreams = av_calloc(stg->nb_streams, sizeof(*audio_element->substreams)); + if (!audio_element->substreams) + return AVERROR(ENOMEM); + audio_element->nb_substreams = stg->nb_streams; + + audio_element->layers = av_calloc(iamf_audio_element->nb_layers, sizeof(*audio_element->layers)); + if (!audio_element->layers) + return AVERROR(ENOMEM); + + for (int i = 0, j = 0; i < iamf_audio_element->nb_layers; i++) { + int nb_channels = iamf_audio_element->layers[i]->ch_layout.nb_channels; + + IAMFLayer *layer = &audio_element->layers[i]; + if (!layer) + return AVERROR(ENOMEM); + memset(layer, 0, sizeof(*layer)); + + if (i) + nb_channels -= iamf_audio_element->layers[i - 1]->ch_layout.nb_channels; + for (; nb_channels > 0 && j < stg->nb_streams; j++) { + const AVStream *st = stg->streams[j]; + IAMFSubStream *substream = &audio_element->substreams[j]; + + substream->audio_substream_id = st->id; + layer->substream_count++; + layer->coupled_substream_count += st->codecpar->ch_layout.nb_channels == 2; + nb_channels -= st->codecpar->ch_layout.nb_channels; + } + if (nb_channels) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid channel count across substreams in layer %u from stream group %u\n", + i, stg->index); + return AVERROR(EINVAL); + } + } + + if (iamf_audio_element->demixing_info) { + AVIAMFParamDefinition *param = iamf_audio_element->demixing_info; + IAMFParamDefinition *param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id); + + if (param->nb_subblocks != 1) { + av_log(log_ctx, AV_LOG_ERROR, "nb_subblocks in demixing_info for stream group %u is not 1\n", stg->index); + return AVERROR(EINVAL); + } + if (!param_definition) { + param_definition = add_param_definition(iamf, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + param_definition->audio_element = iamf_audio_element; + } + if (iamf_audio_element->recon_gain_info) { + AVIAMFParamDefinition *param = iamf_audio_element->recon_gain_info; + IAMFParamDefinition *param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id); + + if (param->nb_subblocks != 1) { + av_log(log_ctx, AV_LOG_ERROR, "nb_subblocks in recon_gain_info for stream group %u is not 1\n", stg->index); + return AVERROR(EINVAL); + } + + if (!param_definition) { + param_definition = add_param_definition(iamf, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + param_definition->audio_element = iamf_audio_element; + } + + tmp = av_realloc_array(iamf->audio_elements, iamf->nb_audio_elements + 1, sizeof(*iamf->audio_elements)); + if (!tmp) + return AVERROR(ENOMEM); + + iamf->audio_elements = tmp; + iamf->audio_elements[iamf->nb_audio_elements++] = audio_element; + + return 0; +} + +int ff_iamf_add_mix_presentation(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx) +{ + IAMFMixPresentation **tmp, *mix_presentation; + + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + return AVERROR(EINVAL); + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + if (stg->id == iamf->mix_presentations[i]->mix_presentation_id) { + av_log(log_ctx, AV_LOG_ERROR, "Duplicate Mix Presentation id %"PRId64"\n", stg->id); + return AVERROR(EINVAL); + } + } + + mix_presentation = av_mallocz(sizeof(*mix_presentation)); + if (!mix_presentation) + return AVERROR(ENOMEM); + + mix_presentation->mix = stg->params.iamf_mix_presentation; + mix_presentation->mix_presentation_id = stg->id; + + for (int i = 0; i < mix_presentation->mix->nb_submixes; i++) { + const AVIAMFSubmix *submix = mix_presentation->mix->submixes[i]; + AVIAMFParamDefinition *param = submix->output_mix_config; + IAMFParamDefinition *param_definition; + + if (!param) { + av_log(log_ctx, AV_LOG_ERROR, "output_mix_config is not present in submix %u from " + "Mix Presentation ID %"PRId64"\n", i, stg->id); + return AVERROR(EINVAL); + } + + param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id); + if (!param_definition) { + param_definition = add_param_definition(iamf, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + + for (int j = 0; j < submix->nb_elements; j++) { + const AVIAMFAudioElement *iamf_audio_element = NULL; + const AVIAMFSubmixElement *element = submix->elements[j]; + param = element->element_mix_config; + + if (!param) { + av_log(log_ctx, AV_LOG_ERROR, "element_mix_config is not present for element %u in submix %u from " + "Mix Presentation ID %"PRId64"\n", j, i, stg->id); + return AVERROR(EINVAL); + } + param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id); + if (!param_definition) { + param_definition = add_param_definition(iamf, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + for (int k = 0; k < iamf->nb_audio_elements; k++) + if (iamf->audio_elements[k]->audio_element_id == element->audio_element_id) { + iamf_audio_element = iamf->audio_elements[k]->element; + break; + } + param_definition->audio_element = iamf_audio_element; + } + } + + tmp = av_realloc_array(iamf->mix_presentations, iamf->nb_mix_presentations + 1, sizeof(*iamf->mix_presentations)); + if (!tmp) + return AVERROR(ENOMEM); + + iamf->mix_presentations = tmp; + iamf->mix_presentations[iamf->nb_mix_presentations++] = mix_presentation; + + return 0; +} + +static int iamf_write_codec_config(const IAMFContext *iamf, + const IAMFCodecConfig *codec_config, + AVIOContext *pb) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + PutBitContext pbc; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + ffio_write_leb(dyn_bc, codec_config->codec_config_id); + avio_wl32(dyn_bc, codec_config->codec_tag); + + ffio_write_leb(dyn_bc, codec_config->nb_samples); + avio_wb16(dyn_bc, codec_config->seek_preroll); + + switch(codec_config->codec_id) { + case AV_CODEC_ID_OPUS: + avio_write(dyn_bc, codec_config->extradata, codec_config->extradata_size); + break; + case AV_CODEC_ID_AAC: + return AVERROR_PATCHWELCOME; + case AV_CODEC_ID_FLAC: + avio_w8(dyn_bc, 0x80); + avio_wb24(dyn_bc, codec_config->extradata_size); + avio_write(dyn_bc, codec_config->extradata, codec_config->extradata_size); + break; + case AV_CODEC_ID_PCM_S16LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 16); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S24LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 24); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S32LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 32); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S16BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 16); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S24BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 24); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S32BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 32); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + default: + break; + } + + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 5, IAMF_OBU_IA_CODEC_CONFIG); + put_bits(&pbc, 3, 0); + flush_put_bits(&pbc); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(pb, header, put_bytes_count(&pbc, 1)); + ffio_write_leb(pb, dyn_size); + avio_write(pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static inline int rescale_rational(AVRational q, int b) +{ + return av_clip_int16(av_rescale(q.num, b, q.den)); +} + +static int scalable_channel_layout_config(const IAMFAudioElement *audio_element, + AVIOContext *dyn_bc) +{ + const AVIAMFAudioElement *element = audio_element->element; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pb; + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 3, element->nb_layers); + put_bits(&pb, 5, 0); + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + for (int i = 0; i < element->nb_layers; i++) { + AVIAMFLayer *layer = element->layers[i]; + int layout; + for (layout = 0; layout < FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts); layout++) { + if (!av_channel_layout_compare(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[layout])) + break; + } + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 4, layout); + put_bits(&pb, 1, !!layer->output_gain_flags); + put_bits(&pb, 1, !!(layer->flags & AV_IAMF_LAYER_FLAG_RECON_GAIN)); + put_bits(&pb, 2, 0); // reserved + put_bits(&pb, 8, audio_element->layers[i].substream_count); + put_bits(&pb, 8, audio_element->layers[i].coupled_substream_count); + if (layer->output_gain_flags) { + put_bits(&pb, 6, layer->output_gain_flags); + put_bits(&pb, 2, 0); + put_bits(&pb, 16, rescale_rational(layer->output_gain, 1 << 8)); + } + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + } + + return 0; +} + +static int ambisonics_config(const IAMFAudioElement *audio_element, + AVIOContext *dyn_bc) +{ + const AVIAMFAudioElement *element = audio_element->element; + AVIAMFLayer *layer = element->layers[0]; + + ffio_write_leb(dyn_bc, 0); // ambisonics_mode + ffio_write_leb(dyn_bc, layer->ch_layout.nb_channels); // output_channel_count + ffio_write_leb(dyn_bc, audio_element->nb_substreams); // substream_count + + if (layer->ch_layout.order == AV_CHANNEL_ORDER_AMBISONIC) + for (int i = 0; i < layer->ch_layout.nb_channels; i++) + avio_w8(dyn_bc, i); + else + for (int i = 0; i < layer->ch_layout.nb_channels; i++) + avio_w8(dyn_bc, layer->ch_layout.u.map[i].id); + + return 0; +} + +static int param_definition(const AVIAMFParamDefinition *param, + AVIOContext *dyn_bc) +{ + ffio_write_leb(dyn_bc, param->parameter_id); + ffio_write_leb(dyn_bc, param->parameter_rate); + avio_w8(dyn_bc, !!param->param_definition_mode << 7); + if (!param->param_definition_mode) { + ffio_write_leb(dyn_bc, param->duration); + ffio_write_leb(dyn_bc, param->constant_subblock_duration); + if (param->constant_subblock_duration == 0) { + ffio_write_leb(dyn_bc, param->nb_subblocks); + for (int i = 0; i < param->nb_subblocks; i++) { + const void *subblock = av_iamf_param_definition_get_subblock(param, i); + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + const AVIAMFMixGain *mix = subblock; + ffio_write_leb(dyn_bc, mix->subblock_duration); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + const AVIAMFDemixingInfo *demix = subblock; + ffio_write_leb(dyn_bc, demix->subblock_duration); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + const AVIAMFReconGain *recon = subblock; + ffio_write_leb(dyn_bc, recon->subblock_duration); + break; + } + } + } + } + } + + return 0; +} + +static int iamf_write_audio_element(const IAMFContext *iamf, + const IAMFAudioElement *audio_element, + AVIOContext *pb, void *log_ctx) +{ + const AVIAMFAudioElement *element = audio_element->element; + const IAMFCodecConfig *codec_config = iamf->codec_configs[audio_element->codec_config_id]; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + PutBitContext pbc; + int param_definition_types = AV_IAMF_PARAMETER_DEFINITION_DEMIXING, dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + ffio_write_leb(dyn_bc, audio_element->audio_element_id); + + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 3, element->audio_element_type); + put_bits(&pbc, 5, 0); + flush_put_bits(&pbc); + avio_write(dyn_bc, header, put_bytes_count(&pbc, 1)); + + ffio_write_leb(dyn_bc, audio_element->codec_config_id); + ffio_write_leb(dyn_bc, audio_element->nb_substreams); + + for (int i = 0; i < audio_element->nb_substreams; i++) + ffio_write_leb(dyn_bc, audio_element->substreams[i].audio_substream_id); + + if (element->nb_layers == 1) + param_definition_types &= ~AV_IAMF_PARAMETER_DEFINITION_DEMIXING; + if (element->nb_layers > 1) + param_definition_types |= AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN; + if (codec_config->codec_tag == MKTAG('f','L','a','C') || + codec_config->codec_tag == MKTAG('i','p','c','m')) + param_definition_types &= ~AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN; + + ffio_write_leb(dyn_bc, av_popcount(param_definition_types)); // num_parameters + + if (param_definition_types & 1) { + const AVIAMFParamDefinition *param = element->demixing_info; + const AVIAMFDemixingInfo *demix; + + if (!param) { + av_log(log_ctx, AV_LOG_ERROR, "demixing_info needed but not set in Stream Group #%u\n", + audio_element->audio_element_id); + return AVERROR(EINVAL); + } + + demix = av_iamf_param_definition_get_subblock(param, 0); + ffio_write_leb(dyn_bc, AV_IAMF_PARAMETER_DEFINITION_DEMIXING); // param_definition_type + param_definition(param, dyn_bc); + + avio_w8(dyn_bc, demix->dmixp_mode << 5); // dmixp_mode + avio_w8(dyn_bc, element->default_w << 4); // default_w + } + if (param_definition_types & 2) { + const AVIAMFParamDefinition *param = element->recon_gain_info; + + if (!param) { + av_log(log_ctx, AV_LOG_ERROR, "recon_gain_info needed but not set in Stream Group #%u\n", + audio_element->audio_element_id); + return AVERROR(EINVAL); + } + ffio_write_leb(dyn_bc, AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN); // param_definition_type + param_definition(param, dyn_bc); + } + + if (element->audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) { + ret = scalable_channel_layout_config(audio_element, dyn_bc); + if (ret < 0) + return ret; + } else { + ret = ambisonics_config(audio_element, dyn_bc); + if (ret < 0) + return ret; + } + + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 5, IAMF_OBU_IA_AUDIO_ELEMENT); + put_bits(&pbc, 3, 0); + flush_put_bits(&pbc); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(pb, header, put_bytes_count(&pbc, 1)); + ffio_write_leb(pb, dyn_size); + avio_write(pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static int iamf_write_mixing_presentation(const IAMFContext *iamf, + const IAMFMixPresentation *mix_presentation, + AVIOContext *pb, void *log_ctx) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + const AVIAMFMixPresentation *mix = mix_presentation->mix; + const AVDictionaryEntry *tag = NULL; + PutBitContext pbc; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + ffio_write_leb(dyn_bc, mix_presentation->mix_presentation_id); // mix_presentation_id + ffio_write_leb(dyn_bc, av_dict_count(mix->annotations)); // count_label + + while ((tag = av_dict_iterate(mix->annotations, tag))) + avio_put_str(dyn_bc, tag->key); + while ((tag = av_dict_iterate(mix->annotations, tag))) + avio_put_str(dyn_bc, tag->value); + + ffio_write_leb(dyn_bc, mix->nb_submixes); + for (int i = 0; i < mix->nb_submixes; i++) { + const AVIAMFSubmix *sub_mix = mix->submixes[i]; + + ffio_write_leb(dyn_bc, sub_mix->nb_elements); + for (int j = 0; j < sub_mix->nb_elements; j++) { + const IAMFAudioElement *audio_element = NULL; + const AVIAMFSubmixElement *submix_element = sub_mix->elements[j]; + + for (int k = 0; k < iamf->nb_audio_elements; k++) + if (iamf->audio_elements[k]->audio_element_id == submix_element->audio_element_id) { + audio_element = iamf->audio_elements[k]; + break; + } + + av_assert0(audio_element); + ffio_write_leb(dyn_bc, submix_element->audio_element_id); + + if (av_dict_count(submix_element->annotations) != av_dict_count(mix->annotations)) { + av_log(log_ctx, AV_LOG_ERROR, "Inconsistent amount of labels in submix %d from Mix Presentation id #%u\n", + j, audio_element->audio_element_id); + return AVERROR(EINVAL); + } + while ((tag = av_dict_iterate(submix_element->annotations, tag))) + avio_put_str(dyn_bc, tag->value); + + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 2, submix_element->headphones_rendering_mode); + put_bits(&pbc, 6, 0); // reserved + flush_put_bits(&pbc); + avio_write(dyn_bc, header, put_bytes_count(&pbc, 1)); + ffio_write_leb(dyn_bc, 0); // rendering_config_extension_size + param_definition(submix_element->element_mix_config, dyn_bc); + avio_wb16(dyn_bc, rescale_rational(submix_element->default_mix_gain, 1 << 8)); + } + param_definition(sub_mix->output_mix_config, dyn_bc); + avio_wb16(dyn_bc, rescale_rational(sub_mix->default_mix_gain, 1 << 8)); + + ffio_write_leb(dyn_bc, sub_mix->nb_layouts); // nb_layouts + for (int i = 0; i < sub_mix->nb_layouts; i++) { + const AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[i]; + int layout, info_type; + int dialogue = submix_layout->dialogue_anchored_loudness.num && + submix_layout->dialogue_anchored_loudness.den; + int album = submix_layout->album_anchored_loudness.num && + submix_layout->album_anchored_loudness.den; + + if (layout == FF_ARRAY_ELEMS(ff_iamf_sound_system_map)) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid Sound System value in a submix\n"); + return AVERROR(EINVAL); + } + + if (submix_layout->layout_type == AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS) { + for (layout = 0; layout < FF_ARRAY_ELEMS(ff_iamf_sound_system_map); layout++) { + if (!av_channel_layout_compare(&submix_layout->sound_system, &ff_iamf_sound_system_map[layout].layout)) + break; + } + if (layout == FF_ARRAY_ELEMS(ff_iamf_sound_system_map)) { + av_log(log_ctx, AV_LOG_ERROR, "Invalid Sound System value in a submix\n"); + return AVERROR(EINVAL); + } + } + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 2, submix_layout->layout_type); // layout_type + if (submix_layout->layout_type == AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS) { + put_bits(&pbc, 4, ff_iamf_sound_system_map[layout].id); // sound_system + put_bits(&pbc, 2, 0); // reserved + } else + put_bits(&pbc, 6, 0); // reserved + flush_put_bits(&pbc); + avio_write(dyn_bc, header, put_bytes_count(&pbc, 1)); + + info_type = (submix_layout->true_peak.num && submix_layout->true_peak.den); + info_type |= (dialogue || album) << 1; + avio_w8(dyn_bc, info_type); + avio_wb16(dyn_bc, rescale_rational(submix_layout->integrated_loudness, 1 << 8)); + avio_wb16(dyn_bc, rescale_rational(submix_layout->digital_peak, 1 << 8)); + if (info_type & 1) + avio_wb16(dyn_bc, rescale_rational(submix_layout->true_peak, 1 << 8)); + if (info_type & 2) { + avio_w8(dyn_bc, dialogue + album); // num_anchored_loudness + if (dialogue) { + avio_w8(dyn_bc, IAMF_ANCHOR_ELEMENT_DIALOGUE); + avio_wb16(dyn_bc, rescale_rational(submix_layout->dialogue_anchored_loudness, 1 << 8)); + } + if (album) { + avio_w8(dyn_bc, IAMF_ANCHOR_ELEMENT_ALBUM); + avio_wb16(dyn_bc, rescale_rational(submix_layout->album_anchored_loudness, 1 << 8)); + } + } + } + } + + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 5, IAMF_OBU_IA_MIX_PRESENTATION); + put_bits(&pbc, 3, 0); + flush_put_bits(&pbc); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(pb, header, put_bytes_count(&pbc, 1)); + ffio_write_leb(pb, dyn_size); + avio_write(pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +int ff_iamf_write_descriptors(const IAMFContext *iamf, AVIOContext *pb, void *log_ctx) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pbc; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + // Sequence Header + init_put_bits(&pbc, header, sizeof(header)); + put_bits(&pbc, 5, IAMF_OBU_IA_SEQUENCE_HEADER); + put_bits(&pbc, 3, 0); + flush_put_bits(&pbc); + + avio_write(dyn_bc, header, put_bytes_count(&pbc, 1)); + ffio_write_leb(dyn_bc, 6); + avio_wb32(dyn_bc, MKBETAG('i','a','m','f')); + avio_w8(dyn_bc, iamf->nb_audio_elements > 1); // primary_profile + avio_w8(dyn_bc, iamf->nb_audio_elements > 1); // additional_profile + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + for (int i = 0; i < iamf->nb_codec_configs; i++) { + ret = iamf_write_codec_config(iamf, iamf->codec_configs[i], pb); + if (ret < 0) + return ret; + } + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + ret = iamf_write_audio_element(iamf, iamf->audio_elements[i], pb, log_ctx); + if (ret < 0) + return ret; + } + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + ret = iamf_write_mixing_presentation(iamf, iamf->mix_presentations[i], pb, log_ctx); + if (ret < 0) + return ret; + } + + return 0; +} diff --git a/libavformat/iamf_writer.h b/libavformat/iamf_writer.h new file mode 100644 index 0000000000..93354670b8 --- /dev/null +++ b/libavformat/iamf_writer.h @@ -0,0 +1,51 @@ +/* + * Immersive Audio Model and Formats muxing helpers and structs + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFORMAT_IAMF_WRITER_H +#define AVFORMAT_IAMF_WRITER_H + +#include + +#include "libavutil/common.h" +#include "avformat.h" +#include "avio.h" +#include "iamf.h" + +static inline IAMFParamDefinition *ff_iamf_get_param_definition(const IAMFContext *iamf, + unsigned int parameter_id) +{ + IAMFParamDefinition *param_definition = NULL; + + for (int i = 0; i < iamf->nb_param_definitions; i++) + if (iamf->param_definitions[i]->param->parameter_id == parameter_id) { + param_definition = iamf->param_definitions[i]; + break; + } + + return param_definition; +} + +int ff_iamf_add_audio_element(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx); +int ff_iamf_add_mix_presentation(IAMFContext *iamf, const AVStreamGroup *stg, void *log_ctx); + +int ff_iamf_write_descriptors(const IAMFContext *iamf, AVIOContext *pb, void *log_ctx); + +#endif /* AVFORMAT_IAMF_WRITER_H */ diff --git a/libavformat/iamfenc.c b/libavformat/iamfenc.c new file mode 100644 index 0000000000..1dbb8b21d4 --- /dev/null +++ b/libavformat/iamfenc.c @@ -0,0 +1,388 @@ +/* + * IAMF muxer + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/avassert.h" +#include "libavutil/common.h" +#include "libavutil/iamf.h" +#include "libavcodec/get_bits.h" +#include "libavcodec/put_bits.h" +#include "avformat.h" +#include "avio_internal.h" +#include "iamf.h" +#include "iamf_writer.h" +#include "internal.h" +#include "mux.h" + +typedef struct IAMFMuxContext { + IAMFContext iamf; + + int first_stream_id; +} IAMFMuxContext; + +static int iamf_init(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + int nb_audio_elements = 0, nb_mix_presentations = 0; + int ret; + + if (!s->nb_streams) { + av_log(s, AV_LOG_ERROR, "There must be at least one stream\n"); + return AVERROR(EINVAL); + } + + for (int i = 0; i < s->nb_streams; i++) { + if (s->streams[i]->codecpar->codec_type != AVMEDIA_TYPE_AUDIO || + (s->streams[i]->codecpar->codec_tag != MKTAG('m','p','4','a') && + s->streams[i]->codecpar->codec_tag != MKTAG('O','p','u','s') && + s->streams[i]->codecpar->codec_tag != MKTAG('f','L','a','C') && + s->streams[i]->codecpar->codec_tag != MKTAG('i','p','c','m'))) { + av_log(s, AV_LOG_ERROR, "Unsupported codec id %s\n", + avcodec_get_name(s->streams[i]->codecpar->codec_id)); + return AVERROR(EINVAL); + } + + if (s->streams[i]->codecpar->ch_layout.nb_channels > 2) { + av_log(s, AV_LOG_ERROR, "Unsupported channel layout on stream #%d\n", i); + return AVERROR(EINVAL); + } + + for (int j = 0; j < i; j++) { + if (s->streams[i]->id == s->streams[j]->id) { + av_log(s, AV_LOG_ERROR, "Duplicated stream id %d\n", s->streams[j]->id); + return AVERROR(EINVAL); + } + } + } + + if (!s->nb_stream_groups) { + av_log(s, AV_LOG_ERROR, "There must be at least two stream groups\n"); + return AVERROR(EINVAL); + } + + for (int i = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + + if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + nb_audio_elements++; + if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + nb_mix_presentations++; + } + if ((nb_audio_elements < 1 && nb_audio_elements > 2) || nb_mix_presentations < 1) { + av_log(s, AV_LOG_ERROR, "There must be >= 1 and <= 2 IAMF_AUDIO_ELEMENT and at least " + "one IAMF_MIX_PRESENTATION stream groups\n"); + return AVERROR(EINVAL); + } + + for (int i = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + continue; + + ret = ff_iamf_add_audio_element(iamf, stg, s); + if (ret < 0) + return ret; + } + + for (int i = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + continue; + + ret = ff_iamf_add_mix_presentation(iamf, stg, s); + if (ret < 0) + return ret; + } + + c->first_stream_id = s->streams[0]->id; + + return 0; +} + +static int iamf_write_header(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + int ret; + + ret = ff_iamf_write_descriptors(iamf, s->pb, s); + if (ret < 0) + return ret; + + c->first_stream_id = s->streams[0]->id; + + return 0; +} + +static inline int rescale_rational(AVRational q, int b) +{ + return av_clip_int16(av_rescale(q.num, b, q.den)); +} + +static int write_parameter_block(AVFormatContext *s, const AVIAMFParamDefinition *param) +{ + const IAMFMuxContext *const c = s->priv_data; + const IAMFContext *const iamf = &c->iamf; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + IAMFParamDefinition *param_definition = ff_iamf_get_param_definition(iamf, param->parameter_id); + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size, ret; + + if (param->param_definition_type > AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) { + av_log(s, AV_LOG_DEBUG, "Ignoring side data with unknown param_definition_type %u\n", + param->param_definition_type); + return 0; + } + + if (!param_definition) { + av_log(s, AV_LOG_ERROR, "Non-existent Parameter Definition with ID %u referenced by a packet\n", + param->parameter_id); + return AVERROR(EINVAL); + } + + if (param->param_definition_type != param_definition->param->param_definition_type || + param->param_definition_mode != param_definition->param->param_definition_mode) { + av_log(s, AV_LOG_ERROR, "Inconsistent param_definition_mode or param_definition_type values " + "for Parameter Definition with ID %u in a packet\n", + param->parameter_id); + return AVERROR(EINVAL); + } + + ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + // Sequence Header + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_PARAMETER_BLOCK); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + + ffio_write_leb(dyn_bc, param->parameter_id); + if (param->param_definition_mode) { + ffio_write_leb(dyn_bc, param->duration); + ffio_write_leb(dyn_bc, param->constant_subblock_duration); + if (param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, param->nb_subblocks); + } + + for (int i = 0; i < param->nb_subblocks; i++) { + const void *subblock = av_iamf_param_definition_get_subblock(param, i); + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + const AVIAMFMixGain *mix = subblock; + if (param->param_definition_mode && param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, mix->subblock_duration); + + ffio_write_leb(dyn_bc, mix->animation_type); + + avio_wb16(dyn_bc, rescale_rational(mix->start_point_value, 1 << 8)); + if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR) + avio_wb16(dyn_bc, rescale_rational(mix->end_point_value, 1 << 8)); + if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) { + avio_wb16(dyn_bc, rescale_rational(mix->control_point_value, 1 << 8)); + avio_w8(dyn_bc, av_clip_uint8(av_rescale(mix->control_point_relative_time.num, 1 << 8, + mix->control_point_relative_time.den))); + } + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + const AVIAMFDemixingInfo *demix = subblock; + if (param->param_definition_mode && param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, demix->subblock_duration); + + avio_w8(dyn_bc, demix->dmixp_mode << 5); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + const AVIAMFReconGain *recon = subblock; + const AVIAMFAudioElement *audio_element = param_definition->audio_element; + + if (param->param_definition_mode && param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, recon->subblock_duration); + + if (!audio_element) { + av_log(s, AV_LOG_ERROR, "Invalid Parameter Definition with ID %u referenced by a packet\n", param->parameter_id); + return AVERROR(EINVAL); + } + + for (int j = 0; j < audio_element->nb_layers; j++) { + const AVIAMFLayer *layer = audio_element->layers[j]; + + if (layer->flags & AV_IAMF_LAYER_FLAG_RECON_GAIN) { + unsigned int recon_gain_flags = 0; + int k = 0; + + for (; k < 7; k++) + recon_gain_flags |= (1 << k) * !!recon->recon_gain[j][k]; + for (; k < 12; k++) + recon_gain_flags |= (2 << k) * !!recon->recon_gain[j][k]; + if (recon_gain_flags >> 8) + recon_gain_flags |= (1 << k); + + ffio_write_leb(dyn_bc, recon_gain_flags); + for (k = 0; k < 12; k++) { + if (recon->recon_gain[j][k]) + avio_w8(dyn_bc, recon->recon_gain[j][k]); + } + } + } + break; + } + default: + av_assert0(0); + } + } + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + ffio_write_leb(s->pb, dyn_size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static int iamf_write_packet(AVFormatContext *s, AVPacket *pkt) +{ + const IAMFMuxContext *const c = s->priv_data; + AVStream *st = s->streams[pkt->stream_index]; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *side_data, *dyn_buf = NULL; + unsigned int skip_samples = 0, discard_padding = 0; + size_t side_data_size; + int dyn_size, type = st->id <= 17 ? st->id + IAMF_OBU_IA_AUDIO_FRAME_ID0 : IAMF_OBU_IA_AUDIO_FRAME; + int ret; + + if (s->nb_stream_groups && st->id == c->first_stream_id) { + AVIAMFParamDefinition *mix = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, NULL); + AVIAMFParamDefinition *demix = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, NULL); + AVIAMFParamDefinition *recon = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, NULL); + + if (mix) { + ret = write_parameter_block(s, mix); + if (ret < 0) + return ret; + } + if (demix) { + ret = write_parameter_block(s, demix); + if (ret < 0) + return ret; + } + if (recon) { + ret = write_parameter_block(s, recon); + if (ret < 0) + return ret; + } + } + side_data = av_packet_get_side_data(pkt, AV_PKT_DATA_SKIP_SAMPLES, + &side_data_size); + + if (side_data && side_data_size >= 10) { + skip_samples = AV_RL32(side_data); + discard_padding = AV_RL32(side_data + 4); + } + + ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, type); + put_bits(&pb, 1, 0); // obu_redundant_copy + put_bits(&pb, 1, skip_samples || discard_padding); + put_bits(&pb, 1, 0); // obu_extension_flag + flush_put_bits(&pb); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + + if (skip_samples || discard_padding) { + ffio_write_leb(dyn_bc, discard_padding); + ffio_write_leb(dyn_bc, skip_samples); + } + + if (st->id > 17) + ffio_write_leb(dyn_bc, st->id); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + ffio_write_leb(s->pb, dyn_size + pkt->size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + avio_write(s->pb, pkt->data, pkt->size); + + return 0; +} + +static void iamf_deinit(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + IAMFAudioElement *audio_element = iamf->audio_elements[i]; + audio_element->element = NULL; + } + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + IAMFMixPresentation *mix_presentation = iamf->mix_presentations[i]; + mix_presentation->mix = NULL; + } + + ff_iamf_uninit_context(iamf); + + return; +} + +static const AVCodecTag iamf_codec_tags[] = { + { AV_CODEC_ID_AAC, MKTAG('m','p','4','a') }, + { AV_CODEC_ID_FLAC, MKTAG('f','L','a','C') }, + { AV_CODEC_ID_OPUS, MKTAG('O','p','u','s') }, + { AV_CODEC_ID_PCM_S16LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S16BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S24LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S24BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S32LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S32BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_NONE, MKTAG('i','p','c','m') } +}; + +const FFOutputFormat ff_iamf_muxer = { + .p.name = "iamf", + .p.long_name = NULL_IF_CONFIG_SMALL("Raw Immersive Audio Model and Formats"), + .p.extensions = "iamf", + .priv_data_size = sizeof(IAMFMuxContext), + .p.audio_codec = AV_CODEC_ID_OPUS, + .init = iamf_init, + .deinit = iamf_deinit, + .write_header = iamf_write_header, + .write_packet = iamf_write_packet, + .p.codec_tag = (const AVCodecTag* const []){ iamf_codec_tags, NULL }, + .p.flags = AVFMT_GLOBALHEADER | AVFMT_NOTIMESTAMPS, +};