From patchwork Sun Nov 26 01:28:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44802 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:bca6:b0:181:818d:5e7f with SMTP id fx38csp2452233pzb; Sat, 25 Nov 2023 17:30:50 -0800 (PST) X-Google-Smtp-Source: AGHT+IHIqMKYhaFtHDmp0385PpklGiFCPTcFLkKsub1HuVxa9QuwZFOp0ObWJWTG1bYcNJGDmgA/ X-Received: by 2002:a05:6402:cb5:b0:54b:5a6:d083 with SMTP id cn21-20020a0564020cb500b0054b05a6d083mr4333331edb.11.1700962250149; Sat, 25 Nov 2023 17:30:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700962250; cv=none; d=google.com; s=arc-20160816; b=KMGwGfa1NibMtY/39Oj9Id8WkDjNVwDbWqH79GFo0jNXL1j+mZ5BBzB72VF+PBxmum FyoH24+eUXR1SjX4PFHBHLGrorJ/qNrx9kblJh4wRmBhJ4FNZsOq+B76eHYNYG2QPMA8 2NLullNu7PamCj6VkXl+DciriUOxLP08Vlmx2dB6np1UWk9oJwL1hzS+8FYCukd+C0xl WqV7LMwv6D20RwJua2v0DUUUFgXDuqTzaa/f6aOrhzlDLg6byNzJJwABkYS1zIKLvvXq aY2s5Erynrh44n59pnJM/aQS/X4pflNV4UdL+mTtNP++QLNbTJlMzlO+maaYNyfGqClb HUeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=OdS1O/ztAaH+pOsJwVhZ2IqPjzd4hQW/i4EHhjz21aE=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=ybFH2A4frmR2HPi3gs80Qobc/PvRBONYajCQrmLi8ViR5K/WbWgaEAz3wXe4EcPKQP m87ip14qLBHQ+DAPdLTQNwbpmqFhnBXj4ITJZS8xkRAlz2es/O2VCIj8nC4awvueyTg4 /e97hmM1S4uDbCc3qIZDKeLyfyMewVNXeOgfVuG7kNr0FemadKBL8cL6zwReKvPj28N7 tldaT7jSstFv7hqsjMxaFgYyFptDJwpy+UaG6trm+e4W3rUSskq1rD6hMmcE9F1c0Z19 nEOzByFLqPifE7vHcgzpfvCyDOUX5KYJjJiFo12v8jpkxROD10Lrb/fQmxCKlM505r6u b3NQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=g72v9jM5; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id eb11-20020a0564020d0b00b00548c1b14bfdsi3471669edb.580.2023.11.25.17.30.49; Sat, 25 Nov 2023 17:30:50 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=g72v9jM5; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 1B5C868CF7D; Sun, 26 Nov 2023 03:29:41 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-il1-f182.google.com (mail-il1-f182.google.com [209.85.166.182]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id E588868CF65 for ; Sun, 26 Nov 2023 03:29:34 +0200 (EET) Received: by mail-il1-f182.google.com with SMTP id e9e14a558f8ab-35beca6d020so12696975ab.0 for ; Sat, 25 Nov 2023 17:29:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700962172; x=1701566972; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=OFlkkhP8mCUXIXYLjOGgRu8Fq+sXdOtGSbsos1HBtKA=; b=g72v9jM5wWob340TkaH0ftl3mCXkRZZ8eAYrVJnfYrVicm8SKEXMSkVrtvU5F+ZFiL NSj3oQEFw/cvPKBk8lFqarFd2lNw67X6Wn38dqAUmgCBx18zSukfJx4DHNEk5DrPhQJV OrVK+ghuYw+4rh+hkfB43BliV0mqCb3cVMLA62N4edFrCHbkkYHHaiZKv79J7MaZqBzD VQRwlzcO0CAqTXsQlZaJPOpAZfquTSJw9B3ol7HAt5mXac5abRH5UhDaDHFRFk3sDyBd yRCc5slZDeq1DWJgOJ55KQLKs1nIwexhlwNraAnwDv51MhlHLtisbejgNsOvF++jXzqj 52rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700962172; x=1701566972; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OFlkkhP8mCUXIXYLjOGgRu8Fq+sXdOtGSbsos1HBtKA=; b=mpriIh9yWw3y1qXb/B2Rn97fd55rdQ4zHbFKpH7juqq5PKVkXH6Uqr8/jvps4rg4lP 7I7oYzzIRSTV4BSY/8HNV2814nXWe+S03IxiO1ZGM3C4zX8m/n9cV6BOjM5nHT+qHoZt KVNa5UhTKj1Jfj6bjm5qy1le91HKG7xK8Yk5Zdbywh6vAHkF3owXe3OTTkLw06uQXsXB SAXklzeYGEVZ+Ij8/Ma9cROFKc3BUKq/SLxIyxnwE7z85kMPkzurp40YRzz8CA4eGaAY rN5q4NsxvER9HG6iy6ZPfXwPG7i95YKaiT3l1lazG/H9AkSGxPZXnnzLioxbViJsIiSV IAUg== X-Gm-Message-State: AOJu0YxPNQ6RrDyMi8HywhkoJplepWEmXKrEtm/KyG9FjXZKAypwn7Qb TPW9myJ6cSWXBQrJ9aYfpz9l5BjLBzc= X-Received: by 2002:a05:6e02:3108:b0:35c:9a43:6d6c with SMTP id bg8-20020a056e02310800b0035c9a436d6cmr3067303ilb.31.1700962172045; Sat, 25 Nov 2023 17:29:32 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id g3-20020a62e303000000b0068a13b0b300sm5049519pfh.11.2023.11.25.17.29.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 25 Nov 2023 17:29:31 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Sat, 25 Nov 2023 22:28:58 -0300 Message-ID: <20231126012858.40388-10-jamrial@gmail.com> X-Mailer: git-send-email 2.42.1 In-Reply-To: <20231126012858.40388-1-jamrial@gmail.com> References: <20231126012858.40388-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 9/9] avformat: Immersive Audio Model and Formats muxer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: yXcVzt6vwQAR Signed-off-by: James Almer --- libavformat/Makefile | 1 + libavformat/allformats.c | 1 + libavformat/iamfenc.c | 1091 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 1093 insertions(+) create mode 100644 libavformat/iamfenc.c diff --git a/libavformat/Makefile b/libavformat/Makefile index 752833f5a8..a90ced6dd2 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -259,6 +259,7 @@ OBJS-$(CONFIG_HLS_DEMUXER) += hls.o hls_sample_encryption.o OBJS-$(CONFIG_HLS_MUXER) += hlsenc.o hlsplaylist.o avc.o OBJS-$(CONFIG_HNM_DEMUXER) += hnm.o OBJS-$(CONFIG_IAMF_DEMUXER) += iamfdec.o iamf.o +OBJS-$(CONFIG_IAMF_MUXER) += iamfenc.o iamf.o OBJS-$(CONFIG_ICO_DEMUXER) += icodec.o OBJS-$(CONFIG_ICO_MUXER) += icoenc.o OBJS-$(CONFIG_IDCIN_DEMUXER) += idcin.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index 63ca44bacd..7529aed4a4 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -213,6 +213,7 @@ extern const AVInputFormat ff_hls_demuxer; extern const FFOutputFormat ff_hls_muxer; extern const AVInputFormat ff_hnm_demuxer; extern const AVInputFormat ff_iamf_demuxer; +extern const FFOutputFormat ff_iamf_muxer; extern const AVInputFormat ff_ico_demuxer; extern const FFOutputFormat ff_ico_muxer; extern const AVInputFormat ff_idcin_demuxer; diff --git a/libavformat/iamfenc.c b/libavformat/iamfenc.c new file mode 100644 index 0000000000..a53396a34d --- /dev/null +++ b/libavformat/iamfenc.c @@ -0,0 +1,1091 @@ +/* + * IAMF muxer + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/avassert.h" +#include "libavutil/common.h" +#include "libavutil/iamf.h" +#include "libavutil/internal.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/opt.h" +#include "libavcodec/get_bits.h" +#include "libavcodec/flac.h" +#include "libavcodec/mpeg4audio.h" +#include "libavcodec/put_bits.h" +#include "avformat.h" +#include "avio_internal.h" +#include "iamf.h" +#include "internal.h" +#include "mux.h" + +typedef struct IAMFMuxContext { + IAMFContext iamf; + + int first_stream_id; +} IAMFMuxContext; + +static int update_extradata(IAMFCodecConfig *codec_config) +{ + GetBitContext gb; + PutBitContext pb; + int ret; + + switch(codec_config->codec_id) { + case AV_CODEC_ID_OPUS: + if (codec_config->extradata_size < 19) + return AVERROR_INVALIDDATA; + codec_config->extradata_size -= 8; + memmove(codec_config->extradata, codec_config->extradata + 8, codec_config->extradata_size); + AV_WB8(codec_config->extradata + 1, 2); // set channels to stereo + break; + case AV_CODEC_ID_FLAC: { + uint8_t buf[13]; + + init_put_bits(&pb, buf, sizeof(buf)); + ret = init_get_bits8(&gb, codec_config->extradata, codec_config->extradata_size); + if (ret < 0) + return ret; + + put_bits32(&pb, get_bits_long(&gb, 32)); // min/max blocksize + put_bits64(&pb, 48, get_bits64(&gb, 48)); // min/max framesize + put_bits(&pb, 20, get_bits(&gb, 20)); // samplerate + skip_bits(&gb, 3); + put_bits(&pb, 3, 1); // set channels to stereo + ret = put_bits_left(&pb); + put_bits(&pb, ret, get_bits(&gb, ret)); + flush_put_bits(&pb); + + memcpy(codec_config->extradata, buf, sizeof(buf)); + break; + } + default: + break; + } + + return 0; +} + +static int fill_codec_config(const AVStreamGroup *stg, IAMFCodecConfig *codec_config) +{ + const AVIAMFAudioElement *iamf = stg->params.iamf_audio_element; + const AVStream *st = stg->streams[0]; + int ret; + + av_freep(&codec_config->extradata); + codec_config->extradata_size = 0; + + codec_config->codec_config_id = iamf->codec_config_id; + codec_config->codec_id = st->codecpar->codec_id; + codec_config->sample_rate = st->codecpar->sample_rate; + codec_config->codec_tag = st->codecpar->codec_tag; + codec_config->nb_samples = st->codecpar->frame_size; + codec_config->seek_preroll = st->codecpar->seek_preroll; + if (st->codecpar->extradata_size) { + codec_config->extradata = av_memdup(st->codecpar->extradata, st->codecpar->extradata_size); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + codec_config->extradata_size = st->codecpar->extradata_size; + ret = update_extradata(codec_config); + if (ret < 0) + return ret; + } + + return 0; +} + +static IAMFParamDefinition *get_param_definition(AVFormatContext *s, unsigned int parameter_id) +{ + const IAMFMuxContext *const c = s->priv_data; + const IAMFContext *const iamf = &c->iamf; + IAMFParamDefinition *param_definition = NULL; + + for (int i = 0; i < iamf->nb_param_definitions; i++) + if (iamf->param_definitions[i].param->parameter_id == parameter_id) { + param_definition = &iamf->param_definitions[i]; + break; + } + + return param_definition; +} + +static IAMFParamDefinition *add_param_definition(AVFormatContext *s, AVIAMFParamDefinition *param) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + IAMFParamDefinition *param_definition = av_dynarray2_add_nofree((void **)&iamf->param_definitions, + &iamf->nb_param_definitions, + sizeof(*iamf->param_definitions), NULL); + if (!param_definition) + return NULL; + param_definition->param = param; + param_definition->audio_element = NULL; + + return param_definition; +} + +static int iamf_init(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + int ret; + + if (!s->nb_streams) { + av_log(s, AV_LOG_ERROR, "There must be at least one stream\n"); + return AVERROR(EINVAL); + } + + for (int i = 0; i < s->nb_streams; i++) { + if (s->streams[i]->codecpar->codec_type != AVMEDIA_TYPE_AUDIO || + (s->streams[i]->codecpar->codec_tag != MKTAG('m','p','4','a') && + s->streams[i]->codecpar->codec_tag != MKTAG('O','p','u','s') && + s->streams[i]->codecpar->codec_tag != MKTAG('f','L','a','C') && + s->streams[i]->codecpar->codec_tag != MKTAG('i','p','c','m'))) { + av_log(s, AV_LOG_ERROR, "Unsupported codec id %s\n", + avcodec_get_name(s->streams[i]->codecpar->codec_id)); + return AVERROR(EINVAL); + } + + if (s->streams[i]->codecpar->ch_layout.nb_channels > 2) { + av_log(s, AV_LOG_ERROR, "Unsupported channel layout on stream #%d\n", i); + return AVERROR(EINVAL); + } + + for (int j = 0; j < i; j++) { + if (s->streams[i]->id == s->streams[j]->id) { + av_log(s, AV_LOG_ERROR, "Duplicated stream id %d\n", s->streams[j]->id); + return AVERROR(EINVAL); + } + } + } + + if (!s->nb_stream_groups) { + av_log(s, AV_LOG_ERROR, "There must be at least two stream groups\n"); + return AVERROR(EINVAL); + } + + for (int i = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + + if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + iamf->nb_audio_elements++; + if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + iamf->nb_mix_presentations++; + } + if ((iamf->nb_audio_elements < 1 && iamf->nb_audio_elements > 2) || iamf->nb_mix_presentations < 1) { + av_log(s, AV_LOG_ERROR, "There must be >= 1 and <= 2 IAMF_AUDIO_ELEMENT and at least " + "one IAMF_MIX_PRESENTATION stream groups\n"); + return AVERROR(EINVAL); + } + + iamf->audio_elements = av_calloc(iamf->nb_audio_elements, sizeof(*iamf->audio_elements)); + iamf->mix_presentations = av_calloc(iamf->nb_mix_presentations, sizeof(*iamf->mix_presentations)); + + if (!iamf->audio_elements || !iamf->mix_presentations) { + iamf->nb_audio_elements = iamf->nb_mix_presentations = 0; + return AVERROR(ENOMEM); + } + + for (int i = 0, idx = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + const AVIAMFAudioElement *iamf_audio_element; + IAMFAudioElement *audio_element; + IAMFCodecConfig *codec_config = NULL; + + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + continue; + + iamf_audio_element = stg->params.iamf_audio_element; + if (iamf_audio_element->audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE) { + const AVIAMFLayer *layer = iamf_audio_element->layers[0]; + if (iamf_audio_element->num_layers != 1) { + av_log(s, AV_LOG_ERROR, "Invalid amount of layers for SCENE_BASED audio element. Must be 1\n"); + return AVERROR(EINVAL); + } + if (layer->ch_layout.order != AV_CHANNEL_ORDER_CUSTOM && + layer->ch_layout.order != AV_CHANNEL_ORDER_AMBISONIC) { + av_log(s, AV_LOG_ERROR, "Invalid channel layout for SCENE_BASED audio element\n"); + return AVERROR(EINVAL); + } + if (layer->ambisonics_mode >= AV_IAMF_AMBISONICS_MODE_PROJECTION) { + av_log(s, AV_LOG_ERROR, "Unsuported ambisonics mode %d\n", layer->ambisonics_mode); + return AVERROR_PATCHWELCOME; + } + for (int j = 0; j < stg->nb_streams; j++) { + if (stg->streams[j]->codecpar->ch_layout.nb_channels > 1) { + av_log(s, AV_LOG_ERROR, "Invalid amount of channels in a stream for MONO mode ambisonics\n"); + return AVERROR(EINVAL); + } + } + } else + for (int k, j = 0; j < iamf_audio_element->num_layers; j++) { + const AVIAMFLayer *layer = iamf_audio_element->layers[j]; + for (k = 0; k < FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts); k++) + if (!av_channel_layout_compare(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[k])) + break; + + if (k >= FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts)) { + av_log(s, AV_LOG_ERROR, "Unsupported channel layout in stream group #%d\n", i); + return AVERROR(EINVAL); + } + } + + for (int j = 0; j < iamf->nb_codec_configs; j++) { + if (iamf->codec_configs[j].codec_config_id == iamf_audio_element->codec_config_id) { + codec_config = &iamf->codec_configs[j]; + break; + } + } + + if (!codec_config) { + codec_config = av_dynarray2_add_nofree((void **)&iamf->codec_configs, &iamf->nb_codec_configs, + sizeof(*iamf->codec_configs), NULL); + if (!codec_config) + return AVERROR(ENOMEM); + memset(codec_config, 0, sizeof(*codec_config)); + + } + + ret = fill_codec_config(stg, codec_config); + if (ret < 0) + return ret; + + for (int j = 0; j < idx; j++) { + if (stg->id == iamf->audio_elements[j].audio_element_id) { + av_log(s, AV_LOG_ERROR, "Duplicated Audio Element id %"PRId64"\n", stg->id); + return AVERROR(EINVAL); + } + } + + audio_element = &iamf->audio_elements[idx++]; + audio_element->element = stg->params.iamf_audio_element; + audio_element->audio_element_id = stg->id; + audio_element->codec_config = codec_config; + + audio_element->substreams = av_calloc(stg->nb_streams, sizeof(*audio_element->substreams)); + if (!audio_element->substreams) + return AVERROR(ENOMEM); + audio_element->nb_substreams = stg->nb_streams; + + for (int j = 0, k = 0; j < iamf_audio_element->num_layers; j++) { + IAMFLayer *layer = av_dynarray2_add_nofree((void **)&audio_element->layers, &audio_element->nb_layers, + sizeof(*audio_element->layers), NULL); + int nb_channels = iamf_audio_element->layers[j]->ch_layout.nb_channels; + + if (!layer) + return AVERROR(ENOMEM); + memset(layer, 0, sizeof(*layer)); + + if (j) + nb_channels -= iamf_audio_element->layers[j - 1]->ch_layout.nb_channels; + for (; nb_channels > 0 && k < stg->nb_streams; k++) { + const AVStream *st = stg->streams[k]; + IAMFSubStream *substream = &audio_element->substreams[k]; + + substream->audio_substream_id = st->id; + layer->substream_count++; + layer->coupled_substream_count += st->codecpar->ch_layout.nb_channels == 2; + nb_channels -= st->codecpar->ch_layout.nb_channels; + } + if (nb_channels) { + av_log(s, AV_LOG_ERROR, "Invalid channel count across substreams in layer %u from stream group %u\n", + j, stg->index); + return AVERROR(EINVAL); + } + } + + if (iamf_audio_element->demixing_info) { + AVIAMFParamDefinition *param = iamf_audio_element->demixing_info; + IAMFParamDefinition *param_definition = get_param_definition(s, param->parameter_id); + + if (param->num_subblocks != 1) { + av_log(s, AV_LOG_ERROR, "num_subblocks in demixing_info for stream group %u is not 1\n", stg->index); + return AVERROR(EINVAL); + } + if (!param_definition) { + param_definition = add_param_definition(s, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + param_definition->audio_element = iamf_audio_element; + } + if (iamf_audio_element->recon_gain_info) { + AVIAMFParamDefinition *param = iamf_audio_element->recon_gain_info; + IAMFParamDefinition *param_definition = get_param_definition(s, param->parameter_id); + + if (param->num_subblocks != 1) { + av_log(s, AV_LOG_ERROR, "num_subblocks in recon_gain_info for stream group %u is not 1\n", stg->index); + return AVERROR(EINVAL); + } + + if (!param_definition) { + param_definition = add_param_definition(s, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + param_definition->audio_element = iamf_audio_element; + } + } + + for (int i = 0, idx = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + IAMFMixPresentation *mix_presentation; + + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + continue; + + for (int j = 0; j < idx; j++) { + if (stg->id == iamf->mix_presentations[j].mix_presentation_id) { + av_log(s, AV_LOG_ERROR, "Duplicate Mix Presentation id %"PRId64"\n", stg->id); + return AVERROR(EINVAL); + } + } + + mix_presentation = &iamf->mix_presentations[idx++]; + mix_presentation->mix = stg->params.iamf_mix_presentation; + mix_presentation->mix_presentation_id = stg->id; + + for (int i = 0; i < mix_presentation->mix->num_submixes; i++) { + const AVIAMFSubmix *submix = mix_presentation->mix->submixes[i]; + AVIAMFParamDefinition *param = submix->output_mix_config; + IAMFParamDefinition *param_definition; + + if (!param) { + av_log(s, AV_LOG_ERROR, "output_mix_config is not present in submix %u from Mix Presentation ID %"PRId64"\n", i, stg->id); + return AVERROR(EINVAL); + } + + param_definition = get_param_definition(s, param->parameter_id); + if (!param_definition) { + param_definition = add_param_definition(s, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + + for (int j = 0; j < submix->num_elements; j++) { + const AVIAMFAudioElement *iamf_audio_element = NULL; + const AVIAMFSubmixElement *element = submix->elements[j]; + param = element->element_mix_config; + + if (!param) { + av_log(s, AV_LOG_ERROR, "element_mix_config is not present for element %u in submix %u from Mix Presentation ID %"PRId64"\n", j, i, stg->id); + return AVERROR(EINVAL); + } + param_definition = get_param_definition(s, param->parameter_id); + if (!param_definition) { + param_definition = add_param_definition(s, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + for (int k = 0; k < iamf->nb_audio_elements; k++) + if (iamf->audio_elements[k].audio_element_id == element->audio_element_id) { + iamf_audio_element = iamf->audio_elements[k].element; + break; + } + param_definition->audio_element = iamf_audio_element; + } + } + } + + c->first_stream_id = s->streams[0]->id; + + return 0; +} + +static int iamf_write_codec_config(AVFormatContext *s, const IAMFCodecConfig *codec_config) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + PutBitContext pb; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + ffio_write_leb(dyn_bc, codec_config->codec_config_id); + avio_wl32(dyn_bc, codec_config->codec_tag); + + ffio_write_leb(dyn_bc, codec_config->nb_samples); + avio_wb16(dyn_bc, codec_config->seek_preroll); + + switch(codec_config->codec_id) { + case AV_CODEC_ID_OPUS: + avio_write(dyn_bc, codec_config->extradata, codec_config->extradata_size); + break; + case AV_CODEC_ID_AAC: + return AVERROR_PATCHWELCOME; + case AV_CODEC_ID_FLAC: + avio_w8(dyn_bc, 0x80); + avio_wb24(dyn_bc, codec_config->extradata_size); + avio_write(dyn_bc, codec_config->extradata, codec_config->extradata_size); + break; + case AV_CODEC_ID_PCM_S16LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 16); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S24LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 24); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S32LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 32); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S16BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 16); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S24BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 24); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S32BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 32); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + default: + break; + } + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_CODEC_CONFIG); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + ffio_write_leb(s->pb, dyn_size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static inline int rescale_rational(AVRational q, int b) +{ + return av_clip_int16(av_rescale(q.num, b, q.den)); +} + +static int scalable_channel_layout_config(AVFormatContext *s, AVIOContext *dyn_bc, + const IAMFAudioElement *audio_element) +{ + const AVIAMFAudioElement *element = audio_element->element; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pb; + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 3, element->num_layers); + put_bits(&pb, 5, 0); + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + for (int i = 0; i < element->num_layers; i++) { + AVIAMFLayer *layer = element->layers[i]; + int layout; + for (layout = 0; layout < FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts); layout++) { + if (!av_channel_layout_compare(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[layout])) + break; + } + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 4, layout); + put_bits(&pb, 1, !!layer->output_gain_flags); + put_bits(&pb, 1, layer->recon_gain_is_present); + put_bits(&pb, 2, 0); // reserved + put_bits(&pb, 8, audio_element->layers[i].substream_count); + put_bits(&pb, 8, audio_element->layers[i].coupled_substream_count); + // av_log(s, AV_LOG_WARNING, "k %d, substream_count %d, coupled_substream_count %d\n", k, layer->substream_count, coupled_substream_count); + if (layer->output_gain_flags) { + put_bits(&pb, 6, layer->output_gain_flags); + put_bits(&pb, 2, 0); + put_bits(&pb, 16, rescale_rational(layer->output_gain, 1 << 8)); + } + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + } + + return 0; +} + +static int ambisonics_config(AVFormatContext *s, AVIOContext *dyn_bc, + const IAMFAudioElement *audio_element) +{ + const AVIAMFAudioElement *element = audio_element->element; + AVIAMFLayer *layer = element->layers[0]; + + ffio_write_leb(dyn_bc, 0); // ambisonics_mode + ffio_write_leb(dyn_bc, layer->ch_layout.nb_channels); // output_channel_count + ffio_write_leb(dyn_bc, audio_element->nb_substreams); // substream_count + + if (layer->ch_layout.order == AV_CHANNEL_ORDER_AMBISONIC) + for (int i = 0; i < layer->ch_layout.nb_channels; i++) + avio_w8(dyn_bc, i); + else + for (int i = 0; i < layer->ch_layout.nb_channels; i++) + avio_w8(dyn_bc, layer->ch_layout.u.map[i].id); + + return 0; +} + +static int param_definition(AVFormatContext *s, AVIOContext *dyn_bc, + AVIAMFParamDefinition *param) +{ + ffio_write_leb(dyn_bc, param->parameter_id); + ffio_write_leb(dyn_bc, param->parameter_rate); + avio_w8(dyn_bc, !!param->param_definition_mode << 7); + if (!param->param_definition_mode) { + ffio_write_leb(dyn_bc, param->duration); + ffio_write_leb(dyn_bc, param->constant_subblock_duration); + if (param->constant_subblock_duration == 0) { + ffio_write_leb(dyn_bc, param->num_subblocks); + for (int i = 0; i < param->num_subblocks; i++) { + const void *subblock = av_iamf_param_definition_get_subblock(param, i); + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + const AVIAMFMixGainParameterData *mix = subblock; + ffio_write_leb(dyn_bc, mix->subblock_duration); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + const AVIAMFDemixingInfoParameterData *demix = subblock; + ffio_write_leb(dyn_bc, demix->subblock_duration); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + const AVIAMFReconGainParameterData *recon = subblock; + ffio_write_leb(dyn_bc, recon->subblock_duration); + break; + } + } + } + } + } + + return 0; +} + +static int iamf_write_audio_element(AVFormatContext *s, const IAMFAudioElement *audio_element) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + const AVIAMFAudioElement *element = audio_element->element; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + PutBitContext pb; + int param_definition_types = AV_IAMF_PARAMETER_DEFINITION_DEMIXING, dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + ffio_write_leb(dyn_bc, audio_element->audio_element_id); + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 3, element->audio_element_type); + put_bits(&pb, 5, 0); + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + + ffio_write_leb(dyn_bc, audio_element->codec_config->codec_config_id); + ffio_write_leb(dyn_bc, audio_element->nb_substreams); + + for (int i = 0; i < audio_element->nb_substreams; i++) + ffio_write_leb(dyn_bc, audio_element->substreams[i].audio_substream_id); + + if (audio_element->nb_layers == 1) + param_definition_types &= ~AV_IAMF_PARAMETER_DEFINITION_DEMIXING; + if (audio_element->nb_layers > 1) + param_definition_types |= AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN; + if (audio_element->codec_config->codec_tag == MKTAG('f','L','a','C') || + audio_element->codec_config->codec_tag == MKTAG('i','p','c','m')) + param_definition_types &= ~AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN; + + ffio_write_leb(dyn_bc, av_popcount(param_definition_types)); // num_parameters + + if (param_definition_types & 1) { + AVIAMFParamDefinition *param = element->demixing_info; + const AVIAMFDemixingInfoParameterData *demix; + + if (!param) { + av_log(s, AV_LOG_ERROR, "demixing_info needed but not set in Stream Group #%u\n", + audio_element->audio_element_id); + return AVERROR(EINVAL); + } + + demix = av_iamf_param_definition_get_subblock(param, 0); + ffio_write_leb(dyn_bc, AV_IAMF_PARAMETER_DEFINITION_DEMIXING); // param_definition_type + param_definition(s, dyn_bc, param); + + avio_w8(dyn_bc, demix->dmixp_mode << 5); // dmixp_mode + avio_w8(dyn_bc, element->default_w << 4); // default_w + } + if (param_definition_types & 2) { + AVIAMFParamDefinition *param = element->recon_gain_info; + + if (!param) { + av_log(s, AV_LOG_ERROR, "recon_gain_info needed but not set in Stream Group #%u\n", + audio_element->audio_element_id); + return AVERROR(EINVAL); + } + ffio_write_leb(dyn_bc, AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN); // param_definition_type + param_definition(s, dyn_bc, param); + } + + if (element->audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) { + ret = scalable_channel_layout_config(s, dyn_bc, audio_element); + if (ret < 0) + return ret; + } else { + ret = ambisonics_config(s, dyn_bc, audio_element); + if (ret < 0) + return ret; + } + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_AUDIO_ELEMENT); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + ffio_write_leb(s->pb, dyn_size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static int iamf_write_mixing_presentation(AVFormatContext *s, const IAMFMixPresentation *mix_presentation) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + const AVIAMFMixPresentation *mix = mix_presentation->mix; + const AVDictionaryEntry *tag = NULL; + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + ffio_write_leb(dyn_bc, mix_presentation->mix_presentation_id); // mix_presentation_id + ffio_write_leb(dyn_bc, av_dict_count(mix->annotations)); // count_label + + while ((tag = av_dict_iterate(mix->annotations, tag))) + avio_put_str(dyn_bc, tag->key); + while ((tag = av_dict_iterate(mix->annotations, tag))) + avio_put_str(dyn_bc, tag->value); + + ffio_write_leb(dyn_bc, mix->num_submixes); + for (int i = 0; i < mix->num_submixes; i++) { + const AVIAMFSubmix *sub_mix = mix->submixes[i]; + + ffio_write_leb(dyn_bc, sub_mix->num_elements); + for (int j = 0; j < sub_mix->num_elements; j++) { + const IAMFAudioElement *audio_element = NULL; + const AVIAMFSubmixElement *submix_element = sub_mix->elements[j]; + + for (int k = 0; k < iamf->nb_audio_elements; k++) + if (iamf->audio_elements[k].audio_element_id == submix_element->audio_element_id) { + audio_element = &iamf->audio_elements[k]; + break; + } + + av_assert0(audio_element); + ffio_write_leb(dyn_bc, submix_element->audio_element_id); + + if (av_dict_count(submix_element->annotations) != av_dict_count(mix->annotations)) { + av_log(s, AV_LOG_ERROR, "Inconsistent amount of labels in submix %d from Mix Presentation id #%u\n", + j, audio_element->audio_element_id); + return AVERROR(EINVAL); + } + while ((tag = av_dict_iterate(submix_element->annotations, tag))) + avio_put_str(dyn_bc, tag->value); + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 2, submix_element->headphones_rendering_mode); + put_bits(&pb, 6, 0); // reserved + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + ffio_write_leb(dyn_bc, 0); // rendering_config_extension_size + param_definition(s, dyn_bc, submix_element->element_mix_config); + avio_wb16(dyn_bc, rescale_rational(submix_element->default_mix_gain, 1 << 8)); + } + param_definition(s, dyn_bc, sub_mix->output_mix_config); + avio_wb16(dyn_bc, rescale_rational(sub_mix->default_mix_gain, 1 << 8)); + + ffio_write_leb(dyn_bc, sub_mix->num_layouts); // num_layouts + for (int i = 0; i < sub_mix->num_layouts; i++) { + const AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[i]; + int layout, info_type; + int dialogue = submix_layout->dialogue_anchored_loudness.num && + submix_layout->dialogue_anchored_loudness.den; + int album = submix_layout->album_anchored_loudness.num && + submix_layout->album_anchored_loudness.den; + + if (layout == FF_ARRAY_ELEMS(ff_iamf_sound_system_map)) { + av_log(s, AV_LOG_ERROR, "Invalid Sound System value in a submix\n"); + return AVERROR(EINVAL); + } + + if (submix_layout->layout_type == AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS) { + for (layout = 0; layout < FF_ARRAY_ELEMS(ff_iamf_sound_system_map); layout++) { + if (!av_channel_layout_compare(&submix_layout->sound_system, &ff_iamf_sound_system_map[layout].layout)) + break; + } + if (layout == FF_ARRAY_ELEMS(ff_iamf_sound_system_map)) { + av_log(s, AV_LOG_ERROR, "Invalid Sound System value in a submix\n"); + return AVERROR(EINVAL); + } + } + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 2, submix_layout->layout_type); // layout_type + if (submix_layout->layout_type == AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS) { + put_bits(&pb, 4, ff_iamf_sound_system_map[layout].id); // sound_system + put_bits(&pb, 2, 0); // reserved + } else + put_bits(&pb, 6, 0); // reserved + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + + info_type = (submix_layout->true_peak.num && submix_layout->true_peak.den); + info_type |= (dialogue || album) << 1; + avio_w8(dyn_bc, info_type); + avio_wb16(dyn_bc, rescale_rational(submix_layout->integrated_loudness, 1 << 8)); + avio_wb16(dyn_bc, rescale_rational(submix_layout->digital_peak, 1 << 8)); + if (info_type & 1) + avio_wb16(dyn_bc, rescale_rational(submix_layout->true_peak, 1 << 8)); + if (info_type & 2) { + avio_w8(dyn_bc, dialogue + album); // num_anchored_loudness + if (dialogue) { + avio_w8(dyn_bc, IAMF_ANCHOR_ELEMENT_DIALOGUE); + avio_wb16(dyn_bc, rescale_rational(submix_layout->dialogue_anchored_loudness, 1 << 8)); + } + if (album) { + avio_w8(dyn_bc, IAMF_ANCHOR_ELEMENT_ALBUM); + avio_wb16(dyn_bc, rescale_rational(submix_layout->album_anchored_loudness, 1 << 8)); + } + } + } + } + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_MIX_PRESENTATION); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + ffio_write_leb(s->pb, dyn_size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static int iamf_write_header(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + // Sequence Header + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_SEQUENCE_HEADER); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + ffio_write_leb(dyn_bc, 6); + avio_wb32(dyn_bc, MKBETAG('i','a','m','f')); + avio_w8(dyn_bc, iamf->nb_audio_elements > 1); // primary_profile + avio_w8(dyn_bc, iamf->nb_audio_elements > 1); // additional_profile + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + for (int i; i < iamf->nb_codec_configs; i++) { + ret = iamf_write_codec_config(s, &iamf->codec_configs[i]); + if (ret < 0) + return ret; + } + + for (int i; i < iamf->nb_audio_elements; i++) { + ret = iamf_write_audio_element(s, &iamf->audio_elements[i]); + if (ret < 0) + return ret; + } + + for (int i; i < iamf->nb_mix_presentations; i++) { + ret = iamf_write_mixing_presentation(s, &iamf->mix_presentations[i]); + if (ret < 0) + return ret; + } + + return 0; +} + +static int write_parameter_block(AVFormatContext *s, AVIAMFParamDefinition *param) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + IAMFParamDefinition *param_definition = get_param_definition(s, param->parameter_id); + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size, ret; + + if (param->param_definition_type > AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) { + av_log(s, AV_LOG_DEBUG, "Ignoring side data with unknown param_definition_type %u\n", + param->param_definition_type); + return 0; + } + + if (!param_definition) { + av_log(s, AV_LOG_ERROR, "Non-existent Parameter Definition with ID %u referenced by a packet\n", + param->parameter_id); + return AVERROR(EINVAL); + } + + if (param->param_definition_type != param_definition->param->param_definition_type || + param->param_definition_mode != param_definition->param->param_definition_mode) { + av_log(s, AV_LOG_ERROR, "Inconsistent param_definition_mode or param_definition_type values " + "for Parameter Definition with ID %u in a packet\n", + param->parameter_id); + return AVERROR(EINVAL); + } + + ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + // Sequence Header + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_PARAMETER_BLOCK); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + + ffio_write_leb(dyn_bc, param->parameter_id); + if (param->param_definition_mode) { + ffio_write_leb(dyn_bc, param->duration); + ffio_write_leb(dyn_bc, param->constant_subblock_duration); + if (param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, param->num_subblocks); + } + + for (int i = 0; i < param->num_subblocks; i++) { + const void *subblock = av_iamf_param_definition_get_subblock(param, i); + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + const AVIAMFMixGainParameterData *mix = subblock; + if (param->param_definition_mode && param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, mix->subblock_duration); + + ffio_write_leb(dyn_bc, mix->animation_type); + + avio_wb16(dyn_bc, rescale_rational(mix->start_point_value, 1 << 8)); + if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR) + avio_wb16(dyn_bc, rescale_rational(mix->end_point_value, 1 << 8)); + if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) { + avio_wb16(dyn_bc, rescale_rational(mix->control_point_value, 1 << 8)); + avio_w8(dyn_bc, mix->control_point_relative_time); + } + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + const AVIAMFDemixingInfoParameterData *demix = subblock; + if (param->param_definition_mode && param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, demix->subblock_duration); + + avio_w8(dyn_bc, demix->dmixp_mode << 5); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + const AVIAMFReconGainParameterData *recon = subblock; + const AVIAMFAudioElement *audio_element = param_definition->audio_element; + + if (param->param_definition_mode && param->constant_subblock_duration == 0) + ffio_write_leb(dyn_bc, recon->subblock_duration); + + if (!audio_element) { + av_log(s, AV_LOG_ERROR, "Invalid Parameter Definition with ID %u referenced by a packet\n", param->parameter_id); + return AVERROR(EINVAL); + } + + for (int j = 0; j < audio_element->num_layers; j++) { + const AVIAMFLayer *layer = audio_element->layers[j]; + + if (layer->recon_gain_is_present) { + unsigned int recon_gain_flags = 0; + int k = 0; + + for (; k < 7; k++) + recon_gain_flags |= (1 << k) * !!recon->recon_gain[j][k]; + for (; k < 12; k++) + recon_gain_flags |= (2 << k) * !!recon->recon_gain[j][k]; + if (recon_gain_flags >> 8) + recon_gain_flags |= (1 << k); + + ffio_write_leb(dyn_bc, recon_gain_flags); + for (k = 0; k < 12; k++) { + if (recon->recon_gain[j][k]) + avio_w8(dyn_bc, recon->recon_gain[j][k]); + } + } + } + break; + } + default: + av_assert0(0); + } + } + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + ffio_write_leb(s->pb, dyn_size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static int iamf_write_packet(AVFormatContext *s, AVPacket *pkt) +{ + const IAMFMuxContext *const c = s->priv_data; + AVStream *st = s->streams[pkt->stream_index]; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size; + int ret, type = st->id <= 17 ? st->id + IAMF_OBU_IA_AUDIO_FRAME_ID0 : IAMF_OBU_IA_AUDIO_FRAME; + + if (s->nb_stream_groups && st->id == c->first_stream_id) { + AVIAMFParamDefinition *mix = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, NULL); + AVIAMFParamDefinition *demix = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, NULL); + AVIAMFParamDefinition *recon = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, NULL); + + if (mix) { + ret = write_parameter_block(s, mix); + if (ret < 0) + return ret; + } + if (demix) { + ret = write_parameter_block(s, demix); + if (ret < 0) + return ret; + } + if (recon) { + ret = write_parameter_block(s, recon); + if (ret < 0) + return ret; + } + } + + ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, type); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + + if (st->id > 17) + ffio_write_leb(dyn_bc, st->id); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + ffio_write_leb(s->pb, dyn_size + pkt->size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + avio_write(s->pb, pkt->data, pkt->size); + + return 0; +} + +static void iamf_deinit(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + IAMFAudioElement *audio_element = &iamf->audio_elements[i]; + audio_element->element = NULL; + } + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + IAMFMixPresentation *mix_presentation = &iamf->mix_presentations[i]; + mix_presentation->mix = NULL; + } + + ff_iamf_uninit_context(iamf); + + return; +} + +static const AVCodecTag iamf_codec_tags[] = { + { AV_CODEC_ID_AAC, MKTAG('m','p','4','a') }, + { AV_CODEC_ID_FLAC, MKTAG('f','L','a','C') }, + { AV_CODEC_ID_OPUS, MKTAG('O','p','u','s') }, + { AV_CODEC_ID_PCM_S16LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S16BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S24LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S24BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S32LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S32BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_NONE, MKTAG('i','p','c','m') } +}; + +const FFOutputFormat ff_iamf_muxer = { + .p.name = "iamf", + .p.long_name = NULL_IF_CONFIG_SMALL("Raw Immersive Audio Model and Formats"), + .p.extensions = "iamf", + .priv_data_size = sizeof(IAMFMuxContext), + .p.audio_codec = AV_CODEC_ID_OPUS, + .init = iamf_init, + .deinit = iamf_deinit, + .write_header = iamf_write_header, + .write_packet = iamf_write_packet, + .p.codec_tag = (const AVCodecTag* const []){ iamf_codec_tags, NULL }, + .p.flags = AVFMT_GLOBALHEADER | AVFMT_NOTIMESTAMPS, +};