From patchwork Fri Dec 15 21:28:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 45171 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:1225:b0:181:818d:5e7f with SMTP id v37csp5954714pzf; Fri, 15 Dec 2023 13:29:03 -0800 (PST) X-Google-Smtp-Source: AGHT+IFFUQFUlrOhJwy9UDO2UaB7HUcaFjBTkX7KD/L/kLN0EKvLmb4kzBc1y0Ww5T+Tt4LY1OEs X-Received: by 2002:a17:906:14d:b0:a19:a19b:55cd with SMTP id 13-20020a170906014d00b00a19a19b55cdmr7082241ejh.93.1702675743350; Fri, 15 Dec 2023 13:29:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702675743; cv=none; d=google.com; s=arc-20160816; b=y7vABvFHFL1ydj2bOQ9VrFZqJ6mP+QHtX/LyfvVULXL3k5b5MW0md3NZYT0BN+QJe7 D2eaKc/ARXlXQftcrEiVnlCzk6Hta78XGrCG5y8YZK1ozgztad29m4+bA9TeJ1JiXGwk A+53ve2INJDPFDvb7BzOkgjIK8s935QLUFRzkwU7yEeW0DJr7z5nlPgCiJlhVCGmpteH v35Gi43kIEdF9+FGP7670Niz+GiyZjCOHBU9zQjPajnjSEDbZKe7f+BWL++4rJvvvZdg OIsJ7pvwE4z5AeOuBiddC85ooGYHBLnZMOXaeeBcJZ0CtlBTisKkOf38GFOc0sRRnLtb /4Aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=MD6ZZVJ7RSHXclXvkKuWHD/uMKcUFmv2bidTG/mrH9A=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=CjMJRtN5VOBOCM+BHv02JPLt9cD9Lrftt5a9WUQKFbGIn+l40YRsoEqt8R7J7eJ3Gl kii60b+1jeGO27bg+VXR4Hkma57NhLr2rELPE01gR7d/x8ihCZxZ8Kc1+MCYSn4L/R2G 9wtwq278O3QohS3XXpoNIANtQBz75xe3p4soXgCWEQ+Fi45CLwBNd4URbLZdNyZ8vNre u0k/+ydlZfHqfdLHl9GPgU+vQWDBuhxH+f9oLlz50HCKmFcxaDH9lrog5agH1qipzkGj bKQSgzIj9l+hwP1I6HqGtxV+en5jlBYNQCRBHp1f/GYLEjx8o0Y2KWE27kx3u76rbBeo 4Cbg== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=UCQY1Q0D; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id f24-20020a170906495800b00a1b80a28871si7875679ejt.350.2023.12.15.13.29.02; Fri, 15 Dec 2023 13:29:03 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=UCQY1Q0D; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 85B2468D0E1; Fri, 15 Dec 2023 23:28:59 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 1164168CFFA for ; Fri, 15 Dec 2023 23:28:53 +0200 (EET) Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-1d336760e72so9982275ad.3 for ; Fri, 15 Dec 2023 13:28:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702675730; x=1703280530; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=2By4LVZ63oJUsGw995pX5NFFV88X3b30JEWRzL7e5q0=; b=UCQY1Q0DwllBaU49qlLrp3baYBK/UU0Fy+9TFNdhSeF7NrePx387tk4yMMval3Q34z TOPSYhaXbAVYAk316CgHhyZIYv9Yqov+120R6RK47pmDDE/azHQN4Em9FTDi7KrDUPDT r9gTZMPl2LLQ20JTufaJNEyJp+6gjBP1kcQTAG2YQ0G7vqj13ORM9bN3ObL1SXXC6U8M DSgJIx+bbwTveIbCeE6yOu7VOJr/+GWzTXCGTiml95jUyV2LQG6TXGzG+bYDWK0GXJWu cSY8KKx4LX1mnV2YbiDw9mSr6wNtvW76ms8WnDe7EvURdcpwxKW1ui5FIs8RMNrKz0TX FW5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702675730; x=1703280530; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2By4LVZ63oJUsGw995pX5NFFV88X3b30JEWRzL7e5q0=; b=vmgbkfSYWDZIf2SzEWeBeEN7DwhQEPXBl5jKFEKnH4fdnjqutjtRBRJB5R09VAXEwk 2soaEK93qXOQj86tbFen2Olc0kCsUWT6s2VkRNwojB7APMZRYqiTlOvfinVBUDjnsx6L THYNMQhREOn5sbHWYdOdWDDiV7DWgt/xdJz1AlqndRhQahJMnOaNEs2OopH1OnYHhUY2 FqKg98H/DH3tAhEVtb6T2sQethbPjtInL5xzEwwQD/CzK3eqerrJa9mE15UaTSBYDqCx FUaSG7LtuBnvi/L0seViey0pU3ZhuyHFLKEyAIIJICuIZL7P7Bs4NhtbAhoxajBgfiyc Ab4w== X-Gm-Message-State: AOJu0YyaqAURt6TbobFHnwiYCKCLxarPY4IuFMao6MxsE3T/RMBSCf7h Vgd+xqrN0StTRl33Jv5+V3lyV3upXOU= X-Received: by 2002:a17:903:2448:b0:1d0:6ffe:9f6 with SMTP id l8-20020a170903244800b001d06ffe09f6mr13769370pls.84.1702675729891; Fri, 15 Dec 2023 13:28:49 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id r13-20020a17090ad40d00b0028b03f9107asm4116972pju.55.2023.12.15.13.28.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Dec 2023 13:28:49 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Fri, 15 Dec 2023 18:28:37 -0300 Message-ID: <20231215212837.1395-1-jamrial@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231214201433.4608-4-jamrial@gmail.com> References: <20231214201433.4608-4-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/8] ffmpeg: add support for muxing AVStreamGroups X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ELtSnpvhx7nC Starting with IAMF support. Signed-off-by: James Almer --- doc/ffmpeg.texi | 200 ++++++++++++++++++++++ fftools/ffmpeg.h | 2 + fftools/ffmpeg_mux_init.c | 342 ++++++++++++++++++++++++++++++++++++++ fftools/ffmpeg_opt.c | 2 + 4 files changed, 546 insertions(+) diff --git a/doc/ffmpeg.texi b/doc/ffmpeg.texi index c503963941..1fadb20686 100644 --- a/doc/ffmpeg.texi +++ b/doc/ffmpeg.texi @@ -623,6 +623,206 @@ Not all muxers support embedded thumbnails, and those who do, only support a few Creates a program with the specified @var{title}, @var{program_num} and adds the specified @var{stream}(s) to it. +@item -stream_group type=@var{type}:st=@var{stream}[:st=@var{stream}][:stg=@var{stream_group}][:id=@var{stream_group_id}...] (@emph{output}) + +Creates a stream group of the specified @var{type}, @var{stream_group_id} and adds the specified +@var{stream}(s) and/or previously defined @var{stream_group}(s) to it. + +@var{type} can be one of the following: +@table @option + +@item iamf_audio_element +Groups @var{stream}s that belong to the same IAMF Audio Element + +For this group @var{type}, the following options are available +@table @option +@item audio_element_type +The Audio Element type. The following values are supported: + +@table @option +@item channel +Scalable channel audio representation +@item scene +Ambisonics representation +@end table + +@item demixing +Demixing information used to reconstruct a scalable channel audio representation. +This option must be separated from the rest with a ',', and takes the following +key=value options + +@table @option +@item parameter_id +An identifier parameters blocks in frames may refer to +@item dmixp_mode +A pre-defined combination of demixing parameters +@end table + +@item recon_gain +Recon gain information used to reconstruct a scalable channel audio representation. +This option must be separated from the rest with a ',', and takes the following +key=value options + +@table @option +@item parameter_id +An identifier parameters blocks in frames may refer to +@end table + +@item layer +A layer defining a Channel Layout in the Audio Element. +This option must be separated from the rest with a ','. Several ',' separated entries +can be defined, and at least one must be set. + +It takes the following ":"-separated key=value options + +@table @option +@item ch_layout +The layer's channel layout +@item flags +The following flags are available: + +@table @option +@item recon_gain +Wether to signal if recon_gain is present as metadata in parameter blocks within frames +@end table + +@item output_gain +@item output_gain_flags +Which channels output_gain applies to. The following flags are available: + +@table @option +@item FL +@item FR +@item BL +@item BR +@item TFL +@item TFR +@end table + +@item ambisonics_mode +The ambisonics mode. This has no effect if audio_element_type is set to channel. + +The following values are supported: + +@table @option +@item mono +Each ambisonics channel is coded as an individual mono stream in the group +@end table + +@end table + +@item default_w +Default weight value + +@end table + +@item iamf_mix_presentation +Groups @var{stream}s that belong to all IAMF Audio Element the same +IAMF Mix Presentation references + +For this group @var{type}, the following options are available + +@table @option +@item submix +A sub-mix within the Mix Presentation. +This option must be separated from the rest with a ','. Several ',' separated entries +can be defined, and at least one must be set. + +It takes the following ":"-separated key=value options + +@table @option +@item parameter_id +An identifier parameters blocks in frames may refer to, for post-processing the mixed +audio signal to generate the audio signal for playback +@item parameter_rate +The sample rate duration fields in parameters blocks in frames that refer to this +@var{parameter_id} are expressed as +@item default_mix_gain +Default mix gain value to apply when there are no parameter blocks sharing the same +@var{parameter_id} for a given frame + +@item element +References an Audio Element used in this Mix Presentation to generate the final output +audio signal for playback. +This option must be separated from the rest with a '|'. Several '|' separated entries +can be defined, and at least one must be set. + +It takes the following ":"-separated key=value options: + +@table @option +@item stg +The @var{stream_group_id} for an Audio Element which this sub-mix refers to +@item parameter_id +An identifier parameters blocks in frames may refer to, for applying any processing to +the referenced and rendered Audio Element before being summed with other processed Audio +Elements +@item parameter_rate +The sample rate duration fields in parameters blocks in frames that refer to this +@var{parameter_id} are expressed as +@item default_mix_gain +Default mix gain value to apply when there are no parameter blocks sharing the same +@var{parameter_id} for a given frame +@item annotations +A key=value string describing the sub-mix element where "key" is a string conforming to +BCP-47 that specifies the language for the "value" string. "key" must be the same as the +one in the mix's @var{annotations} +@item headphones_rendering_mode +Indicates whether the input channel-based Audio Element is rendered to stereo loudspeakers +or spatialized with a binaural renderer when played back on headphones. +This has no effect if the referenced Audio Element's @var{audio_element_type} is set to +channel. + +The following values are supported: + +@table @option +@item stereo +@item binaural +@end table + +@end table + +@item layout +Specifies the layouts for this sub-mix on which the loudness information was measured. +This option must be separated from the rest with a '|'. Several '|' separated entries +can be defined, and at least one must be set. + +It takes the following ":"-separated key=value options: + +@table @option +@item layout_type + +@table @option +@item loudspeakers +The layout follows the loudspeaker sound system convention of ITU-2051-3. +@item binaural +The layout is binaural. +@end table + +@item sound_system +Channel layout matching one of Sound Systems A to J of ITU-2051-3, plus 7.1.2 and 3.1.2 +This has no effect if @var{layout_type} is set to binaural. +@item integrated_loudness +The program integrated loudness information, as defined in ITU-1770-4. +@item digital_peak +The digital (sampled) peak value of the audio signal, as defined in ITU-1770-4. +@item true_peak +The true peak of the audio signal, as defined in ITU-1770-4. +@item dialog_anchored_loudness +The Dialogue loudness information, as defined in ITU-1770-4. +@item album_anchored_loudness +The Album loudness information, as defined in ITU-1770-4. +@end table + +@end table + +@item annotations +A key=value string string describing the mix where "key" is a string conforming to BCP-47 +that specifies the language for the "value" string. "key" must be the same as the ones in +all sub-mix element's @var{annotations}s +@end table + +@end table + @item -target @var{type} (@emph{output}) Specify target file type (@code{vcd}, @code{svcd}, @code{dvd}, @code{dv}, @code{dv50}). @var{type} may be prefixed with @code{pal-}, @code{ntsc-} or diff --git a/fftools/ffmpeg.h b/fftools/ffmpeg.h index affa80856a..1169f723d1 100644 --- a/fftools/ffmpeg.h +++ b/fftools/ffmpeg.h @@ -281,6 +281,8 @@ typedef struct OptionsContext { int nb_disposition; SpecifierOpt *program; int nb_program; + SpecifierOpt *stream_groups; + int nb_stream_groups; SpecifierOpt *time_bases; int nb_time_bases; SpecifierOpt *enc_time_bases; diff --git a/fftools/ffmpeg_mux_init.c b/fftools/ffmpeg_mux_init.c index f527a083db..2134b28512 100644 --- a/fftools/ffmpeg_mux_init.c +++ b/fftools/ffmpeg_mux_init.c @@ -40,6 +40,7 @@ #include "libavutil/dict.h" #include "libavutil/display.h" #include "libavutil/getenv_utf8.h" +#include "libavutil/iamf.h" #include "libavutil/intreadwrite.h" #include "libavutil/log.h" #include "libavutil/mem.h" @@ -2008,6 +2009,343 @@ static int setup_sync_queues(Muxer *mux, AVFormatContext *oc, int64_t buf_size_u return 0; } +static int of_parse_iamf_audio_element_layers(Muxer *mux, AVStreamGroup *stg, char *ptr) +{ + AVIAMFAudioElement *audio_element = stg->params.iamf_audio_element; + AVDictionary *dict = NULL; + const char *token; + int ret = 0; + + audio_element->demixing_info = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_DEMIXING, 1, NULL); + audio_element->recon_gain_info = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN, 1, NULL); + + if (!audio_element->demixing_info || + !audio_element->recon_gain_info) + return AVERROR(ENOMEM); + + /* process manually set layers and parameters */ + token = av_strtok(NULL, ",", &ptr); + while (token) { + const AVDictionaryEntry *e; + int demixing = 0, recon_gain = 0; + int layer = 0; + + if (av_strstart(token, "layer=", &token)) + layer = 1; + else if (av_strstart(token, "demixing=", &token)) + demixing = 1; + else if (av_strstart(token, "recon_gain=", &token)) + recon_gain = 1; + + av_dict_free(&dict); + ret = av_dict_parse_string(&dict, token, "=", ":", 0); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing audio element specification %s\n", token); + goto fail; + } + + if (layer) { + AVIAMFLayer *audio_layer = av_iamf_audio_element_add_layer(audio_element); + if (!audio_layer) { + av_log(mux, AV_LOG_ERROR, "Error adding layer to stream group %d\n", stg->index); + ret = AVERROR(ENOMEM); + goto fail; + } + av_opt_set_dict(audio_layer, &dict); + } else if (demixing || recon_gain) { + AVIAMFParamDefinition *param = demixing ? audio_element->demixing_info + : audio_element->recon_gain_info; + void *subblock = av_iamf_param_definition_get_subblock(param, 0); + + av_opt_set_dict(param, &dict); + av_opt_set_dict(subblock, &dict); + } + + // make sure that no entries are left in the dict + e = NULL; + if (e = av_dict_iterate(dict, e)) { + av_log(mux, AV_LOG_FATAL, "Unknown layer key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto fail; + } + token = av_strtok(NULL, ",", &ptr); + } + +fail: + av_dict_free(&dict); + if (!ret && !audio_element->nb_layers) { + av_log(mux, AV_LOG_ERROR, "No layer in audio element specification\n"); + ret = AVERROR(EINVAL); + } + + return ret; +} + +static int of_parse_iamf_submixes(Muxer *mux, AVStreamGroup *stg, char *ptr) +{ + AVFormatContext *oc = mux->fc; + AVIAMFMixPresentation *mix = stg->params.iamf_mix_presentation; + AVDictionary *dict = NULL; + const char *token; + char *submix_str = NULL; + int ret = 0; + + /* process manually set submixes */ + token = av_strtok(NULL, ",", &ptr); + while (token) { + AVIAMFSubmix *submix = NULL; + const char *subtoken; + char *subptr = NULL; + + if (!av_strstart(token, "submix=", &token)) { + av_log(mux, AV_LOG_ERROR, "No submix in mix presentation specification \"%s\"\n", token); + goto fail; + } + + submix_str = av_strdup(token); + if (!submix_str) + goto fail; + + submix = av_iamf_mix_presentation_add_submix(mix); + if (!submix) { + av_log(mux, AV_LOG_ERROR, "Error adding submix to stream group %d\n", stg->index); + ret = AVERROR(ENOMEM); + goto fail; + } + submix->output_mix_config = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, 0, NULL); + if (!submix->output_mix_config) { + ret = AVERROR(ENOMEM); + goto fail; + } + + subptr = NULL; + subtoken = av_strtok(submix_str, "|", &subptr); + while (subtoken) { + const AVDictionaryEntry *e; + int element = 0, layout = 0; + + if (av_strstart(subtoken, "element=", &subtoken)) + element = 1; + else if (av_strstart(subtoken, "layout=", &subtoken)) + layout = 1; + + av_dict_free(&dict); + ret = av_dict_parse_string(&dict, subtoken, "=", ":", 0); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing submix specification \"%s\"\n", subtoken); + goto fail; + } + + if (element) { + AVIAMFSubmixElement *submix_element; + int64_t idx = -1; + + if (e = av_dict_get(dict, "stg", NULL, 0)) + idx = strtol(e->value, NULL, 0); + av_dict_set(&dict, "stg", NULL, 0); + if (idx < 0 || idx >= oc->nb_stream_groups - 1 || + oc->stream_groups[idx]->type != AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) { + av_log(mux, AV_LOG_ERROR, "Invalid or missing stream group index in " + "submix element specification \"%s\"\n", subtoken); + ret = AVERROR(EINVAL); + goto fail; + } + submix_element = av_iamf_submix_add_element(submix); + if (!submix_element) { + av_log(mux, AV_LOG_ERROR, "Error adding element to submix\n"); + ret = AVERROR(ENOMEM); + goto fail; + } + + submix_element->audio_element_id = oc->stream_groups[idx]->id; + + submix_element->element_mix_config = + av_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, 0, NULL); + if (!submix_element->element_mix_config) + ret = AVERROR(ENOMEM); + av_opt_set_dict2(submix_element, &dict, AV_OPT_SEARCH_CHILDREN); + } else if (layout) { + AVIAMFSubmixLayout *submix_layout = av_iamf_submix_add_layout(submix); + if (!submix_layout) { + av_log(mux, AV_LOG_ERROR, "Error adding layout to submix\n"); + ret = AVERROR(ENOMEM); + goto fail; + } + av_opt_set_dict(submix_layout, &dict); + } else + av_opt_set_dict2(submix, &dict, AV_OPT_SEARCH_CHILDREN); + + if (ret < 0) { + goto fail; + } + + // make sure that no entries are left in the dict + e = NULL; + while (e = av_dict_iterate(dict, e)) { + av_log(mux, AV_LOG_FATAL, "Unknown submix key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto fail; + } + subtoken = av_strtok(NULL, "|", &subptr); + } + av_freep(&submix_str); + + if (!submix->nb_elements) { + av_log(mux, AV_LOG_ERROR, "No audio elements in submix specification \"%s\"\n", token); + ret = AVERROR(EINVAL); + } + token = av_strtok(NULL, ",", &ptr); + } + +fail: + av_dict_free(&dict); + av_free(submix_str); + + return ret; +} + +static int of_parse_group_token(Muxer *mux, const char *token, char *ptr) +{ + AVFormatContext *oc = mux->fc; + AVStreamGroup *stg; + AVDictionary *dict = NULL, *tmp = NULL; + const AVDictionaryEntry *e; + const AVOption opts[] = { + { "type", "Set group type", offsetof(AVStreamGroup, type), AV_OPT_TYPE_INT, + { .i64 = 0 }, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM, "type" }, + { "iamf_audio_element", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT }, .unit = "type" }, + { "iamf_mix_presentation", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION }, .unit = "type" }, + { NULL }, + }; + const AVClass class = { + .class_name = "StreamGroupType", + .item_name = av_default_item_name, + .option = opts, + .version = LIBAVUTIL_VERSION_INT, + }; + const AVClass *pclass = &class; + int type, ret; + + ret = av_dict_parse_string(&dict, token, "=", ":", AV_DICT_MULTIKEY); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing group specification %s\n", token); + return ret; + } + + // "type" is not a user settable AVOption in AVStreamGroup, so handle it here + e = av_dict_get(dict, "type", NULL, 0); + if (!e) { + av_log(mux, AV_LOG_ERROR, "No type specified for Stream Group in \"%s\"\n", token); + ret = AVERROR(EINVAL); + goto end; + } + + ret = av_opt_eval_int(&pclass, opts, e->value, &type); + if (!ret && type == AV_STREAM_GROUP_PARAMS_NONE) + ret = AVERROR(EINVAL); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Invalid group type \"%s\"\n", e->value); + goto end; + } + + av_dict_copy(&tmp, dict, 0); + stg = avformat_stream_group_create(oc, type, &tmp); + if (!stg) { + ret = AVERROR(ENOMEM); + goto end; + } + + e = NULL; + while (e = av_dict_get(dict, "st", e, 0)) { + int64_t idx = strtol(e->value, NULL, 0); + if (idx < 0 || idx >= oc->nb_streams) { + av_log(mux, AV_LOG_ERROR, "Invalid stream index %"PRId64"\n", idx); + ret = AVERROR(EINVAL); + goto end; + } + ret = avformat_stream_group_add_stream(stg, oc->streams[idx]); + if (ret < 0) + goto end; + } + while (e = av_dict_get(dict, "stg", e, 0)) { + int64_t idx = strtol(e->value, NULL, 0); + if (idx < 0 || idx >= oc->nb_stream_groups - 1) { + av_log(mux, AV_LOG_ERROR, "Invalid stream group index %"PRId64"\n", idx); + ret = AVERROR(EINVAL); + goto end; + } + for (unsigned i = 0; i < oc->stream_groups[idx]->nb_streams; i++) { + ret = avformat_stream_group_add_stream(stg, oc->stream_groups[idx]->streams[i]); + if (ret < 0) + goto end; + } + } + + switch(type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + ret = of_parse_iamf_audio_element_layers(mux, stg, ptr); + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + ret = of_parse_iamf_submixes(mux, stg, ptr); + break; + default: + av_log(mux, AV_LOG_FATAL, "Unknown group type %d.\n", type); + ret = AVERROR(EINVAL); + break; + } + + if (ret < 0) + goto end; + + // make sure that nothing but "st" and "stg" entries are left in the dict + e = NULL; + av_dict_set(&tmp, "type", NULL, 0); + while (e = av_dict_iterate(tmp, e)) { + if (!strcmp(e->key, "st") || !strcmp(e->key, "stg")) + continue; + + av_log(mux, AV_LOG_FATAL, "Unknown group key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto end; + } + + ret = 0; +end: + av_dict_free(&dict); + av_dict_free(&tmp); + + return ret; +} + +static int of_add_groups(Muxer *mux, const OptionsContext *o) +{ + /* process manually set groups */ + for (int i = 0; i < o->nb_stream_groups; i++) { + const char *token; + char *str, *ptr = NULL; + int ret = 0; + + str = av_strdup(o->stream_groups[i].u.str); + if (!str) + return ret; + + token = av_strtok(str, ",", &ptr); + if (token) + ret = of_parse_group_token(mux, token, ptr); + + av_free(str); + if (ret < 0) + return ret; + } + + return 0; +} + static int of_add_programs(Muxer *mux, const OptionsContext *o) { AVFormatContext *oc = mux->fc; @@ -2793,6 +3131,10 @@ int of_open(const OptionsContext *o, const char *filename, Scheduler *sch) if (err < 0) return err; + err = of_add_groups(mux, o); + if (err < 0) + return err; + err = of_add_programs(mux, o); if (err < 0) return err; diff --git a/fftools/ffmpeg_opt.c b/fftools/ffmpeg_opt.c index 6177a96a4e..915f8e3ea0 100644 --- a/fftools/ffmpeg_opt.c +++ b/fftools/ffmpeg_opt.c @@ -1493,6 +1493,8 @@ const OptionDef options[] = { "add metadata", "string=string" }, { "program", HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(program) }, "add program with specified streams", "title=string:st=number..." }, + { "stream_group", HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(stream_groups) }, + "add stream group with specified streams and group type-specific arguments", "id=number:st=number..." }, { "dframes", HAS_ARG | OPT_PERFILE | OPT_EXPERT | OPT_OUTPUT, { .func_arg = opt_data_frames }, "set the number of data frames to output", "number" },