From patchwork Tue Nov 21 21:14:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44741 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:8c2a:b0:181:818d:5e7f with SMTP id j42csp848785pzh; Tue, 21 Nov 2023 13:15:12 -0800 (PST) X-Google-Smtp-Source: AGHT+IGhi9DgfJ/Un1jGn8bZtMZrAHBxputKn9NlmE6v3SMnbHzF5O7JCAe1L74mZKmkCLh/3sGG X-Received: by 2002:a05:6402:27cd:b0:544:3944:d7cd with SMTP id c13-20020a05640227cd00b005443944d7cdmr381673ede.2.1700601311731; Tue, 21 Nov 2023 13:15:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700601311; cv=none; d=google.com; s=arc-20160816; b=0r8qBYYozs5VlLrRDxFzbuVDqLiNml7egr41l/QDyDaUlDeow7xfXHKXtqauCtR79o /HATEZCbQR0JU0HFMORs8S06p3BK3kztfNohQWx5TnWBrGuKn0j8+h75ouMtQKhL/B7h ATilCxCkLqr2K8czE6Non5rrKihD5wgWkBkhTjYVhVNLprcNGXMEaNZDbvmqY2Mg8XUU XGr7tx9+4fabIO87Id8xr3FRbTEik7VFUZQ9sQMPpmPdMqZOciR9TIH3MkXBunMDj8aZ 7lb4KdvgXG2ZYKdyVfaUaJ95iYsJCP+NZGMKLX5bzbFL6uF2a9hTNOmLa+a094aOTRkE Ztlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=3AibQA9fY131xRPyaVAIG9EC+W2U/HLRL+52x+f2qO0=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=cK+bai3j083bn6stz3o27m0bHuC74bX7d4Og6p0QB0maLrQX9fgZH5EH+H7Zq2g3w3 OorDOfKJPGrtVYgtCuEUepAwP5y/DIusrufjWf0xjvbFrduazet88gNzoNjmem+USCOu DM3HnCBnFiXxj8WQz54ffH9Ky6MBdNzyLWYdBkUtDrY6vTYR1hHow7QlM5nuU/htotpT T8ospyBKGHgtbTe2987MCmYS+h3T/f4c3Y3/Vzb0AZuxeJnQsFu4pmh8snUBQXh77Ee0 jVUcEMJNxYz51HqMIEUt7lMJ71p1c9DQ1EGkmIiwkZ9q7Qq9lDEo0SYKsFp7FbV/hY2Y czxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="k/y39Us9"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id u10-20020a50a40a000000b00548587f45a2si5430611edb.518.2023.11.21.13.15.11; Tue, 21 Nov 2023 13:15:11 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="k/y39Us9"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 40EA368CCB0; Tue, 21 Nov 2023 23:14:56 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C8FE068CC92 for ; Tue, 21 Nov 2023 23:14:49 +0200 (EET) Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-6c3363a2b93so5734503b3a.3 for ; Tue, 21 Nov 2023 13:14:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700601288; x=1701206088; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=ZYZnGqPcy1dZUxXraXWBguwraZUwIe6Ewb3CyKeS6FI=; b=k/y39Us9XqhXv+Wj8i+3B2KIPO2QGpCayukmWPQvxmJJMFri0WIX/sLWMjXYBOOXl/ i1LI+T0awOwWDsIpdRGMEePQUxupb8Q8aBCYTMogAHKor+GtFrxq2VbpJ+Iubs5gL9MM X1RJg+fNm7tx3x0Qpfg7g3rNFY3IeF+7lNjSuxZKobwsXMazFxRuSnp/bj8JEXmLRw9Z FREX0mpDfS6SlCMTRACU6W8dIPtMRdwKPTWm59PVwMOqwkTn5TPA7sXFHYjCFhBlK6Bf a9/4SEqb9xNj5z3kP9yxQD3kyv+jREUlhCF+afGoMoOtNxYclGv1LMDvGNCOcK2Hnwcq DdhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700601288; x=1701206088; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZYZnGqPcy1dZUxXraXWBguwraZUwIe6Ewb3CyKeS6FI=; b=EQGBi8L9EzKL0SmK6XW6GkJLR0XmzcTGj+R500zwPly1byaKZJM/hBI8g4xyZsB4k3 8DjAOGxSLOvuqPzUse836YiTeO1Mv0s/1RjlKJL9TMCkBxB94pYzU+dru8sjFlXeAemm 9OWa6gR+7YEpfz6FErQCOdY3o99U2/0qMxl0EJmn2WK7oxSi566BrdobO6PZVwbR2ehX Aedsmtmcw8aRufbePI8HLA/wJSsz/gjBs2723Pq/6Nm0aFKGK5t/yiqki0ssf4khIDxl RcttRMR6mWrEUENvEq9cjGoWOK1Oi0jj091FYC/bDaLtagEifjCjkLgOYxaXhql9PAxa rHSQ== X-Gm-Message-State: AOJu0YxQCbFvBY5qXu/q236QZb5Ei0Gnb0vizRwzoxiUQ0uSRtuy1/NP B1zyZTr7W3J3BMA98EFLpO/kXgFYAYo= X-Received: by 2002:a05:6a20:38a2:b0:187:604a:3add with SMTP id n34-20020a056a2038a200b00187604a3addmr279013pzf.24.1700601287370; Tue, 21 Nov 2023 13:14:47 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id bn2-20020a056a00324200b0069ea08a2a99sm8412505pfb.211.2023.11.21.13.14.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Nov 2023 13:14:47 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 21 Nov 2023 18:14:38 -0300 Message-ID: <20231121211442.8723-2-jamrial@gmail.com> X-Mailer: git-send-email 2.42.1 In-Reply-To: <20231121211442.8723-1-jamrial@gmail.com> References: <20231121211442.8723-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 1/5] avformat: introduce AVStreamGroup X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: 2yot1SbP17Ne Signed-off-by: James Almer --- libavformat/avformat.c | 178 +++++++++++++++++++++++++++++++++++++++-- libavformat/avformat.h | 163 +++++++++++++++++++++++++++++++++++++ libavformat/dump.c | 33 ++++++-- libavformat/internal.h | 33 ++++++++ libavformat/options.c | 90 +++++++++++++++++++++ 5 files changed, 486 insertions(+), 11 deletions(-) diff --git a/libavformat/avformat.c b/libavformat/avformat.c index 5b8bb7879e..4e31924c71 100644 --- a/libavformat/avformat.c +++ b/libavformat/avformat.c @@ -80,6 +80,25 @@ FF_ENABLE_DEPRECATION_WARNINGS av_freep(pst); } +void ff_free_stream_group(AVStreamGroup **pstg) +{ + AVStreamGroup *stg = *pstg; + + if (!stg) + return; + + av_freep(&stg->streams); + av_dict_free(&stg->metadata); + av_freep(&stg->priv_data); + switch (stg->type) { + // Structs in the union are freed here + default: + break; + } + + av_freep(pstg); +} + void ff_remove_stream(AVFormatContext *s, AVStream *st) { av_assert0(s->nb_streams>0); @@ -88,6 +107,14 @@ void ff_remove_stream(AVFormatContext *s, AVStream *st) ff_free_stream(&s->streams[ --s->nb_streams ]); } +void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg) +{ + av_assert0(s->nb_stream_groups > 0); + av_assert0(s->stream_groups[ s->nb_stream_groups - 1 ] == stg); + + ff_free_stream_group(&s->stream_groups[ --s->nb_stream_groups ]); +} + /* XXX: suppress the packet queue */ void ff_flush_packet_queue(AVFormatContext *s) { @@ -118,6 +145,9 @@ void avformat_free_context(AVFormatContext *s) for (unsigned i = 0; i < s->nb_streams; i++) ff_free_stream(&s->streams[i]); + for (unsigned i = 0; i < s->nb_stream_groups; i++) + ff_free_stream_group(&s->stream_groups[i]); + s->nb_stream_groups = 0; s->nb_streams = 0; for (unsigned i = 0; i < s->nb_programs; i++) { @@ -138,6 +168,7 @@ void avformat_free_context(AVFormatContext *s) av_dict_free(&si->id3v2_meta); av_packet_free(&si->pkt); av_packet_free(&si->parse_pkt); + av_freep(&s->stream_groups); av_freep(&s->streams); ff_flush_packet_queue(s); av_freep(&s->url); @@ -464,7 +495,7 @@ int av_find_best_stream(AVFormatContext *ic, enum AVMediaType type, */ static int match_stream_specifier(const AVFormatContext *s, const AVStream *st, const char *spec, const char **indexptr, - const AVProgram **p) + const AVStreamGroup **g, const AVProgram **p) { int match = 1; /* Stores if the specifier matches so far. */ while (*spec) { @@ -493,6 +524,46 @@ static int match_stream_specifier(const AVFormatContext *s, const AVStream *st, match = 0; if (nopic && (st->disposition & AV_DISPOSITION_ATTACHED_PIC)) match = 0; + } else if (*spec == 'g' && *(spec + 1) == ':') { + int64_t group_idx = -1, group_id = -1; + int found = 0; + char *endptr; + spec += 2; + if (*spec == '#' || (*spec == 'i' && *(spec + 1) == ':')) { + spec += 1 + (*spec == 'i'); + group_id = strtol(spec, &endptr, 0); + if (spec == endptr || (*endptr && *endptr++ != ':')) + return AVERROR(EINVAL); + spec = endptr; + } else { + group_idx = strtol(spec, &endptr, 0); + /* Disallow empty id and make sure that if we are not at the end, then another specifier must follow. */ + if (spec == endptr || (*endptr && *endptr++ != ':')) + return AVERROR(EINVAL); + spec = endptr; + } + if (match) { + if (group_id > 0) { + for (unsigned i = 0; i < s->nb_stream_groups; i++) { + if (group_id == s->stream_groups[i]->id) { + group_idx = i; + break; + } + } + } + if (group_idx < 0 || group_idx > s->nb_stream_groups) + return AVERROR(EINVAL); + for (unsigned j = 0; j < s->stream_groups[group_idx]->nb_streams; j++) { + if (st->index == s->stream_groups[group_idx]->streams[j]->index) { + found = 1; + if (g) + *g = s->stream_groups[group_idx]; + break; + } + } + } + if (!found) + match = 0; } else if (*spec == 'p' && *(spec + 1) == ':') { int prog_id; int found = 0; @@ -592,9 +663,10 @@ int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st, char *endptr; const char *indexptr = NULL; const AVProgram *p = NULL; + const AVStreamGroup *g = NULL; int nb_streams; - ret = match_stream_specifier(s, st, spec, &indexptr, &p); + ret = match_stream_specifier(s, st, spec, &indexptr, &g, &p); if (ret < 0) goto error; @@ -612,10 +684,11 @@ int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st, return (index == st->index); /* If we requested a matching stream index, we have to ensure st is that. */ - nb_streams = p ? p->nb_stream_indexes : s->nb_streams; + nb_streams = g ? g->nb_streams : (p ? p->nb_stream_indexes : s->nb_streams); for (int i = 0; i < nb_streams && index >= 0; i++) { - const AVStream *candidate = s->streams[p ? p->stream_index[i] : i]; - ret = match_stream_specifier(s, candidate, spec, NULL, NULL); + unsigned idx = g ? g->streams[i]->index : (p ? p->stream_index[i] : i); + const AVStream *candidate = s->streams[idx]; + ret = match_stream_specifier(s, candidate, spec, NULL, NULL, NULL); if (ret < 0) goto error; if (ret > 0 && index-- == 0 && st == candidate) @@ -629,6 +702,101 @@ error: return ret; } + +/** + * Matches a stream specifier (but ignores requested index). + * + * @param indexptr set to point to the requested stream index if there is one + * + * @return <0 on error + * 0 if st is NOT a matching stream + * >0 if st is a matching stream + */ +static int match_stream_group_specifier(const AVFormatContext *s, const AVStreamGroup *stg, + const char *spec, const char **indexptr) +{ + int match = 1; /* Stores if the specifier matches so far. */ + while (*spec) { + if (*spec <= '9' && *spec >= '0') { /* opt:index */ + if (indexptr) + *indexptr = spec; + return match; + } else if (*spec == 't' && *(spec + 1) == ':') { + int64_t group_type = -1; + int found = 0; + char *endptr; + spec += 2; + group_type = strtol(spec, &endptr, 0); + /* Disallow empty type and make sure that if we are not at the end, then another specifier must follow. */ + if (spec == endptr || (*endptr && *endptr++ != ':')) + return AVERROR(EINVAL); + spec = endptr; + if (match && group_type > 0) { + for (unsigned i = 0; i < s->nb_stream_groups; i++) { + if (group_type == s->stream_groups[i]->type) { + found = 1; + break; + } + } + } + if (!found) + match = 0; + } else if (*spec == '#' || + (*spec == 'i' && *(spec + 1) == ':')) { + int group_id; + char *endptr; + spec += 1 + (*spec == 'i'); + group_id = strtol(spec, &endptr, 0); + if (spec == endptr || *endptr) /* Disallow empty id and make sure we are at the end. */ + return AVERROR(EINVAL); + return match && (group_id == stg->id); + } + } + + return match; +} + +int avformat_match_stream_group_specifier(AVFormatContext *s, AVStreamGroup *stg, + const char *spec) +{ + int ret, index; + char *endptr; + const char *indexptr = NULL; + + ret = match_stream_group_specifier(s, stg, spec, &indexptr); + if (ret < 0) + goto error; + + if (!indexptr) + return ret; + + index = strtol(indexptr, &endptr, 0); + if (*endptr) { /* We can't have anything after the requested index. */ + ret = AVERROR(EINVAL); + goto error; + } + + /* This is not really needed but saves us a loop for simple stream index specifiers. */ + if (spec == indexptr) + return (index == stg->index); + + /* If we requested a matching stream index, we have to ensure stg is that. */ + for (int i = 0; i < s->nb_stream_groups && index >= 0; i++) { + const AVStreamGroup *candidate = s->stream_groups[i]; + ret = match_stream_group_specifier(s, candidate, spec, NULL); + if (ret < 0) + goto error; + if (ret > 0 && index-- == 0 && stg == candidate) + return 1; + } + return 0; + +error: + if (ret == AVERROR(EINVAL)) + av_log(s, AV_LOG_ERROR, "Invalid stream group specifier: %s.\n", spec); + return ret; +} + AVRational av_guess_sample_aspect_ratio(AVFormatContext *format, AVStream *stream, AVFrame *frame) { AVRational undef = {0, 1}; diff --git a/libavformat/avformat.h b/libavformat/avformat.h index 9e7eca007e..0b4ae096d5 100644 --- a/libavformat/avformat.h +++ b/libavformat/avformat.h @@ -1018,6 +1018,77 @@ typedef struct AVStream { int pts_wrap_bits; } AVStream; +enum AVStreamGroupParamsType { + AV_STREAM_GROUP_PARAMS_NONE, +}; + +typedef struct AVStreamGroup { + /** + * A class for @ref avoptions. Set by avformat_stream_group_create(). + */ + const AVClass *av_class; + + void *priv_data; + + /** + * Group index in AVFormatContext. + */ + unsigned int index; + + /** + * Group type-specific group ID. + * + * decoding: set by libavformat + * encoding: may set by the user, replaced by libavformat if left unset + */ + int64_t id; + + /** + * Group type + * + * decoding: set by libavformat on group creation + * encoding: set by avformat_stream_group_create() + */ + enum AVStreamGroupParamsType type; + + /** + * Group type-specific parameters + */ + union { + uintptr_t dummy; // Placeholder + } params; + + /** + * Metadata that applies to the whole group. + * + * - demuxing: set by libavformat on group creation + * - muxing: may be set by the caller before avformat_write_header() + * + * Freed by libavformat in avformat_free_context(). + */ + AVDictionary *metadata; + + /** + * Number of elements in AVStreamGroup.streams. + * + * Set by avformat_stream_group_add_stream() must not be modified by any other code. + */ + unsigned int nb_streams; + + /** + * A list of streams in the group. New entries are created with + * avformat_stream_group_add_stream(). + * + * - demuxing: entries are created by libavformat on group creation. + * If AVFMTCTX_NOHEADER is set in ctx_flags, then new entries may also + * appear in av_read_frame(). + * - muxing: entries are created by the user before avformat_write_header(). + * + * Freed by libavformat in avformat_free_context(). + */ + AVStream **streams; +} AVStreamGroup; + struct AVCodecParserContext *av_stream_get_parser(const AVStream *s); #if FF_API_GET_END_PTS @@ -1726,6 +1797,26 @@ typedef struct AVFormatContext { * @return 0 on success, a negative AVERROR code on failure */ int (*io_close2)(struct AVFormatContext *s, AVIOContext *pb); + + /** + * Number of elements in AVFormatContext.stream_groups. + * + * Set by avformat_stream_group_create(), must not be modified by any other code. + */ + unsigned int nb_stream_groups; + + /** + * A list of all stream groups in the file. New groups are created with + * avformat_stream_group_create(), and filled with avformat_stream_group_add_stream(). + * + * - demuxing: groups may be created by libavformat in avformat_open_input(). + * If AVFMTCTX_NOHEADER is set in ctx_flags, then new groups may also + * appear in av_read_frame(). + * - muxing: groups may be created by the user before avformat_write_header(). + * + * Freed by libavformat in avformat_free_context(). + */ + AVStreamGroup **stream_groups; } AVFormatContext; /** @@ -1844,6 +1935,37 @@ const AVClass *avformat_get_class(void); */ const AVClass *av_stream_get_class(void); +/** + * Get the AVClass for AVStreamGroup. It can be used in combination with + * AV_OPT_SEARCH_FAKE_OBJ for examining options. + * + * @see av_opt_find(). + */ +const AVClass *av_stream_group_get_class(void); + +/** + * Add a new empty stream group to a media file. + * + * When demuxing, it may be called by the demuxer in read_header(). If the + * flag AVFMTCTX_NOHEADER is set in s.ctx_flags, then it may also + * be called in read_packet(). + * + * When muxing, may be called by the user before avformat_write_header(). + * + * User is required to call avformat_free_context() to clean up the allocation + * by avformat_stream_group_create(). + * + * New streams can be added to the group with avformat_stream_group_add_stream(). + * + * @param s media file handle + * + * @return newly created group or NULL on error. + * @see avformat_new_stream, avformat_stream_group_add_stream. + */ +AVStreamGroup *avformat_stream_group_create(AVFormatContext *s, + enum AVStreamGroupParamsType type, + AVDictionary **options); + /** * Add a new stream to a media file. * @@ -1863,6 +1985,31 @@ const AVClass *av_stream_get_class(void); */ AVStream *avformat_new_stream(AVFormatContext *s, const struct AVCodec *c); +/** + * Add an already allocated stream to a stream group. + * + * When demuxing, it may be called by the demuxer in read_header(). If the + * flag AVFMTCTX_NOHEADER is set in s.ctx_flags, then it may also + * be called in read_packet(). + * + * When muxing, may be called by the user before avformat_write_header() after + * having allocated a new group with avformat_stream_group_create() and stream with + * avformat_new_stream(). + * + * User is required to call avformat_free_context() to clean up the allocation + * by avformat_stream_group_add_stream(). + * + * @param stg stream group belonging to a media file. + * @param st stream in the media file to add to the group. + * + * @retval 0 success + * @retval AVERROR(EEXIST) the stream was already in the group + * @retval "another negative error code" legitimate errors + * + * @see avformat_new_stream, avformat_stream_group_create. + */ +int avformat_stream_group_add_stream(AVStreamGroup *stg, AVStream *st); + #if FF_API_AVSTREAM_SIDE_DATA /** * Wrap an existing array as stream side data. @@ -2819,6 +2966,22 @@ AVRational av_guess_frame_rate(AVFormatContext *ctx, AVStream *stream, int avformat_match_stream_specifier(AVFormatContext *s, AVStream *st, const char *spec); +/** + * Check if the group stg contained in s is matched by the stream group + * specifier spec. + * + * See the "stream group specifiers" chapter in the documentation for the + * syntax of spec. + * + * @return >0 if stg is matched by spec; + * 0 if stg is not matched by spec; + * AVERROR code if spec is invalid + * + * @note A stream group specifier can match several groups in the format. + */ +int avformat_match_stream_group_specifier(AVFormatContext *s, AVStreamGroup *stg, + const char *spec); + int avformat_queue_attached_pictures(AVFormatContext *s); enum AVTimebaseSource { diff --git a/libavformat/dump.c b/libavformat/dump.c index c0868a1bb3..ededeedaa9 100644 --- a/libavformat/dump.c +++ b/libavformat/dump.c @@ -509,7 +509,7 @@ static void dump_sidedata(void *ctx, const AVStream *st, const char *indent) /* "user interface" functions */ static void dump_stream_format(const AVFormatContext *ic, int i, - int index, int is_output) + int group_index, int index, int is_output) { char buf[256]; int flags = (is_output ? ic->oformat->flags : ic->iformat->flags); @@ -517,6 +517,8 @@ static void dump_stream_format(const AVFormatContext *ic, int i, const FFStream *const sti = cffstream(st); const AVDictionaryEntry *lang = av_dict_get(st->metadata, "language", NULL, 0); const char *separator = ic->dump_separator; + const char *group_indent = group_index >= 0 ? " " : ""; + const char *extra_indent = group_index >= 0 ? " " : " "; AVCodecContext *avctx; int ret; @@ -543,7 +545,8 @@ static void dump_stream_format(const AVFormatContext *ic, int i, avcodec_string(buf, sizeof(buf), avctx, is_output); avcodec_free_context(&avctx); - av_log(NULL, AV_LOG_INFO, " Stream #%d:%d", index, i); + av_log(NULL, AV_LOG_INFO, "%s Stream #%d", group_indent, index); + av_log(NULL, AV_LOG_INFO, ":%d", i); /* the pid is an important information, so we display it */ /* XXX: add a generic system */ @@ -621,9 +624,24 @@ static void dump_stream_format(const AVFormatContext *ic, int i, av_log(NULL, AV_LOG_INFO, " (non-diegetic)"); av_log(NULL, AV_LOG_INFO, "\n"); - dump_metadata(NULL, st->metadata, " "); + dump_metadata(NULL, st->metadata, extra_indent); - dump_sidedata(NULL, st, " "); + dump_sidedata(NULL, st, extra_indent); +} + +static void dump_stream_group(const AVFormatContext *ic, uint8_t *printed, + int i, int index, int is_output) +{ + const AVStreamGroup *stg = ic->stream_groups[i]; + char buf[512]; + int ret; + + av_log(NULL, AV_LOG_INFO, " Stream group #%d:%d[0x%"PRIx64"]:", index, i, stg->id); + + switch (stg->type) { + default: + break; + } } void av_dump_format(AVFormatContext *ic, int index, @@ -699,7 +717,7 @@ void av_dump_format(AVFormatContext *ic, int index, dump_metadata(NULL, program->metadata, " "); for (k = 0; k < program->nb_stream_indexes; k++) { dump_stream_format(ic, program->stream_index[k], - index, is_output); + -1, index, is_output); printed[program->stream_index[k]] = 1; } total += program->nb_stream_indexes; @@ -708,9 +726,12 @@ void av_dump_format(AVFormatContext *ic, int index, av_log(NULL, AV_LOG_INFO, " No Program\n"); } + for (i = 0; i < ic->nb_stream_groups; i++) + dump_stream_group(ic, printed, i, index, is_output); + for (i = 0; i < ic->nb_streams; i++) if (!printed[i]) - dump_stream_format(ic, i, index, is_output); + dump_stream_format(ic, i, -1, index, is_output); av_free(printed); } diff --git a/libavformat/internal.h b/libavformat/internal.h index 7702986c9c..c6181683ef 100644 --- a/libavformat/internal.h +++ b/libavformat/internal.h @@ -202,6 +202,7 @@ typedef struct FFStream { */ AVStream pub; + AVFormatContext *fmtctx; /** * Set to 1 if the codec allows reordering, so pts can be different * from dts. @@ -427,6 +428,26 @@ static av_always_inline const FFStream *cffstream(const AVStream *st) return (const FFStream*)st; } +typedef struct FFStreamGroup { + /** + * The public context. + */ + AVStreamGroup pub; + + AVFormatContext *fmtctx; +} FFStreamGroup; + + +static av_always_inline FFStreamGroup *ffstreamgroup(AVStreamGroup *stg) +{ + return (FFStreamGroup*)stg; +} + +static av_always_inline const FFStreamGroup *cffstreamgroup(const AVStreamGroup *stg) +{ + return (const FFStreamGroup*)stg; +} + #ifdef __GNUC__ #define dynarray_add(tab, nb_ptr, elem)\ do {\ @@ -608,6 +629,18 @@ void ff_free_stream(AVStream **st); */ void ff_remove_stream(AVFormatContext *s, AVStream *st); +/** + * Frees a stream group without modifying the corresponding AVFormatContext. + * Must only be called if the latter doesn't matter or if the stream + * is not yet attached to an AVFormatContext. + */ +void ff_free_stream_group(AVStreamGroup **pstg); +/** + * Remove a stream group from its AVFormatContext and free it. + * The group must be the last stream of the AVFormatContext. + */ +void ff_remove_stream_group(AVFormatContext *s, AVStreamGroup *stg); + unsigned int ff_codec_get_tag(const AVCodecTag *tags, enum AVCodecID id); enum AVCodecID ff_codec_get_id(const AVCodecTag *tags, unsigned int tag); diff --git a/libavformat/options.c b/libavformat/options.c index 1d8c52246b..9ddc28842c 100644 --- a/libavformat/options.c +++ b/libavformat/options.c @@ -271,6 +271,7 @@ AVStream *avformat_new_stream(AVFormatContext *s, const AVCodec *c) if (!st->codecpar) goto fail; + sti->fmtctx = s; sti->avctx = avcodec_alloc_context3(NULL); if (!sti->avctx) goto fail; @@ -325,6 +326,95 @@ fail: return NULL; } +static const AVOption stream_group_options[] = { + {"id", "Set group id", offsetof(AVStreamGroup, id), AV_OPT_TYPE_INT64, {.i64 = 0}, 0, INT64_MAX, AV_OPT_FLAG_ENCODING_PARAM }, + { NULL } +}; + +static const AVClass stream_group_class = { + .class_name = "AVStreamGroup", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = stream_group_options, +}; + +const AVClass *av_stream_group_get_class(void) +{ + return &stream_group_class; +} + +AVStreamGroup *avformat_stream_group_create(AVFormatContext *s, + enum AVStreamGroupParamsType type, + AVDictionary **options) +{ + AVStreamGroup **stream_groups; + AVStreamGroup *stg; + FFStreamGroup *stgi; + + stream_groups = av_realloc_array(s->stream_groups, s->nb_stream_groups + 1, + sizeof(*stream_groups)); + if (!stream_groups) + return NULL; + s->stream_groups = stream_groups; + + stgi = av_mallocz(sizeof(*stgi)); + if (!stgi) + return NULL; + stg = &stgi->pub; + + stg->av_class = &stream_group_class; + av_opt_set_defaults(stg); + stg->type = type; + switch (type) { + // Structs in the union are allocated here + default: + goto fail; + } + + if (options) { + if (av_opt_set_dict2(stg, options, AV_OPT_SEARCH_CHILDREN)) + goto fail; + } + + stgi->fmtctx = s; + stg->index = s->nb_stream_groups; + + s->stream_groups[s->nb_stream_groups++] = stg; + + return stg; +fail: + ff_free_stream_group(&stg); + return NULL; +} + +static int stream_group_add_stream(AVStreamGroup *stg, AVStream *st) +{ + AVStream **streams = av_realloc_array(stg->streams, stg->nb_streams + 1, + sizeof(*stg->streams)); + if (!streams) + return AVERROR(ENOMEM); + + stg->streams = streams; + stg->streams[stg->nb_streams++] = st; + + return 0; +} + +int avformat_stream_group_add_stream(AVStreamGroup *stg, AVStream *st) +{ + const FFStreamGroup *stgi = cffstreamgroup(stg); + const FFStream *sti = cffstream(st); + + if (stgi->fmtctx != sti->fmtctx) + return AVERROR(EINVAL); + + for (int i = 0; i < stg->nb_streams; i++) + if (stg->streams[i]->index == st->index) + return AVERROR(EEXIST); + + return stream_group_add_stream(stg, st); +} + static int option_is_disposition(const AVOption *opt) { return opt->type == AV_OPT_TYPE_CONST && From patchwork Tue Nov 21 21:14:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44742 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:8c2a:b0:181:818d:5e7f with SMTP id j42csp848891pzh; Tue, 21 Nov 2023 13:15:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IHzAfIESjI3ZXwt0ot1FlJAhgu8EwJyEstBiczgUe6RaaWvN6ExK77/EwvDkbJz/58pqpFK X-Received: by 2002:aa7:c658:0:b0:53f:2671:e0f4 with SMTP id z24-20020aa7c658000000b0053f2671e0f4mr417575edr.38.1700601321030; Tue, 21 Nov 2023 13:15:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700601321; cv=none; d=google.com; s=arc-20160816; b=SOpAGi719NQ+4EUpnEh40jKPvZ0FMXXMbMYUV+v4efBtbplwKymjijQdbkY1M8uEZH 9n0sTgQfaX+06qE63TQl7GjVtiJtSdj+TAXuNvyMXcF6GJMse8wk7YRk1A8X3X8SQEoj saXEv8nSul0pqMH6DhpGaCv8xyChdxWoYoTBwdPHo0N2e/xynG8iH9VVY9D8ZogbK3qn jXd/D0KU1SjgjiBlPpQV06o5dnSaRzaJ51WQSpd814wdWOIcvTn/gudSNjK7jLkVcoam DN66Y2FDX9k6mAfeX79fO933ZkKjBF5k5BdDZDcT17eA5cYssl3+51uFPICP995MzUep psWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=aN4oYknymk6o9F1EZjxeX2g1SIZRWj70Xdf1ShppncE=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=zmgKDM+jHpKTFwHsl8a9OAwx5tXyz1KwWIBTJZQHgfYnMW+6tUQ1Dp5qEQASX9XT1g eK6a3KprF+Bd2twjEhKeMdmNGWgQ3aQJG03PZEpgvI9cAnbtpOoawmn05CD3H+bJXaCd +vAp5yRCC/zUq8UlRTOidOpgeQSkNQ3NBi33UpJ4OyakAAajcnZfi6l+LC3a+0z5s9KR iyqmqczL7Ya3Za8X7cc3Sk9zEz+lbK+wmbp5zjClcUSdjD+TwdDDWghTC0pg/NNG/qp6 MR35AYiiP96F608dEJyciWpVTI4xTPTO1Yc4jKgX5GdVnrhtPndt/8IC4cn4NllLg1rd fIBQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=J1JQqkDx; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s22-20020a056402521600b00548491d3514si6132580edd.79.2023.11.21.13.15.20; Tue, 21 Nov 2023 13:15:21 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=J1JQqkDx; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 64EE768CCBA; Tue, 21 Nov 2023 23:14:58 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id C2C4568CCA4 for ; Tue, 21 Nov 2023 23:14:50 +0200 (EET) Received: by mail-pf1-f174.google.com with SMTP id d2e1a72fcca58-6cbc8199a2aso455934b3a.1 for ; Tue, 21 Nov 2023 13:14:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700601289; x=1701206089; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=TSNmKZwoCMkjxuj7+W7roudmvrQ+hNf3DTrFKouFeZ4=; b=J1JQqkDxP4go+vyIHNhqP3+Vabk+a6FYSx/B716by54VsDQuBbSk4l75dT7iHiPGrH dorx7Za58JS4NTa18JXWUyBt+sFmQSvI/IzvxQ9Aw4l8iRzTi0wXdONQPXmHhfVyAUMH Ttrw5xud4Nhv+KnamCmCnwipOOufyVr4efIXfbPZSEh+v0bnDG/W13/xXgM1mPNeD/jW CSCgTf1SObRbs/6piynuNl14BAc1WlGj1S2nQYvL/JinH4hECnRtI1NO+qXFU3xunxZo Syvglj8I8vBuEV4i6IZ8tD9GE7ro8ntemx8a8350MZb2m7licXQw8twPG8odlNovFbi2 fnRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700601289; x=1701206089; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TSNmKZwoCMkjxuj7+W7roudmvrQ+hNf3DTrFKouFeZ4=; b=WYXrPu/PGjmc8Qqr2zMSAD3MZX9HqSNfLnSprp+LY4d5+50Qave7oQc+MNNI6Kg2Zb 6qgNM3wpWYAV3Y81rbFXMD/0MyS56iNN5eHlNdPHMCJ9s1YYjlZu11xp1j5J9V33a6xR d31rJ858CZNNvrT7W+XXbth3t1/8tIgVVVqDxRCfNpHAmch8CuffyCQ0wRVJoCt/TKc2 0PiiGsOGouwwMb1MyD0sZnibAdbUzq8RHeHlFUm5V0iDMnvtc1i37AxpxKKdDAlAoT2j 5iDgb6p5UcB07kWusyJMtfC9mdjIS58AsKh3tQU4VxZByQJlqKGNh7sreERGFhH04rWu jBzA== X-Gm-Message-State: AOJu0Yw2KR+TZa4xVvBgn7ZFYf7ViKI2El/kvMfe1LyHUe/JggFH5Kdc WuTZmxJaaY19AVt4CsVxEQyYXf+6lGc= X-Received: by 2002:a05:6a00:1c9c:b0:6cb:a60c:2143 with SMTP id y28-20020a056a001c9c00b006cba60c2143mr458271pfw.9.1700601288655; Tue, 21 Nov 2023 13:14:48 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id bn2-20020a056a00324200b0069ea08a2a99sm8412505pfb.211.2023.11.21.13.14.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Nov 2023 13:14:48 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 21 Nov 2023 18:14:39 -0300 Message-ID: <20231121211442.8723-3-jamrial@gmail.com> X-Mailer: git-send-email 2.42.1 In-Reply-To: <20231121211442.8723-1-jamrial@gmail.com> References: <20231121211442.8723-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 2/5] avutil/mem: add av_dynarray2_add_nofree X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: dT0kj5zs5Xl6 Signed-off-by: James Almer --- libavutil/mem.c | 17 +++++++++++++++++ libavutil/mem.h | 32 +++++++++++++++++++++++++++++--- 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/libavutil/mem.c b/libavutil/mem.c index 36b8940a0c..bd37710968 100644 --- a/libavutil/mem.c +++ b/libavutil/mem.c @@ -356,6 +356,23 @@ void *av_dynarray2_add(void **tab_ptr, int *nb_ptr, size_t elem_size, return tab_elem_data; } +void *av_dynarray2_add_nofree(void **tab_ptr, int *nb_ptr, size_t elem_size, + const uint8_t *elem_data) +{ + uint8_t *tab_elem_data = NULL; + + FF_DYNARRAY_ADD(INT_MAX, elem_size, *tab_ptr, *nb_ptr, { + tab_elem_data = (uint8_t *)*tab_ptr + (*nb_ptr) * elem_size; + if (elem_data) + memcpy(tab_elem_data, elem_data, elem_size); + else if (CONFIG_MEMORY_POISONING) + memset(tab_elem_data, FF_MEMORY_POISON, elem_size); + }, { + return NULL; + }); + return tab_elem_data; +} + static void fill16(uint8_t *dst, int len) { uint32_t v = AV_RN16(dst - 2); diff --git a/libavutil/mem.h b/libavutil/mem.h index ab7648ac57..c0161be243 100644 --- a/libavutil/mem.h +++ b/libavutil/mem.h @@ -519,7 +519,7 @@ void av_memcpy_backptr(uint8_t *dst, int back, int cnt); * @param[in,out] tab_ptr Pointer to the array to grow * @param[in,out] nb_ptr Pointer to the number of elements in the array * @param[in] elem Element to add - * @see av_dynarray_add_nofree(), av_dynarray2_add() + * @see av_dynarray_add_nofree(), av_dynarray2_add(), av_dynarray2_add_nofree() */ void av_dynarray_add(void *tab_ptr, int *nb_ptr, void *elem); @@ -531,7 +531,7 @@ void av_dynarray_add(void *tab_ptr, int *nb_ptr, void *elem); * instead and leave current buffer untouched. * * @return >=0 on success, negative otherwise - * @see av_dynarray_add(), av_dynarray2_add() + * @see av_dynarray_add(), av_dynarray2_add(), av_dynarray2_add_nofree() */ av_warn_unused_result int av_dynarray_add_nofree(void *tab_ptr, int *nb_ptr, void *elem); @@ -557,11 +557,37 @@ int av_dynarray_add_nofree(void *tab_ptr, int *nb_ptr, void *elem); * * @return Pointer to the data of the element to copy in the newly allocated * space - * @see av_dynarray_add(), av_dynarray_add_nofree() + * @see av_dynarray2_add_nofree(), av_dynarray_add(), av_dynarray_add_nofree() */ void *av_dynarray2_add(void **tab_ptr, int *nb_ptr, size_t elem_size, const uint8_t *elem_data); +/** + * Add an element of size `elem_size` to a dynamic array. + * + * The array is reallocated when its number of elements reaches powers of 2. + * Therefore, the amortized cost of adding an element is constant. + * + * In case of success, the pointer to the array is updated in order to + * point to the new grown array, and the number pointed to by `nb_ptr` + * is incremented. + * In case of failure, the array and `nb_ptr` are left untouched, and NULL + * is returned. + * + * @param[in,out] tab_ptr Pointer to the array to grow + * @param[in,out] nb_ptr Pointer to the number of elements in the array + * @param[in] elem_size Size in bytes of an element in the array + * @param[in] elem_data Pointer to the data of the element to add. If + * `NULL`, the space of the newly added element is + * allocated but left uninitialized. + * + * @return Pointer to the data of the element to copy in the newly allocated + * space on success, NULL otherwise. + * @see av_dynarray2_add(), av_dynarray_add(), av_dynarray_add_nofree() + */ +void *av_dynarray2_add_nofree(void **tab_ptr, int *nb_ptr, size_t elem_size, + const uint8_t *elem_data); + /** * @} */ From patchwork Tue Nov 21 21:14:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44743 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:8c2a:b0:181:818d:5e7f with SMTP id j42csp849118pzh; Tue, 21 Nov 2023 13:15:42 -0800 (PST) X-Google-Smtp-Source: AGHT+IH21/nZqfO+Pjrr4BYhlzFf+t0R+cktX0J9f2SoAIPwZ+JQvag5aqfDU0V/5PqQvIUA7qMW X-Received: by 2002:a17:907:d046:b0:9be:77cd:4c2c with SMTP id vb6-20020a170907d04600b009be77cd4c2cmr77373ejc.28.1700601342393; Tue, 21 Nov 2023 13:15:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700601342; cv=none; d=google.com; s=arc-20160816; b=b5I4UQS3fDM2kPsVIw0FadwGzr3TxB+yrAE4SMl710UyFpp7SIglHddqEvp2txoBta zO6ZLhtOroGJLtO0cop0hPQCVuJG+bBP7PgBBZCD8z9ifmW0xCzfHIwWdfakC7Q0Wrro HdEQVNthcPHW6J9dC/coKMFlTi5rK2xax7bMxMBtshRj5g3V1XanJrs+UU1RKvTfUyjN IiVmUREpt65AmvUQTCTWa76CqwdvZ30McoLuOPtK+7GTGHhLxAIe5elBfQzlO2wmY2Je 0MzKCAEYcQzoQYcAVPgagoar+tneBvBGMHVYgFid6hntbivReszTkYEA5gGtnXVppaL7 WO5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=kMUq3/t99q5VpY4FCkXcpGayjHVuyVwzIxdDi1b8K9A=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=z/R9ZIZngNm14b3TNqsx04NmOFiBYc1rYmW9Djhkxyi269hVUH5XatLbNH6gTHcAvx Yf+OVi0sH1u/40rG1D/DvRYkYrlpWlNAN0uUFThV6obOVvjdLqpfzXb85f8n4dZSB9Vl enayGwV8XnZkeAF3CsLib3XKHub8kzLYDVuztYyQAQRG/71Bfy/R2BVJYFN6TPF6xuvj fOQwaYH46T2W71MCrZMCLU2zEcllQbSoc/m9opDcKd9nGFKltutvrQKjX93vki3KhCw8 xMR5GbkDAs4kWZHSzeVydosgeI7NsxKmR0wcV9W3RQz4G8faLx8kuEau2LCZTsBwpI3S 3kug== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="nrFb/f8I"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id s23-20020a170906a19700b009b916eae8basi5741508ejy.882.2023.11.21.13.15.41; Tue, 21 Nov 2023 13:15:42 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b="nrFb/f8I"; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8A64A68CCEB; Tue, 21 Nov 2023 23:15:03 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pg1-f173.google.com (mail-pg1-f173.google.com [209.85.215.173]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 6792968CAA2 for ; Tue, 21 Nov 2023 23:14:54 +0200 (EET) Received: by mail-pg1-f173.google.com with SMTP id 41be03b00d2f7-5bd5809f63aso3507983a12.3 for ; Tue, 21 Nov 2023 13:14:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700601291; x=1701206091; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QQcx+lZ98EebRaAQZ1xbGzSbm7uHEqpzzyVsG4Mg9BA=; b=nrFb/f8IR92w3IwrwgOrPfIGcxBXksRB88w80eEjsv8ZdfDQ7/MDDmP63OaBaYPguB UddHVkgliZmDrTCeghWD+anifooJ8omMAOEbHdRRVaaQH5cpHxnFwNNgG5/MPbzD57j8 LqvUw0HlZvZu4Uh4PDnSlOr1rzVuy9mSgLPR5y0uQU5yzla+uw30WA91lmlqg6LXJ+76 ozAa+98+G9o7Id+S8UG+cvqWK+Ogd3f5ssmZckiERzvq5jGssuygXDA/p/vk5fak6Cm5 p2QPxQluovTLeo/p0TRV28p8A4yrikh59x2mBlTPHL0L03I3y7vBIG+8PoGFVoarAeyd AQJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700601291; x=1701206091; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QQcx+lZ98EebRaAQZ1xbGzSbm7uHEqpzzyVsG4Mg9BA=; b=s4zGVWMUvUy4HUllIUuIcR7tVBM36UrGnPvu/9DoewbzZyYSTAcPgtRPbeAQ9ny0E/ yveMX/MeeUL3n12UdegyVtngKmAKRWaxQrQ/4XgaoVeIEuDJx1mDgfvcUOgjIO8Y686/ p7/coh7UwQjD9gkAssBe3Gq00/TPHzjX+lcGSsA/2eaTTnW3HS63y12Vv3S0b8sU5bbB 50pMaZVo5Zw9D57Rf8E0jYdYaAnF1lUIDR7tHMqlSL88GAJULcNPx8e0my7uosUbz3/E /Fkva/oeLUcEAYmoPt0HTU6JeFf8wOCnQ6AukyEQiuI4Ck4MzqwwU65RBqJ3d5Wb+YZq U0Hg== X-Gm-Message-State: AOJu0Yw+83GuHrUhF6vYRJwDJ/KdefKEYCDTjQm/U31j7BLH47UlRr+l VsWAII1oCMQdHPFjhQpTWfhJ801H74M= X-Received: by 2002:a05:6a20:12d6:b0:16c:b5ce:50f with SMTP id v22-20020a056a2012d600b0016cb5ce050fmr293152pzg.32.1700601290347; Tue, 21 Nov 2023 13:14:50 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id bn2-20020a056a00324200b0069ea08a2a99sm8412505pfb.211.2023.11.21.13.14.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Nov 2023 13:14:49 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 21 Nov 2023 18:14:40 -0300 Message-ID: <20231121211442.8723-4-jamrial@gmail.com> X-Mailer: git-send-email 2.42.1 In-Reply-To: <20231121211442.8723-1-jamrial@gmail.com> References: <20231121211442.8723-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 3/5] avformat: Immersive Audio Model and Formats demuxer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: e8iyhwyos1RV Signed-off-by: James Almer --- libavcodec/avpacket.c | 3 + libavcodec/packet.h | 24 + libavformat/Makefile | 2 + libavformat/allformats.c | 1 + libavformat/avformat.c | 10 +- libavformat/avformat.h | 8 +- libavformat/dump.c | 107 +++- libavformat/iamf.c | 627 ++++++++++++++++++ libavformat/iamf.h | 379 +++++++++++ libavformat/iamf_internal.h | 190 ++++++ libavformat/iamf_parse.c | 1193 +++++++++++++++++++++++++++++++++++ libavformat/iamfdec.c | 533 ++++++++++++++++ libavformat/options.c | 51 +- 13 files changed, 3105 insertions(+), 23 deletions(-) create mode 100644 libavformat/iamf.c create mode 100644 libavformat/iamf.h create mode 100644 libavformat/iamf_internal.h create mode 100644 libavformat/iamf_parse.c create mode 100644 libavformat/iamfdec.c diff --git a/libavcodec/avpacket.c b/libavcodec/avpacket.c index e29725c2d2..0f8c9b77ae 100644 --- a/libavcodec/avpacket.c +++ b/libavcodec/avpacket.c @@ -301,6 +301,9 @@ const char *av_packet_side_data_name(enum AVPacketSideDataType type) case AV_PKT_DATA_DOVI_CONF: return "DOVI configuration record"; case AV_PKT_DATA_S12M_TIMECODE: return "SMPTE ST 12-1:2014 timecode"; case AV_PKT_DATA_DYNAMIC_HDR10_PLUS: return "HDR10+ Dynamic Metadata (SMPTE 2094-40)"; + case AV_PKT_DATA_IAMF_MIX_GAIN_PARAM: return "IAMF Mix Gain Parameter Data"; + case AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM: return "IAMF Demixing Info Parameter Data"; + case AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM: return "IAMF Recon Gain Info Parameter Data"; } return NULL; } diff --git a/libavcodec/packet.h b/libavcodec/packet.h index b19409b719..2c57d262c6 100644 --- a/libavcodec/packet.h +++ b/libavcodec/packet.h @@ -299,6 +299,30 @@ enum AVPacketSideDataType { */ AV_PKT_DATA_DYNAMIC_HDR10_PLUS, + /** + * IAMF Mix Gain Parameter Data associated with the audio frame. This metadata + * is in the form of the AVIAMFParamDefinition struct and contains information + * defined in sections 3.6.1 and 3.8.1 of the Immersive Audio Model and + * Formats standard. + */ + AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, + + /** + * IAMF Demixing Info Parameter Data associated with the audio frame. This + * metadata is in the form of the AVIAMFParamDefinition struct and contains + * information defined in sections 3.6.1 and 3.8.2 of the Immersive Audio Model + * and Formats standard. + */ + AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, + + /** + * IAMF Recon Gain Info Parameter Data associated with the audio frame. This + * metadata is in the form of the AVIAMFParamDefinition struct and contains + * information defined in sections 3.6.1 and 3.8.3 of the Immersive Audio Model + * and Formats standard. + */ + AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, + /** * The number of side data types. * This is not part of the public API/ABI in the sense that it may diff --git a/libavformat/Makefile b/libavformat/Makefile index 329055ccfd..472bbaf7cf 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -3,6 +3,7 @@ DESC = FFmpeg container format library HEADERS = avformat.h \ avio.h \ + iamf.h \ version.h \ version_major.h \ @@ -258,6 +259,7 @@ OBJS-$(CONFIG_EVC_MUXER) += rawenc.o OBJS-$(CONFIG_HLS_DEMUXER) += hls.o hls_sample_encryption.o OBJS-$(CONFIG_HLS_MUXER) += hlsenc.o hlsplaylist.o avc.o OBJS-$(CONFIG_HNM_DEMUXER) += hnm.o +OBJS-$(CONFIG_IAMF_DEMUXER) += iamfdec.o iamf_parse.o iamf.o OBJS-$(CONFIG_ICO_DEMUXER) += icodec.o OBJS-$(CONFIG_ICO_MUXER) += icoenc.o OBJS-$(CONFIG_IDCIN_DEMUXER) += idcin.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index d4b505a5a3..63ca44bacd 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -212,6 +212,7 @@ extern const FFOutputFormat ff_hevc_muxer; extern const AVInputFormat ff_hls_demuxer; extern const FFOutputFormat ff_hls_muxer; extern const AVInputFormat ff_hnm_demuxer; +extern const AVInputFormat ff_iamf_demuxer; extern const AVInputFormat ff_ico_demuxer; extern const FFOutputFormat ff_ico_muxer; extern const AVInputFormat ff_idcin_demuxer; diff --git a/libavformat/avformat.c b/libavformat/avformat.c index 4e31924c71..7084fd3136 100644 --- a/libavformat/avformat.c +++ b/libavformat/avformat.c @@ -37,6 +37,7 @@ #include "avformat.h" #include "avio.h" #include "demux.h" +#include "iamf.h" #include "mux.h" #include "internal.h" @@ -91,7 +92,14 @@ void ff_free_stream_group(AVStreamGroup **pstg) av_dict_free(&stg->metadata); av_freep(&stg->priv_data); switch (stg->type) { - // Structs in the union are freed here + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: { + avformat_iamf_audio_element_free(&stg->params.iamf_audio_element); + break; + } + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: { + avformat_iamf_mix_presentation_free(&stg->params.iamf_mix_presentation); + break; + } default: break; } diff --git a/libavformat/avformat.h b/libavformat/avformat.h index 0b4ae096d5..ca3e4a1ce2 100644 --- a/libavformat/avformat.h +++ b/libavformat/avformat.h @@ -1020,8 +1020,13 @@ typedef struct AVStream { enum AVStreamGroupParamsType { AV_STREAM_GROUP_PARAMS_NONE, + AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT, + AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION, }; +struct AVIAMFAudioElement; +struct AVIAMFMixPresentation; + typedef struct AVStreamGroup { /** * A class for @ref avoptions. Set by avformat_stream_group_create(). @@ -1055,7 +1060,8 @@ typedef struct AVStreamGroup { * Group type-specific parameters */ union { - uintptr_t dummy; // Placeholder + struct AVIAMFAudioElement *iamf_audio_element; + struct AVIAMFMixPresentation *iamf_mix_presentation; } params; /** diff --git a/libavformat/dump.c b/libavformat/dump.c index ededeedaa9..c618856429 100644 --- a/libavformat/dump.c +++ b/libavformat/dump.c @@ -38,6 +38,7 @@ #include "libavcodec/avcodec.h" #include "avformat.h" +#include "iamf.h" #include "internal.h" #define HEXDUMP_PRINT(...) \ @@ -134,28 +135,36 @@ static void print_fps(double d, const char *postfix) av_log(NULL, AV_LOG_INFO, "%1.0fk %s", d / 1000, postfix); } -static void dump_metadata(void *ctx, const AVDictionary *m, const char *indent) +static void dump_dictionary(void *ctx, const AVDictionary *m, + const char *name, const char *indent) { - if (m && !(av_dict_count(m) == 1 && av_dict_get(m, "language", NULL, 0))) { - const AVDictionaryEntry *tag = NULL; - - av_log(ctx, AV_LOG_INFO, "%sMetadata:\n", indent); - while ((tag = av_dict_iterate(m, tag))) - if (strcmp("language", tag->key)) { - const char *p = tag->value; - av_log(ctx, AV_LOG_INFO, - "%s %-16s: ", indent, tag->key); - while (*p) { - size_t len = strcspn(p, "\x8\xa\xb\xc\xd"); - av_log(ctx, AV_LOG_INFO, "%.*s", (int)(FFMIN(255, len)), p); - p += len; - if (*p == 0xd) av_log(ctx, AV_LOG_INFO, " "); - if (*p == 0xa) av_log(ctx, AV_LOG_INFO, "\n%s %-16s: ", indent, ""); - if (*p) p++; - } - av_log(ctx, AV_LOG_INFO, "\n"); + const AVDictionaryEntry *tag = NULL; + + if (!m) + return; + + av_log(ctx, AV_LOG_INFO, "%s%s:\n", indent, name); + while ((tag = av_dict_iterate(m, tag))) + if (strcmp("language", tag->key)) { + const char *p = tag->value; + av_log(ctx, AV_LOG_INFO, + "%s %-16s: ", indent, tag->key); + while (*p) { + size_t len = strcspn(p, "\x8\xa\xb\xc\xd"); + av_log(ctx, AV_LOG_INFO, "%.*s", (int)(FFMIN(255, len)), p); + p += len; + if (*p == 0xd) av_log(ctx, AV_LOG_INFO, " "); + if (*p == 0xa) av_log(ctx, AV_LOG_INFO, "\n%s %-16s: ", indent, ""); + if (*p) p++; } - } + av_log(ctx, AV_LOG_INFO, "\n"); + } +} + +static void dump_metadata(void *ctx, const AVDictionary *m, const char *indent) +{ + if (m && !(av_dict_count(m) == 1 && av_dict_get(m, "language", NULL, 0))) + dump_dictionary(ctx, m, "Metadata", indent); } /* param change side data*/ @@ -639,6 +648,64 @@ static void dump_stream_group(const AVFormatContext *ic, uint8_t *printed, av_log(NULL, AV_LOG_INFO, " Stream group #%d:%d[0x%"PRIx64"]:", index, i, stg->id); switch (stg->type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: { + const AVIAMFAudioElement *audio_element = stg->params.iamf_audio_element; + av_log(NULL, AV_LOG_INFO, " IAMF Audio Element\n"); + dump_metadata(NULL, stg->metadata, " "); + for (int j = 0; j < audio_element->num_layers; j++) { + const AVIAMFLayer *layer = audio_element->layers[j]; + int channel_count = layer->ch_layout.nb_channels; + av_log(NULL, AV_LOG_INFO, " Layer %d:", j); + ret = av_channel_layout_describe(&layer->ch_layout, buf, sizeof(buf)); + if (ret >= 0) + av_log(NULL, AV_LOG_INFO, " %s", buf); + av_log(NULL, AV_LOG_INFO, "\n"); + for (int k = 0; channel_count > 0 && k < stg->nb_streams; k++) { + AVStream *st = stg->streams[k]; + dump_stream_format(ic, st->index, i, index, is_output); + printed[st->index] = 1; + channel_count -= st->codecpar->ch_layout.nb_channels; + } + } + break; + } + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: { + const AVIAMFMixPresentation *mix_presentation = stg->params.iamf_mix_presentation; + av_log(NULL, AV_LOG_INFO, " IAMF Mix Presentation\n"); + dump_metadata(NULL, stg->metadata, " "); + dump_dictionary(NULL, mix_presentation->annotations, "Annotations", " "); + for (int j = 0; j < mix_presentation->num_submixes; j++) { + AVIAMFSubmix *sub_mix = mix_presentation->submixes[j]; + av_log(NULL, AV_LOG_INFO, " Submix %d:\n", j); + for (int k = 0; k < sub_mix->num_elements; k++) { + const AVIAMFSubmixElement *submix_element = sub_mix->elements[k]; + const AVStreamGroup *audio_element = NULL; + for (int l = 0; l < ic->nb_stream_groups; l++) + if (ic->stream_groups[l]->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT && + ic->stream_groups[l]->id == submix_element->audio_element_id) { + audio_element = ic->stream_groups[l]; + break; + } + if (audio_element) { + av_log(NULL, AV_LOG_INFO, " IAMF Audio Element #%d:%d[0x%"PRIx64"]\n", + index, audio_element->index, audio_element->id); + dump_dictionary(NULL, submix_element->annotations, "Annotations", " "); + } + } + for (int k = 0; k < sub_mix->num_layouts; k++) { + const AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[k]; + av_log(NULL, AV_LOG_INFO, " Layout #%d:", k); + if (submix_layout->layout_type == 2) { + ret = av_channel_layout_describe(&submix_layout->sound_system, buf, sizeof(buf)); + if (ret >= 0) + av_log(NULL, AV_LOG_INFO, " %s", buf); + } else if (submix_layout->layout_type == 3) + av_log(NULL, AV_LOG_INFO, " Binaural"); + av_log(NULL, AV_LOG_INFO, "\n"); + } + } + break; + } default: break; } diff --git a/libavformat/iamf.c b/libavformat/iamf.c new file mode 100644 index 0000000000..701f3ced68 --- /dev/null +++ b/libavformat/iamf.c @@ -0,0 +1,627 @@ +/* + * Immersive Audio Model and Formats helper functions and defines + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include +#include + +#include "libavutil/avassert.h" +#include "libavutil/error.h" +#include "libavutil/log.h" +#include "libavutil/mem.h" +#include "libavutil/opt.h" + +#include "iamf.h" +#include "iamf_internal.h" + +const AVChannelLayout ff_iamf_scalable_ch_layouts[10] = { + AV_CHANNEL_LAYOUT_MONO, + AV_CHANNEL_LAYOUT_STEREO, + // "Loudspeaker configuration for Sound System B" + AV_CHANNEL_LAYOUT_5POINT1_BACK, + // "Loudspeaker configuration for Sound System C" + AV_CHANNEL_LAYOUT_5POINT1POINT2_BACK, + // "Loudspeaker configuration for Sound System D" + AV_CHANNEL_LAYOUT_5POINT1POINT4_BACK, + // "Loudspeaker configuration for Sound System I" + AV_CHANNEL_LAYOUT_7POINT1, + // "Loudspeaker configuration for Sound System I" + Ltf + Rtf + AV_CHANNEL_LAYOUT_7POINT1POINT2, + // "Loudspeaker configuration for Sound System J" + AV_CHANNEL_LAYOUT_7POINT1POINT4_BACK, + // Front subset of "Loudspeaker configuration for Sound System J" + AV_CHANNEL_LAYOUT_3POINT1POINT2, + // Binaural + AV_CHANNEL_LAYOUT_STEREO, +}; + +const struct IAMFSoundSystemMap ff_iamf_sound_system_map[13] = { + { SOUND_SYSTEM_A_0_2_0, AV_CHANNEL_LAYOUT_STEREO }, + { SOUND_SYSTEM_B_0_5_0, AV_CHANNEL_LAYOUT_5POINT1_BACK }, + { SOUND_SYSTEM_C_2_5_0, AV_CHANNEL_LAYOUT_5POINT1POINT2_BACK }, + { SOUND_SYSTEM_D_4_5_0, AV_CHANNEL_LAYOUT_5POINT1POINT4_BACK }, + { SOUND_SYSTEM_E_4_5_1, + { + .nb_channels = 11, + .order = AV_CHANNEL_ORDER_NATIVE, + .u.mask = AV_CH_LAYOUT_5POINT1POINT4_BACK | AV_CH_BOTTOM_FRONT_CENTER, + }, + }, + { SOUND_SYSTEM_F_3_7_0, AV_CHANNEL_LAYOUT_7POINT2POINT3 }, + { SOUND_SYSTEM_G_4_9_0, AV_CHANNEL_LAYOUT_9POINT1POINT4_BACK }, + { SOUND_SYSTEM_H_9_10_3, AV_CHANNEL_LAYOUT_22POINT2 }, + { SOUND_SYSTEM_I_0_7_0, AV_CHANNEL_LAYOUT_7POINT1 }, + { SOUND_SYSTEM_J_4_7_0, AV_CHANNEL_LAYOUT_7POINT1POINT4_BACK }, + { SOUND_SYSTEM_10_2_7_0, AV_CHANNEL_LAYOUT_7POINT1POINT2 }, + { SOUND_SYSTEM_11_2_3_0, AV_CHANNEL_LAYOUT_3POINT1POINT2 }, + { SOUND_SYSTEM_12_0_1_0, AV_CHANNEL_LAYOUT_MONO }, +}; + +#define IAMF_ADD_FUNC_TEMPLATE(parent_type, parent_name, child_type, child_name, suffix) \ +int avformat_iamf_ ## parent_name ## _add_ ## child_name(parent_type *parent_name, AVDictionary **options) \ +{ \ + child_type **child_name ## suffix, *child_name; \ + \ + if (parent_name->num_## child_name ## suffix == UINT_MAX) \ + return AVERROR(EINVAL); \ + \ + child_name ## suffix = av_realloc_array(parent_name->child_name ## suffix, \ + parent_name->num_## child_name ## suffix + 1, \ + sizeof(*parent_name->child_name ## suffix)); \ + if (!child_name ## suffix) \ + return AVERROR(ENOMEM); \ + \ + parent_name->child_name ## suffix = child_name ## suffix; \ + \ + child_name = parent_name->child_name ## suffix[parent_name->num_## child_name ## suffix] \ + = av_mallocz(sizeof(*child_name)); \ + if (!child_name) \ + return AVERROR(ENOMEM); \ + \ + child_name->av_class = &child_name ## _class; \ + av_opt_set_defaults(child_name); \ + if (options) { \ + int ret = av_opt_set_dict2(child_name, options, AV_OPT_SEARCH_CHILDREN); \ + if (ret < 0) { \ + av_freep(&parent_name->child_name ## suffix[parent_name->num_## child_name ## suffix]); \ + return ret; \ + } \ + } \ + parent_name->num_## child_name ## suffix++; \ + \ + return 0; \ +} + +#define FLAGS AV_OPT_FLAG_ENCODING_PARAM + +// +// Param Definition +// +#define OFFSET(x) offsetof(AVIAMFMixGainParameterData, x) +static const AVOption mix_gain_options[] = { + { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS }, + { "animation_type", "set animation_type", OFFSET(animation_type), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 2, FLAGS }, + { "start_point_value", "set start_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS }, + { "end_point_value", "set end_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS }, + { "control_point_value", "set control_point_value", OFFSET(animation_type), AV_OPT_TYPE_RATIONAL, {.dbl = 0 }, -128.0, 128.0, FLAGS }, + { "control_point_relative_time", "set control_point_relative_time", OFFSET(animation_type), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, UINT8_MAX, FLAGS }, + { NULL }, +}; + +static const AVClass mix_gain_class = { + .class_name = "AVIAMFSubmixElement", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = mix_gain_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFDemixingInfoParameterData, x) +static const AVOption demixing_info_options[] = { + { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS }, + { "dmixp_mode", "set dmixp_mode", OFFSET(dmixp_mode), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 6, FLAGS }, + { NULL }, +}; + +static const AVClass demixing_info_class = { + .class_name = "AVIAMFDemixingInfoParameterData", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = demixing_info_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFReconGainParameterData, x) +static const AVOption recon_gain_options[] = { + { "subblock_duration", "set subblock_duration", OFFSET(subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 1 }, 1, UINT_MAX, FLAGS }, + { NULL }, +}; + +static const AVClass recon_gain_class = { + .class_name = "AVIAMFReconGainParameterData", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = recon_gain_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFParamDefinition, x) +static const AVOption param_definition_options[] = { + { "parameter_id", "set parameter_id", OFFSET(parameter_id), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { "parameter_rate", "set parameter_rate", OFFSET(parameter_rate), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { "param_definition_mode", "set param_definition_mode", OFFSET(param_definition_mode), AV_OPT_TYPE_INT, {.i64 = 1 }, 0, 1, FLAGS }, + { "duration", "set duration", OFFSET(duration), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { "constant_subblock_duration", "set constant_subblock_duration", OFFSET(constant_subblock_duration), AV_OPT_TYPE_INT64, {.i64 = 0 }, 0, UINT_MAX, FLAGS }, + { NULL }, +}; + +static const AVClass *param_definition_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + switch(i) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + ret = &mix_gain_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + ret = &demixing_info_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + ret = &recon_gain_class; + break; + default: + break; + } + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass param_definition_class = { + .class_name = "AVIAMFParamDefinition", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = param_definition_options, + .child_class_iterate = param_definition_child_iterate, +}; + +const AVClass *avformat_iamf_param_definition_get_class(void) +{ + return ¶m_definition_class; +} + +AVIAMFParamDefinition *avformat_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType type, AVDictionary **options, + unsigned int num_subblocks, AVDictionary **subblock_options, + size_t *out_size) +{ + + struct MixGainStruct { + AVIAMFParamDefinition p; + AVIAMFMixGainParameterData m; + }; + struct DemixStruct { + AVIAMFParamDefinition p; + AVIAMFDemixingInfoParameterData d; + }; + struct ReconGainStruct { + AVIAMFParamDefinition p; + AVIAMFReconGainParameterData r; + }; + size_t subblocks_offset, subblock_size; + size_t size; + AVIAMFParamDefinition *par; + + switch (type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + subblocks_offset = offsetof(struct MixGainStruct, m); + subblock_size = sizeof(AVIAMFMixGainParameterData); + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + subblocks_offset = offsetof(struct DemixStruct, d); + subblock_size = sizeof(AVIAMFDemixingInfoParameterData); + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + subblocks_offset = offsetof(struct ReconGainStruct, r); + subblock_size = sizeof(AVIAMFReconGainParameterData); + break; + default: + return NULL; + } + + size = subblocks_offset; + if (num_subblocks > (SIZE_MAX - size) / subblock_size) + return NULL; + size += subblock_size * num_subblocks; + + par = av_mallocz(size); + if (!par) + return NULL; + + par->av_class = ¶m_definition_class; + av_opt_set_defaults(par); + if (options) { + int ret = av_opt_set_dict(par, options); + if (ret < 0) { + av_free(par); + return NULL; + } + } + par->param_definition_type = type; + par->num_subblocks = num_subblocks; + par->subblock_size = subblock_size; + par->subblocks_offset = subblocks_offset; + + for (int i = 0; i < num_subblocks; i++) { + void *subblock = avformat_iamf_param_definition_get_subblock(par, i); + + switch (type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + ((AVIAMFMixGainParameterData *)subblock)->av_class = &mix_gain_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + ((AVIAMFDemixingInfoParameterData *)subblock)->av_class = &demixing_info_class; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + ((AVIAMFReconGainParameterData *)subblock)->av_class = &recon_gain_class; + break; + default: + av_assert0(0); + } + + av_opt_set_defaults(subblock); + if (subblock_options && subblock_options[i]) { + int ret = av_opt_set_dict(subblock, &subblock_options[i]); + if (ret < 0) { + av_free(par); + return NULL; + } + } + } + + if (out_size) + *out_size = size; + + return par; +} + +// +// Audio Element +// +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFLayer, x) +static const AVOption layer_options[] = { + { "ch_layout", "set ch_layout", OFFSET(ch_layout), AV_OPT_TYPE_CHLAYOUT, {.str = NULL }, 0, 0, FLAGS }, + { "recon_gain_is_present", "set recon_gain_is_present", OFFSET(recon_gain_is_present), AV_OPT_TYPE_BOOL, {.i64 = 0 }, 0, 1, FLAGS }, + { "output_gain_flags", "set output_gain_flags", OFFSET(output_gain_flags), AV_OPT_TYPE_FLAGS, + {.i64 = 0 }, 0, (1 << 6) - 1, FLAGS, "output_gain_flags" }, + {"FL", "Left channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 5 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"FR", "Right channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 4 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"BL", "Left surround channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 3 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"BR", "Right surround channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 2 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"TFL", "Left top front channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 1 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + {"TFR", "Right top front channel", 0, AV_OPT_TYPE_CONST, + {.i64 = 1 << 0 }, INT_MIN, INT_MAX, FLAGS, "output_gain_flags"}, + { "output_gain", "set output_gain", OFFSET(output_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "ambisonics_mode", "set ambisonics_mode", OFFSET(ambisonics_mode), AV_OPT_TYPE_INT, + { .i64 = AV_IAMF_AMBISONICS_MODE_MONO }, + AV_IAMF_AMBISONICS_MODE_MONO, AV_IAMF_AMBISONICS_MODE_PROJECTION, FLAGS, "ambisonics_mode" }, + { "mono", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AMBISONICS_MODE_MONO }, .unit = "ambisonics_mode" }, + { "projection", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AMBISONICS_MODE_PROJECTION }, .unit = "ambisonics_mode" }, + { NULL }, +}; + +static const AVClass layer_class = { + .class_name = "AVIAMFLayer", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = layer_options, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFAudioElement, x) +static const AVOption audio_element_options[] = { + { "audio_element_type", "set audio_element_type", OFFSET(audio_element_type), AV_OPT_TYPE_INT, + {.i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL }, + AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, FLAGS, "audio_element_type" }, + { "channel", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL }, .unit = "audio_element_type" }, + { "scene", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE }, .unit = "audio_element_type" }, + { "default_w", "set default_w", OFFSET(default_w), AV_OPT_TYPE_INT, {.i64 = 0 }, 0, 10, FLAGS }, + { NULL }, +}; + +static const AVClass *audio_element_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + if (i) + ret = &layer_class; + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass audio_element_class = { + .class_name = "AVIAMFAudioElement", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = audio_element_options, + .child_class_iterate = audio_element_child_iterate, +}; + +const AVClass *avformat_iamf_audio_element_get_class(void) +{ + return &audio_element_class; +} + +AVIAMFAudioElement *avformat_iamf_audio_element_alloc(void) +{ + AVIAMFAudioElement *audio_element = av_mallocz(sizeof(*audio_element)); + + if (audio_element) { + audio_element->av_class = &audio_element_class; + av_opt_set_defaults(audio_element); + } + + return audio_element; +} + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFAudioElement, audio_element, AVIAMFLayer, layer, s) + +void avformat_iamf_audio_element_free(AVIAMFAudioElement **paudio_element) +{ + AVIAMFAudioElement *audio_element = *paudio_element; + + if (!audio_element) + return; + + for (int i = 0; i < audio_element->num_layers; i++) { + AVIAMFLayer *layer = audio_element->layers[i]; + av_opt_free(layer); + av_free(layer->demixing_matrix); + av_free(layer); + } + av_free(audio_element->layers); + + av_free(audio_element->demixing_info); + av_free(audio_element->recon_gain_info); + av_freep(paudio_element); +} + +// +// Mix Presentation +// +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFSubmixElement, x) +static const AVOption submix_element_options[] = { + { "headphones_rendering_mode", "Headphones rendering mode", OFFSET(headphones_rendering_mode), AV_OPT_TYPE_INT, + { .i64 = AV_IAMF_HEADPHONES_MODE_STEREO }, + AV_IAMF_HEADPHONES_MODE_STEREO, AV_IAMF_HEADPHONES_MODE_BINAURAL, FLAGS, "headphones_rendering_mode" }, + { "stereo", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_HEADPHONES_MODE_STEREO }, .unit = "headphones_rendering_mode" }, + { "binaural", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_HEADPHONES_MODE_BINAURAL }, .unit = "headphones_rendering_mode" }, + { "default_mix_gain", "Default mix gain", OFFSET(default_mix_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "annotations", "Annotations", OFFSET(annotations), AV_OPT_TYPE_DICT, { .str = NULL }, 0, 0, FLAGS }, + { NULL }, +}; + +static void *submix_element_child_next(void *obj, void *prev) +{ + AVIAMFSubmixElement *submix_element = obj; + if (!prev) + return submix_element->element_mix_config; + + return NULL; +} + +static const AVClass *submix_element_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + if (i) + ret = ¶m_definition_class; + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass element_class = { + .class_name = "AVIAMFSubmixElement", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = submix_element_options, + .child_next = submix_element_child_next, + .child_class_iterate = submix_element_child_iterate, +}; + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFSubmix, submix, AVIAMFSubmixElement, element, s) + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFSubmixLayout, x) +static const AVOption submix_layout_options[] = { + { "layout_type", "Layout type", OFFSET(layout_type), AV_OPT_TYPE_INT, + { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS }, + AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS, AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL, FLAGS, "layout_type" }, + { "loudspeakers", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS }, .unit = "layout_type" }, + { "binaural", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL }, .unit = "layout_type" }, + { "sound_system", "Sound System", OFFSET(sound_system), AV_OPT_TYPE_CHLAYOUT, { .str = NULL }, 0, 0, FLAGS }, + { "integrated_loudness", "Integrated loudness", OFFSET(integrated_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "digital_peak", "Digital peak", OFFSET(digital_peak), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "true_peak", "True peak", OFFSET(true_peak), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "dialog_anchored_loudness", "Anchored loudness (Dialog)", OFFSET(dialogue_anchored_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { "album_anchored_loudness", "Anchored loudness (Album)", OFFSET(album_anchored_loudness), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { NULL }, +}; + +static const AVClass layout_class = { + .class_name = "AVIAMFSubmixLayout", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = submix_layout_options, +}; + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFSubmix, submix, AVIAMFSubmixLayout, layout, s) + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFSubmix, x) +static const AVOption submix_presentation_options[] = { + { "default_mix_gain", "Default mix gain", OFFSET(default_mix_gain), AV_OPT_TYPE_RATIONAL, { .dbl = 0 }, -128.0, 128.0, FLAGS }, + { NULL }, +}; + +static void *submix_presentation_child_next(void *obj, void *prev) +{ + AVIAMFSubmix *sub_mix = obj; + if (!prev) + return sub_mix->output_mix_config; + + return NULL; +} + +static const AVClass *submix_presentation_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + switch(i) { + case 0: + ret = &element_class; + break; + case 1: + ret = &layout_class; + break; + case 2: + ret = ¶m_definition_class; + break; + default: + break; + } + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass submix_class = { + .class_name = "AVIAMFSubmix", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = submix_presentation_options, + .child_next = submix_presentation_child_next, + .child_class_iterate = submix_presentation_child_iterate, +}; + +#undef OFFSET +#define OFFSET(x) offsetof(AVIAMFMixPresentation, x) +static const AVOption mix_presentation_options[] = { + { "annotations", "set annotations", OFFSET(annotations), AV_OPT_TYPE_DICT, {.str = NULL }, 0, 0, FLAGS }, + { NULL }, +}; + +#undef OFFSET +#undef FLAGS + +static const AVClass *mix_presentation_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + if (i) + ret = &submix_class; + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + +static const AVClass mix_presentation_class = { + .class_name = "AVIAMFMixPresentation", + .item_name = av_default_item_name, + .version = LIBAVUTIL_VERSION_INT, + .option = mix_presentation_options, + .child_class_iterate = mix_presentation_child_iterate, +}; + +const AVClass *avformat_iamf_mix_presentation_get_class(void) +{ + return &mix_presentation_class; +} + +AVIAMFMixPresentation *avformat_iamf_mix_presentation_alloc(void) +{ + AVIAMFMixPresentation *mix_presentation = av_mallocz(sizeof(*mix_presentation)); + + if (mix_presentation) { + mix_presentation->av_class = &mix_presentation_class; + av_opt_set_defaults(mix_presentation); + } + + return mix_presentation; +} + +IAMF_ADD_FUNC_TEMPLATE(AVIAMFMixPresentation, mix_presentation, AVIAMFSubmix, submix, es) + +void avformat_iamf_mix_presentation_free(AVIAMFMixPresentation **pmix_presentation) +{ + AVIAMFMixPresentation *mix_presentation = *pmix_presentation; + + if (!mix_presentation) + return; + + for (int i = 0; i < mix_presentation->num_submixes; i++) { + AVIAMFSubmix *sub_mix = mix_presentation->submixes[i]; + for (int j = 0; j < sub_mix->num_elements; j++) { + AVIAMFSubmixElement *submix_element = sub_mix->elements[j]; + av_opt_free(submix_element); + av_free(submix_element->element_mix_config); + av_free(submix_element); + } + av_free(sub_mix->elements); + for (int j = 0; j < sub_mix->num_layouts; j++) { + AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[j]; + av_opt_free(submix_layout); + av_free(submix_layout); + } + av_free(sub_mix->layouts); + av_free(sub_mix->output_mix_config); + av_free(sub_mix); + } + av_opt_free(mix_presentation); + av_free(mix_presentation->submixes); + + av_freep(pmix_presentation); +} diff --git a/libavformat/iamf.h b/libavformat/iamf.h new file mode 100644 index 0000000000..2479795290 --- /dev/null +++ b/libavformat/iamf.h @@ -0,0 +1,379 @@ +/* + * Immersive Audio Model and Formats helper functions and defines + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFORMAT_IAMF_H +#define AVFORMAT_IAMF_H + +/** + * @file + * Immersive Audio Model and Formats API header + */ + +#include +#include + +#include "libavutil/attributes.h" +#include "libavutil/avassert.h" +#include "libavutil/channel_layout.h" +#include "libavutil/dict.h" +#include "libavutil/rational.h" + +struct AVStreamGroup; + +enum AVIAMFAudioElementType { + AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, + AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, +}; + +/** + * @defgroup lavf_iamf_params Parameter Definition + * @{ + * Parameters as defined in section 3.6.1 and 3.8 + * @} + * @defgroup lavf_iamf_audio Audio Element + * @{ + * Audio Elements as defined in section 3.6 + * @} + * @defgroup lavf_iamf_mix Mix Presentation + * @{ + * Mix Presentations as defined in section 3.7 + * @} + * + * @} + * @addtogroup lavf_iamf_params + * @{ + */ +enum AVIAMFAnimationType { + AV_IAMF_ANIMATION_TYPE_STEP, + AV_IAMF_ANIMATION_TYPE_LINEAR, + AV_IAMF_ANIMATION_TYPE_BEZIER, +}; + +/** + * Mix Gain Parameter Data as defined in section 3.8.1 + * + * Subblocks in AVIAMFParamDefinition use this struct when the value or + * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is + * AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN. + */ +typedef struct AVIAMFMixGainParameterData { + const AVClass *av_class; + + // AVOption enabled fields + unsigned int subblock_duration; + enum AVIAMFAnimationType animation_type; + AVRational start_point_value; + AVRational end_point_value; + AVRational control_point_value; + unsigned int control_point_relative_time; +} AVIAMFMixGainParameterData; + +/** + * Demixing Info Parameter Data as defined in section 3.8.2 + * + * Subblocks in AVIAMFParamDefinition use this struct when the value or + * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is + * AV_IAMF_PARAMETER_DEFINITION_DEMIXING. + */ +typedef struct AVIAMFDemixingInfoParameterData { + const AVClass *av_class; + + // AVOption enabled fields + unsigned int subblock_duration; + unsigned int dmixp_mode; +} AVIAMFDemixingInfoParameterData; + +/** + * Recon Gain Info Parameter Data as defined in section 3.8.3 + * + * Subblocks in AVIAMFParamDefinition use this struct when the value or + * @ref AVIAMFParamDefinition.param_definition_type param_definition_type is + * AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN. + */ +typedef struct AVIAMFReconGainParameterData { + const AVClass *av_class; + + // AVOption enabled fields + unsigned int subblock_duration; + // End of AVOption enabled fields + uint8_t recon_gain[6][12]; +} AVIAMFReconGainParameterData; + +enum AVIAMFParamDefinitionType { + AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, + AV_IAMF_PARAMETER_DEFINITION_DEMIXING, + AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN, +}; + +/** + * Parameters as defined in section 3.6.1 + */ +typedef struct AVIAMFParamDefinition { + const AVClass *av_class; + + size_t subblocks_offset; + size_t subblock_size; + + enum AVIAMFParamDefinitionType param_definition_type; + unsigned int num_subblocks; + + // AVOption enabled fields + unsigned int parameter_id; + unsigned int parameter_rate; + unsigned int param_definition_mode; + unsigned int duration; + unsigned int constant_subblock_duration; +} AVIAMFParamDefinition; + +const AVClass *avformat_iamf_param_definition_get_class(void); + +AVIAMFParamDefinition *avformat_iamf_param_definition_alloc(enum AVIAMFParamDefinitionType param_definition_type, + AVDictionary **options, + unsigned int num_subblocks, AVDictionary **subblock_options, + size_t *size); + +/** + * Get the subblock at the specified {@code idx}. Must be between 0 and num_subblocks - 1. + * + * The @ref AVIAMFParamDefinition.param_definition_type "param definition type" defines + * the struct type of the returned pointer. + */ +static av_always_inline void* +avformat_iamf_param_definition_get_subblock(AVIAMFParamDefinition *par, unsigned int idx) +{ + av_assert0(idx < par->num_subblocks); + return (void *)((uint8_t *)par + par->subblocks_offset + idx * par->subblock_size); +} + +/** + * @} + * @addtogroup lavf_iamf_audio + * @{ + */ + +enum AVIAMFAmbisonicsMode { + AV_IAMF_AMBISONICS_MODE_MONO, + AV_IAMF_AMBISONICS_MODE_PROJECTION, +}; + +/** + * A layer defining a Channel Layout in the Audio Element. + * + * When audio_element_type is AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, this + * corresponds to an Scalable Channel Layout layer as defined in section 3.6.2. + * For AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, it is an Ambisonics channel + * layout as defined in section 3.6.3 + */ +typedef struct AVIAMFLayer { + const AVClass *av_class; + + // AVOption enabled fields + AVChannelLayout ch_layout; + + unsigned int recon_gain_is_present; + /** + * Output gain flags as defined in section 3.6.2 + * + * This field is defined only if audio_element_type is + * AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL, must be 0 otherwise. + */ + unsigned int output_gain_flags; + /** + * Output gain as defined in section 3.6.2 + * + * Must be 0 if @ref output_gain_flags is 0. + */ + AVRational output_gain; + /** + * Ambisonics mode as defined in section 3.6.3 + * + * This field is defined only if audio_element_type is + * AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, must be 0 otherwise. + * + * If 0, channel_mapping is defined implicitly (Ambisonic Order) + * or explicitly (Custom Order with ambi channels) in @ref ch_layout. + * If 1, @ref demixing_matrix must be set. + */ + enum AVIAMFAmbisonicsMode ambisonics_mode; + + // End of AVOption enabled fields + /** + * Demixing matrix as defined in section 3.6.3 + * + * Set only if @ref ambisonics_mode == 1, must be NULL otherwise. + */ + AVRational *demixing_matrix; +} AVIAMFLayer; + +typedef struct AVIAMFAudioElement { + const AVClass *av_class; + + AVIAMFLayer **layers; + /** + * Number of layers, or channel groups, in the Audio Element. + * For audio_element_type AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE, there + * may be exactly 1. + * + * Set by avformat_iamf_audio_element_add_layer(), must not be + * modified by any other code. + */ + unsigned int num_layers; + + unsigned int codec_config_id; + + AVIAMFParamDefinition *demixing_info; + AVIAMFParamDefinition *recon_gain_info; + + // AVOption enabled fields + /** + * Audio element type as defined in section 3.6 + */ + enum AVIAMFAudioElementType audio_element_type; + + /** + * Default weight value as defined in section 3.6 + */ + unsigned int default_w; +} AVIAMFAudioElement; + +const AVClass *avformat_iamf_audio_element_get_class(void); + +AVIAMFAudioElement *avformat_iamf_audio_element_alloc(void); + +int avformat_iamf_audio_element_add_layer(AVIAMFAudioElement *audio_element, AVDictionary **options); + +void avformat_iamf_audio_element_free(AVIAMFAudioElement **audio_element); + +/** + * @} + * @addtogroup lavf_iamf_mix + * @{ + */ + +enum AVIAMFHeadphonesMode { + AV_IAMF_HEADPHONES_MODE_STEREO, + AV_IAMF_HEADPHONES_MODE_BINAURAL, +}; + +typedef struct AVIAMFSubmixElement { + const AVClass *av_class; + + unsigned int audio_element_id; + + AVIAMFParamDefinition *element_mix_config; + + // AVOption enabled fields + enum AVIAMFHeadphonesMode headphones_rendering_mode; + + AVRational default_mix_gain; + + /** + * A dictionary of string describing the submix. Must have the same + * amount of entries as @ref AVIAMFMixPresentation.annotations "the + * mix's annotations". + * + * decoding: set by libavformat + * encoding: set by the user + */ + AVDictionary *annotations; +} AVIAMFSubmixElement; + +enum AVIAMFSubmixLayoutType { + AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS = 2, + AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL = 3, +}; + +typedef struct AVIAMFSubmixLayout { + const AVClass *av_class; + + // AVOption enabled fields + enum AVIAMFSubmixLayoutType layout_type; + AVChannelLayout sound_system; + AVRational integrated_loudness; + AVRational digital_peak; + AVRational true_peak; + AVRational dialogue_anchored_loudness; + AVRational album_anchored_loudness; +} AVIAMFSubmixLayout; + +typedef struct AVIAMFSubmix { + const AVClass *av_class; + + AVIAMFSubmixElement **elements; + /** + * Set by avformat_iamf_mix_presentation_add_submix(), must not be + * modified by any other code. + */ + unsigned int num_elements; + + AVIAMFSubmixLayout **layouts; + /** + * Set by avformat_iamf_mix_presentation_add_submix(), must not be + * modified by any other code. + */ + unsigned int num_layouts; + + AVIAMFParamDefinition *output_mix_config; + + // AVOption enabled fields + AVRational default_mix_gain; +} AVIAMFSubmix; + +typedef struct AVIAMFMixPresentation { + const AVClass *av_class; + + AVIAMFSubmix **submixes; + /** + * Number of submixes in the presentation. + * + * Set by avformat_iamf_mix_presentation_add_submix(), must not be + * modified by any other code. + */ + unsigned int num_submixes; + + // AVOption enabled fields + /** + * A dictionary of string describing the mix. Must have the same + * amount of entries as every @ref AVIAMFSubmixElement.annotations + * "Submix element annotations". + * + * decoding: set by libavformat + * encoding: set by the user + */ + AVDictionary *annotations; +} AVIAMFMixPresentation; + +const AVClass *avformat_iamf_mix_presentation_get_class(void); + +AVIAMFMixPresentation *avformat_iamf_mix_presentation_alloc(void); + +int avformat_iamf_mix_presentation_add_submix(AVIAMFMixPresentation *mix_presentation, + AVDictionary **options); + +int avformat_iamf_submix_add_element(AVIAMFSubmix *submix, AVDictionary **options); + +int avformat_iamf_submix_add_layout(AVIAMFSubmix *submix, AVDictionary **options); + +void avformat_iamf_mix_presentation_free(AVIAMFMixPresentation **mix_presentation); +/** + * @} + */ + +#endif /* AVFORMAT_IAMF_H */ diff --git a/libavformat/iamf_internal.h b/libavformat/iamf_internal.h new file mode 100644 index 0000000000..6d40e5920a --- /dev/null +++ b/libavformat/iamf_internal.h @@ -0,0 +1,190 @@ +/* + * Immersive Audio Model and Formats helper functions and defines + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVFORMAT_IAMF_INTERNAL_H +#define AVFORMAT_IAMF_INTERNAL_H + +#include + +#include "libavutil/channel_layout.h" +#include "libavcodec/codec_id.h" +#include "avformat.h" +#include "avio.h" +#include "iamf.h" + +#define MAX_IAMF_OBU_HEADER_SIZE (1 + 8 * 3) + +// OBU types (section 3.2). +enum IAMF_OBU_Type { + IAMF_OBU_IA_CODEC_CONFIG = 0, + IAMF_OBU_IA_AUDIO_ELEMENT = 1, + IAMF_OBU_IA_MIX_PRESENTATION = 2, + IAMF_OBU_IA_PARAMETER_BLOCK = 3, + IAMF_OBU_IA_TEMPORAL_DELIMITER = 4, + IAMF_OBU_IA_AUDIO_FRAME = 5, + IAMF_OBU_IA_AUDIO_FRAME_ID0 = 6, + IAMF_OBU_IA_AUDIO_FRAME_ID1 = 7, + IAMF_OBU_IA_AUDIO_FRAME_ID2 = 8, + IAMF_OBU_IA_AUDIO_FRAME_ID3 = 9, + IAMF_OBU_IA_AUDIO_FRAME_ID4 = 10, + IAMF_OBU_IA_AUDIO_FRAME_ID5 = 11, + IAMF_OBU_IA_AUDIO_FRAME_ID6 = 12, + IAMF_OBU_IA_AUDIO_FRAME_ID7 = 13, + IAMF_OBU_IA_AUDIO_FRAME_ID8 = 14, + IAMF_OBU_IA_AUDIO_FRAME_ID9 = 15, + IAMF_OBU_IA_AUDIO_FRAME_ID10 = 16, + IAMF_OBU_IA_AUDIO_FRAME_ID11 = 17, + IAMF_OBU_IA_AUDIO_FRAME_ID12 = 18, + IAMF_OBU_IA_AUDIO_FRAME_ID13 = 19, + IAMF_OBU_IA_AUDIO_FRAME_ID14 = 20, + IAMF_OBU_IA_AUDIO_FRAME_ID15 = 21, + IAMF_OBU_IA_AUDIO_FRAME_ID16 = 22, + IAMF_OBU_IA_AUDIO_FRAME_ID17 = 23, + // 24~30 reserved. + IAMF_OBU_IA_SEQUENCE_HEADER = 31, +}; + +typedef struct IAMFCodecConfig { + unsigned codec_config_id; + enum AVCodecID codec_id; + uint32_t codec_tag; + unsigned nb_samples; + int seek_preroll; + uint8_t *extradata; + int extradata_size; + int sample_rate; +} IAMFCodecConfig; + +typedef struct IAMFLayer { + unsigned int substream_count; + unsigned int coupled_substream_count; +} IAMFLayer; + +typedef struct IAMFSubStream { + unsigned int audio_substream_id; + + // demux + AVCodecParameters *codecpar; +} IAMFSubStream; + +typedef struct IAMFAudioElement { + AVIAMFAudioElement *element; + unsigned int audio_element_id; + + IAMFSubStream *substreams; + unsigned int nb_substreams; + + const IAMFCodecConfig *codec_config; + + // mux + IAMFLayer *layers; + unsigned int nb_layers; +} IAMFAudioElement; + +typedef struct IAMFMixPresentation { + AVIAMFMixPresentation *mix; + unsigned int mix_presentation_id; + + // demux + unsigned int count_label; + char **language_label; +} IAMFMixPresentation; + +typedef struct IAMFParamDefinition { + const AVIAMFAudioElement *audio_element; + AVIAMFParamDefinition *param; + size_t param_size; +} IAMFParamDefinition; + +typedef struct IAMFContext { + IAMFCodecConfig *codec_configs; + int nb_codec_configs; + IAMFAudioElement *audio_elements; + int nb_audio_elements; + IAMFMixPresentation *mix_presentations; + int nb_mix_presentations; + IAMFParamDefinition *param_definitions; + int nb_param_definitions; +} IAMFContext; + +enum IAMF_Anchor_Element { + IAMF_ANCHOR_ELEMENT_UNKNWONW, + IAMF_ANCHOR_ELEMENT_DIALOGUE, + IAMF_ANCHOR_ELEMENT_ALBUM, +}; + +enum IAMF_Sound_System { + SOUND_SYSTEM_A_0_2_0 = 0, // "Loudspeaker configuration for Sound System A" + SOUND_SYSTEM_B_0_5_0 = 1, // "Loudspeaker configuration for Sound System B" + SOUND_SYSTEM_C_2_5_0 = 2, // "Loudspeaker configuration for Sound System C" + SOUND_SYSTEM_D_4_5_0 = 3, // "Loudspeaker configuration for Sound System D" + SOUND_SYSTEM_E_4_5_1 = 4, // "Loudspeaker configuration for Sound System E" + SOUND_SYSTEM_F_3_7_0 = 5, // "Loudspeaker configuration for Sound System F" + SOUND_SYSTEM_G_4_9_0 = 6, // "Loudspeaker configuration for Sound System G" + SOUND_SYSTEM_H_9_10_3 = 7, // "Loudspeaker configuration for Sound System H" + SOUND_SYSTEM_I_0_7_0 = 8, // "Loudspeaker configuration for Sound System I" + SOUND_SYSTEM_J_4_7_0 = 9, // "Loudspeaker configuration for Sound System J" + SOUND_SYSTEM_10_2_7_0 = 10, // "Loudspeaker configuration for Sound System I" + Ltf + Rtf + SOUND_SYSTEM_11_2_3_0 = 11, // Front subset of "Loudspeaker configuration for Sound System J" + SOUND_SYSTEM_12_0_1_0 = 12, // Mono +}; + +struct IAMFSoundSystemMap { + enum IAMF_Sound_System id; + AVChannelLayout layout; +}; + +static inline int iamf_leb(AVIOContext *pb, unsigned *len) { + int more, i = 0; + *len = 0; + + do { + unsigned bits; + int byte = avio_r8(pb); + if (pb->error) + return pb->error; + if (pb->eof_reached) + return AVERROR_INVALIDDATA; + more = byte & 0x80; + bits = byte & 0x7f; + if (i <= 3 || (i == 4 && bits < (1 << 4))) + *len |= bits << (i * 7); + else if (bits) + return AVERROR_INVALIDDATA; + if (++i == 8 && more) + return AVERROR_INVALIDDATA; + } while (more); + + return i; +} + +extern const AVChannelLayout ff_iamf_scalable_ch_layouts[10]; +extern const struct IAMFSoundSystemMap ff_iamf_sound_system_map[13]; + +int ff_iamf_parse_obu_header(const uint8_t *buf, int buf_size, + unsigned *obu_size, int *start_pos, enum IAMF_OBU_Type *type, + unsigned *skip_samples, unsigned *discard_padding); + +int ff_iamfdec_read_descriptors(IAMFContext *c,AVIOContext *pb, + int size, void *log_ctx); + +void ff_iamf_uninit_context(IAMFContext *c); + +#endif /* AVFORMAT_IAMF_INTERNAL_H */ diff --git a/libavformat/iamf_parse.c b/libavformat/iamf_parse.c new file mode 100644 index 0000000000..e8f35dc9c7 --- /dev/null +++ b/libavformat/iamf_parse.c @@ -0,0 +1,1193 @@ +/* + * Immersive Audio Model and Formats parsing + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/avassert.h" +#include "libavutil/common.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/log.h" +#include "libavcodec/get_bits.h" +#include "libavcodec/flac.h" +#include "libavcodec/mpeg4audio.h" +#include "libavcodec/put_bits.h" +#include "avio_internal.h" +#include "iamf.h" +#include "iamf_internal.h" +#include "isom.h" + +static inline unsigned get_leb128(GetBitContext *gb) { + int more, i = 0; + unsigned len = 0; + + do { + unsigned bits; + int byte = get_bits(gb, 8); + more = byte & 0x80; + bits = byte & 0x7f; + if (i <= 3 || (i == 4 && bits < (1 << 4))) + len |= bits << (i * 7); + else if (bits) + return AVERROR_INVALIDDATA; + if (++i == 8 && more) + return AVERROR_INVALIDDATA; + } while (more); + + return len; +} + +static int opus_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len) +{ + int left = len - avio_tell(pb); + + if (left < 11) + return AVERROR_INVALIDDATA; + + codec_config->extradata = av_malloc(left + 8); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + + AV_WB32(codec_config->extradata, MKBETAG('O','p','u','s')); + AV_WB32(codec_config->extradata + 4, MKBETAG('H','e','a','d')); + codec_config->extradata_size = avio_read(pb, codec_config->extradata + 8, left); + if (codec_config->extradata_size < left) + return AVERROR_INVALIDDATA; + + codec_config->extradata_size += 8; + codec_config->sample_rate = 48000; + + return 0; +} + +static int aac_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len, void *logctx) +{ + MPEG4AudioConfig cfg = { 0 }; + int object_type_id, codec_id, stream_type; + int ret, tag, left; + + tag = avio_r8(pb); + if (tag != MP4DecConfigDescrTag) + return AVERROR_INVALIDDATA; + + object_type_id = avio_r8(pb); + if (object_type_id != 0x40) + return AVERROR_INVALIDDATA; + + stream_type = avio_r8(pb); + if (((stream_type >> 2) != 5) || ((stream_type >> 1) & 1)) + return AVERROR_INVALIDDATA; + + avio_skip(pb, 3); // buffer size db + avio_skip(pb, 4); // rc_max_rate + avio_skip(pb, 4); // avg bitrate + + codec_id = ff_codec_get_id(ff_mp4_obj_type, object_type_id); + if (codec_id && codec_id != codec_config->codec_id) + return AVERROR_INVALIDDATA; + + tag = avio_r8(pb); + if (tag != MP4DecSpecificDescrTag) + return AVERROR_INVALIDDATA; + + left = len - avio_tell(pb); + if (left <= 0) + return AVERROR_INVALIDDATA; + + codec_config->extradata = av_malloc(left); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + + codec_config->extradata_size = avio_read(pb, codec_config->extradata, left); + if (codec_config->extradata_size < left) + return AVERROR_INVALIDDATA; + + ret = avpriv_mpeg4audio_get_config2(&cfg, codec_config->extradata, + codec_config->extradata_size, 1, logctx); + if (ret < 0) + return ret; + + codec_config->sample_rate = cfg.sample_rate; + + return 0; +} + +static int flac_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len) +{ + int left; + + avio_skip(pb, 4); // METADATA_BLOCK_HEADER + + left = len - avio_tell(pb); + if (left < FLAC_STREAMINFO_SIZE) + return AVERROR_INVALIDDATA; + + codec_config->extradata = av_malloc(left); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + + codec_config->extradata_size = avio_read(pb, codec_config->extradata, left); + if (codec_config->extradata_size < left) + return AVERROR_INVALIDDATA; + + codec_config->sample_rate = AV_RB24(codec_config->extradata + 10) >> 4; + + return 0; +} + +static int ipcm_decoder_config(IAMFCodecConfig *codec_config, + AVIOContext *pb, int len) +{ + static const enum AVSampleFormat sample_fmt[2][3] = { + { AV_CODEC_ID_PCM_S16BE, AV_CODEC_ID_PCM_S24BE, AV_CODEC_ID_PCM_S32BE }, + { AV_CODEC_ID_PCM_S16LE, AV_CODEC_ID_PCM_S24LE, AV_CODEC_ID_PCM_S32LE }, + }; + int sample_format = avio_r8(pb); // 0 = BE, 1 = LE + int sample_size = (avio_r8(pb) / 8 - 2); // 16, 24, 32 + if (sample_format > 1 || sample_size > 2) + return AVERROR_INVALIDDATA; + + codec_config->codec_id = sample_fmt[sample_format][sample_size]; + codec_config->sample_rate = avio_rb32(pb); + + if (len - avio_tell(pb)) + return AVERROR_INVALIDDATA; + + return 0; +} + +static int codec_config_obu(void *s, IAMFContext *c, AVIOContext *pb, int len) +{ + IAMFCodecConfig *codec_config = NULL; + FFIOContext b; + AVIOContext *pbc; + uint8_t *buf; + enum AVCodecID avcodec_id; + unsigned codec_config_id, nb_samples, codec_id; + int16_t seek_preroll; + int ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pbc = &b.pub; + + ret = iamf_leb(pbc, &codec_config_id); + if (ret < 0) + goto fail; + + codec_id = avio_rb32(pbc); + ret = iamf_leb(pbc, &nb_samples); + if (ret < 0) + goto fail; + + seek_preroll = avio_rb16(pbc); + + switch(codec_id) { + case MKBETAG('O','p','u','s'): + avcodec_id = AV_CODEC_ID_OPUS; + break; + case MKBETAG('m','p','4','a'): + avcodec_id = AV_CODEC_ID_AAC; + break; + case MKBETAG('f','L','a','C'): + avcodec_id = AV_CODEC_ID_FLAC; + break; + default: + avcodec_id = AV_CODEC_ID_NONE; + break; + } + + for (int i = 0; i < c->nb_codec_configs; i++) + if (c->codec_configs[i].codec_config_id == codec_config_id) { + ret = AVERROR_INVALIDDATA; + goto fail; + } + + codec_config = av_dynarray2_add_nofree((void **)&c->codec_configs, &c->nb_codec_configs, + sizeof(*c->codec_configs), NULL); + if (!codec_config) { + ret = AVERROR(ENOMEM); + goto fail; + } + + memset(codec_config, 0, sizeof(*codec_config)); + + codec_config->codec_config_id = codec_config_id; + codec_config->codec_id = avcodec_id; + codec_config->nb_samples = nb_samples; + codec_config->seek_preroll = seek_preroll; + + switch(codec_id) { + case MKBETAG('O','p','u','s'): + ret = opus_decoder_config(codec_config, pbc, len); + break; + case MKBETAG('m','p','4','a'): + ret = aac_decoder_config(codec_config, pbc, len, s); + break; + case MKBETAG('f','L','a','C'): + ret = flac_decoder_config(codec_config, pbc, len); + break; + case MKBETAG('i','p','c','m'): + ret = ipcm_decoder_config(codec_config, pbc, len); + break; + default: + break; + } + if (ret < 0) + goto fail; + + len -= avio_tell(pbc); + if (len) + av_log(s, AV_LOG_WARNING, "Underread in codec_config_obu. %d bytes left at the end\n", len); + + ret = 0; +fail: + av_free(buf); + return ret; +} + +static int update_extradata(AVCodecParameters *codecpar) +{ + GetBitContext gb; + PutBitContext pb; + int ret; + + switch(codecpar->codec_id) { + case AV_CODEC_ID_OPUS: + AV_WB8(codecpar->extradata + 9, codecpar->ch_layout.nb_channels); + break; + case AV_CODEC_ID_AAC: { + uint8_t buf[5]; + + init_put_bits(&pb, buf, sizeof(buf)); + ret = init_get_bits8(&gb, codecpar->extradata, codecpar->extradata_size); + if (ret < 0) + return ret; + + ret = get_bits(&gb, 5); + put_bits(&pb, 5, ret); + if (ret == AOT_ESCAPE) // violates section 3.11.2, but better check for it + put_bits(&pb, 6, get_bits(&gb, 6)); + ret = get_bits(&gb, 4); + put_bits(&pb, 4, ret); + if (ret == 0x0f) + put_bits(&pb, 24, get_bits(&gb, 24)); + + skip_bits(&gb, 4); + put_bits(&pb, 4, codecpar->ch_layout.nb_channels); // set channel config + ret = put_bits_left(&pb); + put_bits(&pb, ret, get_bits(&gb, ret)); + flush_put_bits(&pb); + + memcpy(codecpar->extradata, buf, sizeof(buf)); + break; + } + case AV_CODEC_ID_FLAC: { + uint8_t buf[13]; + + init_put_bits(&pb, buf, sizeof(buf)); + ret = init_get_bits8(&gb, codecpar->extradata, codecpar->extradata_size); + if (ret < 0) + return ret; + + put_bits32(&pb, get_bits_long(&gb, 32)); // min/max blocksize + put_bits64(&pb, 48, get_bits64(&gb, 48)); // min/max framesize + put_bits(&pb, 20, get_bits(&gb, 20)); // samplerate + skip_bits(&gb, 3); + put_bits(&pb, 3, codecpar->ch_layout.nb_channels - 1); + ret = put_bits_left(&pb); + put_bits(&pb, ret, get_bits(&gb, ret)); + flush_put_bits(&pb); + + memcpy(codecpar->extradata, buf, sizeof(buf)); + break; + } + } + + return 0; +} + +static int scalable_channel_layout_config(void *s, AVIOContext *pb, + IAMFAudioElement *audio_element, + const IAMFCodecConfig *codec_config) +{ + int num_layers, k = 0; + + num_layers = avio_r8(pb) >> 5; // get_bits(&gb, 3); + // skip_bits(&gb, 5); //reserved + + if (num_layers > 6) + return AVERROR_INVALIDDATA; + + for (int i = 0; i < num_layers; i++) { + AVIAMFLayer *layer; + int loudspeaker_layout, output_gain_is_present_flag; + int substream_count, coupled_substream_count; + int ret, byte = avio_r8(pb); + + ret = avformat_iamf_audio_element_add_layer(audio_element->element, NULL); + if (ret < 0) + return ret; + + loudspeaker_layout = byte >> 4; // get_bits(&gb, 4); + output_gain_is_present_flag = (byte >> 3) & 1; //get_bits1(&gb); + layer = audio_element->element->layers[i]; + layer->recon_gain_is_present = (byte >> 2) & 1; + substream_count = avio_r8(pb); + coupled_substream_count = avio_r8(pb); + + if (output_gain_is_present_flag) { + layer->output_gain_flags = avio_r8(pb) >> 2; // get_bits(&gb, 6); + layer->output_gain = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + } + + if (loudspeaker_layout < 10) + av_channel_layout_copy(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[loudspeaker_layout]); + else + layer->ch_layout = (AVChannelLayout){ .order = AV_CHANNEL_ORDER_UNSPEC, + .nb_channels = substream_count + + coupled_substream_count }; + + for (int j = 0; j < substream_count; j++) { + IAMFSubStream *substream = &audio_element->substreams[k++]; + + substream->codecpar->ch_layout = coupled_substream_count-- > 0 ? (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO : + (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO; + + ret = update_extradata(substream->codecpar); + if (ret < 0) + return ret; + } + + } + + return 0; +} + +static int ambisonics_config(void *s, AVIOContext *pb, + IAMFAudioElement *audio_element, + const IAMFCodecConfig *codec_config) +{ + AVIAMFLayer *layer; + unsigned ambisonics_mode; + int output_channel_count, substream_count, order; + int ret; + + ret = iamf_leb(pb, &ambisonics_mode); + if (ret < 0) + return ret; + + if (ambisonics_mode > 1) + return 0; + + output_channel_count = avio_r8(pb); // C + substream_count = avio_r8(pb); // N + if (audio_element->nb_substreams != substream_count) + return AVERROR_INVALIDDATA; + + order = floor(sqrt(output_channel_count - 1)); + /* incomplete order - some harmonics are missing */ + if ((order + 1) * (order + 1) != output_channel_count) + return AVERROR_INVALIDDATA; + + ret = avformat_iamf_audio_element_add_layer(audio_element->element, NULL); + if (ret < 0) + return ret; + + layer = audio_element->element->layers[0]; + layer->ambisonics_mode = ambisonics_mode; + if (ambisonics_mode == 0) { + for (int i = 0; i < substream_count; i++) { + IAMFSubStream *substream = &audio_element->substreams[i]; + + substream->codecpar->ch_layout = (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO; + + ret = update_extradata(substream->codecpar); + if (ret < 0) + return ret; + } + + layer->ch_layout.order = AV_CHANNEL_ORDER_CUSTOM; + layer->ch_layout.nb_channels = output_channel_count; + layer->ch_layout.u.map = av_calloc(output_channel_count, sizeof(*layer->ch_layout.u.map)); + if (!layer->ch_layout.u.map) + return AVERROR(ENOMEM); + + for (int i = 0; i < output_channel_count; i++) + layer->ch_layout.u.map[i].id = avio_r8(pb) + AV_CHAN_AMBISONIC_BASE; + } else { + int coupled_substream_count = avio_r8(pb); // M + int nb_demixing_matrix = substream_count + coupled_substream_count; + int demixing_matrix_size = nb_demixing_matrix * output_channel_count; + + layer->ch_layout = (AVChannelLayout){ .order = AV_CHANNEL_ORDER_AMBISONIC, .nb_channels = output_channel_count }; + layer->demixing_matrix = av_malloc_array(demixing_matrix_size, sizeof(*layer->demixing_matrix)); + if (!layer->demixing_matrix) + return AVERROR(ENOMEM); + + for (int i = 0; i < demixing_matrix_size; i++) + layer->demixing_matrix[i] = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + + for (int i = 0; i < substream_count; i++) { + IAMFSubStream *substream = &audio_element->substreams[i]; + + substream->codecpar->ch_layout = coupled_substream_count-- > 0 ? (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO : + (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO; + + + ret = update_extradata(substream->codecpar); + if (ret < 0) + return ret; + } + } + + return 0; +} + +static int param_parse(void *s, IAMFContext *c, AVIOContext *pb, + unsigned int param_definition_type, + const AVIAMFAudioElement *audio_element, + AVIAMFParamDefinition **out_param_definition) +{ + IAMFParamDefinition *param_definition = NULL; + AVIAMFParamDefinition *param; + unsigned int parameter_id, parameter_rate, param_definition_mode; + unsigned int duration = 0, constant_subblock_duration = 0, num_subblocks = 0; + size_t param_size; + int ret; + + ret = iamf_leb(pb, ¶meter_id); + if (ret < 0) + return ret; + + for (int i = 0; i < c->nb_param_definitions; i++) + if (c->param_definitions[i].param->parameter_id == parameter_id) { + param_definition = &c->param_definitions[i]; + break; + } + + ret = iamf_leb(pb, ¶meter_rate); + if (ret < 0) + return ret; + + param_definition_mode = avio_r8(pb) >> 7; + + if (param_definition_mode == 0) { + ret = iamf_leb(pb, &duration); + if (ret < 0) + return ret; + + ret = iamf_leb(pb, &constant_subblock_duration); + if (ret < 0) + return ret; + + if (constant_subblock_duration == 0) { + ret = iamf_leb(pb, &num_subblocks); + if (ret < 0) + return ret; + } else + num_subblocks = duration / constant_subblock_duration; + } + + param = avformat_iamf_param_definition_alloc(param_definition_type, NULL, num_subblocks, NULL, ¶m_size); + if (!param) + return AVERROR(ENOMEM); + + for (int i = 0; i < num_subblocks; i++) { + void *subblock = avformat_iamf_param_definition_get_subblock(param, i); + unsigned int subblock_duration = constant_subblock_duration; + + if (constant_subblock_duration == 0) { + ret = iamf_leb(pb, &subblock_duration); + if (ret < 0) { + av_free(param); + return ret; + } + } + + switch (param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + AVIAMFMixGainParameterData *mix = subblock; + mix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + AVIAMFDemixingInfoParameterData *demix = subblock; + demix->subblock_duration = subblock_duration; + // DemixingInfoParameterData + demix->dmixp_mode = avio_r8(pb) >> 5; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + AVIAMFReconGainParameterData *recon = subblock; + recon->subblock_duration = subblock_duration; + break; + } + default: + av_free(param); + return AVERROR_INVALIDDATA; + } + } + + param->parameter_id = parameter_id; + param->parameter_rate = parameter_rate; + param->param_definition_mode = param_definition_mode; + param->duration = duration; + param->constant_subblock_duration = constant_subblock_duration; + param->num_subblocks = num_subblocks; + + if (param_definition) { + if (param_definition->param_size != param_size || memcmp(param_definition->param, param, param_size)) { + av_log(s, AV_LOG_ERROR, "Incosistent parameters for parameter_id %u\n", parameter_id); + av_free(param); + return AVERROR_INVALIDDATA; + } + } else { + param_definition = av_dynarray2_add_nofree((void **)&c->param_definitions, &c->nb_param_definitions, + sizeof(*c->param_definitions), NULL); + if (!param_definition) { + av_free(param); + return AVERROR(ENOMEM); + } + param_definition->param = param; + param_definition->param_size = param_size; + param_definition->audio_element = audio_element; + } + + av_assert0(out_param_definition); + *out_param_definition = param; + + return 0; +} + +static int audio_element_obu(void *s, IAMFContext *c, AVIOContext *pb, int len) +{ + const IAMFCodecConfig *codec_config = NULL; + AVIAMFAudioElement *element; + IAMFAudioElement *audio_element; + FFIOContext b; + AVIOContext *pbc; + uint8_t *buf; + unsigned audio_element_id, codec_config_id, num_substreams, num_parameters; + int audio_element_type, ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pbc = &b.pub; + + ret = iamf_leb(pbc, &audio_element_id); + if (ret < 0) + goto fail; + + for (int i = 0; i < c->nb_audio_elements; i++) + if (c->audio_elements[i].audio_element_id == audio_element_id) { + av_log(s, AV_LOG_ERROR, "Duplicate audio_element_id %d\n", audio_element_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + audio_element_type = avio_r8(pbc) >> 5; + + ret = iamf_leb(pbc, &codec_config_id); + if (ret < 0) + goto fail; + + for (int i = 0; i < c->nb_codec_configs; i++) { + if (c->codec_configs[i].codec_config_id == codec_config_id) { + codec_config = &c->codec_configs[i]; + break; + } + } + + if (!codec_config) { + av_log(s, AV_LOG_ERROR, "Non existant codec config id %d referenced in an audio element\n", codec_config_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + if (codec_config->codec_id == AV_CODEC_ID_NONE) { + av_log(s, AV_LOG_DEBUG, "Unknown codec id referenced in an audio element. Ignoring\n"); + ret = 0; + goto fail; + } + + ret = iamf_leb(pbc, &num_substreams); + if (ret < 0) + goto fail; + + audio_element = av_dynarray2_add_nofree((void **)&c->audio_elements, &c->nb_audio_elements, + sizeof(*c->audio_elements), NULL); + if (!audio_element) { + ret = AVERROR(ENOMEM); + goto fail; + } + + memset(audio_element, 0, sizeof(*audio_element)); + + audio_element->codec_config = codec_config; + audio_element->audio_element_id = audio_element_id; + element = audio_element->element = avformat_iamf_audio_element_alloc(); + if (!element) { + ret = AVERROR(ENOMEM); + goto fail; + } + + element->codec_config_id = codec_config_id; + element->audio_element_type = audio_element_type; + + for (int i = 0; i < num_substreams; i++) { + IAMFSubStream *substream; + + substream = av_dynarray2_add_nofree((void **)&audio_element->substreams, &audio_element->nb_substreams, + sizeof(*audio_element->substreams), NULL); + if (!substream) { + ret = AVERROR(ENOMEM); + goto fail; + } + + substream->codecpar = avcodec_parameters_alloc(); + if (!substream->codecpar) { + ret = AVERROR(ENOMEM); + goto fail; + } + + ret = iamf_leb(pbc, &substream->audio_substream_id); + if (ret < 0) + goto fail; + + substream->codecpar->codec_type = AVMEDIA_TYPE_AUDIO; + substream->codecpar->codec_id = codec_config->codec_id; + substream->codecpar->frame_size = codec_config->nb_samples; + substream->codecpar->sample_rate = codec_config->sample_rate; + substream->codecpar->seek_preroll = codec_config->seek_preroll; + + switch(substream->codecpar->codec_id) { + case AV_CODEC_ID_AAC: + case AV_CODEC_ID_FLAC: + case AV_CODEC_ID_OPUS: + substream->codecpar->extradata = av_malloc(codec_config->extradata_size + AV_INPUT_BUFFER_PADDING_SIZE); + if (!substream->codecpar->extradata) { + ret = AVERROR(ENOMEM); + goto fail; + } + memcpy(substream->codecpar->extradata, codec_config->extradata, codec_config->extradata_size); + memset(substream->codecpar->extradata + codec_config->extradata_size, 0, AV_INPUT_BUFFER_PADDING_SIZE); + substream->codecpar->extradata_size = codec_config->extradata_size; + break; + } + } + + ret = iamf_leb(pbc, &num_parameters); + if (ret < 0) + goto fail; + + if (num_parameters && audio_element_type != 0) { + av_log(s, AV_LOG_ERROR, "Audio Element parameter count %u is invalid" + " for Scene representations\n", num_parameters); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + for (int i = 0; i < num_parameters; i++) { + unsigned param_definition_type; + + ret = iamf_leb(pbc, ¶m_definition_type); + if (ret < 0) + goto fail; + + if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN) { + ret = AVERROR_INVALIDDATA; + goto fail; + } else if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_DEMIXING) { + ret = param_parse(s, c, pbc, param_definition_type, element, &element->demixing_info); + if (ret < 0) + goto fail; + + element->default_w = avio_r8(pbc) >> 4; + } else if (param_definition_type == AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) { + ret = param_parse(s, c, pbc, param_definition_type, element, &element->recon_gain_info); + if (ret < 0) + goto fail; + } else { + unsigned param_definition_size; + ret = iamf_leb(pbc, ¶m_definition_size); + if (ret < 0) + goto fail; + + avio_skip(pbc, param_definition_size); + } + } + + if (audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) { + ret = scalable_channel_layout_config(s, pbc, audio_element, codec_config); + if (ret < 0) + goto fail; + } else if (audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE) { + ret = ambisonics_config(s, pbc, audio_element, codec_config); + if (ret < 0) + goto fail; + } else { + unsigned audio_element_config_size; + ret = iamf_leb(pbc, &audio_element_config_size); + if (ret < 0) + goto fail; + } + + len -= avio_tell(pbc); + if (len) + av_log(s, AV_LOG_WARNING, "Underread in audio_element_obu. %d bytes left at the end\n", len); + + ret = 0; +fail: + av_free(buf); + + return ret; +} + +static int label_string(AVIOContext *pb, char **label) +{ + uint8_t buf[128]; + + avio_get_str(pb, sizeof(buf), buf, sizeof(buf)); + + if (pb->error) + return pb->error; + if (pb->eof_reached) + return AVERROR_INVALIDDATA; + *label = av_strdup(buf); + if (!*label) + return AVERROR(ENOMEM); + + return 0; +} + +static int mix_presentation_obu(void *s, IAMFContext *c, AVIOContext *pb, int len) +{ + AVIAMFMixPresentation *mix; + IAMFMixPresentation *mix_presentation; + FFIOContext b; + AVIOContext *pbc; + uint8_t *buf; + unsigned mix_presentation_id; + int ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pbc = &b.pub; + + ret = iamf_leb(pbc, &mix_presentation_id); + if (ret < 0) + goto fail; + + for (int i = 0; i < c->nb_mix_presentations; i++) + if (c->mix_presentations[i].mix_presentation_id == mix_presentation_id) { + av_log(s, AV_LOG_ERROR, "Duplicate mix_presentation_id %d\n", mix_presentation_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + mix_presentation = av_dynarray2_add_nofree((void **)&c->mix_presentations, &c->nb_mix_presentations, + sizeof(*c->mix_presentations), NULL); + if (!mix_presentation) { + ret = AVERROR(ENOMEM); + goto fail; + } + + memset(mix_presentation, 0, sizeof(*mix_presentation)); + + mix_presentation->mix_presentation_id = mix_presentation_id; + mix = mix_presentation->mix = avformat_iamf_mix_presentation_alloc(); + if (!mix) { + ret = AVERROR(ENOMEM); + goto fail; + } + + ret = iamf_leb(pbc, &mix_presentation->count_label); + if (ret < 0) + goto fail; + + mix_presentation->language_label = av_calloc(mix_presentation->count_label, + sizeof(*mix_presentation->language_label)); + if (!mix_presentation->language_label) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int i = 0; i < mix_presentation->count_label; i++) { + ret = label_string(pbc, &mix_presentation->language_label[i]); + if (ret < 0) + goto fail; + } + + for (int i = 0; i < mix_presentation->count_label; i++) { + char *annotation = NULL; + ret = label_string(pbc, &annotation); + if (ret < 0) + goto fail; + ret = av_dict_set(&mix->annotations, mix_presentation->language_label[i], annotation, + AV_DICT_DONT_STRDUP_VAL | AV_DICT_DONT_OVERWRITE); + if (ret < 0) + goto fail; + } + + ret = iamf_leb(pbc, &mix->num_submixes); + if (ret < 0) + goto fail; + + mix->submixes = av_calloc(mix->num_submixes, sizeof(*mix->submixes)); + if (!mix->submixes) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int i = 0; i < mix->num_submixes; i++) { + AVIAMFSubmix *sub_mix; + + sub_mix = mix->submixes[i] = av_mallocz(sizeof(*sub_mix)); + if (!sub_mix) { + ret = AVERROR(ENOMEM); + goto fail; + } + + ret = iamf_leb(pbc, &sub_mix->num_elements); + if (ret < 0) + goto fail; + + sub_mix->elements = av_calloc(sub_mix->num_elements, sizeof(*sub_mix->elements)); + if (!sub_mix->elements) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int j = 0; j < sub_mix->num_elements; j++) { + AVIAMFSubmixElement *submix_element; + IAMFAudioElement *audio_element = NULL; + unsigned int rendering_config_extension_size; + + submix_element = sub_mix->elements[j] = av_mallocz(sizeof(*submix_element)); + if (!submix_element) { + ret = AVERROR(ENOMEM); + goto fail; + } + + ret = iamf_leb(pbc, &submix_element->audio_element_id); + if (ret < 0) + goto fail; + + for (int k = 0; k < c->nb_audio_elements; k++) + if (c->audio_elements[k].audio_element_id == submix_element->audio_element_id) { + audio_element = &c->audio_elements[k]; + break; + } + + if (!audio_element) { + av_log(s, AV_LOG_ERROR, "Invalid Audio Element with id %u referenced by Mix Parameters %u\n", + submix_element->audio_element_id, mix_presentation_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + for (int k = 0; k < mix_presentation->count_label; k++) { + char *annotation = NULL; + ret = label_string(pbc, &annotation); + if (ret < 0) + goto fail; + ret = av_dict_set(&submix_element->annotations, mix_presentation->language_label[k], annotation, + AV_DICT_DONT_STRDUP_VAL | AV_DICT_DONT_OVERWRITE); + if (ret < 0) + goto fail; + } + + submix_element->headphones_rendering_mode = avio_r8(pbc) >> 6; + + ret = iamf_leb(pbc, &rendering_config_extension_size); + if (ret < 0) + goto fail; + avio_skip(pbc, rendering_config_extension_size); + + ret = param_parse(s, c, pbc, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, + audio_element->element, + &submix_element->element_mix_config); + if (ret < 0) + goto fail; + submix_element->default_mix_gain = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + } + ret = param_parse(s, c, pbc, AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL, &sub_mix->output_mix_config); + if (ret < 0) + goto fail; + sub_mix->default_mix_gain = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + + ret = iamf_leb(pbc, &sub_mix->num_layouts); + if (ret < 0) + goto fail; + + sub_mix->layouts = av_calloc(sub_mix->num_layouts, sizeof(*sub_mix->layouts)); + if (!sub_mix->layouts) { + ret = AVERROR(ENOMEM); + goto fail; + } + + for (int j = 0; j < sub_mix->num_layouts; j++) { + AVIAMFSubmixLayout *submix_layout; + int info_type; + int byte = avio_r8(pbc); + + submix_layout = sub_mix->layouts[j] = av_mallocz(sizeof(*submix_layout)); + if (!submix_layout) { + ret = AVERROR(ENOMEM); + goto fail; + } + + submix_layout->layout_type = byte >> 6; + if (submix_layout->layout_type < AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS && + submix_layout->layout_type > AV_IAMF_SUBMIX_LAYOUT_TYPE_BINAURAL) { + av_log(s, AV_LOG_ERROR, "Invalid Layout type %u in a submix from Mix Presentation %u\n", + submix_layout->layout_type, mix_presentation_id); + ret = AVERROR_INVALIDDATA; + goto fail; + } + + if (submix_layout->layout_type == 2) { + int sound_system; + sound_system = (byte >> 2) & 0xF; + av_channel_layout_copy(&submix_layout->sound_system, &ff_iamf_sound_system_map[sound_system].layout); + } + + info_type = avio_r8(pbc); + submix_layout->integrated_loudness = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + submix_layout->digital_peak = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + + if (info_type & 1) + submix_layout->true_peak = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + if (info_type & 2) { + unsigned int num_anchored_loudness = avio_r8(pbc); + + for (int k = 0; k < num_anchored_loudness; k++) { + unsigned int anchor_element = avio_r8(pbc); + AVRational anchored_loudness = av_make_q(sign_extend(avio_rb16(pbc), 16), 1 << 8); + if (anchor_element == IAMF_ANCHOR_ELEMENT_DIALOGUE) + submix_layout->dialogue_anchored_loudness = anchored_loudness; + else if (anchor_element <= IAMF_ANCHOR_ELEMENT_ALBUM) + submix_layout->album_anchored_loudness = anchored_loudness; + else + av_log(s, AV_LOG_DEBUG, "Unknown anchor_element. Ignoring\n"); + } + } + + if (info_type & 0xFC) { + unsigned int info_type_size; + ret = iamf_leb(pbc, &info_type_size); + if (ret < 0) + goto fail; + + avio_skip(pbc, info_type_size); + } + } + } + + len -= avio_tell(pbc); + if (len) + av_log(s, AV_LOG_WARNING, "Underread in mix_presentation_obu. %d bytes left at the end\n", len); + + ret = 0; +fail: + av_free(buf); + + return ret; +} + +int ff_iamf_parse_obu_header(const uint8_t *buf, int buf_size, + unsigned *obu_size, int *start_pos, enum IAMF_OBU_Type *type, + unsigned *skip_samples, unsigned *discard_padding) +{ + GetBitContext gb; + int ret, extension_flag, trimming, start; + unsigned skip = 0, discard = 0; + unsigned size; + + ret = init_get_bits8(&gb, buf, FFMIN(buf_size, MAX_IAMF_OBU_HEADER_SIZE)); + if (ret < 0) + return ret; + + *type = get_bits(&gb, 5); + /*redundant =*/ get_bits1(&gb); + trimming = get_bits1(&gb); + extension_flag = get_bits1(&gb); + + *obu_size = get_leb128(&gb); + if (*obu_size > INT_MAX) + return AVERROR_INVALIDDATA; + + start = get_bits_count(&gb) / 8; + + if (trimming) { + skip = get_leb128(&gb); // num_samples_to_trim_at_end + discard = get_leb128(&gb); // num_samples_to_trim_at_start + } + + if (skip_samples) + *skip_samples = skip; + if (discard_padding) + *discard_padding = discard; + + if (extension_flag) { + unsigned extension_bytes = get_leb128(&gb); + if (extension_bytes > INT_MAX / 8) + return AVERROR_INVALIDDATA; + skip_bits_long(&gb, extension_bytes * 8); + } + + if (get_bits_left(&gb) < 0) + return AVERROR_INVALIDDATA; + + size = *obu_size + start; + if (size > INT_MAX) + return AVERROR_INVALIDDATA; + + *obu_size -= get_bits_count(&gb) / 8 - start; + *start_pos = size - *obu_size; + + return size; +} + +int ff_iamfdec_read_descriptors(IAMFContext *c, AVIOContext *pb, + int max_size, void *log_ctx) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE + AV_INPUT_BUFFER_PADDING_SIZE]; + int ret; + + while (1) { + unsigned obu_size; + enum IAMF_OBU_Type type; + int start_pos, len, size; + + if ((ret = ffio_ensure_seekback(pb, FFMIN(MAX_IAMF_OBU_HEADER_SIZE, max_size))) < 0) + return ret; + size = avio_read(pb, header, FFMIN(MAX_IAMF_OBU_HEADER_SIZE, max_size)); + if (size < 0) + return size; + + len = ff_iamf_parse_obu_header(header, size, &obu_size, &start_pos, &type, NULL, NULL); + if (len < 0 || obu_size > max_size) { + av_log(log_ctx, AV_LOG_ERROR, "Failed to read obu\n"); + avio_seek(pb, -size, SEEK_CUR); + return len; + } + + if (type >= IAMF_OBU_IA_PARAMETER_BLOCK && type < IAMF_OBU_IA_SEQUENCE_HEADER) { + avio_seek(pb, -size, SEEK_CUR); + break; + } + + avio_seek(pb, -(size - start_pos), SEEK_CUR); + switch (type) { + case IAMF_OBU_IA_CODEC_CONFIG: + ret = codec_config_obu(log_ctx, c, pb, obu_size); + break; + case IAMF_OBU_IA_AUDIO_ELEMENT: + ret = audio_element_obu(log_ctx, c, pb, obu_size); + break; + case IAMF_OBU_IA_MIX_PRESENTATION: + ret = mix_presentation_obu(log_ctx, c, pb, obu_size); + break; + case IAMF_OBU_IA_TEMPORAL_DELIMITER: + break; + default: { + int64_t offset = avio_skip(pb, obu_size); + if (offset < 0) + ret = offset; + break; + } + } + if (ret < 0) + return ret; + max_size -= obu_size; + if (!max_size) + break; + } + + return 0; +} + +void ff_iamf_uninit_context(IAMFContext *c) +{ + if (!c) + return; + + for (int i = 0; i < c->nb_codec_configs; i++) + av_free(c->codec_configs[i].extradata); + av_freep(&c->codec_configs); + c->nb_codec_configs = 0; + + for (int i = 0; i < c->nb_audio_elements; i++) { + IAMFAudioElement *audio_element = &c->audio_elements[i]; + for (int j = 0; j < audio_element->nb_substreams; j++) + avcodec_parameters_free(&audio_element->substreams[i].codecpar); + av_free(audio_element->substreams); + av_free(audio_element->layers); + avformat_iamf_audio_element_free(&audio_element->element); + } + av_freep(&c->audio_elements); + c->nb_audio_elements = 0; + + for (int i = 0; i < c->nb_mix_presentations; i++) { + IAMFMixPresentation *mix_presentation = &c->mix_presentations[i]; + for (int j = 0; j < mix_presentation->count_label; j++) + av_free(mix_presentation->language_label[j]); + av_free(mix_presentation->language_label); + avformat_iamf_mix_presentation_free(&mix_presentation->mix); + } + av_freep(&c->mix_presentations); + c->nb_mix_presentations = 0; + + av_freep(&c->param_definitions); + c->nb_param_definitions = 0; +} diff --git a/libavformat/iamfdec.c b/libavformat/iamfdec.c new file mode 100644 index 0000000000..fbf36f1e47 --- /dev/null +++ b/libavformat/iamfdec.c @@ -0,0 +1,533 @@ +/* + * Immersive Audio Model and Formats demuxer + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "config_components.h" + +#include "libavutil/avassert.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/log.h" +#include "libavcodec/mathops.h" +#include "avformat.h" +#include "avio_internal.h" +#include "demux.h" +#include "iamf.h" +#include "iamf_internal.h" +#include "internal.h" + +typedef struct IAMFDemuxContext { + IAMFContext iamf; + + // Packet side data + AVIAMFParamDefinition *mix; + size_t mix_size; + AVIAMFParamDefinition *demix; + size_t demix_size; + AVIAMFParamDefinition *recon; + size_t recon_size; +} IAMFDemuxContext; + +static AVStream *find_stream_by_id(AVFormatContext *s, int id) +{ + for (int i = 0; i < s->nb_streams; i++) + if (s->streams[i]->id == id) + return s->streams[i]; + + av_log(s, AV_LOG_ERROR, "Invalid stream id %d\n", id); + return NULL; +} + +static int audio_frame_obu(AVFormatContext *s, AVPacket *pkt, int len, + enum IAMF_OBU_Type type, + unsigned skip_samples, unsigned discard_padding, + int id_in_bitstream) +{ + const IAMFDemuxContext *const c = s->priv_data; + AVStream *st; + int ret, audio_substream_id; + + if (id_in_bitstream) { + unsigned explicit_audio_substream_id; + ret = iamf_leb(s->pb, &explicit_audio_substream_id); + if (ret < 0) + return ret; + len -= ret; + audio_substream_id = explicit_audio_substream_id; + } else + audio_substream_id = type - IAMF_OBU_IA_AUDIO_FRAME_ID0; + + st = find_stream_by_id(s, audio_substream_id); + if (!st) + return AVERROR_INVALIDDATA; + + ret = av_get_packet(s->pb, pkt, len); + if (ret < 0) + return ret; + if (ret != len) + return AVERROR_INVALIDDATA; + + if (skip_samples || discard_padding) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_SKIP_SAMPLES, 10); + if (!side_data) + return AVERROR(ENOMEM); + AV_WL32(side_data, skip_samples); + AV_WL32(side_data + 4, discard_padding); + } + if (c->mix) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, c->mix_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->mix, c->mix_size); + } + if (c->demix) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, c->demix_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->demix, c->demix_size); + } + if (c->recon) { + uint8_t *side_data = av_packet_new_side_data(pkt, AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, c->recon_size); + if (!side_data) + return AVERROR(ENOMEM); + memcpy(side_data, c->recon, c->recon_size); + } + + pkt->stream_index = st->index; + return 0; +} + +static const IAMFParamDefinition *get_param_definition(AVFormatContext *s, unsigned int parameter_id) +{ + const IAMFDemuxContext *const c = s->priv_data; + const IAMFContext *const iamf = &c->iamf; + const IAMFParamDefinition *param_definition = NULL; + + for (int i = 0; i < iamf->nb_param_definitions; i++) + if (iamf->param_definitions[i].param->parameter_id == parameter_id) { + param_definition = &iamf->param_definitions[i]; + break; + } + + return param_definition; +} + +static int parameter_block_obu(AVFormatContext *s, int len) +{ + IAMFDemuxContext *const c = s->priv_data; + const IAMFParamDefinition *param_definition; + const AVIAMFParamDefinition *param; + AVIAMFParamDefinition *out_param = NULL; + FFIOContext b; + AVIOContext *pb; + uint8_t *buf; + unsigned int duration, constant_subblock_duration; + unsigned int num_subblocks; + unsigned int parameter_id; + size_t out_param_size; + int ret; + + buf = av_malloc(len); + if (!buf) + return AVERROR(ENOMEM); + + ret = avio_read(s->pb, buf, len); + if (ret != len) { + if (ret >= 0) + ret = AVERROR_INVALIDDATA; + goto fail; + } + + ffio_init_context(&b, buf, len, 0, NULL, NULL, NULL, NULL); + pb = &b.pub; + + ret = iamf_leb(pb, ¶meter_id); + if (ret < 0) + goto fail; + + param_definition = get_param_definition(s, parameter_id); + if (!param_definition) { + av_log(s, AV_LOG_VERBOSE, "Non existant parameter_id %d referenced in a parameter block. Ignoring\n", + parameter_id); + ret = 0; + goto fail; + } + + param = param_definition->param; + if (param->param_definition_mode) { + ret = iamf_leb(pb, &duration); + if (ret < 0) + goto fail; + + ret = iamf_leb(pb, &constant_subblock_duration); + if (ret < 0) + goto fail; + + if (constant_subblock_duration == 0) { + ret = iamf_leb(pb, &num_subblocks); + if (ret < 0) + goto fail; + } else + num_subblocks = duration / constant_subblock_duration; + } else { + duration = param->duration; + constant_subblock_duration = param->constant_subblock_duration; + num_subblocks = param->num_subblocks; + if (!num_subblocks) + num_subblocks = duration / constant_subblock_duration; + } + + out_param = avformat_iamf_param_definition_alloc(param->param_definition_type, NULL, num_subblocks, + NULL, &out_param_size); + if (!out_param) { + ret = AVERROR(ENOMEM); + goto fail; + } + + out_param->parameter_id = param->parameter_id; + out_param->param_definition_type = param->param_definition_type; + out_param->parameter_rate = param->parameter_rate; + out_param->param_definition_mode = param->param_definition_mode; + out_param->duration = duration; + out_param->constant_subblock_duration = constant_subblock_duration; + out_param->num_subblocks = num_subblocks; + + for (int i = 0; i < num_subblocks; i++) { + void *subblock = avformat_iamf_param_definition_get_subblock(out_param, i); + unsigned int subblock_duration; + + if (param->param_definition_mode && !constant_subblock_duration) { + ret = iamf_leb(pb, &subblock_duration); + if (ret < 0) + goto fail; + } else { + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + subblock_duration = ((AVIAMFMixGainParameterData *)subblock)->subblock_duration; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + subblock_duration = ((AVIAMFDemixingInfoParameterData *)subblock)->subblock_duration; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + subblock_duration = ((AVIAMFReconGainParameterData *)subblock)->subblock_duration; + break; + default: + av_assert0(0); + } + } + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + AVIAMFMixGainParameterData *mix = subblock; + + ret = iamf_leb(pb, &mix->animation_type); + if (ret < 0) + goto fail; + + if (mix->animation_type > AV_IAMF_ANIMATION_TYPE_BEZIER) { + ret = 0; + av_free(out_param); + goto fail; + } + + mix->start_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR) + mix->end_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) { + mix->control_point_value = av_make_q(sign_extend(avio_rb16(pb), 16), 1 << 8); + mix->control_point_relative_time = avio_r8(pb); + } + mix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + AVIAMFDemixingInfoParameterData *demix = subblock; + + demix->dmixp_mode = avio_r8(pb) >> 5; + demix->subblock_duration = subblock_duration; + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + AVIAMFReconGainParameterData *recon = subblock; + const AVIAMFAudioElement *audio_element = param_definition->audio_element; + + av_assert0(audio_element); + for (int i = 0; i < audio_element->num_layers; i++) { + const AVIAMFLayer *layer = audio_element->layers[i]; + if (layer->recon_gain_is_present) { + unsigned int recon_gain_flags, bitcount; + ret = iamf_leb(pb, &recon_gain_flags); + if (ret < 0) + goto fail; + + bitcount = 7 + 5 * !!(recon_gain_flags & 0x80); + recon_gain_flags = (recon_gain_flags & 0x7F) | ((recon_gain_flags & 0xFF00) >> 1); + for (int j = 0; j < bitcount; j++) { + if (recon_gain_flags & (1 << j)) + recon->recon_gain[i][j] = avio_r8(pb); + } + } + } + recon->subblock_duration = subblock_duration; + break; + } + default: + av_assert0(0); + } + } + + len -= avio_tell(pb); + if (len) { + int level = (s->error_recognition & AV_EF_EXPLODE) ? AV_LOG_ERROR : AV_LOG_WARNING; + av_log(s, level, "Underread in parameter_block_obu. %d bytes left at the end\n", len); + } + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: + av_free(c->mix); + c->mix = out_param; + c->mix_size = out_param_size; + break; + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: + av_free(c->demix); + c->demix = out_param; + c->demix_size = out_param_size; + break; + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: + av_free(c->recon); + c->recon = out_param; + c->recon_size = out_param_size; + break; + default: + av_assert0(0); + } + + ret = 0; +fail: + if (ret < 0) + av_free(out_param); + av_free(buf); + + return ret; +} + +static int iamf_read_packet(AVFormatContext *s, AVPacket *pkt) +{ + IAMFDemuxContext *const c = s->priv_data; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE + AV_INPUT_BUFFER_PADDING_SIZE]; + unsigned obu_size; + int ret; + + while (1) { + enum IAMF_OBU_Type type; + unsigned skip_samples, discard_padding; + int len, size, start_pos; + + if ((ret = ffio_ensure_seekback(s->pb, MAX_IAMF_OBU_HEADER_SIZE)) < 0) + return ret; + size = avio_read(s->pb, header, MAX_IAMF_OBU_HEADER_SIZE); + if (size < 0) + return size; + + len = ff_iamf_parse_obu_header(header, size, &obu_size, &start_pos, &type, + &skip_samples, &discard_padding); + if (len < 0) { + av_log(s, AV_LOG_ERROR, "Failed to read obu\n"); + return len; + } + avio_seek(s->pb, -(size - start_pos), SEEK_CUR); + + if (type == IAMF_OBU_IA_AUDIO_FRAME) + return audio_frame_obu(s, pkt, obu_size, type, + skip_samples, discard_padding, 1); + else if (type >= IAMF_OBU_IA_AUDIO_FRAME_ID0 && type <= IAMF_OBU_IA_AUDIO_FRAME_ID17) + return audio_frame_obu(s, pkt, obu_size, type, + skip_samples, discard_padding, 0); + else if (type == IAMF_OBU_IA_PARAMETER_BLOCK) { + ret = parameter_block_obu(s, obu_size); + if (ret < 0) + return ret; + } else if (type == IAMF_OBU_IA_TEMPORAL_DELIMITER) { + av_freep(&c->mix); + c->mix_size = 0; + av_freep(&c->demix); + c->demix_size = 0; + av_freep(&c->recon); + c->recon_size = 0; + } else { + int64_t offset = avio_skip(s->pb, obu_size); + if (offset < 0) { + ret = offset; + break; + } + } + } + + return ret; +} + +//return < 0 if we need more data +static int get_score(const uint8_t *buf, int buf_size, enum IAMF_OBU_Type type, int *seq) +{ + if (type == IAMF_OBU_IA_SEQUENCE_HEADER) { + if (buf_size < 4 || AV_RB32(buf) != MKBETAG('i','a','m','f')) + return 0; + *seq = 1; + return -1; + } + if (type >= IAMF_OBU_IA_CODEC_CONFIG && type <= IAMF_OBU_IA_TEMPORAL_DELIMITER) + return *seq ? -1 : 0; + if (type >= IAMF_OBU_IA_AUDIO_FRAME && type <= IAMF_OBU_IA_AUDIO_FRAME_ID17) + return *seq ? AVPROBE_SCORE_EXTENSION + 1 : 0; + return 0; +} + +static int iamf_probe(const AVProbeData *p) +{ + unsigned obu_size; + enum IAMF_OBU_Type type; + int seq = 0, cnt = 0, start_pos; + int ret; + + while (1) { + int size = ff_iamf_parse_obu_header(p->buf + cnt, p->buf_size - cnt, + &obu_size, &start_pos, &type, + NULL, NULL); + if (size < 0) + return 0; + + ret = get_score(p->buf + cnt + start_pos, + p->buf_size - cnt - start_pos, + type, &seq); + if (ret >= 0) + return ret; + + cnt += FFMIN(size, p->buf_size - cnt); + } + return 0; +} + +static int iamf_read_header(AVFormatContext *s) +{ + IAMFDemuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + int ret; + + ret = ff_iamfdec_read_descriptors(iamf, s->pb, INT_MAX, s); + if (ret < 0) + return ret; + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + IAMFAudioElement *audio_element = &iamf->audio_elements[i]; + AVStreamGroup *stg = avformat_stream_group_create(s, AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT, NULL); + + if (!stg) + return AVERROR(ENOMEM); + + stg->id = audio_element->audio_element_id; + stg->params.iamf_audio_element = audio_element->element; + audio_element->element = NULL; + + for (int j = 0; j < audio_element->nb_substreams; j++) { + IAMFSubStream *substream = &audio_element->substreams[j]; + AVStream *st = avformat_new_stream(s, NULL); + + if (!st) + return AVERROR(ENOMEM); + + ret = avformat_stream_group_add_stream(stg, st); + if (ret < 0) + return ret; + + ret = avcodec_parameters_copy(st->codecpar, substream->codecpar); + if (ret < 0) + return ret; + + st->id = substream->audio_substream_id; + avpriv_set_pts_info(st, 64, 1, st->codecpar->sample_rate); + } + } + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + IAMFMixPresentation *mix_presentation = &iamf->mix_presentations[i]; + AVStreamGroup *stg = avformat_stream_group_create(s, AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION, NULL); + const AVIAMFMixPresentation *mix = mix_presentation->mix; + + if (!stg) + return AVERROR(ENOMEM); + + stg->id = mix_presentation->mix_presentation_id; + stg->params.iamf_mix_presentation = mix_presentation->mix; + mix_presentation->mix = NULL; + + for (int j = 0; j < mix->num_submixes; j++) { + AVIAMFSubmix *sub_mix = mix->submixes[j]; + + for (int k = 0; k < sub_mix->num_elements; k++) { + AVIAMFSubmixElement *submix_element = sub_mix->elements[k]; + AVStreamGroup *audio_element = NULL; + + for (int l = 0; l < s->nb_stream_groups; l++) + if (s->stream_groups[l]->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT && + s->stream_groups[l]->id == submix_element->audio_element_id) { + audio_element = s->stream_groups[l]; + break; + } + av_assert0(audio_element); + + for (int l = 0; l < audio_element->nb_streams; l++) { + ret = avformat_stream_group_add_stream(stg, audio_element->streams[l]); + if (ret < 0 && ret != AVERROR(EEXIST)) + return ret; + } + } + } + } + + return 0; +} + +static int iamf_read_close(AVFormatContext *s) +{ + IAMFDemuxContext *const c = s->priv_data; + + ff_iamf_uninit_context(&c->iamf); + + av_freep(&c->mix); + c->mix_size = 0; + av_freep(&c->demix); + c->demix_size = 0; + av_freep(&c->recon); + c->recon_size = 0; + + return 0; +} + +const AVInputFormat ff_iamf_demuxer = { + .name = "iamf", + .long_name = NULL_IF_CONFIG_SMALL("Raw Immersive Audio Model and Formats"), + .priv_data_size = sizeof(IAMFDemuxContext), + .flags_internal = FF_FMT_INIT_CLEANUP, + .read_probe = iamf_probe, + .read_header = iamf_read_header, + .read_packet = iamf_read_packet, + .read_close = iamf_read_close, + .extensions = "iamf", + .flags = AVFMT_GENERIC_INDEX | AVFMT_NO_BYTE_SEEK | AVFMT_NOTIMESTAMPS | AVFMT_SHOW_IDS, +}; diff --git a/libavformat/options.c b/libavformat/options.c index 9ddc28842c..6227dc3799 100644 --- a/libavformat/options.c +++ b/libavformat/options.c @@ -20,6 +20,7 @@ #include "avformat.h" #include "avio_internal.h" #include "demux.h" +#include "iamf.h" #include "internal.h" #include "libavcodec/avcodec.h" @@ -326,6 +327,43 @@ fail: return NULL; } +static void *stream_group_child_next(void *obj, void *prev) +{ + AVStreamGroup *stg = obj; + if (!prev) { + switch(stg->type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + return stg->params.iamf_audio_element; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + return stg->params.iamf_mix_presentation; + default: + break; + } + } + return NULL; +} + +static const AVClass *stream_group_child_iterate(void **opaque) +{ + uintptr_t i = (uintptr_t)*opaque; + const AVClass *ret = NULL; + + switch(i) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + ret = avformat_iamf_audio_element_get_class(); + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + ret = avformat_iamf_mix_presentation_get_class(); + break; + default: + break; + } + + if (ret) + *opaque = (void*)(i + 1); + return ret; +} + static const AVOption stream_group_options[] = { {"id", "Set group id", offsetof(AVStreamGroup, id), AV_OPT_TYPE_INT64, {.i64 = 0}, 0, INT64_MAX, AV_OPT_FLAG_ENCODING_PARAM }, { NULL } @@ -336,6 +374,8 @@ static const AVClass stream_group_class = { .item_name = av_default_item_name, .version = LIBAVUTIL_VERSION_INT, .option = stream_group_options, + .child_next = stream_group_child_next, + .child_class_iterate = stream_group_child_iterate, }; const AVClass *av_stream_group_get_class(void) @@ -366,7 +406,16 @@ AVStreamGroup *avformat_stream_group_create(AVFormatContext *s, av_opt_set_defaults(stg); stg->type = type; switch (type) { - // Structs in the union are allocated here + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + stg->params.iamf_audio_element = avformat_iamf_audio_element_alloc(); + if (!stg->params.iamf_audio_element) + goto fail; + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + stg->params.iamf_mix_presentation = avformat_iamf_mix_presentation_alloc(); + if (!stg->params.iamf_mix_presentation) + goto fail; + break; default: goto fail; } From patchwork Tue Nov 21 21:14:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44740 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:8c2a:b0:181:818d:5e7f with SMTP id j42csp849009pzh; Tue, 21 Nov 2023 13:15:31 -0800 (PST) X-Google-Smtp-Source: AGHT+IEquEVBznPfH8BgoHhE2DWcGybv5RcCKaewhflPrwMxIvYBaRPTbl2taNQEvLC4LA9eyaMP X-Received: by 2002:a17:906:224d:b0:a01:97e6:6771 with SMTP id 13-20020a170906224d00b00a0197e66771mr175829ejr.0.1700601331450; Tue, 21 Nov 2023 13:15:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700601331; cv=none; d=google.com; s=arc-20160816; b=DukU/wh6oJHMVqB11JHncsWMmJHgKcZDgdRFvkllrpVRj3fU6wYKihfV6oANcFXx/N 0R6KdCJGL0KPStv/Q7Sq5KhY6rwtj8iV0mFmSdUJF2kQ1nlb8BtfKu0z2F6A5WD3WBrd 7AtOjvgyWpBPcLO9COU2Fy98oyfOhd/z4CWlhIykhlqfLJxZB3rLVmPGmDDLoytdskOS hlluS67p89jCEkvXieTSKu26cxZGArReDZhjWxhHU8ly6mAG6XjOoG5BnIS3qN9sq2cV yrXThHPC60DmAAFNYXKxj7ZjeMuiS+Km78o4FSHEJ/tzPJyNEBAMqyjXzZT3EnSyU7TG AIsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=KHtbCE7B7t4u5QOH0+RjCvbmFfro+Uv21pRmYazFT6s=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=nBLRC151aVXaHklmKg08s7POVLNpePUfRr7drJAX0OdKqA7acv0T03YcbRGWzytZVj DQH1WOCtzMNx4mSMnoXR/W2x3/Jl3+yw31DIY7myiU+yuKnUySvoq5SDZFT5iqrCETAO TZHtxwjjXaCgi8g077M5WHXpu2EY+QGEH6wUgVOWPwkubMFEMw2tmOeYFB/D7xi2LRNm LWSIP5N6uFGBaydtAr3Ydc2kDvsNG4nEI6A45i/ycFd3JGj3v4Lopfkl600cK+hs0Sgz vj8fM3vUfHt7263zhyzT85Oxtvw3wSZQEDBc3eXd1h2yIrr8XKDnscD3GIrc/CHr0PTG M0bw== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=heQjSKlg; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id h23-20020a170906591700b009e63570bc24si6112943ejq.717.2023.11.21.13.15.30; Tue, 21 Nov 2023 13:15:31 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=heQjSKlg; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id 8BB8D68CCCB; Tue, 21 Nov 2023 23:15:00 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 3E0EC68CAA2 for ; Tue, 21 Nov 2023 23:14:54 +0200 (EET) Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-6c32a20d5dbso5422606b3a.1 for ; Tue, 21 Nov 2023 13:14:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700601292; x=1701206092; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=WMwSj+BVdTjNgAmIupHTlztQcu3TiR8Oy3uo9qqBdJI=; b=heQjSKlgo4tMqHILe3CicDq8dzUNdIvSNazdEQ+GmWnI5mFXM+c2mqIZwo7A078xgU eo3digx4SKwsypNzxUJt3cR5cu+hsX5wJsn9XPe2iYFYfE4uN95vSLxxd/yLPZ8OyV1F 4E7Im6ALOPKgFR9b7z7oV0Ti2f/5orHiR9qVK9sCcfIj/zD24/mnLA3V+5AT8BQ3FUGH 2mZLeOCQJNl3qOV9CymDSb8QMiNQHBPftCHC7525q8PhGJ+WOIqAogV6HG0g/eg/o2u7 EFnSLwDvgFa6GzigghoQcYXqk4JJtuQm8vT5/VIQ2CtU3I/yd4elmBv/n65+k5kl9i2R B6+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700601292; x=1701206092; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WMwSj+BVdTjNgAmIupHTlztQcu3TiR8Oy3uo9qqBdJI=; b=QyGz3xEwXWmP8rmdBDFjBujcCFJitVV/2aiYt4cAUgpiXafpalIFXB762ISzrIDWRC p4fxwGXpcc8mtCtF1JhDLhhRRk2cJBtQsmu4ZRvs6Rubf/OCqLV3V4FcFS7jjAWvjQF7 d6mFwCCP1ELdpAfh+XMdEs6TAFhGSdTZw3xHlAzDolIYurm2FXcJeTt5kZLq7XHR8WkC x10FT89bHBwFd+oZC9d37xDy4RzbGEQg8fpiLgP6fq2117iAYo+675WmTxjMnC6uQz9P GJfSUV/UH6FCGPqTrIBEi3R5s2vVwuBPKqbVQg0xPCwa6DJBcFX+mEztVsW8vC3wCusZ mPRQ== X-Gm-Message-State: AOJu0YwHd6xi88akkL3A/9O17Q5SVzHGHhfcJne8lzSxcOx4GcGMm5Ug sGP1NSa7qZpI3cZBj1gvfsmJuZpy1f4= X-Received: by 2002:a05:6a20:2693:b0:18b:1a35:542 with SMTP id h19-20020a056a20269300b0018b1a350542mr208375pze.33.1700601291739; Tue, 21 Nov 2023 13:14:51 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id bn2-20020a056a00324200b0069ea08a2a99sm8412505pfb.211.2023.11.21.13.14.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Nov 2023 13:14:51 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 21 Nov 2023 18:14:41 -0300 Message-ID: <20231121211442.8723-5-jamrial@gmail.com> X-Mailer: git-send-email 2.42.1 In-Reply-To: <20231121211442.8723-1-jamrial@gmail.com> References: <20231121211442.8723-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 4/5] avformat: Immersive Audio Model and Formats muxer X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: ydCkTU7JlNrv Signed-off-by: James Almer --- libavformat/Makefile | 1 + libavformat/allformats.c | 1 + libavformat/iamfenc.c | 1085 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 1087 insertions(+) create mode 100644 libavformat/iamfenc.c diff --git a/libavformat/Makefile b/libavformat/Makefile index 472bbaf7cf..70c43ddde3 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -260,6 +260,7 @@ OBJS-$(CONFIG_HLS_DEMUXER) += hls.o hls_sample_encryption.o OBJS-$(CONFIG_HLS_MUXER) += hlsenc.o hlsplaylist.o avc.o OBJS-$(CONFIG_HNM_DEMUXER) += hnm.o OBJS-$(CONFIG_IAMF_DEMUXER) += iamfdec.o iamf_parse.o iamf.o +OBJS-$(CONFIG_IAMF_MUXER) += iamfenc.o iamf.o OBJS-$(CONFIG_ICO_DEMUXER) += icodec.o OBJS-$(CONFIG_ICO_MUXER) += icoenc.o OBJS-$(CONFIG_IDCIN_DEMUXER) += idcin.o diff --git a/libavformat/allformats.c b/libavformat/allformats.c index 63ca44bacd..7529aed4a4 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -213,6 +213,7 @@ extern const AVInputFormat ff_hls_demuxer; extern const FFOutputFormat ff_hls_muxer; extern const AVInputFormat ff_hnm_demuxer; extern const AVInputFormat ff_iamf_demuxer; +extern const FFOutputFormat ff_iamf_muxer; extern const AVInputFormat ff_ico_demuxer; extern const FFOutputFormat ff_ico_muxer; extern const AVInputFormat ff_idcin_demuxer; diff --git a/libavformat/iamfenc.c b/libavformat/iamfenc.c new file mode 100644 index 0000000000..29933d56eb --- /dev/null +++ b/libavformat/iamfenc.c @@ -0,0 +1,1085 @@ +/* + * IAMF muxer + * Copyright (c) 2023 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#include "libavutil/avassert.h" +#include "libavutil/common.h" +#include "libavutil/internal.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/opt.h" +#include "libavcodec/get_bits.h" +#include "libavcodec/flac.h" +#include "libavcodec/mpeg4audio.h" +#include "libavcodec/put_bits.h" +#include "avformat.h" +#include "avio_internal.h" +#include "iamf.h" +#include "iamf_internal.h" +#include "internal.h" +#include "mux.h" + +typedef struct IAMFMuxContext { + IAMFContext iamf; + + int first_stream_id; +} IAMFMuxContext; + +static int update_extradata(AVFormatContext *s, IAMFCodecConfig *codec_config) +{ + GetBitContext gb; + PutBitContext pb; + int ret; + + switch(codec_config->codec_id) { + case AV_CODEC_ID_OPUS: + if (codec_config->extradata_size < 19) + return AVERROR_INVALIDDATA; + codec_config->extradata_size -= 8; + memmove(codec_config->extradata, codec_config->extradata + 8, codec_config->extradata_size); + AV_WB8(codec_config->extradata + 1, 2); // set channels to stereo + break; + case AV_CODEC_ID_FLAC: { + uint8_t buf[13]; + + init_put_bits(&pb, buf, sizeof(buf)); + ret = init_get_bits8(&gb, codec_config->extradata, codec_config->extradata_size); + if (ret < 0) + return ret; + + put_bits32(&pb, get_bits_long(&gb, 32)); // min/max blocksize + put_bits64(&pb, 48, get_bits64(&gb, 48)); // min/max framesize + put_bits(&pb, 20, get_bits(&gb, 20)); // samplerate + skip_bits(&gb, 3); + put_bits(&pb, 3, 1); // set channels to stereo + ret = put_bits_left(&pb); + put_bits(&pb, ret, get_bits(&gb, ret)); + flush_put_bits(&pb); + + memcpy(codec_config->extradata, buf, sizeof(buf)); + break; + } + default: + break; + } + + return 0; +} + +static int fill_codec_config(AVFormatContext *s, const AVStreamGroup *stg, + IAMFCodecConfig *codec_config) +{ + const AVIAMFAudioElement *iamf = stg->params.iamf_audio_element; + const AVStream *st = stg->streams[0]; + int ret; + + av_freep(&codec_config->extradata); + codec_config->extradata_size = 0; + + codec_config->codec_config_id = iamf->codec_config_id; + codec_config->codec_id = st->codecpar->codec_id; + codec_config->sample_rate = st->codecpar->sample_rate; + codec_config->codec_tag = st->codecpar->codec_tag; + codec_config->nb_samples = st->codecpar->frame_size; + codec_config->seek_preroll = st->codecpar->seek_preroll; + if (st->codecpar->extradata_size) { + codec_config->extradata = av_memdup(st->codecpar->extradata, st->codecpar->extradata_size); + if (!codec_config->extradata) + return AVERROR(ENOMEM); + codec_config->extradata_size = st->codecpar->extradata_size; + ret = update_extradata(s, codec_config); + if (ret < 0) + return ret; + } + + return 0; +} + +static IAMFParamDefinition *get_param_definition(AVFormatContext *s, unsigned int parameter_id) +{ + const IAMFMuxContext *const c = s->priv_data; + const IAMFContext *const iamf = &c->iamf; + IAMFParamDefinition *param_definition = NULL; + + for (int i = 0; i < iamf->nb_param_definitions; i++) + if (iamf->param_definitions[i].param->parameter_id == parameter_id) { + param_definition = &iamf->param_definitions[i]; + break; + } + + return param_definition; +} + +static IAMFParamDefinition *add_param_definition(AVFormatContext *s, AVIAMFParamDefinition *param) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + IAMFParamDefinition *param_definition = av_dynarray2_add_nofree((void **)&iamf->param_definitions, + &iamf->nb_param_definitions, + sizeof(*iamf->param_definitions), NULL); + if (!param_definition) + return NULL; + param_definition->param = param; + param_definition->audio_element = NULL; + + return param_definition; +} + +static int iamf_init(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + int stream_id = 0, ret; + + if (!s->nb_streams) { + av_log(s, AV_LOG_ERROR, "There must be at least one stream\n"); + return AVERROR(EINVAL); + } + + for (int i = 0; i < s->nb_streams; i++) { + if (s->streams[i]->codecpar->codec_type != AVMEDIA_TYPE_AUDIO || + (s->streams[i]->codecpar->codec_tag != MKTAG('m','p','4','a') && + s->streams[i]->codecpar->codec_tag != MKTAG('O','p','u','s') && + s->streams[i]->codecpar->codec_tag != MKTAG('f','L','a','C') && + s->streams[i]->codecpar->codec_tag != MKTAG('i','p','c','m'))) { + av_log(s, AV_LOG_ERROR, "Unsupported codec id %s\n", + avcodec_get_name(s->streams[i]->codecpar->codec_id)); + return AVERROR(EINVAL); + } + + if (s->streams[i]->codecpar->ch_layout.nb_channels > 2) { + av_log(s, AV_LOG_ERROR, "Unsupported channel layout on stream #%d\n", i); + return AVERROR(EINVAL); + } + + if (!s->streams[i]->id) + s->streams[i]->id = ++stream_id; + } + + if (!s->nb_stream_groups) { + av_log(s, AV_LOG_ERROR, "There must be at least two stream groups\n"); + return AVERROR(EINVAL); + } + + for (int i = 0; i < s->nb_stream_groups; i++) { + AVStreamGroup *stg = s->stream_groups[i]; + + if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + iamf->nb_audio_elements++; + if (stg->type == AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + iamf->nb_mix_presentations++; + } + if ((iamf->nb_audio_elements < 1 && iamf->nb_audio_elements > 2) || iamf->nb_mix_presentations < 1) { + av_log(s, AV_LOG_ERROR, "There must be >= 1 and <= 2 IAMF_AUDIO_ELEMENT and at least one IAMF_MIX_PRESENTATION stream groups\n"); + return AVERROR(EINVAL); + } + + iamf->audio_elements = av_calloc(iamf->nb_audio_elements, sizeof(*iamf->audio_elements)); + iamf->mix_presentations = av_calloc(iamf->nb_mix_presentations, sizeof(*iamf->mix_presentations)); + + if (!iamf->audio_elements || !iamf->mix_presentations) { + iamf->nb_audio_elements = iamf->nb_mix_presentations = 0; + return AVERROR(ENOMEM); + } + + for (int i = 0, idx = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + AVIAMFAudioElement *iamf_audio_element; + IAMFAudioElement *audio_element; + IAMFCodecConfig *codec_config = NULL; + + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT) + continue; + + iamf_audio_element = stg->params.iamf_audio_element; + if (iamf_audio_element->audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_SCENE) { + const AVIAMFLayer *layer = iamf_audio_element->layers[0]; + if (iamf_audio_element->num_layers != 1) { + av_log(s, AV_LOG_ERROR, "Invalid amount of layers for SCENE_BASED audio element. Must be 1\n"); + return AVERROR(EINVAL); + } + if (layer->ch_layout.order != AV_CHANNEL_ORDER_CUSTOM && + layer->ch_layout.order != AV_CHANNEL_ORDER_AMBISONIC) { + av_log(s, AV_LOG_ERROR, "Invalid channel layout for SCENE_BASED audio element\n"); + return AVERROR(EINVAL); + } + for (int j = 0; j < stg->nb_streams; j++) { + if (stg->streams[j]->codecpar->ch_layout.nb_channels > 1) { + av_log(s, AV_LOG_ERROR, "PROJECTION mode ambisonics not supported\n"); + return AVERROR_PATCHWELCOME; + } + } + } else + for (int k, j = 0; j < iamf_audio_element->num_layers; j++) { + const AVIAMFLayer *layer = iamf_audio_element->layers[j]; + for (k = 0; k < FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts); k++) + if (!av_channel_layout_compare(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[k])) + break; + + if (k >= FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts)) { + av_log(s, AV_LOG_ERROR, "Unsupported channel layout in stream group #%d\n", i); + return AVERROR(EINVAL); + } + } + + for (int j = 0; j < iamf->nb_codec_configs; j++) { + if (iamf->codec_configs[j].codec_config_id == iamf_audio_element->codec_config_id) { + codec_config = &iamf->codec_configs[j]; + break; + } + } + + if (!codec_config) { + codec_config = av_dynarray2_add_nofree((void **)&iamf->codec_configs, &iamf->nb_codec_configs, + sizeof(*iamf->codec_configs), NULL); + if (!codec_config) + return AVERROR(ENOMEM); + memset(codec_config, 0, sizeof(*codec_config)); + + } + + ret = fill_codec_config(s, stg, codec_config); + if (ret < 0) + return ret; + + audio_element = &iamf->audio_elements[idx++]; + audio_element->element = iamf_audio_element; + audio_element->audio_element_id = stg->id; + audio_element->codec_config = codec_config; + + audio_element->substreams = av_calloc(stg->nb_streams, sizeof(*audio_element->substreams)); + if (!audio_element->substreams) + return AVERROR(ENOMEM); + audio_element->nb_substreams = stg->nb_streams; + + for (int j = 0, k = 0; j < iamf_audio_element->num_layers; j++) { + IAMFLayer *layer = av_dynarray2_add_nofree((void **)&audio_element->layers, &audio_element->nb_layers, + sizeof(*audio_element->layers), NULL); + int nb_channels = iamf_audio_element->layers[j]->ch_layout.nb_channels; + + if (!layer) + return AVERROR(ENOMEM); + memset(layer, 0, sizeof(*layer)); + + if (j) + nb_channels -= iamf_audio_element->layers[j - 1]->ch_layout.nb_channels; + for (; nb_channels > 0 && k < stg->nb_streams; k++) { + const AVStream *st = stg->streams[k]; + IAMFSubStream *substream = &audio_element->substreams[k]; + + substream->audio_substream_id = st->id; + layer->substream_count++; + layer->coupled_substream_count += st->codecpar->ch_layout.nb_channels == 2; + nb_channels -= st->codecpar->ch_layout.nb_channels; + } + if (nb_channels) { + av_log(s, AV_LOG_ERROR, "Invalid channel count across substreams in layer %u from stream group %u\n", + j, stg->index); + return AVERROR(EINVAL); + } + } + + if (iamf_audio_element->demixing_info) { + AVIAMFParamDefinition *param = iamf_audio_element->demixing_info; + IAMFParamDefinition *param_definition = get_param_definition(s, param->parameter_id); + + if (param->num_subblocks != 1) { + av_log(s, AV_LOG_ERROR, "num_subblocks in demixing_info for stream group %u is not 1\n", stg->index); + return AVERROR(EINVAL); + } + if (!param_definition) { + param_definition = add_param_definition(s, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + param_definition->audio_element = iamf_audio_element; + } + if (iamf_audio_element->recon_gain_info) { + AVIAMFParamDefinition *param = iamf_audio_element->recon_gain_info; + IAMFParamDefinition *param_definition = get_param_definition(s, param->parameter_id); + + if (param->num_subblocks != 1) { + av_log(s, AV_LOG_ERROR, "num_subblocks in recon_gain_info for stream group %u is not 1\n", stg->index); + return AVERROR(EINVAL); + } + + if (!param_definition) { + param_definition = add_param_definition(s, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + param_definition->audio_element = iamf_audio_element; + } + } + + for (int i = 0, idx = 0; i < s->nb_stream_groups; i++) { + const AVStreamGroup *stg = s->stream_groups[i]; + IAMFMixPresentation *mix_presentation; + + if (stg->type != AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION) + continue; + + mix_presentation = &iamf->mix_presentations[idx++]; + mix_presentation->mix = stg->params.iamf_mix_presentation; + + for (int i = 0; i < stg->params.iamf_mix_presentation->num_submixes; i++) { + AVIAMFSubmix *submix = stg->params.iamf_mix_presentation->submixes[i]; + AVIAMFParamDefinition *param = submix->output_mix_config; + IAMFParamDefinition *param_definition; + + if (!param) { + av_log(s, AV_LOG_ERROR, "output_mix_config is not present in submix %u from Mix Presentation ID %"PRId64"\n", i, stg->id); + return AVERROR(EINVAL); + } + + param_definition = get_param_definition(s, param->parameter_id); + if (!param_definition) { + param_definition = add_param_definition(s, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + + for (int j = 0; j < submix->num_elements; j++) { + const AVIAMFAudioElement *iamf_audio_element = NULL; + const AVIAMFSubmixElement *element = submix->elements[j]; + param = element->element_mix_config; + + if (!param) { + av_log(s, AV_LOG_ERROR, "element_mix_config is not present for element %u in submix %u from Mix Presentation ID %"PRId64"\n", j, i, stg->id); + return AVERROR(EINVAL); + } + param_definition = get_param_definition(s, param->parameter_id); + if (!param_definition) { + param_definition = add_param_definition(s, param); + if (!param_definition) + return AVERROR(ENOMEM); + } + for (int k = 0; k < iamf->nb_audio_elements; k++) + if (iamf->audio_elements[k].audio_element_id == element->audio_element_id) { + iamf_audio_element = iamf->audio_elements[k].element; + break; + } + param_definition->audio_element = iamf_audio_element; + } + } + } + + c->first_stream_id = s->streams[0]->id; + + return 0; +} + +static void leb(AVIOContext *pb, unsigned value) +{ + int len, i; + uint8_t byte; + + len = (av_log2(value) + 7) / 7; + + for (i = 0; i < len; i++) { + byte = value >> (7 * i) & 0x7f; + if (i < len - 1) + byte |= 0x80; + + avio_w8(pb, byte); + } +} + +static int iamf_write_codec_config(AVFormatContext *s, const IAMFCodecConfig *codec_config) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + PutBitContext pb; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + leb(dyn_bc, codec_config->codec_config_id); + avio_wl32(dyn_bc, codec_config->codec_tag); + + leb(dyn_bc, codec_config->nb_samples); + avio_wb16(dyn_bc, codec_config->seek_preroll); + + switch(codec_config->codec_id) { + case AV_CODEC_ID_OPUS: + avio_write(dyn_bc, codec_config->extradata, codec_config->extradata_size); + break; + case AV_CODEC_ID_AAC: + return AVERROR_PATCHWELCOME; + case AV_CODEC_ID_FLAC: + avio_w8(dyn_bc, 0x80); + avio_wb24(dyn_bc, codec_config->extradata_size); + avio_write(dyn_bc, codec_config->extradata, codec_config->extradata_size); + break; + case AV_CODEC_ID_PCM_S16LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 16); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S24LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 24); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S32LE: + avio_w8(dyn_bc, 0); + avio_w8(dyn_bc, 32); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S16BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 16); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S24BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 24); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + case AV_CODEC_ID_PCM_S32BE: + avio_w8(dyn_bc, 1); + avio_w8(dyn_bc, 32); + avio_wb32(dyn_bc, codec_config->sample_rate); + break; + default: + break; + } + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_CODEC_CONFIG); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + leb(s->pb, dyn_size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static inline int rescale_rational(AVRational q, int b) +{ + return av_clip_int16(av_rescale(q.num, b, q.den)); +} + +static int scalable_channel_layout_config(AVFormatContext *s, AVIOContext *dyn_bc, + const IAMFAudioElement *audio_element) +{ + const AVIAMFAudioElement *element = audio_element->element; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pb; + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 3, element->num_layers); + put_bits(&pb, 5, 0); + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + for (int i = 0; i < element->num_layers; i++) { + AVIAMFLayer *layer = element->layers[i]; + int layout; + for (layout = 0; layout < FF_ARRAY_ELEMS(ff_iamf_scalable_ch_layouts); layout++) { + if (!av_channel_layout_compare(&layer->ch_layout, &ff_iamf_scalable_ch_layouts[layout])) + break; + } + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 4, layout); + put_bits(&pb, 1, !!layer->output_gain_flags); + put_bits(&pb, 1, layer->recon_gain_is_present); + put_bits(&pb, 2, 0); // reserved + put_bits(&pb, 8, audio_element->layers[i].substream_count); + put_bits(&pb, 8, audio_element->layers[i].coupled_substream_count); + // av_log(s, AV_LOG_WARNING, "k %d, substream_count %d, coupled_substream_count %d\n", k, layer->substream_count, coupled_substream_count); + if (layer->output_gain_flags) { + put_bits(&pb, 6, layer->output_gain_flags); + put_bits(&pb, 2, 0); + put_bits(&pb, 16, rescale_rational(layer->output_gain, 1 << 8)); + } + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + } + + return 0; +} + +static int ambisonics_config(AVFormatContext *s, AVIOContext *dyn_bc, + const IAMFAudioElement *audio_element) +{ + const AVIAMFAudioElement *element = audio_element->element; + AVIAMFLayer *layer = element->layers[0]; + + leb(dyn_bc, 0); // ambisonics_mode + leb(dyn_bc, layer->ch_layout.nb_channels); // output_channel_count + leb(dyn_bc, audio_element->nb_substreams); // substream_count + + if (layer->ch_layout.order == AV_CHANNEL_ORDER_AMBISONIC) + for (int i = 0; i < layer->ch_layout.nb_channels; i++) + avio_w8(dyn_bc, i); + else + for (int i = 0; i < layer->ch_layout.nb_channels; i++) + avio_w8(dyn_bc, layer->ch_layout.u.map[i].id); + + return 0; +} + +static int param_definition(AVFormatContext *s, AVIOContext *dyn_bc, + AVIAMFParamDefinition *param) +{ + leb(dyn_bc, param->parameter_id); + leb(dyn_bc, param->parameter_rate); + avio_w8(dyn_bc, !!param->param_definition_mode << 7); // param_definition_mode + if (!param->param_definition_mode) { + leb(dyn_bc, param->duration); // duration + leb(dyn_bc, param->constant_subblock_duration); // constant_subblock_duration + if (param->constant_subblock_duration == 0) { + leb(dyn_bc, param->num_subblocks); + for (int i = 0; i < param->num_subblocks; i++) { + const void *subblock = avformat_iamf_param_definition_get_subblock(param, i); + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + const AVIAMFMixGainParameterData *mix = subblock; + leb(dyn_bc, mix->subblock_duration); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + const AVIAMFDemixingInfoParameterData *demix = subblock; + leb(dyn_bc, demix->subblock_duration); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + const AVIAMFReconGainParameterData *recon = subblock; + leb(dyn_bc, recon->subblock_duration); + break; + } + } + } + } + } + + return 0; +} + +static int iamf_write_audio_element(AVFormatContext *s, const IAMFAudioElement *audio_element) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + const AVIAMFAudioElement *element = audio_element->element; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + PutBitContext pb; + int param_definition_types = AV_IAMF_PARAMETER_DEFINITION_DEMIXING, dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + leb(dyn_bc, audio_element->audio_element_id); + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 3, element->audio_element_type); + put_bits(&pb, 5, 0); + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + + leb(dyn_bc, audio_element->codec_config->codec_config_id); + leb(dyn_bc, audio_element->nb_substreams); + + for (int i = 0; i < audio_element->nb_substreams; i++) + leb(dyn_bc, audio_element->substreams[i].audio_substream_id); + + if (audio_element->nb_layers == 1) + param_definition_types &= ~AV_IAMF_PARAMETER_DEFINITION_DEMIXING; + if (audio_element->nb_layers > 1) + param_definition_types |= AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN; + if (audio_element->codec_config->codec_tag == MKTAG('f','L','a','C') || + audio_element->codec_config->codec_tag == MKTAG('i','p','c','m')) + param_definition_types &= ~AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN; + + leb(dyn_bc, av_popcount(param_definition_types)); // num_parameters + + if (param_definition_types & 1) { + AVIAMFParamDefinition *param = element->demixing_info; + AVIAMFDemixingInfoParameterData *demix; + + if (!param) { + av_log(s, AV_LOG_ERROR, "demixing_info needed but not set in Stream Group #%u\n", + audio_element->audio_element_id); + return AVERROR(EINVAL); + } + + demix = avformat_iamf_param_definition_get_subblock(param, 0); + leb(dyn_bc, AV_IAMF_PARAMETER_DEFINITION_DEMIXING); // param_definition_type + param_definition(s, dyn_bc, param); + + avio_w8(dyn_bc, demix->dmixp_mode << 5); // dmixp_mode + avio_w8(dyn_bc, element->default_w << 4); // default_w + } + if (param_definition_types & 2) { + AVIAMFParamDefinition *param = element->recon_gain_info; + + if (!param) { + av_log(s, AV_LOG_ERROR, "recon_gain_info needed but not set in Stream Group #%u\n", + audio_element->audio_element_id); + return AVERROR(EINVAL); + } + leb(dyn_bc, AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN); // param_definition_type + param_definition(s, dyn_bc, param); + } + + if (element->audio_element_type == AV_IAMF_AUDIO_ELEMENT_TYPE_CHANNEL) { + ret = scalable_channel_layout_config(s, dyn_bc, audio_element); + if (ret < 0) + return ret; + } else { + ret = ambisonics_config(s, dyn_bc, audio_element); + if (ret < 0) + return ret; + } + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_AUDIO_ELEMENT); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + leb(s->pb, dyn_size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static int iamf_write_mixing_presentation(AVFormatContext *s, const IAMFMixPresentation *mix_presentation) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + AVIAMFMixPresentation *mix = mix_presentation->mix; + const AVDictionaryEntry *tag = NULL; + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + leb(dyn_bc, mix_presentation->mix_presentation_id); // mix_presentation_id + leb(dyn_bc, av_dict_count(mix->annotations)); // count_label + + while ((tag = av_dict_iterate(mix->annotations, tag))) + avio_put_str(dyn_bc, tag->key); + while ((tag = av_dict_iterate(mix->annotations, tag))) + avio_put_str(dyn_bc, tag->value); + + leb(dyn_bc, mix->num_submixes); + for (int i = 0; i < mix->num_submixes; i++) { + const AVIAMFSubmix *sub_mix = mix->submixes[i]; + + leb(dyn_bc, sub_mix->num_elements); + for (int j = 0; j < sub_mix->num_elements; j++) { + const IAMFAudioElement *audio_element = NULL; + const AVIAMFSubmixElement *submix_element = sub_mix->elements[j]; + + for (int k = 0; k < iamf->nb_audio_elements; k++) + if (iamf->audio_elements[k].audio_element_id == submix_element->audio_element_id) { + audio_element = &iamf->audio_elements[k]; + break; + } + + av_assert0(audio_element); + leb(dyn_bc, submix_element->audio_element_id); + + if (av_dict_count(submix_element->annotations) != av_dict_count(mix->annotations)) { + av_log(s, AV_LOG_ERROR, "Inconsistent amount of labels in submix %d from Mix Presentation id #%u\n", + j, audio_element->audio_element_id); + return AVERROR(EINVAL); + } + while ((tag = av_dict_iterate(submix_element->annotations, tag))) + avio_put_str(dyn_bc, tag->value); + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 2, submix_element->headphones_rendering_mode); + put_bits(&pb, 6, 0); // reserved + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + leb(dyn_bc, 0); // rendering_config_extension_size + param_definition(s, dyn_bc, submix_element->element_mix_config); + avio_wb16(dyn_bc, rescale_rational(submix_element->default_mix_gain, 1 << 8)); + } + param_definition(s, dyn_bc, sub_mix->output_mix_config); + avio_wb16(dyn_bc, rescale_rational(sub_mix->default_mix_gain, 1 << 8)); + + leb(dyn_bc, sub_mix->num_layouts); // num_layouts + for (int i = 0; i < sub_mix->num_layouts; i++) { + AVIAMFSubmixLayout *submix_layout = sub_mix->layouts[i]; + int layout, info_type; + int dialogue = submix_layout->dialogue_anchored_loudness.num && + submix_layout->dialogue_anchored_loudness.den; + int album = submix_layout->album_anchored_loudness.num && + submix_layout->album_anchored_loudness.den; + + if (layout == FF_ARRAY_ELEMS(ff_iamf_sound_system_map)) { + av_log(s, AV_LOG_ERROR, "Invalid Sound System value in a submix\n"); + return AVERROR(EINVAL); + } + + if (submix_layout->layout_type == AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS) { + for (layout = 0; layout < FF_ARRAY_ELEMS(ff_iamf_sound_system_map); layout++) { + if (!av_channel_layout_compare(&submix_layout->sound_system, &ff_iamf_sound_system_map[layout].layout)) + break; + } + if (layout == FF_ARRAY_ELEMS(ff_iamf_sound_system_map)) { + av_log(s, AV_LOG_ERROR, "Invalid Sound System value in a submix\n"); + return AVERROR(EINVAL); + } + } + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 2, submix_layout->layout_type); // layout_type + if (submix_layout->layout_type == AV_IAMF_SUBMIX_LAYOUT_TYPE_LOUDSPEAKERS) { + put_bits(&pb, 4, ff_iamf_sound_system_map[layout].id); // sound_system + put_bits(&pb, 2, 0); // reserved + } else + put_bits(&pb, 6, 0); // reserved + flush_put_bits(&pb); + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + + info_type = (submix_layout->true_peak.num && submix_layout->true_peak.den); + info_type |= (dialogue || album) << 1; + avio_w8(dyn_bc, info_type); + avio_wb16(dyn_bc, rescale_rational(submix_layout->integrated_loudness, 1 << 8)); + avio_wb16(dyn_bc, rescale_rational(submix_layout->digital_peak, 1 << 8)); + if (info_type & 1) + avio_wb16(dyn_bc, rescale_rational(submix_layout->true_peak, 1 << 8)); + if (info_type & 2) { + avio_w8(dyn_bc, dialogue + album); // num_anchored_loudness + if (dialogue) { + avio_w8(dyn_bc, IAMF_ANCHOR_ELEMENT_DIALOGUE); + avio_wb16(dyn_bc, rescale_rational(submix_layout->dialogue_anchored_loudness, 1 << 8)); + } + if (album) { + avio_w8(dyn_bc, IAMF_ANCHOR_ELEMENT_ALBUM); + avio_wb16(dyn_bc, rescale_rational(submix_layout->album_anchored_loudness, 1 << 8)); + } + } + } + } + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_MIX_PRESENTATION); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + leb(s->pb, dyn_size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static int iamf_write_header(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size; + + int ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + // Sequence Header + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_SEQUENCE_HEADER); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + + avio_write(dyn_bc, header, put_bytes_count(&pb, 1)); + leb(dyn_bc, 6); + avio_wb32(dyn_bc, MKBETAG('i','a','m','f')); + avio_w8(dyn_bc, iamf->nb_audio_elements > 1); // primary_profile + avio_w8(dyn_bc, iamf->nb_audio_elements > 1); // additional_profile + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + for (int i; i < iamf->nb_codec_configs; i++) { + ret = iamf_write_codec_config(s, &iamf->codec_configs[i]); + if (ret < 0) + return ret; + } + + for (int i; i < iamf->nb_audio_elements; i++) { + ret = iamf_write_audio_element(s, &iamf->audio_elements[i]); + if (ret < 0) + return ret; + } + + for (int i; i < iamf->nb_mix_presentations; i++) { + ret = iamf_write_mixing_presentation(s, &iamf->mix_presentations[i]); + if (ret < 0) + return ret; + } + + return 0; +} + +static int write_parameter_block(AVFormatContext *s, AVIAMFParamDefinition *param) +{ + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + IAMFParamDefinition *param_definition = get_param_definition(s, param->parameter_id); + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size, ret; + + if (param->param_definition_type > AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN) { + av_log(s, AV_LOG_DEBUG, "Ignoring side data with unknown param_definition_type %u\n", + param->param_definition_type); + return 0; + } + + if (!param_definition) { + av_log(s, AV_LOG_ERROR, "Non-existent Parameter Definition with ID %u referenced by a packet\n", + param->parameter_id); + return AVERROR(EINVAL); + } + + if (param->param_definition_type != param_definition->param->param_definition_type || + param->param_definition_mode != param_definition->param->param_definition_mode) { + av_log(s, AV_LOG_ERROR, "Inconsistent param_definition_mode or param_definition_type values " + "for Parameter Definition with ID %u in a packet\n", + param->parameter_id); + return AVERROR(EINVAL); + } + + ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + // Sequence Header + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, IAMF_OBU_IA_PARAMETER_BLOCK); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + + leb(dyn_bc, param->parameter_id); + if (param->param_definition_mode) { + leb(dyn_bc, param->duration); + leb(dyn_bc, param->constant_subblock_duration); + if (param->constant_subblock_duration == 0) + leb(dyn_bc, param->num_subblocks); + } + + for (int i = 0; i < param->num_subblocks; i++) { + const void *subblock = avformat_iamf_param_definition_get_subblock(param, i); + + switch (param->param_definition_type) { + case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: { + const AVIAMFMixGainParameterData *mix = subblock; + if (param->param_definition_mode && param->constant_subblock_duration == 0) + leb(dyn_bc, mix->subblock_duration); + + leb(dyn_bc, mix->animation_type); + + avio_wb16(dyn_bc, rescale_rational(mix->start_point_value, 1 << 8)); + if (mix->animation_type >= AV_IAMF_ANIMATION_TYPE_LINEAR) + avio_wb16(dyn_bc, rescale_rational(mix->end_point_value, 1 << 8)); + if (mix->animation_type == AV_IAMF_ANIMATION_TYPE_BEZIER) { + avio_wb16(dyn_bc, rescale_rational(mix->control_point_value, 1 << 8)); + avio_w8(dyn_bc, mix->control_point_relative_time); + } + break; + } + case AV_IAMF_PARAMETER_DEFINITION_DEMIXING: { + const AVIAMFDemixingInfoParameterData *demix = subblock; + if (param->param_definition_mode && param->constant_subblock_duration == 0) + leb(dyn_bc, demix->subblock_duration); + + avio_w8(dyn_bc, demix->dmixp_mode << 5); + break; + } + case AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN: { + const AVIAMFReconGainParameterData *recon = subblock; + const AVIAMFAudioElement *audio_element = param_definition->audio_element; + + if (param->param_definition_mode && param->constant_subblock_duration == 0) + leb(dyn_bc, recon->subblock_duration); + + if (!audio_element) { + av_log(s, AV_LOG_ERROR, "Invalid Parameter Definition with ID %u referenced by a packet\n", param->parameter_id); + return AVERROR(EINVAL); + } + + for (int j = 0; j < audio_element->num_layers; j++) { + const AVIAMFLayer *layer = audio_element->layers[j]; + + if (layer->recon_gain_is_present) { + unsigned int recon_gain_flags = 0; + int k = 0; + + for (; k < 7; k++) + recon_gain_flags |= (1 << k) * !!recon->recon_gain[j][k]; + for (; k < 12; k++) + recon_gain_flags |= (2 << k) * !!recon->recon_gain[j][k]; + if (recon_gain_flags >> 8) + recon_gain_flags |= (1 << k); + + leb(dyn_bc, recon_gain_flags); + for (k = 0; k < 12; k++) { + if (recon->recon_gain[j][k]) + avio_w8(dyn_bc, recon->recon_gain[j][k]); + } + } + } + break; + } + default: + av_assert0(0); + } + } + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + leb(s->pb, dyn_size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + + return 0; +} + +static int iamf_write_packet(AVFormatContext *s, AVPacket *pkt) +{ + const IAMFMuxContext *const c = s->priv_data; + AVStream *st = s->streams[pkt->stream_index]; + uint8_t header[MAX_IAMF_OBU_HEADER_SIZE]; + PutBitContext pb; + AVIOContext *dyn_bc; + uint8_t *dyn_buf = NULL; + int dyn_size; + int ret, type = st->id <= 17 ? st->id + IAMF_OBU_IA_AUDIO_FRAME_ID0 : IAMF_OBU_IA_AUDIO_FRAME; + + if (s->nb_stream_groups && st->id == c->first_stream_id) { + AVIAMFParamDefinition *mix = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_MIX_GAIN_PARAM, NULL); + AVIAMFParamDefinition *demix = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_DEMIXING_INFO_PARAM, NULL); + AVIAMFParamDefinition *recon = + (AVIAMFParamDefinition *)av_packet_get_side_data(pkt, AV_PKT_DATA_IAMF_RECON_GAIN_INFO_PARAM, NULL); + + if (mix) { + ret = write_parameter_block(s, mix); + if (ret < 0) + return ret; + } + if (demix) { + ret = write_parameter_block(s, demix); + if (ret < 0) + return ret; + } + if (recon) { + ret = write_parameter_block(s, recon); + if (ret < 0) + return ret; + } + } + + ret = avio_open_dyn_buf(&dyn_bc); + if (ret < 0) + return ret; + + init_put_bits(&pb, header, sizeof(header)); + put_bits(&pb, 5, type); + put_bits(&pb, 3, 0); + flush_put_bits(&pb); + avio_write(s->pb, header, put_bytes_count(&pb, 1)); + + if (st->id > 17) + leb(dyn_bc, st->id); + + dyn_size = avio_close_dyn_buf(dyn_bc, &dyn_buf); + leb(s->pb, dyn_size + pkt->size); + avio_write(s->pb, dyn_buf, dyn_size); + av_free(dyn_buf); + avio_write(s->pb, pkt->data, pkt->size); + + return 0; +} + +static void iamf_deinit(AVFormatContext *s) +{ + IAMFMuxContext *const c = s->priv_data; + IAMFContext *const iamf = &c->iamf; + + for (int i = 0; i < iamf->nb_audio_elements; i++) { + IAMFAudioElement *audio_element = &iamf->audio_elements[i]; + audio_element->element = NULL; + } + + for (int i = 0; i < iamf->nb_mix_presentations; i++) { + IAMFMixPresentation *mix_presentation = &iamf->mix_presentations[i]; + mix_presentation->mix = NULL; + } + + ff_iamf_uninit_context(iamf); + + return; +} + +static const AVCodecTag iamf_codec_tags[] = { + { AV_CODEC_ID_AAC, MKTAG('m','p','4','a') }, + { AV_CODEC_ID_FLAC, MKTAG('f','L','a','C') }, + { AV_CODEC_ID_OPUS, MKTAG('O','p','u','s') }, + { AV_CODEC_ID_PCM_S16LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S16BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S24LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S24BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S32LE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_PCM_S32BE, MKTAG('i','p','c','m') }, + { AV_CODEC_ID_NONE, MKTAG('i','p','c','m') } +}; + +const FFOutputFormat ff_iamf_muxer = { + .p.name = "iamf", + .p.long_name = NULL_IF_CONFIG_SMALL("Raw Immersive Audio Model and Formats"), + .p.extensions = "iamf", + .priv_data_size = sizeof(IAMFMuxContext), + .p.audio_codec = AV_CODEC_ID_OPUS, + .init = iamf_init, + .deinit = iamf_deinit, + .write_header = iamf_write_header, + .write_packet = iamf_write_packet, + .p.codec_tag = (const AVCodecTag* const []){ iamf_codec_tags, NULL }, + .p.flags = AVFMT_GLOBALHEADER | AVFMT_NOTIMESTAMPS, + // .p.priv_class = &iamf_class, +}; From patchwork Tue Nov 21 21:14:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Almer X-Patchwork-Id: 44744 Delivered-To: ffmpegpatchwork2@gmail.com Received: by 2002:a05:6a20:8c2a:b0:181:818d:5e7f with SMTP id j42csp849238pzh; Tue, 21 Nov 2023 13:15:53 -0800 (PST) X-Google-Smtp-Source: AGHT+IHgImjx/ngk2SuF1hbg2myNu26rKAt1mzRRSkPK2RGkgl8r1Boubqyg45z+1Yeeo1M5prCY X-Received: by 2002:a17:906:1cf:b0:a02:99b5:d0dc with SMTP id 15-20020a17090601cf00b00a0299b5d0dcmr134703ejj.10.1700601353486; Tue, 21 Nov 2023 13:15:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700601353; cv=none; d=google.com; s=arc-20160816; b=lkivwus+Hg/L1wxCXgzvDlWlHNX2jrXs9us7MWf2I0fVA1gA16YkvXPgAT5RSY46cr YAlXdefM6LWHCq6bZzTP/iBdH5sV0NfhAD/+oZBu/DrqDU009IlB12aHp4R9k8u+WtfQ OYMe37xz1JMlrxUYdst9eSEMJ9ip9xAwkAY2xFuiReh9nEqIhYV9+ft0EU/YnivIsWcY pgmsjGwh+bvhokvF4jq6wqFXur82kkBQf9Ik3Wy7Loo3t/S9qPg5N1XpdQF9tcNPmZVv 7HxGFOYvUNxdFCTCTB5bVIbWctivKlklz2NRMhef8VT7LOljmuo04Ru0wCxLxiATtC2C b1Wg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:content-transfer-encoding:reply-to:list-subscribe :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:subject:mime-version:references:in-reply-to:message-id :date:to:from:dkim-signature:delivered-to; bh=8wrL54IYkuKsXZwm3sM6FlstmomB7oJaKqYnqy/S1jk=; fh=YOA8vD9MJZuwZ71F/05pj6KdCjf6jQRmzLS+CATXUQk=; b=NtM9m4HqkzzNujk2aZtPywUF90ulIZiKCuAq2iUXoBwQ6FC/T5e4yS2hz14+/Plj9D ZsP8m0OcV16cYwugI933WWguTTN2NluDzlW3pVfBvCjyynl1Zr62IKRuNQhexjVAwgCF 7aygO5UFAZyndfTk8iFwQjl7Geqr97HjLLMe1/ZqY8n7HQbyzENp8MbxG0s480/cyLcZ qf8sLoVAYIVhNjRtCGQ9IWUIetghuxAcYpaBHjswByN/wLmCKlMg+uHLs9YXSv55Zhwg XJD7v9nl18jolt1csymkYDF7V9FH33qXGjYx1TvEuVQk8kh9+Md2WncpYoyqHOlmtRCb Rn+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=TD840FzW; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from ffbox0-bg.mplayerhq.hu (ffbox0-bg.ffmpeg.org. [79.124.17.100]) by mx.google.com with ESMTP id q9-20020a170906360900b009e63570bc1esi6319555ejb.633.2023.11.21.13.15.53; Tue, 21 Nov 2023 13:15:53 -0800 (PST) Received-SPF: pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) client-ip=79.124.17.100; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=TD840FzW; spf=pass (google.com: domain of ffmpeg-devel-bounces@ffmpeg.org designates 79.124.17.100 as permitted sender) smtp.mailfrom=ffmpeg-devel-bounces@ffmpeg.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from [127.0.1.1] (localhost [127.0.0.1]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTP id CD01F68CCDF; Tue, 21 Nov 2023 23:15:04 +0200 (EET) X-Original-To: ffmpeg-devel@ffmpeg.org Delivered-To: ffmpeg-devel@ffmpeg.org Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) by ffbox0-bg.mplayerhq.hu (Postfix) with ESMTPS id 48D6D68CCCB for ; Tue, 21 Nov 2023 23:14:55 +0200 (EET) Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-6c34e87b571so4899873b3a.3 for ; Tue, 21 Nov 2023 13:14:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700601293; x=1701206093; darn=ffmpeg.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=rG4A55hSdFxBIZyGibAEnfV6rp6mILn4vlBJfx9CEDw=; b=TD840FzW/KZ7gW38XBHy7QmsNdiMB7vgfza+QB5iJ+fbDX5VgrV2nEHkVUu+zdEBje AeTTPYrlU/rKP/qidNCyqjv9DHd6gY5ISH4c42r/It02Ep1y1VB8vmGY18YNvoF7GHDn qiOj/1oGL0TVXxQs5E31xnpLiEfUvU0Jjx0rj3e+k8RlZj7ECeJR2ofkpOt6K0VAV1Sb nt2zCmk/p03tehLKTUdFHP3a25WW8WkXPO0TU4w0bKEaI8HemgTMH6l1/DXTXZZfR2IA 7qeRzZ8rHRwLaZrGBbQPOFRN0Kk5jufeKWwAoVjL859axKJVxrLYfFFBBHlzLYA2yd4S Fb9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700601293; x=1701206093; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rG4A55hSdFxBIZyGibAEnfV6rp6mILn4vlBJfx9CEDw=; b=lPkELlCit5hv1nX7L9UQ1nCWGD7kI9gWWvzNC+UgkPPcZZBGwEAIBypUivKEQ6Ro3T lbSyHlQ6yNjjHIpDbacfLMxL7/BJqFriAzTnYdXnD+p7FH/2EUG1lqQwkU2QwOc4z+1Q WVmVNl3PVW8tHuMn999LmiPlVkutHGnyi6caApFSdZ98iITS4abyKgqwh/KdCKRf/AYF Id1NiQ5CZbxLK91xHhnQUO506inyeEzKH2zDs1SqsQ+7SrY5yVeGBdNKsdlj+yll6O2Q 0fZ71wlm1c07wZCn0t5OvAqMNEc2NEwR1XMidDvhXFz0UfqkUYzX1+X0IkoUUwbMtUab xcYQ== X-Gm-Message-State: AOJu0YzVRb/hkui3LSATQo8fBpXYhflCEKVh1bUXdczT8ULjU9TJHS+W C4v3+nk/2Y0XsYNrZiDgzmBSE/yBUbg= X-Received: by 2002:a05:6a21:9706:b0:18a:d4e6:ce20 with SMTP id ub6-20020a056a21970600b0018ad4e6ce20mr239838pzb.23.1700601293098; Tue, 21 Nov 2023 13:14:53 -0800 (PST) Received: from localhost.localdomain (host197.190-225-105.telecom.net.ar. [190.225.105.197]) by smtp.gmail.com with ESMTPSA id bn2-20020a056a00324200b0069ea08a2a99sm8412505pfb.211.2023.11.21.13.14.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Nov 2023 13:14:52 -0800 (PST) From: James Almer To: ffmpeg-devel@ffmpeg.org Date: Tue, 21 Nov 2023 18:14:42 -0300 Message-ID: <20231121211442.8723-6-jamrial@gmail.com> X-Mailer: git-send-email 2.42.1 In-Reply-To: <20231121211442.8723-1-jamrial@gmail.com> References: <20231121211442.8723-1-jamrial@gmail.com> MIME-Version: 1.0 Subject: [FFmpeg-devel] [PATCH 5/5] ffmpeg: add support for muxing AVStreamGroups X-BeenThere: ffmpeg-devel@ffmpeg.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: FFmpeg development discussions and patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: FFmpeg development discussions and patches Errors-To: ffmpeg-devel-bounces@ffmpeg.org Sender: "ffmpeg-devel" X-TUID: yaX8RfWXOETX Signed-off-by: James Almer --- Example command line, remuxing an existing iamf file using the avoptions defined for iamf in libavformat. This creates two stream groups, one Audio Element and one Mixing Presentation. The first defines two layers, one stereo and one 5.1. The latter defines two submixes, one for standard loudspeakers output with two layouts, also stereo and 5.1, and one for Binaural output. ./ffmpeg -i iamf_test_000059.iamf \ -stream_group "type=iamf_audio_element:id=1:st=0:st=1:st=2:st=3:default_w=10,demixing=dmixp_mode=1:parameter_id=998,recon_gain=parameter_id=101,layer=ch_layout=stereo,layer=ch_layout=5.1:recon_gain_is_present=true" \ -stream_group "type=iamf_mix_presentation:id=3:stg=0:annotations=en-us=Mix_Presentation,submix=parameter_id=100:default_mix_gain=1.1|element=stg=0:parameter_id=100:headphones_rendering_mode=stereo:annotations=en-us=Standard_submix|layout=sound_system=stereo:integrated_loudness=1.0|layout=sound_system=5.1,submix=parameter_id=100|element=stg=0:parameter_id=100:headphones_rendering_mode=binaural:default_mix_gain=1.0:annotations=en-us=Binaural_submix|layout=layout_type=binaural" \ -c:a copy -map 0 -y test.iamf fftools/ffmpeg.h | 2 + fftools/ffmpeg_mux_init.c | 327 ++++++++++++++++++++++++++++++++++++++ fftools/ffmpeg_opt.c | 2 + 3 files changed, 331 insertions(+) diff --git a/fftools/ffmpeg.h b/fftools/ffmpeg.h index 41935d39d5..057535adbb 100644 --- a/fftools/ffmpeg.h +++ b/fftools/ffmpeg.h @@ -262,6 +262,8 @@ typedef struct OptionsContext { int nb_disposition; SpecifierOpt *program; int nb_program; + SpecifierOpt *stream_groups; + int nb_stream_groups; SpecifierOpt *time_bases; int nb_time_bases; SpecifierOpt *enc_time_bases; diff --git a/fftools/ffmpeg_mux_init.c b/fftools/ffmpeg_mux_init.c index 63a25a350f..62e3e4aa86 100644 --- a/fftools/ffmpeg_mux_init.c +++ b/fftools/ffmpeg_mux_init.c @@ -27,6 +27,7 @@ #include "libavformat/avformat.h" #include "libavformat/avio.h" +#include "libavformat/iamf.h" #include "libavcodec/avcodec.h" @@ -1943,6 +1944,328 @@ static int setup_sync_queues(Muxer *mux, AVFormatContext *oc, int64_t buf_size_u return 0; } +static int of_parse_iamf_audio_element_layers(Muxer *mux, AVStreamGroup *stg, char **ptr) +{ + AVIAMFAudioElement *audio_element = stg->params.iamf_audio_element; + AVDictionary *dict = NULL; + const char *token; + int ret = 0; + + audio_element->demixing_info = + avformat_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_DEMIXING, NULL, 1, NULL, NULL); + audio_element->recon_gain_info = + avformat_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_RECON_GAIN, NULL, 1, NULL, NULL); + + if (!audio_element->demixing_info || + !audio_element->recon_gain_info) + return AVERROR(ENOMEM); + + /* process manually set layers and parameters */ + token = av_strtok(NULL, ",", ptr); + while (token) { + const AVDictionaryEntry *e; + int demixing = 0, recon_gain = 0; + int layer = 0; + + if (av_strstart(token, "layer=", &token)) + layer = 1; + else if (av_strstart(token, "demixing=", &token)) + demixing = 1; + else if (av_strstart(token, "recon_gain=", &token)) + recon_gain = 1; + + av_dict_free(&dict); + ret = av_dict_parse_string(&dict, token, "=", ":", 0); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing audio element specification %s\n", token); + goto fail; + } + + if (layer) { + ret = avformat_iamf_audio_element_add_layer(audio_element, &dict); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error adding layer to stream group %d\n", stg->index); + goto fail; + } + } else if (demixing || recon_gain) { + AVIAMFParamDefinition *param = demixing ? audio_element->demixing_info + : audio_element->recon_gain_info; + void *subblock = avformat_iamf_param_definition_get_subblock(param, 0); + + av_opt_set_dict(param, &dict); + av_opt_set_dict(subblock, &dict); + + /* Hardcode spec parameters */ + param->param_definition_mode = 0; + param->parameter_rate = stg->streams[0]->codecpar->sample_rate; + param->duration = + param->constant_subblock_duration = stg->streams[0]->codecpar->frame_size; + } + + // make sure that no entries are left in the dict + e = NULL; + if (e = av_dict_iterate(dict, e)) { + av_log(mux, AV_LOG_FATAL, "Unknown layer key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto fail; + } + token = av_strtok(NULL, ",", ptr); + } + +fail: + av_dict_free(&dict); + if (!ret && !audio_element->num_layers) { + av_log(mux, AV_LOG_ERROR, "No layer in audio element specification\n"); + ret = AVERROR(EINVAL); + } + + return ret; +} + +static int of_parse_iamf_submixes(Muxer *mux, AVStreamGroup *stg, char **ptr) +{ + AVFormatContext *oc = mux->fc; + AVIAMFMixPresentation *mix = stg->params.iamf_mix_presentation; + AVDictionary *dict = NULL; + const char *token; + char *submix_str = NULL; + int ret = 0; + + /* process manually set submixes */ + token = av_strtok(NULL, ",", ptr); + while (token) { + AVIAMFSubmix *submix = NULL; + const char *subtoken; + char *subptr = NULL; + + if (!av_strstart(token, "submix=", &token)) { + av_log(mux, AV_LOG_ERROR, "No submix in mix presentation specification \"%s\"\n", token); + goto fail; + } + + submix_str = av_strdup(token); + if (!submix_str) + goto fail; + + ret = avformat_iamf_mix_presentation_add_submix(mix, NULL); + if (!ret) { + submix = mix->submixes[mix->num_submixes - 1]; + submix->output_mix_config = + avformat_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL, 0, NULL, NULL); + if (!submix->output_mix_config) + ret = AVERROR(ENOMEM); + } + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error adding submix to stream group %d\n", stg->index); + goto fail; + } + + submix->output_mix_config->parameter_rate = stg->streams[0]->codecpar->sample_rate; + + subptr = NULL; + subtoken = av_strtok(submix_str, "|", &subptr); + while (subtoken) { + const AVDictionaryEntry *e; + int element = 0, layout = 0; + + if (av_strstart(subtoken, "element=", &subtoken)) + element = 1; + else if (av_strstart(subtoken, "layout=", &subtoken)) + layout = 1; + + av_dict_free(&dict); + ret = av_dict_parse_string(&dict, subtoken, "=", ":", 0); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing submix specification \"%s\"\n", subtoken); + goto fail; + } + + if (element) { + AVIAMFSubmixElement *submix_element; + int idx = -1; + + if (e = av_dict_get(dict, "stg", NULL, 0)) + idx = strtol(e->value, NULL, 0); + av_dict_set(&dict, "stg", NULL, 0); + if (idx < 0 || idx >= oc->nb_stream_groups) { + av_log(mux, AV_LOG_ERROR, "Invalid or missing stream group index in " + "submix element specification \"%s\"\n", subtoken); + ret = AVERROR(EINVAL); + goto fail; + } + ret = avformat_iamf_submix_add_element(submix, NULL); + if (ret < 0) + av_log(mux, AV_LOG_ERROR, "Error adding element to submix\n"); + + submix_element = submix->elements[submix->num_elements - 1]; + submix_element->audio_element_id = oc->stream_groups[idx]->id; + + submix_element->element_mix_config = + avformat_iamf_param_definition_alloc(AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN, NULL, 0, NULL, NULL); + if (!submix_element->element_mix_config) + ret = AVERROR(ENOMEM); + av_opt_set_dict2(submix_element, &dict, AV_OPT_SEARCH_CHILDREN); + submix_element->element_mix_config->parameter_rate = stg->streams[0]->codecpar->sample_rate; + } else if (layout) { + ret = avformat_iamf_submix_add_layout(submix, &dict); + if (ret < 0) + av_log(mux, AV_LOG_ERROR, "Error adding layout to submix\n"); + } else + av_opt_set_dict2(submix, &dict, AV_OPT_SEARCH_CHILDREN); + + if (ret < 0) { + goto fail; + } + + // make sure that no entries are left in the dict + e = NULL; + while (e = av_dict_iterate(dict, e)) { + av_log(mux, AV_LOG_FATAL, "Unknown submix key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto fail; + } + subtoken = av_strtok(NULL, "|", &subptr); + } + av_freep(&submix_str); + + if (!submix->num_elements) { + av_log(mux, AV_LOG_ERROR, "No audio elements in submix specification \"%s\"\n", token); + ret = AVERROR(EINVAL); + } + token = av_strtok(NULL, ",", ptr); + } + +fail: + av_dict_free(&dict); + av_free(submix_str); + + return ret; +} + +static int of_add_groups(Muxer *mux, const OptionsContext *o) +{ + AVFormatContext *oc = mux->fc; + int ret; + + /* process manually set groups */ + for (int i = 0; i < o->nb_stream_groups; i++) { + AVDictionary *dict = NULL, *tmp = NULL; + const AVDictionaryEntry *e; + AVStreamGroup *stg = NULL; + int type; + const char *token; + char *str, *ptr = NULL; + const AVOption opts[] = { + { "type", "Set group type", offsetof(AVStreamGroup, type), AV_OPT_TYPE_INT, + { .i64 = 0 }, 0, INT_MAX, AV_OPT_FLAG_ENCODING_PARAM, "type" }, + { "iamf_audio_element", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT }, .unit = "type" }, + { "iamf_mix_presentation", NULL, 0, AV_OPT_TYPE_CONST, + { .i64 = AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION }, .unit = "type" }, + { NULL }, + }; + const AVClass class = { + .class_name = "StreamGroupType", + .item_name = av_default_item_name, + .option = opts, + .version = LIBAVUTIL_VERSION_INT, + }; + const AVClass *pclass = &class; + + str = av_strdup(o->stream_groups[i].u.str); + if (!str) + goto end; + + token = av_strtok(str, ",", &ptr); + if (token) { + ret = av_dict_parse_string(&dict, token, "=", ":", AV_DICT_MULTIKEY); + if (ret < 0) { + av_log(mux, AV_LOG_ERROR, "Error parsing group specification %s\n", token); + goto end; + } + + // "type" is not a user settable option in AVStreamGroup + e = av_dict_get(dict, "type", NULL, 0); + if (!e) { + av_log(mux, AV_LOG_ERROR, "No type define for Steam Group %d\n", i); + ret = AVERROR(EINVAL); + goto end; + } + + ret = av_opt_eval_int(&pclass, opts, e->value, &type); + if (ret < 0 || type == AV_STREAM_GROUP_PARAMS_NONE) { + av_log(mux, AV_LOG_ERROR, "Invalid group type \"%s\"\n", e->value); + goto end; + } + + av_dict_copy(&tmp, dict, 0); + stg = avformat_stream_group_create(oc, type, &tmp); + if (!stg) { + ret = AVERROR(ENOMEM); + goto end; + } + av_dict_set(&tmp, "type", NULL, 0); + + e = NULL; + while (e = av_dict_get(dict, "st", e, 0)) { + unsigned int idx = strtol(e->value, NULL, 0); + if (idx >= oc->nb_streams) { + av_log(mux, AV_LOG_ERROR, "Invalid stream index %d\n", idx); + ret = AVERROR(EINVAL); + goto end; + } + avformat_stream_group_add_stream(stg, oc->streams[idx]); + } + while (e = av_dict_get(dict, "stg", e, 0)) { + unsigned int idx = strtol(e->value, NULL, 0); + if (idx >= oc->nb_stream_groups || idx == stg->index) { + av_log(mux, AV_LOG_ERROR, "Invalid stream group index %d\n", idx); + ret = AVERROR(EINVAL); + goto end; + } + for (int j = 0; j < oc->stream_groups[idx]->nb_streams; j++) + avformat_stream_group_add_stream(stg, oc->stream_groups[idx]->streams[j]); + } + + switch(type) { + case AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT: + ret = of_parse_iamf_audio_element_layers(mux, stg, &ptr); + break; + case AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION: + ret = of_parse_iamf_submixes(mux, stg, &ptr); + break; + default: + av_log(mux, AV_LOG_FATAL, "Unknown group type %d.\n", type); + ret = AVERROR(EINVAL); + break; + } + + if (ret < 0) + goto end; + + // make sure that nothing but "st" and "stg" entries are left in the dict + e = NULL; + while (e = av_dict_iterate(tmp, e)) { + if (!strcmp(e->key, "st") || !strcmp(e->key, "stg")) + continue; + + av_log(mux, AV_LOG_FATAL, "Unknown group key %s.\n", e->key); + ret = AVERROR(EINVAL); + goto end; + } + } + +end: + av_dict_free(&dict); + av_dict_free(&tmp); + av_free(str); + if (ret < 0) + return ret; + } + + return 0; +} + static int of_add_programs(Muxer *mux, const OptionsContext *o) { AVFormatContext *oc = mux->fc; @@ -2740,6 +3063,10 @@ int of_open(const OptionsContext *o, const char *filename) if (err < 0) return err; + err = of_add_groups(mux, o); + if (err < 0) + return err; + err = of_add_programs(mux, o); if (err < 0) return err; diff --git a/fftools/ffmpeg_opt.c b/fftools/ffmpeg_opt.c index 304471dd03..1144f64f89 100644 --- a/fftools/ffmpeg_opt.c +++ b/fftools/ffmpeg_opt.c @@ -1491,6 +1491,8 @@ const OptionDef options[] = { "add metadata", "string=string" }, { "program", HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(program) }, "add program with specified streams", "title=string:st=number..." }, + { "stream_group", HAS_ARG | OPT_STRING | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(stream_groups) }, + "add stream group with specified streams and group type-specific arguments", "id=number:st=number..." }, { "dframes", HAS_ARG | OPT_PERFILE | OPT_EXPERT | OPT_OUTPUT, { .func_arg = opt_data_frames }, "set the number of data frames to output", "number" },